radsqlrelay performance issues with network latency

Dr. Uwe Meyer-Gruhl freeradius_email at congenio.de
Wed Dec 19 13:49:07 CET 2012


Yes, you can safely use it:

Such INSERT statements will not be collected, however, since they do not 
match the specified regex pattern

/^\s*insert\s+into\s+`?\w+`?\s+(?:\(.*?\)\s+)?
       values\s*\(.*\)\s*;\s*$/ix

because of the characters "ON DUPLICATE KEY UPDATE ..." trailing the 
brackets after the VALUES. Thus, they are handled as any other 
non-INSERT statements by the script: they are copied verbatim.

I think there is no point in making the script recognize this special 
case, since the timestamps most probably differ on every line. Even if 
they were the same, I do not know whether the SQL syntax allows for a 
multi-row INSERT with "ON DUPLICATE KEY UPDATE".


Regards,


Uwe Meyer-Gruhl


am 19.12.2012 12:00 schrieb freeradius-devel-request at lists.freeradius.org:
> Message: 1
> Date: Tue, 18 Dec 2012 14:11:59 +0100
> From: Stefan Winter <stefan.winter at restena.lu>
> To: freeradius-devel at lists.freeradius.org,
> 	freeradius_email at congenio.de
> Subject: Re: radsqlrelay performance issues with network latency
> Message-ID: <50D06B9F.4000508 at restena.lu>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi,
>
> that's a very interesting script, thanks!
>
> We've been looking for ways to improve radsqlrelay's performance on our
> side, too, and this certainly comes in handy.
>
> One question though: our statements are like:
>
> INSERT INTO lastusage   (username,   service,   type)  VALUES (
> 'username',   'IMAP',   'Access-Accept')  ON DUPLICATE KEY UPDATE
> last_seen='2012-12-18 14:04:44'
>
> Not being a Perl expert (so I can't read the code and make sense of it),
> I wonder if the algorithm that detects "similar" statements is also
> prepared to cope with such statements with a suffix ON DUPLICATE ... .
>
> Can I safely use the new radsqlrelay for these statements?
>
> Greetings,
>
> Stefan Winter
>
> On 13.12.2012 14:47, Dr. Uwe Meyer-Gruhl wrote:
>> Hi,
>>
>>
>> as probably most of you know, the rlm_sql_log/radsqlrelay pair of
>> programs was invented to solve to problem of intermittent database
>> downtimes. Furthermore, it is supposed to make FreeRADIUS answer faster
>> because of the decoupling of the database operations.
>>
>> However, I found out that with latency of 25ms between a MySQL database
>> server and the FreeRADIUS machine, there still is a performance problem:
>> In our large-scale installation with > 100 log entries per seconds, we
>> saw that the sqlrelay log file was constantly growing. It grew faster
>> than radsqlrelay was able to shift the data to the database server.
>> Another FreeRADIUS instance with < 1ms latency had no problem.
>>
>> The root cause is that each SQL statement is handled individually within
>> radsqlrelay. So, with a network latency of 25ms, there can be at most
>> 1000ms/25ms = 40 database operations per second, while with 1ms latency,
>> there can be 1000 per second.
>>
>> Since the rlm_sql_log output consists mainly of INSERT statements to the
>> same table and with the same structure, we can collect several INSERT
>> statements into one with multiple rows. This can be done by the
>> radsqlrelay Perl script very elegantly, since Perl was invented for such
>> purposes. I have appended a modified version of radsqlrelay which
>> collects up to 100 consecutive INSERT statements into one before doing
>> the actual database submittal.
>>
>> I have only tested the script for MySQL. It cannot optimize in this way
>> for Oracle databases, since Oracle does not conform to the SQL standard
>> for multi-row INSERTs, but the script automatically reduces the maximum
>> number of collected statements to 1 in that case. PostgreSQL databases
>> should be able to cope with multi-row INSERTs as well and benefit likewise.
>>
>> The theoretical number of INSERT statements per second is lifted by a
>> factor of 100 by this adapted script. In our own scenario, there will be
>> no bottlenecks any more, even if the latency increased to 250ms.
>>
>> Maybe this improvement can be included in one of the next FreeRADIUS
>> releases.
>>
>>
>> Kind regards,
>>
>>
>> Uwe Meyer-Gruhl
>>
>>
>> ---
>> #!/usr/bin/perl
>> ##
>> ##  radsqlrelay.pl    This program tails a SQL logfile and forwards
>> ##            the queries to a database server. Used to
>> ##            replicate accounting records to one (central)
>> ##            database, even if the database has extended
>> ##            downtime.
>> ##
>> ##  Version:    $Id$
>> ##
>> ##  Author:     Nicolas Baradakis <nicolas.baradakis at cegetel.net>
>> ##
>> ##  Copyright (C) 2005 Cegetel
>> ##
>> ##  This program is free software; you can redistribute it and/or
>> ##  modify it under the terms of the GNU General Public License
>> ##  as published by the Free Software Foundation; either version 2
>> ##  of the License, or (at your option) any later version.
>> ##
>> ##  This program is distributed in the hope that it will be useful,
>> ##  but WITHOUT ANY WARRANTY; without even the implied warranty of
>> ##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> ##  GNU General Public License for more details.
>> ##
>> ##  You should have received a copy of the GNU General Public License
>> ##  along with this program; if not, write to the Free Software
>> ##  Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
>> 02110-1301, USA
>> ##
>>
>> use DBI;
>> use Fcntl;
>> use Getopt::Std;
>> use POSIX qw(:unistd_h :errno_h);
>>
>> use warnings;
>> use strict;
>>
>> my $maxcollect = 100;    # tunable, works for MySQL!
>>
>> my $lastinsert;
>> my @values;
>>
>> my $need_exit = 0;
>>
>> sub got_signal()
>> {
>>      $need_exit = 1;
>> }
>>
>> # /!\ OS-dependent structure
>> # Linux struct flock
>> #   short l_type;
>> #   short l_whence;
>> #   off_t l_start;
>> #   off_t l_len;
>> #   pid_t l_pid;
>> # c2ph says: typedef='s2 l2 i', sizeof=16
>> my $FLOCK_STRUCT = 's2l2i';
>>
>> sub setlock($;$$)
>> {
>>      my ($fh, $start, $len) = @_;
>>      $start = 0 unless defined $start;
>>      $len = 0 unless defined $len;
>>
>>                                      #type     whence    start till  pid
>>      my $packed = pack($FLOCK_STRUCT, F_WRLCK, SEEK_SET, $start, $len, 0);
>>      if (fcntl($fh, F_SETLKW, $packed)) { return 1 }
>>      else { return 0 }
>> }
>>
>> sub usage()
>> {
>>      print STDERR <<HERE;
>> usage: radsqlrelay [options] file_path
>> options:
>>      -?        Print this help message.
>>      -1        One-shot mode: push the file to database and exit.
>>      -b database    Name of the database to use.
>>      -d sql_driver    Driver to use: mysql, pg, oracle.
>>      -f file        Read password from file, instead of command line.
>>      -h host        Connect to host.
>>      -P port        Port number to use for connection.
>>      -p passord    Password to use when connecting to server.
>>      -u user        User for login.
>>      -x        Turn on debugging.
>> HERE
>> }
>>
>> sub connect_wait($)
>> {
>>      my $dbinfo = shift;
>>      my $dbh;
>>      while (!$dbh) {
>>      $dbh = DBI->connect($dbinfo->{base}, $dbinfo->{user}, $dbinfo->{pass},
>>                  { RaiseError => 0, PrintError => 0,
>>                    AutoCommit => 1 });
>>      sleep (1) if !$dbh;
>>      exit if $need_exit;
>>      }
>>      $dbinfo->{handle} = $dbh;
>> }
>>
>>
>>
>> sub process_file($$)
>> {
>>      my ($dbinfo, $path) = @_;
>>
>>      sub do_inserts($) {
>>          my $dbinfo = shift;
>>          if (scalar(@values) > 0) {
>>              my $query = $lastinsert . " ";
>>              $query .= join(" ), ( ", at values);
>>              $query .= " );";
>>              do_query($dbinfo,$query);
>>          }
>>          @values = ();
>>      }
>>
>>      sub do_query($$) {
>>          my ($dbinfo,$query) = @_;
>>          until ($dbinfo->{handle}->do($query)) {
>>              print $dbinfo->{handle}->errstr."\n";
>>              if ($dbinfo->{handle}->ping) {
>>              sleep (1);
>>              } else {
>>              print "error: Lost connection to database\n";
>>              $dbinfo->{handle}->disconnect;
>>              connect_wait($dbinfo);
>>              }
>>          }
>>      }
>>
>>
>>
>>      unless (-e $path.'.work') {
>>      until (rename($path, $path.'.work')) {
>>          if ($! == ENOENT) {
>>          sleep(1);
>>          return if $need_exit;
>>          } else {
>>          print STDERR "error: Couldn't move $path to $path.work: $!\n";
>>          exit 1;
>>          }
>>      }
>>      }
>>
>>      open(FILE, "+< $path.work") or die "error: Couldn't open $path.work:
>> $!\n";
>>      setlock(\*FILE) or die "error: Couldn't lock $path.work: $!\n";
>>
>>      $lastinsert = "";
>>      @values = ();
>>
>>      while (<FILE>) {
>>          chomp (my $line = $_);
>>
>>          if (!($line =~ /^\s*insert\s+into\s+`?\w+`?\s+(?:\(.*?\)\s+)?
>>                              values\s*\(.*\)\s*;\s*$/ix)) {
>>              # This is no INSERT, so start new collection
>>              do_inserts($dbinfo);
>>              $lastinsert = "";
>>              # must output this line
>>              do_query($dbinfo, "$line");
>>          }
>>          else {
>>              # This is an INSERT, so collect it
>>              my $insert = $line;
>>              my $values = $line;
>>              $insert =~ s/^\s*(insert\s+into\s+`?\w+`?\s+(?:\(.*?\)\s+)?
>>                                values\s*\().*\)\s*;\s*$/$1/ix;
>>              $values =~ s/^\s*insert\s+into\s+`?\w+`?\s+(?:\(.*?\)\s+)?
>>                               values\s*\((.*)\)\s*;\s*$/$1/ix;
>>
>>              if (($lastinsert ne "") && ($insert ne $lastinsert)) {
>>                  # This is different from the last one
>>                  do_inserts($dbinfo);
>>              }
>>              push(@values, $values);
>>              $lastinsert = $insert; # start new collection
>>          }
>>
>>          # limit to $maxcollect collected lines
>>          if (scalar(@values) >= $maxcollect) {
>>              do_inserts($dbinfo);
>>          }
>>      }
>>
>>      # Cleanup
>>      do_inserts($dbinfo);
>>
>>      unlink($path.'.work');
>>      close(FILE); # and unlock
>> }
>>
>> # sub main()
>>
>> my %args = (
>>          b => 'radius',
>>          d => 'mysql',
>>          h => 'localhost',
>>          p => 'radius',
>>          u => 'radius',
>> );
>> my $ret = getopts("b:d:f:h:P:p:u:x1?", \%args);
>> if (!$ret or @ARGV != 1) {
>>      usage();
>>      exit 1;
>> }
>> if ($args{'?'}) {
>>      usage();
>>      exit 0;
>> }
>>
>> my $data_source;
>> if (lc($args{d}) eq 'mysql') {
>>      $data_source = "DBI:mysql:database=$args{b};host=$args{h}";
>> } elsif (lc($args{d}) eq 'pg') {
>>      $data_source = "DBI:Pg:dbname=$args{b};host=$args{h}";
>> } elsif (lc($args{d}) eq 'oracle') {
>>      $data_source = "DBI:Oracle:$args{b}";
>>      # Oracle does not conform to the SQL standard for multirow INSERTs
>>      $maxcollect = 1;
>> } else {
>>      print STDERR "error: SQL driver not supported yet: $args{d}\n";
>>      exit 1;
>> }
>> $data_source .= ";port=$args{P}" if $args{'P'};
>>
>> my $pw;
>> if($args{f}) {
>>      open(FILE, "< $args{f}") or die "error: Couldn't open $args{f}: $!\n";
>>      $pw = <FILE>;
>>      chomp($pw);
>>      close(FILE);
>> } else {
>>      # args{p} is always defined.
>>      $pw = $args{p};
>> }
>>
>> $SIG{INT} = \&got_signal;
>> $SIG{TERM} = \&got_signal;
>>
>> my %dbinfo = (
>>            base => $data_source,
>>            user => $args{u},
>>            pass => $pw,
>> );
>> connect_wait(\%dbinfo);
>>
>> my $path = shift @ARGV;
>>
>> until ($need_exit) {
>>      process_file(\%dbinfo, $path);
>>      last if ($args{1} || $need_exit);
>>      sleep(10);
>> }
>>
>> $dbinfo{handle}->disconnect;
>>
>>
>> -
>> List info/subscribe/unsubscribe? See
>> http://www.freeradius.org/list/devel.html
>




More information about the Freeradius-Devel mailing list