Freeradius 3.0.4 Accounting [sql-relay problem]

Ibrahim Almahfooz ibrahim.nezar at gorannet.net
Mon Apr 4 20:27:47 CEST 2016


Hi Everyone,

We are currently running FreeRADIUS 3.0.4 as an accounting server installed
from the repository on CentOS 7 server.

We have almost 25k adsl,vdsl,gpon users in the network. The BRAS [NAS] is
sending the accounting data to the FreeRADIUS.

Instead of writing accounting data directly to the MySQL db, we configured
it to write the sql statements to the sql-relay file and set a crontab to
dump it to the remote db every 15 minutes using *radsqlrelay* command:

radsqlrelay -x -h x.x.x.x -d mysql -b mysqldb -u root -pdbpassword -1
/var/log/radius/radacct/sql-relay

While the above command run by the crontab, we see sql-relay.work file
writing the background and we assume this is normal in order to write the
incoming accounting data while doing the sql-relay transfer to the remote
db.

After a couple of days of setting everything, our [BRAS] NAS device
has reported accounting errors in its statistics like accounting start not
successfully sent or accounting stop not successfully sent and statistics
shows the errors are happening randomly every one, two or six hours.

We start doing investigations and found that in the radius.log there are
many errors happening in the same time the radsqlrelay running by the
crontab, the error is below:

Fri Apr  1 03:59:23 2016 : Info: rlm_sql (sql): 0 of 3 connections in use.
Need more spares

Fri Apr  1 03:59:23 2016 : Info: rlm_sql (sql): Opening additional
connection (73115)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 4 of 4 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73116)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 5 of 5 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73117)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 6 of 6 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73118)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 7 of 7 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73119)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 7 of 7 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73120)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 9 of 9 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73121)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 10 of 10 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73122)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 11 of 11 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73123)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 12 of 12 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73124)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 13 of 13 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73125)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 14 of 14 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73126)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 15 of 15 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73127)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 16 of 16 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73128)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 17 of 17 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73129)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 18 of 18 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73130)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 19 of 19 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73131)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 20 of 20 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73132)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 21 of 21 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73133)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 22 of 22 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73134)

Fri Apr  1 04:00:02 2016 : Warning: rlm_sql (sql): 23 of 23 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:02 2016 : Info: rlm_sql (sql): Opening additional
connection (73135)

Fri Apr  1 04:00:03 2016 : Warning: rlm_sql (sql): 24 of 24 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:03 2016 : Warning: rlm_sql (sql): 24 of 24 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:03 2016 : Info: rlm_sql (sql): Opening additional
connection (73136)

Fri Apr  1 04:00:03 2016 : Info: rlm_sql (sql): Opening additional
connection (73137)

Fri Apr  1 04:00:03 2016 : Warning: rlm_sql (sql): 26 of 26 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:03 2016 : Info: rlm_sql (sql): Opening additional
connection (73138)

Fri Apr  1 04:00:03 2016 : Warning: rlm_sql (sql): 27 of 27 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:03 2016 : Info: rlm_sql (sql): Opening additional
connection (73139)

Fri Apr  1 04:00:03 2016 : Warning: rlm_sql (sql): 28 of 28 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:03 2016 : Info: rlm_sql (sql): Opening additional
connection (73140)

Fri Apr  1 04:00:03 2016 : Warning: rlm_sql (sql): 29 of 29 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:03 2016 : Info: rlm_sql (sql): Opening additional
connection (73141)

Fri Apr  1 04:00:03 2016 : Warning: rlm_sql (sql): 30 of 30 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:03 2016 : Info: rlm_sql (sql): Opening additional
connection (73142)

Fri Apr  1 04:00:03 2016 : Warning: rlm_sql (sql): 31 of 31 connections in
use.  You probably need to increase "spare"

Fri Apr  1 04:00:03 2016 : Info: rlm_sql (sql): Opening additional
connection (73143)

Fri Apr  1 04:00:09 2016 : Error: 1 requests have been waiting in the
processing queue for 6 seconds.  Check that all databases are running
properly!

>From the radius.log we see this error start happening exactly while the
crontab of dumping the sql-relay happens but not on every crontab job! We
are not running any db in the same FreeRADIUS since we are writing to the
file. Obviously, the FR server is considering sql-relay as database and
report the error "Check that all databases are running properly!"

Seems like the radius stops responding (not always) when dumping the
sql-relay file to the mysql db.

Also we sometimes see error of more than 50 requests have been waiting in
the processing… in the radius.log. And that actually what triggers the NAS
device to consider the freeRadius as down status. Surprisingly, our NAS
start again sending accounting packet and the server goes normally. and
this error keep showing on both radius.log and the NAS from now and then.
And this probably means "Loss of accounting data"

Actually we have no idea why this behaviour is occurring. We start thinking
if there is any size or performance limitation in the sql-relay.work file?
Or may be a bug in this version and we are unaware of the fix if exists?

Any ideas?

Best,

Ibrahim
System Administrator


More information about the Freeradius-Users mailing list