Detail file handling

Sun May 6 08:54:06 CEST 2007

Peter Nixon wrote:
> I think you may have misunderstood the problem and my solution slightly. The 
> CPU on the RADIUS servers NEVER gets pegged. Its the system load on the 
> backend DB server which goes high, slowing everything down. Now in fact its 
> not even the CPU that is being pegged, but rather the hard disk(s). 

  Ah, OK.

> One possible optimisation I could make it to have my duplicate_session_killer 
> use data from the sqlippool tables (which are fixed in length and therefore 
> should be faster to update indexes) than radacct. I need t benchmark this 
> though, and haven't seen the need yet to break a working system.

  I like Kostas' idea

http://kkalev.wordpress.com/2007/03/25/radius-server-performance-tips/

  Always believe the accounting DB for duplicate logins, but write
duplicate logins to a separate DB, too.  Have another process read the
second DB, check the NAS, and generate "stop" records if the user isn't
really online.

  Or maybe that's a separate problem from what you have.

> My current config uses 2 separate sql modules, one for auth queries and one 
> for acct. Both hit the same database. This ensures that auth can never 
> overwhelm acct and visa versa.. (Maybe some of your new changes will make 
> this redundant but it certainly improved things when I did it 6 months ago)

  That's the hope.  We'll have to see if it works in practice.  I
suspect that the prioritization of auth versus acct matters only in
high-load situations.  If you're only getting 10-20 packets per second,
there's always a gap between packets, and the auth/acct prioritization
doesn't matter.

. However when you have a single thread poking accounting data into the
system
> at full speed things start to break down. A single thread can push data into 
> the radacct table at > 1000 records per second, which pegs the hard disks 
> (and works the CPU pretty well also) making the DB spend all its time 
> updating indexes (as well as writing to disk of course). 

  The goal, then, is to write to the DB in such a way that there's
bandwidth left over for authentication packets.  That's hard for the
RADIUS server to know.

> So, again.. Having the detail reader not inject packets back into the
system
> unless there is > X threads free is a good idea, but I still think that that 
> reader needs a configurable delay in between each packet that it injects...

  Probably, yes.

> Its no the lack of threads thats the issue (although that IS an issue), its 
> the amount of load one dedicated thread can create...

  All right.   I'd like to rate-limit the detail packets, but also keep
it adaptable to changing DB/hardware issues.  We can perhaps fix this by
having only one detail packet "in-flight" at a time.  We also keep track
of round trip time to the DB.  We send the "next" detail packet only
after 2*RTT, ensuring that the load factor on the DB due to detail file
handling is always less than 50%.

  That should be relatively easy to implement.  The configuration should
be "load_factor", expressed in percentage 1..100.  The server then
calculates "delay = (RTT * (100-load_factor))/load_factor"

  That's adaptable, and should be easy for administrators to understand
and configure.

  Alan DeKok.
--
  http://deployingradius.com       - The web site of the book
  http://deployingradius.com/blog/ - The blog