s.cantos at neopost.com
Thu Nov 25 17:50:38 CET 2010
I'm using freeradius 1.x to authenticate users using a ldap backend. I see lots of "unresponsive child" errors when getting some request load. After reading through the mailing list archive I've seen that this is typical from one module being slow.
I thought it was the ldap but I've now proven that the ldap answers correctly. In fact the slowdown happens in the postauth section when rlm_ippool tries to allocate an IP. I created a /16 network pool (more than 65000 IPs) to make sure I won't miss one when I'll get lots of traffic. This is lots more than what I need.
I've traced the process (truss on solaris) and noticed that the thread is spending lots of time to do something like walking over all the database. In the rlm_ippool code I can see this comment:
* Walk through the database searching for an active=0 entry.
* We search twice. Once to see if we have an active entry with the same callerid
* so that MPPP can work ok and then once again to find a free entry.
Does this means that each time we need to allocate a new IP we need to walk through the complete database 2 times ?
For information, when I use rlm_ippool_tool to show active entries in the DB it takes around 12 seconds. I've also noticed that there are mutexex locking the DB each time we need to access it. Finally I conclude that each time I need to answer an auth request I'm stucked during 12 seconds.
I'm missing something ?
Do I need to reduce the numbers of IPs in the pool or is there some other solution ?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Freeradius-Devel