2.2.0 & dhcp: regression

Phil Mayers p.mayers at imperial.ac.uk
Fri Jul 12 13:39:45 CEST 2013


On 12/07/13 11:55, Eugene Grosbein wrote:
> On 12.07.2013 17:38, Phil Mayers wrote:
>> On 12/07/13 11:17, Eugene Grosbein wrote:
>>>
>>> Please help. We need at least 1000 concurrent threads to deal with the load here.
>>
>> 1000 threads is a crazy number. Can you explain why you think you need
>> that many? Are you doing very slow logic/lookups or something?
>
> Our database is powerful enough to deal with so many requests.
> We may easily get that many requests and want to be able to process
> them in parallel without needless queueing.

With respect, this is a pretty basic logic.

The figure of merit here is offered load in terms of request/sec, and 
the average/max processing time per-request.

If you have 1000 request/sec and each request takes 1 millisecond to 
process, a single thread is sufficient.

If you have 100,000 request/sec and each request takes 10 milliseconds, 
then you need 100 threads.

If your database is "so powerful" it shouldn't be taking too long, so 
unless you have a truly enormous number of request/sec, you don't need 
1000 threads.

If you really do have that many request/sec, you probably should look at 
some form of load balancing, rather than having enormous thread pool. 
The thread pool performance will not scale linearly - various server 
internal data structures are locked, and you will probably run into lock 
contention at high thread counts.

I assert that 1000 (posix, shared-memory) threads is always the wrong 
answer to pretty much *any* problem ;o)

>
>> Anyway, the problem is almost certainly system ulimits. I don't know why
>> it's different under 2.2 to 2.1, but I would look into ulimit.
>
> ulimit for files is over 11000 here. And radiusd successfully opens more
> than 1024 files. It just breaks afterwards.

It could be the use of select() then. If rlm_perl opens FDs 5-1023, then 
the radius server needs to open some sockets, it will get FD #1024, 
which might cause select to complain.

Try running the server under "strace" and see if you can see where it 
goes wrong, and what it calls just before it does.


More information about the Freeradius-Users mailing list