rihad at mail.ru
Mon Mar 1 14:10:37 CET 2010
We have FreeRADIUS 2.1.3 servicing four Cisco NASses, which in turn
service hundreds of PPPoE clients. rlm_perl with a custom written script
is used for authorization/accounting, performing at about 10 auth
requests/sec on a Dell PowerEdge 2950 box. At times, when a NAS is
rebooted, triggering reauthentication of hundreds of PPPoE users, the
server log is swamped with many lines of this kind:
Error: Received conflicting packet from client 10.10.70.3 port 1645 -
ID: 86 due to unfinished request 273963. Giving up on old request.
Suddenly, it turns out the server isn't making any progress, besides
giving up on old requests and chewing CPU time. Only turning off radius
authorization on the Ciscos stops the duplicate packet flooding, and
then turning it back on part by part brings everything together. I don't
think the NAS is to blame, it's the hundreds of ADSL modems doing the
frequent reauthentication every few seconds.
Some current settings:
max_request_time = 6
cleanup_delay = 10
max_requests = 1024
max_servers = 5 #threads are used
I've tried all sorts of combinations for the above, with max_requests as
low as 50, to no avail.
Can freeradius be configured to reply with a (temporary) REJECT to an
auth request when max_requests is reached, instead of just ignoring the
request? I think that would allow the server to make steady progress.
What else should I do in terms of radius configuration? Please do not
suggest that I fix the code to make it faster, it's more of a
misconfiguration issue (radius or Cisco). The server should be making
some progress no matter how slow it ran, unlike what we're having at the
times of trouble.
Thanks for any tips.
More information about the Freeradius-Users