FR suddenly doesn't respond any more and eats all cpu

Benedikt Panzer benedikt.panzer at rus.uni-stuttgart.de
Fri Aug 19 15:11:11 CEST 2005


Hi,

I really enjoy answering to myself ;-)

I found the problem is not on the ldap server side but really in FR 
(configuration?). And it's a matter of the number of RADIUS requests: 
two clients quering FR at the same time don't cause problems for me, but 
when three clients query it FreeRADIUS hangs within 2 minutes. Every 
time. But those error messages in the log file (like "All ldap 
connections are in use", see my last posting) were not shown again. 
Precicely, most of the time no error was shown at all. FR handles one 
request normal and then just hangs.

So I tried different combinations of the options max_requests, 
max_servers, max_request_per_server, ldap_connection_number, ldap 
timeouts and so on (see below). Nothing changed. FR _always_ crashed 
after about 2 minutes when queried by 3 clients.

Then I started to enable debugging mode again (-x) and noticed, that FR 
doesn't crash any longer! I set all other options back to their default 
values and still - FR doesn't crash! (it neither shows any error 
message) Also I tested the switch -s and just the same, the error 
doesn't occur then. Back in normal mode (without -x or -s) FR crashes 
again, with one of both switches it doesn't. Strange to me. Is this 
normal for you experts?

Have a nice weekend!
regards, Benedikt


The combination of options I tested (all combinations failed, that means 
FR crashed):

max_request_time = 30
delete_blocked_requests = no
max_requests = 1024
start_servers = 5
max_servers = 32
min_spare_servers = 3
max_spare_servers = 20
max_requests_per_server = 0
ldap {
    ldap_connections_number = 5
    timeout = 4
    timelimit = 3
    net_timeout = 4
}

max_request_time = 5
delete_blocked_requests = no
max_requests = 1024
start_servers = 5
max_servers = 32
min_spare_servers = 3
max_spare_servers = 20
max_requests_per_server = *100*
ldap {
    ldap_connections_number = *10*
    timeout = 4
    timelimit = 3
    net_timeout = 4
}

max_request_time = 5
delete_blocked_requests = no
max_requests = *4096*
start_servers = 5
max_servers = 32
min_spare_servers = 3
max_spare_servers = 20
max_requests_per_server = *100*
ldap {
    ldap_connections_number = *10*
    timeout = 4
    timelimit = 3
    net_timeout = 4
}

max_request_time = 5
delete_blocked_requests = *yes*
max_requests = *4096*
start_servers = 5
max_servers = 32
min_spare_servers = 3
max_spare_servers = 20
max_requests_per_server = *100*
ldap {
    ldap_connections_number = *10*
    timeout = 4
    timelimit = 3
    net_timeout = 4
}

max_request_time = 5
delete_blocked_requests = no
max_requests = 1024
start_servers = 5
max_servers = 32
min_spare_servers = 3
max_spare_servers = 20
max_requests_per_server = 0
ldap {
    ldap_connections_number = *10*
    timeout = *2*
    timelimit = *1*
    net_timeout = *2*
}


> Fri Aug 19 09:22:02 2005 : Error: rlm_ldap: All ldap connections are 
> in use
> Fri Aug 19 09:22:03 2005 : Error: rlm_ldap: ldap_search() failed: 
> Timed out while waiting for server to respond. Please increase the 
> timeout.
> Fri Aug 19 09:22:37 2005 : Error: rlm_ldap: 
> uid=ilebraun,ou=accounts,dc=SIAM bind to 
> lanldap1.rus.uni-stuttgart.de:389 failed: timeout
> Fri Aug 19 09:24:32 2005 : Info: The maximum number of threads (32) 
> are active,cannot spawn new thread to handle request
> Fri Aug 19 09:24:41 2005 : Error: WARNING: Unresponsive child (id 
> 1123056560) for request 47
>
>> I've configured here a FreeRADIUS 1.0.4 and I'm running it now to 
>> handle test requests. First, everything looked ok. FR responded all 
>> requests correctly. But suddenly it didn't respond any more to RADIUS 
>> requests and I saw it used 1 of my 2 cpus completly. Before it took 
>> between 1-2 percent of the cpu. FreeRADIUS even could not be killed 
>> by a normal kill, I needed kill -9 to terminate it. It's very strange 
>> to me that happend after half an hour normal behavior. Then I started 
>> FreeRADIUS in debugging mode (-X) but then the error didn't occur 
>> until I stopped it 1 day later. Just now I ran it again in 
>> not-debugging mode and again after about half an hour the same 
>> strange error: processor load about 99% and no responses to any 
>> requests.
>




More information about the Freeradius-Users mailing list