FR suddenly doesn't respond any more and eats all cpu
Benedikt Panzer
benedikt.panzer at rus.uni-stuttgart.de
Fri Aug 19 15:11:11 CEST 2005
Hi,
I really enjoy answering to myself ;-)
I found the problem is not on the ldap server side but really in FR
(configuration?). And it's a matter of the number of RADIUS requests:
two clients quering FR at the same time don't cause problems for me, but
when three clients query it FreeRADIUS hangs within 2 minutes. Every
time. But those error messages in the log file (like "All ldap
connections are in use", see my last posting) were not shown again.
Precicely, most of the time no error was shown at all. FR handles one
request normal and then just hangs.
So I tried different combinations of the options max_requests,
max_servers, max_request_per_server, ldap_connection_number, ldap
timeouts and so on (see below). Nothing changed. FR _always_ crashed
after about 2 minutes when queried by 3 clients.
Then I started to enable debugging mode again (-x) and noticed, that FR
doesn't crash any longer! I set all other options back to their default
values and still - FR doesn't crash! (it neither shows any error
message) Also I tested the switch -s and just the same, the error
doesn't occur then. Back in normal mode (without -x or -s) FR crashes
again, with one of both switches it doesn't. Strange to me. Is this
normal for you experts?
Have a nice weekend!
regards, Benedikt
The combination of options I tested (all combinations failed, that means
FR crashed):
max_request_time = 30
delete_blocked_requests = no
max_requests = 1024
start_servers = 5
max_servers = 32
min_spare_servers = 3
max_spare_servers = 20
max_requests_per_server = 0
ldap {
ldap_connections_number = 5
timeout = 4
timelimit = 3
net_timeout = 4
}
max_request_time = 5
delete_blocked_requests = no
max_requests = 1024
start_servers = 5
max_servers = 32
min_spare_servers = 3
max_spare_servers = 20
max_requests_per_server = *100*
ldap {
ldap_connections_number = *10*
timeout = 4
timelimit = 3
net_timeout = 4
}
max_request_time = 5
delete_blocked_requests = no
max_requests = *4096*
start_servers = 5
max_servers = 32
min_spare_servers = 3
max_spare_servers = 20
max_requests_per_server = *100*
ldap {
ldap_connections_number = *10*
timeout = 4
timelimit = 3
net_timeout = 4
}
max_request_time = 5
delete_blocked_requests = *yes*
max_requests = *4096*
start_servers = 5
max_servers = 32
min_spare_servers = 3
max_spare_servers = 20
max_requests_per_server = *100*
ldap {
ldap_connections_number = *10*
timeout = 4
timelimit = 3
net_timeout = 4
}
max_request_time = 5
delete_blocked_requests = no
max_requests = 1024
start_servers = 5
max_servers = 32
min_spare_servers = 3
max_spare_servers = 20
max_requests_per_server = 0
ldap {
ldap_connections_number = *10*
timeout = *2*
timelimit = *1*
net_timeout = *2*
}
> Fri Aug 19 09:22:02 2005 : Error: rlm_ldap: All ldap connections are
> in use
> Fri Aug 19 09:22:03 2005 : Error: rlm_ldap: ldap_search() failed:
> Timed out while waiting for server to respond. Please increase the
> timeout.
> Fri Aug 19 09:22:37 2005 : Error: rlm_ldap:
> uid=ilebraun,ou=accounts,dc=SIAM bind to
> lanldap1.rus.uni-stuttgart.de:389 failed: timeout
> Fri Aug 19 09:24:32 2005 : Info: The maximum number of threads (32)
> are active,cannot spawn new thread to handle request
> Fri Aug 19 09:24:41 2005 : Error: WARNING: Unresponsive child (id
> 1123056560) for request 47
>
>> I've configured here a FreeRADIUS 1.0.4 and I'm running it now to
>> handle test requests. First, everything looked ok. FR responded all
>> requests correctly. But suddenly it didn't respond any more to RADIUS
>> requests and I saw it used 1 of my 2 cpus completly. Before it took
>> between 1-2 percent of the cpu. FreeRADIUS even could not be killed
>> by a normal kill, I needed kill -9 to terminate it. It's very strange
>> to me that happend after half an hour normal behavior. Then I started
>> FreeRADIUS in debugging mode (-X) but then the error didn't occur
>> until I stopped it 1 day later. Just now I ran it again in
>> not-debugging mode and again after about half an hour the same
>> strange error: processor load about 99% and no responses to any
>> requests.
>
More information about the Freeradius-Users
mailing list