issue with reaped processes timing out in rad_waitpid

Alex Sharaz alex.sharaz at york.ac.uk
Fri Oct 31 09:54:29 CET 2014


Hi 

Found this 

https://www.marshut.net/kqyrtz/reaped-processes-still-timing-out-in-rad-waitpid.html


regarding the following error being seen in radius.log. We've just discovered that this is happening a lot here  and wondered if anyone has had a chance to look at his proposed patch/ produced another one that rectified the problem.
Rgds
Alex 

....I have hit upon a case where some ntlm_auth processes would return (and write the NT_KEY to the connecting pipe) but FR still complains that it failed and denies authentication (this is on 2.2.5).

This manifests in the logs like the following:

Tue Oct 28 11:10:15 2014 : Auth: Login incorrect (mschap: External script says NT_KEY: 4BCE6CA72058BA7EE500D1A68A8771C0): [tstRad9] (from client 155.98.204.47 port 0 cli 02-00-00-00-00-01 via TLS tunnel)

Since this is actually the output for a valid and successful authentication, it appears that the exit code is the real issue.
That exit code is either that of the process itself or 2 if rad_waitpid times out while waiting for the child.

After adding some debugging statements and recompiling I found that there were cases where reap_children would reap a child process but the pid would not be found in thread_pool.waiters. This only happened when there were a significant number of auths per seconds and still not consistently. Some head scratching ensued and a colleague then suggested there may be a race condition between rad_fork (where it calls fr_hash_table_insert)  and reap_children (where it calls fr_hash_table_finddata).

.......




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freeradius.org/pipermail/freeradius-users/attachments/20141031/a96cc6aa/attachment.html>


More information about the Freeradius-Users mailing list