Zombie accounting processes
Phil Mayers
p.mayers at imperial.ac.uk
Mon Feb 13 21:08:22 CET 2006
George-Cristian Bîrzan wrote:
> On Mon, 2006-02-13 at 13:39 -0500, Alan DeKok wrote:
>>> waitpid(22058, NULL, WNOHANG) = -1 ECHILD (No child
>> processes)
>>> 22058 exists, and is another process that hung.
>> That's definitely a bug in your OS somewhere. If 22058 exists, and
>> is "defunct", then the parent process calling "waitpid" on it should
>> *never* get ECHILD.
>
> It's not the parent, from what I can tell. At which point it should do
> that, no?
>
No it should not do that, that's why you are having problems.
Posix MANDATES that any thread in a thread group can reap any child
process, not just the thread that created the process. The Linux manpage
for waitpid says this was fixed in the 2.4 kernel, but clearly not -
possibly the problem lies in the pthread library (LinuxThreads) you're
using, which is bundled with glibc. There's also the morass of SIGCHLD
handling...
I would suggest trying a system with proper NPTL threading (2.6 kernel,
recent libc compiled appropriately) which will almost certainly work.
Upgrading your main system to that may be a little more... involved.
More information about the Freeradius-Users
mailing list