How to disable threads in 2.1.7
Alan DeKok
aland at deployingradius.com
Fri Oct 16 16:01:11 CEST 2009
Craig Campbell wrote:
> I was hoping to build a version that could fork children, but not spawn
> threads.
The server can "exec" child shell scripts. It *cannot* run multiple
RADIUS servers as child processes.
> There are known 'challenges' in using the fork command in multi threaded
> environments. (As opposed to a process that forks children for
> different processing branches.) A couple of years ago I had an
> extremely challenging time modifying an existing threaded application to
> additionally fork off children to perform certain other tasks.
The challenge is in ensuring that the right thread catches the right
child exit.
If you run the server with "radiusd -s", it won't spawn threads.
> The issue I am seeing of stranded/hung children looks similar (that is
> not to say I have caught the culprit... just suspicion at this point).
> The issue seems to happen only sometimes during bursts of increased
> load. (Same as my previous experience.)
It may be a race condition under heavy load. But I don't see why...
the thread that forks then waits for the child to exit, and grabs the
exit code. This should ensure that the child dies, rather than staying
as a zombie.
> If I were to GUESS, at this point I'd look for interrupts that result in
> children when mute locks are in place and unintentionally inherited by
> the child process.
Except that the server doesn't fork... and continue running. It
forks, and immediately exec's the shell script. If the shell script
fails to be executed, the child *still* dies.
The child doesn't obtain *or* check mutexes in between the fork() and
exec(). It does almost *nothing*, as there is only a 100 lines of code
between the fork() and exec()
> (My solution was to acquire ALL locks before a fork,
> then have the child and parent clear them all after) - see man
> pthread_atfork section: RATIONALE if you have access to a Linux system).
That is for long-running children. We don't do that.
> I cannot explain why apparently no one else is seeing the issue I am
> chasing. As far as I can tell, my configuration is quite basic.
Kernel bugs? Possible race conditions in the code?
> I am now trying a run with the -s option but, if successful, it won't
> tell us much about why.
If it works...
Alan DeKok.
More information about the Freeradius-Users
mailing list