rlm_exec triggered hang or local weirdness?
Joe Maimon
jmaimon at ttec.com
Fri Oct 28 05:25:48 CEST 2005
(long)
I have been running radiusd cvs 9/29 + local mods for some time. (mostly
under gdb with -fxx), to see if I can upgrade all systems to this or
something later (plan to recheckpoint as soon as 9/29 is stable for me).
As illustrated by previous unrelated posts, this server forks a lot (for
every accounting which is confed for 5 minute updates per logged in user
on the nas) to implement a ddns scheme.
One issue that appears to be cropping up is where debug output and the
server appears to freeze up.
=====================================================
thread apply all bt shows this :
Thread 11 (Thread -1290798160 (LWP 7382)):
#0 0xb7ca438f in fork () from /lib/tls/libc.so.6
#1 0xb7f73064 in fork () from /lib/tls/libpthread.so.0
#2 0x08058a03 in rad_fork (exec_wait=0) at threads.c:1114
#3 0xb7fa8205 in radius_exec_program () from
/usr/lib/freeradius/rlm_exec-1.1.0-pre0.so
#4 0xb7fa8d8c in radius_exec_program () from
/usr/lib/freeradius/rlm_exec-1.1.0-pre0.so
#5 0x08054d26 in modcall (component=3, c=0x81b1188, request=0xb51033c8)
at modcall.c:254
#6 0x080552d9 in call_one (component=Variable "component" is not available.
) at modcall.c:287
#7 0x08054eb5 in modcall (component=3, c=0x81afa08, request=0xb51033c8)
at modcall.c:342
#8 0x08053f5f in indexed_modcall (comp=3, idx=Variable "idx" is not
available.
) at modules.c:428
#9 0x0804f009 in rad_accounting (request=0xb51033c8) at acct.c:65
#10 0x08060f32 in rad_respond (request=0xb51033c8, fun=0x804efa0
<rad_accounting>) at request_process.c:456
#11 0x08057dba in request_handler_thread (arg=0x81b5828) at threads.c:585
#12 0xb7f6cccd in start_thread () from /lib/tls/libpthread.so.0
#13 0xb7ce3b0e in clone () from /lib/tls/libc.so.6
<the rest of the threads are at sem_wait>
Thread 1 (Thread -1212102976 (LWP 27527)):
#0 0xb7cf0d69 in pthread_setcanceltype () from /lib/tls/libc.so.6
#1 0xb7c7ac05 in malloc () from /lib/tls/libc.so.6
#2 0xb7f5a744 in rad_recv (fd=8) at radius.c:1063
#3 0x0805ec18 in proxy_socket_recv (listener=0xfffffffc,
pfun=0xfffffffc, prequest=0xfffffffc) at listen.c:802
#4 0x0804d3c1 in main (argc=2, argv=0xbfabde44) at radiusd.c:665
============================================
This tentatively appears to occur with -fxx NOT under gdb, normal is not
currently known.
cvs 3/16 does not have this issue when running normally.
The only change I could detect as being obviously relevant is an added
call to reap_children() in rad_fork()
Now reap_children implements some magic voodoo tail head queue scheme
that I havent managed to wrap my head around yet.
debug output shows that radutmp (which is before the ddns exec module)
is the last module to returns normally before.
So I have added copious debug output. And I am having trouble
understanding what I am seeing (nothing new there).
In reap_children I have added some calls like this:
==================================
---
/home/joe/download/radius/cvs/cvs-20050929/radiusd/src/main/threads.c
2005-09-14 21:21:14.000000000 -0400
+++ src/main/threads.c 2005-10-27 22:45:28.000000000 -0400
@@ -246,8 +248,10 @@
{
if (thread_pool.wait_head != thread_pool.wait_tail) {
int num;
+ int rval;
pthread_mutex_lock(&thread_pool.wait_mutex);
+ FUNC_LINE_FILE(" mutex locked");
for (num = ((thread_pool.wait_tail + MAX_WAITERS) -
thread_pool.wait_head) % MAX_WAITERS;
num != 0;
num--) {
@@ -259,18 +263,35 @@
/*
* Child is still alive: move it to the tail.
*/
- if (waitpid(pid, NULL, WNOHANG) == 0) {
+ FUNC_LINE_FILE(" wait for %d", pid);
+ if (waitpid(pid, &rval, WNOHANG) == 0) {
if (((thread_pool.wait_tail + 1) %
MAX_WAITERS)
== thread_pool.wait_head) {
rad_assert(0 == 1);
}
+ FUNC_LINE_FILE(" wait for %d returned val\n"
+ " WIFEXITED(%d)"
+ " WEXITSTATUS(%d)"
+ " WIFSIGNALED(%d)"
+ " WTERMSIG(%d)"
+ " WIFSTOPPED(%d)"
+ " WSTOPSIG(%d)"
+ , pid, WIFEXITED(rval),
+ WIFEXITED(rval) ?
WEXITSTATUS(rval) : 0,
+ WIFSIGNALED(rval),
+ WIFSIGNALED(rval) ?
WTERMSIG(rval) : 0,
+ WIFSTOPPED(rval),
+ WIFSTOPPED(rval) ?
WSTOPSIG(rval) : 0
+ );
thread_pool.wait[thread_pool.wait_tail]
= pid;
thread_pool.wait_tail++;
thread_pool.wait_tail %= MAX_WAITERS;
} /* else no child, or was already reaped */
+ else FUNC_LINE_FILE(" wait for %d returned
noval", pid);
}
pthread_mutex_unlock(&thread_pool.wait_mutex);
+ FUNC_LINE_FILE(" mutex unlocked");
}
}
===============================================================
A sample of the output looks like this:
================================================================
reap_children() threads.c:254 mutex locked
reap_children() threads.c:266 wait for 9240
reap_children() threads.c:285 wait for 9240 returned val
WIFEXITED(0) WEXITSTATUS(0) WIFSIGNALED(1) WTERMSIG(84) WIFSTOPPED(0)
WSTOPSIG(0)
reap_children() threads.c:266 wait for 9241
reap_children() threads.c:285 wait for 9241 returned val
WIFEXITED(0) WEXITSTATUS(0) WIFSIGNALED(1) WTERMSIG(84) WIFSTOPPED(0)
WSTOPSIG(0)
reap_children() threads.c:266 wait for 14565
reap_children() threads.c:285 wait for 14565 returned val
WIFEXITED(0) WEXITSTATUS(0) WIFSIGNALED(1) WTERMSIG(84) WIFSTOPPED(0)
WSTOPSIG(0)
reap_children() threads.c:266 wait for 14566
reap_children() threads.c:285 wait for 14566 returned val
WIFEXITED(0) WEXITSTATUS(0) WIFSIGNALED(1) WTERMSIG(84) WIFSTOPPED(0)
WSTOPSIG(0)
reap_children() threads.c:266 wait for 17694
reap_children() threads.c:285 wait for 17694 returned val
WIFEXITED(0) WEXITSTATUS(0) WIFSIGNALED(1) WTERMSIG(84) WIFSTOPPED(0)
WSTOPSIG(0)
reap_children() threads.c:266 wait for 19290
reap_children() threads.c:285 wait for 19290 returned val
WIFEXITED(0) WEXITSTATUS(0) WIFSIGNALED(1) WTERMSIG(84) WIFSTOPPED(0)
WSTOPSIG(0)
reap_children() threads.c:266 wait for 19291
reap_children() threads.c:285 wait for 19291 returned val
WIFEXITED(0) WEXITSTATUS(0) WIFSIGNALED(1) WTERMSIG(84) WIFSTOPPED(0)
WSTOPSIG(0)
reap_children() threads.c:266 wait for 20399
reap_children() threads.c:285 wait for 20399 returned val
WIFEXITED(0) WEXITSTATUS(0) WIFSIGNALED(1) WTERMSIG(84) WIFSTOPPED(0)
WSTOPSIG(0)
reap_children() threads.c:294 mutex unlocked
========================================================================
Question 1:
Signal 84?
Question 2:
Appears that reap_children waits for thread pids, in addition to forked
children? (see 9241 and 9240 and 20399..... below)
======================================================
freerad 6760 32641 6760 0 7 22:45 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 6760 32641 6791 0 7 22:45 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 6760 32641 6792 0 7 22:45 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 6760 32641 6793 0 7 22:45 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 6760 32641 6794 0 7 22:45 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 6760 32641 6795 0 7 22:45 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 6760 32641 19287 0 7 22:58 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 9240 6760 9240 0 1 22:48 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 9241 6760 9241 0 1 22:48 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 14565 6760 14565 0 1 22:53 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 14566 6760 14566 0 1 22:53 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 17694 6760 17694 0 1 22:56 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 19290 6760 19290 0 1 22:58 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 19291 6760 19291 0 1 22:58 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 20399 6760 20399 0 1 22:59 pts/15 00:00:00
src/main/.libs/radiusd -fxx
freerad 22467 6760 22467 0 1 23:01 pts/15 00:00:00
src/main/.libs/radiusd -fxx
=============================================================
How this has anything todo with my possible issue, I am not quite certain.
misc:
FUNC_LINE_FILE() is a local debug modification looks like this:
#define FUNC_LINE_FILE(format, ...) do {\
printf("%s() %s:%d" format
"\n",__func__,__FILE__,__LINE__, ## __VA_ARGS__);\
} while (0);
(yes its missing a mutex lock, but since I put it into libradius.h I
couldnt easily use DEBUG2)
If anyone has any tips I would appreciate it.
Joe
More information about the Freeradius-Devel
mailing list