Unexpected "Exiting normally" 2.1.8?

Wed Nov 4 21:47:21 CET 2009

Craig Campbell <craig at ccraft.ca> wrote:
> 
>  Thanks for the update - I was concluding I'd have to wait for the release 
> of 2.1.8 to pursue this.  I am currently in a situation where I can help 
> debug 2.1.8, since the 'new' systems aren't yet in production.
> 
Well I can see no reason to run FreeRADIUS no in a debugger all the 
time, even when in production.  However my nickname is "Rambo Clouter" 
so maybe you do not want to follow my advice. :)

When you compile FreeRADIUS you simply make sure you leave 
debugging symbols in and turn off compiler optimisations (so your CFLAGS 
should be '-O0 -g'.  You probably can do this by running configure as 
follows:
----
CFLAGS='-O0 -g' ./configure --all-your-usual-options-that-you-want
----

> Looking at your debug output (and I am in no way an expert at that) it seems 
> as though the process received a signal?
>
Well FreeRADIUS is sending it to herself according to gdb:
---- src/main/radiusd.c line 419 ----
        /*
         *      Send a TERM signal to all
         *      associated processes
         *      (including us, which gets
         *      ignored.)
         */
#ifndef __MINGW32__
        if (spawn_flag) kill(-radius_pid, SIGTERM);
#endif  
----

For whatever reason, it is not getting ignored.  At first I thought it 
was because I run my FreeRADIUS (even in production) in gdb, but as you 
do not I am wondering what is actually going on.

To run it in the debugger just run 'gdb freeradius' and you will get the 
gdb prompt.  There you want to type 'run -f' and wait for it to puke.  
When it does you could type 'where' for it to tell you what happened, 
but we know what is happening, we want to find which patch is doing it 
:)  Oh familise yourself with screen[2] if you do not know it already, 
you should run the debugger in a screen'd session so you can return to 
it later without having to remain logged in.

> I am running a 'custom' module (event.c as I recall) from Alan that resolves 
> an issue with hung children (very exciting!), and I followed Alan's 
> instructions to get to this point.  I would really like to try to 'give 
> back' if I can and assist in identifying the cause of the program exiting 
> (assuming it is a new and as of yet unidentified bug).
> 
> Would copying the steps you have below on my two redhat systems be a good 
> way to proceed?
> 
Pretty much follow:

http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/

I had been running with the cherry-pick'ed patches for weeks and had no 
problems up to 9261f3e0026323b2c397af13d02fbc5780908143, so I am certain 
that the issue is the result of the patches between 
12ead56dffca9b3ecddc8a7860a1ef5b5361b374 and 
9dbc8974fdd2300a70293eda9c62bce20a3c9165.  The problem is you *have* to 
apply my listed cherry-picks, as if you add *any* of the TCP related 
code Alan has been working on, it all stops compiling[1]

Cheers

[1] I am pretty sure Alan has stashed a number of patches that he has 
	not put into the publically available GIT trees as things like 
	the jumbo socket clean up patch 
	(e04b62f1bd257489bd92ccc584b0886c7e2011e8) refer to 
	my_ipaddr/my_port which is not in any header files I have or 
	found in 'master'
[2] http://blogamundo.net/code/screen/
-- 
Alexander Clouter
.sigmonster says: Simplicity does not precede complexity, but follows it.