Crashes in 2.1.8 when handling received auth packets
aland at deployingradius.com
Wed Feb 3 13:49:28 CET 2010
John Morrissey wrote:
> We recently upgraded from 2.0.4 to 2.1.8 and are now noticing occasional
> segfaults when handling received auth packets. Representative backtraces are
> below. In all cases, all threads are idle except one, which is receiving an
> auth packet.
> In the first case, auth_socket_recv() passes a NULL packet to
> received_request(), which is strange since auth_socket_recv() checks for
> that case immediately before.
I don't see that in the output you posted...
> In the second case, received_request() gets a bogus pointer to the packet,
> apparently from rad_recv().
> #0 fr_packet_cmp (a=0x7f7179f94070, b=0x0) at packet.c:139
> #3 0x00007f71e33ad4cb in fr_packet_list_find (pl=<value optimized out>, request=0x7f7179f94070) at packet.c:581
Note the same point "request == a". The issue is that *b* is NULL.
i.e. an entry in the cache of "known" packets has been mysteriously
free'd, *without* removing it from the cache!
> #4 0x0000000000427b49 in received_request (listener=0x15505d0, packet=0x0, prequest=0x5b0000, client=0x7f71e33b0fc0) at event.c:2775
> packet_p = <value optimized out>
> request = (REQUEST *) 0x0
That can be NULL for the first few lines of received_request(). It's
not a problem.
> Thread 1 (Thread 0x7f90c1422ae0 (LWP 11822)):
> #0 fr_packet_cmp (a=0x7f90213b6ee0, b=0x144) at packet.c:139
That isn't a good value for the 'b' pointer. Here, it looks like the
hash entry was free'd, and that memory is now being used for something else.
i.e. there's a "free request" call which does *not* first do "remove
packet from the cache".
My $0.02 is to re-build with all of the debugging turned on. (i.e.
without defining -DNDEBUG, and deleted any "-O" from the CFLAGS in
Make.inc). This leaves the assertions && variables in the code.
There are assertions in the "free request" function that try to make
sure the request has been removed from the packet cache. If those
assertions are hit, then the stack trace will show the bug.
If they aren't hit... it will be a lot harder to track this down.
More information about the Freeradius-Devel