Crash due to fr_packet_cmp

Alan DeKok aland at deployingradius.com
Sun Nov 15 21:19:55 CET 2009


fabiana marvani wrote:
> After some time with load the freeradius crashes
> 
> We first noticed this crash with our plugins activated, but then we deactivated
> all plugins and used "default" configuration:

  There have been a few reports similar to this.  They all require
sending the server many 10s of millions of packets over a long time.
This makes it hard to reproduce && debug.

  It's likely a race condition.  But it's hard to say where, or why.

> core.8555
> Program terminated with signal 11, Segmentation fault.
> (gdb) bt
> #0  fr_packet_cmp (a=0xb4897cd8, b=0x0) at packet.c:139
...
> #6  0x0806cf83 in received_request (listener=0x8d4f608, packet=0xb4897cd8,
> prequest=0xbf89d0dc, client=0x8d2fc80)
>     at event.c:2723

  The server keeps all packets in a hash, to ensure it catches
duplicates, etc.  The hash is keyed by the source packet (src/dst
ip/port).  The crash comes because the "request" structure is still in
the hash, though the "packet" entry in that structure has become NULL.

  The only problem is... the packet entry is *only* set to NULL after
the request has been deleted from hash.  And *only* the main thread
adds/deletes entries to the hash.  And *only* the main thread allocates
or free's request data structures.

  So this is a problem that should be avoided completely by the design
of the server.

  Some questions:

- which OS && CPU (32 / 64-bit)
- which version of the server
- which command line was used to run the server
- is this reproducible in non-threaded mode (radiusd -fs)

  If you are using an older version of the server, please also try with
the current git "stable" branch (see git.freeradius.org).  It has some
changes which give it only one code path for doing certain kinds of
request mangling.  This makes it less likely for there to be errors,
race conditions, etc.

  Alan DeKok.



More information about the Freeradius-Users mailing list