talloc & threads in rlm_eap
Phil Mayers
p.mayers at imperial.ac.uk
Mon Jun 23 21:29:25 CEST 2014
On 23/06/2014 17:51, Phil Mayers wrote:
> Also, don't see what happens with marker->obj - thrown away? So don't
> see how backtrace_print() can ever find it?
>
So I rigged a copy of FR to open a thread-local file descriptor and log
every pairalloc & vp->da->name, as well as _pairfree & backtrace. It
looks like the request might be being freed in two threads - a worker
and the main. The process bombs out in the main thread as per below:
Mon Jun 23 20:25:16 2014 : Info: talloc: access after free error - first
free may be at src/lib/valuepair.c:194
Mon Jun 23 20:25:16 2014 : Info: Bad talloc magic value - access after free
Mon Jun 23 20:25:16 2014 : Info: talloc abort: Bad talloc magic value -
access after free
Mon Jun 23 20:25:16 2014 : Info: CAUGHT SIGNAL: Aborted
Mon Jun 23 20:25:16 2014 : Info: Backtrace of last 17 frames:
/opt/fr3/lib/libfreeradius-radius.so(fr_fault+0xd2)[0x7fbbbe279729]
/opt/fr3/lib/libfreeradius-radius.so(+0xaa64)[0x7fbbbe279a64]
/usr/lib64/libtalloc.so.2(talloc_get_name+0x58)[0x36ad402dd8]
/usr/lib64/libtalloc.so.2(_talloc_get_type_abort+0x2b)[0x36ad4057eb]
/opt/fr3/lib/libfreeradius-radius.so(fr_verify_vp+0x8d)[0x7fbbbe27a1ee]
/opt/fr3/lib/libfreeradius-radius.so(_fr_cursor_init+0x67)[0x7fbbbe2789df]
/opt/fr3/lib/libfreeradius-radius.so(fr_verify_list+0x2e)[0x7fbbbe27a68e]
/opt/fr3/lib/libfreeradius-server.so(+0x209eb)[0x7fbbbe4d19eb]
/opt/fr3/lib/libfreeradius-server.so(verify_request+0xd4)[0x7fbbbe4d1ac9]
/opt/fr3/sbin/radiusd[0x435799]
/opt/fr3/sbin/radiusd[0x433531]
/opt/fr3/lib/libfreeradius-radius.so(fr_event_run+0x142)[0x7fbbbe29c75d]
/opt/fr3/lib/libfreeradius-radius.so(fr_event_loop+0x509)[0x7fbbbe29d06a]
/opt/fr3/sbin/radiusd(radius_event_process+0x26)[0x43d768]
/opt/fr3/sbin/radiusd(main+0xc46)[0x42a43a]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x379dc1ed1d]
/opt/fr3/sbin/radiusd[0x40cee9]
...backtrace reveals the vp pointer was alloc'ed and free'ed in another
thread a bit earlier (18695 being the linux tid, which is != pid 18675
so is not the main thread):
/tmp/threadlog-18675-18695:allocing 0x7fbb64001820 User-Name
...
/tmp/threadlog-18675-18695:freeing 0x7fbb64001820
/tmp/threadlog-18675-18695-/opt/fr3/lib/libfreeradius-radius.so(+0x2678b)[0x7fbbbe29578b]
/tmp/threadlog-18675-18695-/usr/lib64/libtalloc.so.2[0x36ad4025a4]
/tmp/threadlog-18675-18695-/opt/fr3/lib/libfreeradius-radius.so(pairfree+0x63)[0x7fbbbe295a59]
/tmp/threadlog-18675-18695-/opt/fr3/sbin/radiusd[0x4354ee]
/tmp/threadlog-18675-18695-/opt/fr3/sbin/radiusd[0x4358f8]
/tmp/threadlog-18675-18695-/opt/fr3/sbin/radiusd[0x431073]
/tmp/threadlog-18675-18695-/lib64/libpthread.so.0[0x379e4079d1]
/tmp/threadlog-18675-18695-/lib64/libc.so.6(clone+0x6d)[0x379dce8b7d]
...unfortunately the backtrace for the thread that did doesn't tell me much.
I can extend this instrumentation further - what would be useful to know?
More information about the Freeradius-Devel
mailing list