fr_packet_cmp again
Josip Almasi
joe at vrspace.org
Wed Apr 27 21:17:07 CEST 2011
Hi,
compiled current 2.1.11 branch on RHEL 6.0 and RHEL 5.5, hit it with
about 7.5K req/s, and it dies on both, after about 1-3.5M requests.
Hit it with about 1000 req/s, it still dies.
It's always segfault in libfreeradius.
(there's more issues on RHEL 6 but they're not critical)
It's got nothing to do with max_request_time; tried changing it just in
case, but I get response time in milliseconds anyway.
It's not about threads; tried with -s, and it still dies.
I'm reasonably sure it's not about load either, just I can't wait like
100000 secs for fail.
All the time, I get decent response time (<20ms auth, <200ms acct) and
no retries. Well at least there's no 'ignoring dupes' repored in
radius.log. Well there's nothing except 'ready' in radius.log.
In fact, it seems bug #35 is still on:
-----------------
Program received signal SIGSEGV, Segmentation fault.
fr_packet_cmp (a=0x11edc90, b=0xfb0b098dddf10f89) at packet.c:139
139 if (a->sockfd < b->sockfd) return -1;
(gdb) bt
#0 fr_packet_cmp (a=0x11edc90, b=0xfb0b098dddf10f89) at packet.c:139
#1 0x00007ffff7bc551b in list_find (ht=0x85aa80, data=0x7fffffffe1c8)
at hash.c:191
#2 fr_hash_table_find (ht=0x85aa80, data=0x7fffffffe1c8) at hash.c:454
#3 0x00007ffff7bc5569 in fr_hash_table_finddata (ht=<value optimized out>,
data=<value optimized out>) at hash.c:484
#4 0x00007ffff7bd30ea in fr_packet_list_find (pl=<value optimized out>,
request=0x11edc90) at packet.c:583
#5 0x0000000000428569 in received_request (listener=0x8686a0,
packet=0x11edc90, prequest=0x7fffffffe320, client=0x79d680) at
event.c:2833
#6 0x0000000000414ee3 in auth_socket_recv (listener=0x8686a0,
pfun=0x7fffffffe328, prequest=0x7fffffffe320) at listen.c:857
#7 0x00000000004291a0 in event_socket_handler (xel=<value optimized out>,
fd=<value optimized out>, ctx=0x8686a0) at event.c:3423
#8 0x00007ffff7bd437b in fr_event_loop (el=0x854c50) at event.c:413
#9 0x000000000041beb4 in main (argc=<value optimized out>,
argv=<value optimized out>) at radiusd.c:408
(gdb) print a=0x11edc90
$1 = (const RADIUS_PACKET *) 0x11edc90
(gdb) print $1->sockfd
$2 = 10
(gdb) print b=0xfb0b098dddf10f89
$3 = (const RADIUS_PACKET *) 0xfb0b098dddf10f89
(gdb) print $3->sockfd
Cannot access memory at address 0xfb0b098dddf10f89
-----------------
Did this a number of times, sometimes it's auth_socket_recv but mostly
acct_socket_recv.
So it seems something still frees data before yanking list...?
Well, beats me.
Any hints?
Now, how to reproduce:
- get spizd 0.5 from http://sourceforge.net/projects/spizd/files/
- unzip, have java in path
- edit etc/dictionary.txt, enter at least one username:password pair
- run bin/spizd-radius.sh <server>, verify you get Access-Accept
- edit etc/spizd.properties: change verboseThis and verboseThat to
false, change spizd.circular to true
- run spizd-radius.sh again and wait, it dies about 10-15 mins later
- optionally, increase maxThreads to kill it faster
What this test does: Access-Request, Acct Start, (optionally a number of
Interim-Update), Acct Stop - sending request imediatelly after response
to previous request is received, thread per session.
Regards...
More information about the Freeradius-Devel
mailing list