Radius server crash due to 35# bug

Liu Linyong liuly at ffcs.cn
Wed Apr 7 04:15:36 CEST 2010


Hi,

The conclusion:
The 35# bug can recur rather easily.
 
We found FreeRadius server crash sometimes with load. 
To test the stability of the FreeRadius architecture, I modify the two functions: rad_authenticate() / rad_accounting(), 
they will do very simple things, and will not use any modules.

int rad_accounting(REQUEST *request)
{
	request->reply->code = PW_ACCOUNTING_RESPONSE;
	return RLM_MODULE_OK;
}

int rad_authenticate(REQUEST *request)
{
	VALUE_PAIR *reply_tmp = NULL;
	
	reply_tmp = pairmake( "Service-Type", "Framed-User", T_OP_EQ);
	if( reply_tmp ) {
		pairxlatmove(request, &request->reply->vps, &reply_tmp);
		pairfree(&reply_tmp);
	}

	if (request->reply->code == 0) {
	  request->reply->code = PW_AUTHENTICATION_ACK;
    }
	return RLM_MODULE_OK;	
}

Then use gdb to run 'radiusd', and do load test with LoadRunner, which sending thousands of requests per second.
FreeRadius will cost little mem(16M) and cpu(1.6%).
After an hour or so, the server will crash. These can be repeated easily. gdb show the position:

Program received signal SIGSEGV, Segmentation fault.
0xff376a6c in fr_packet_cmp (a=0x3d4070, b=0x0) at packet.c:139
139             if (a->sockfd < b->sockfd) return -1;
(gdb) 
(gdb) 
(gdb) bt
#0  0xff376a6c in fr_packet_cmp (a=0x3d4070, b=0x0) at packet.c:139
#1  0xff3680ec in list_find (ht=0x156fc8, head=0x156fe4, reversed=2911412273, data=0xffbff808) at hash.c:191
#2  0xff3685ec in fr_hash_table_finddata (ht=0x2469d0, data=0x156fe4) at hash.c:491
#3  0xff3771d0 in fr_packet_list_find (pl=0x156aa8, request=0x3d4070) at packet.c:557
#4  0x00032b88 in received_request (listener=0x15b848, packet=0x3d4070, prequest=0xffbff938, client=0x1101d8) at event.c:2746
#5  0x0001d5bc in acct_socket_recv (listener=0x15b848, pfun=0xffbff93c, prequest=0xffbff938) at listen.c:908
#6  0x000337d0 in event_socket_handler (xel=0x158450, fd=10, ctx=0x15b848) at event.c:3316
#7  0xff377f84 in fr_event_loop (el=0x158450) at event.c:400
#8  0x000284b4 in main (argc=2, argv=0xffbffb3c) at radiusd.c:398

The problem seems to be the same with 35# bug. (https://bugs.freeradius.org/bugzilla/show_bug.cgi?id=35)


Environmnet:
MEM: 24G phys mem
OS:  Solaris 5.10, sparc             
GCC:    3.3.2
GMAKE:  3.80
GDB:  6.8
CLIENT: LoadRunner (sending packages: auth & acct start & acct stop)
FreeRadius server: 2-1-7

Is there any information else I should post here?

***********************************************
Liu Linyong 2010-04-07




More information about the Freeradius-Devel mailing list