Segfault on FR 3.0.4 when MySQL server goes away

Jonathan Gazeley Jonathan.Gazeley at bristol.ac.uk
Tue Nov 24 13:19:27 CET 2015


I'm currently running a mixed estate of RADIUS 2 and 3 on CentOS 6 and 7 
(halfway through a migration project). The older servers are running a 
home-built FR 2.2.9 and the newer ones are running stock FR 3.0.4, as 
supplied with CentOS 7.

I've noticed the FR 3 boxes segfault from time to time and I think I've 
pinned it down to odd behaviour when the MySQL server is temporarily 
unavailable. We have clustered SQL behind an F5 load balancer but there 
is a small delay during failover when one of the SQL nodes goes away or 
when the F5 decides to switch to a different node for some other reason. 
The FR2 boxes cope fine with this transition but the FR3 boxes segfault 
in about 50% of cases.

I understand 3.0.4 is not the latest and I'm happy to build 3.1.x if 
this will fix the problem but initially I'm trying to work out if this 
is a bug in the server, or a misconfiguration on my part (although I'd 
hope a misconfiguration wouldn't cause a crash).

Backtrace included below, although this was not built with 
--enable-developer since it is a busy production system. I tried to 
replicate this on a dev system but the much lower load makes it hard to 
cause the crash.

Thanks,
Jonathan

...
rlm_sql (uobsql-write): Reserved connection (4)
rlm_sql (uobsql-write): Executing query: 'select count(*) from 
localprod.all_users_table where suspended is not null and 
common_username='it000543''
rlm_sql_mysql: MYSQL check_error: 2006, returning RLM_SQL_RECONNECT
rlm_sql (uobsql-write): Reconnecting (4)
rlm_sql_mysql: Starting connect to MySQL server
rlm_sql_mysql: Couldn't connect socket to MySQL server 
radiusd at db-write.nomadic-core.bris.ac.uk:radius
rlm_sql_mysql: Mysql error 'Lost connection to MySQL server at 'reading 
initial communication packet', system error: 104'
rlm_sql (uobsql-write): Reserved connection (3)
rlm_sql (uobsql-write): Executing query: 'select count(*) from 
localprod.all_users_table where suspended is not null and 
common_username='it000543''
rlm_sql_mysql: MYSQL check_error: 2006, returning RLM_SQL_RECONNECT
rlm_sql (uobsql-write): Reconnecting (3)
rlm_sql_mysql: Starting connect to MySQL server
rlm_sql_mysql: Couldn't connect socket to MySQL server 
radiusd at db-write.nomadic-core.bris.ac.uk:radius
rlm_sql_mysql: Mysql error 'Lost connection to MySQL server at 'reading 
initial communication packet', system error: 104'
rlm_sql (uobsql-write): Reserved connection (2)
rlm_sql (uobsql-write): Executing query: 'select count(*) from 
localprod.all_users_table where suspended is not null and 
common_username='it000543''
rlm_sql_mysql: MYSQL check_error: 2006, returning RLM_SQL_RECONNECT
rlm_sql (uobsql-write): Reconnecting (2)
rlm_sql_mysql: Starting connect to MySQL server
rlm_sql_mysql: Couldn't connect socket to MySQL server 
radiusd at db-write.nomadic-core.bris.ac.uk:radius
rlm_sql_mysql: Mysql error 'Lost connection to MySQL server at 'reading 
initial communication packet', system error: 104'
rlm_sql (uobsql-write): Reserved connection (1)
rlm_sql (uobsql-write): Executing query: 'select count(*) from 
localprod.all_users_table where suspended is not null and 
common_username='it000543''
rlm_sql_mysql: MYSQL check_error: 2006, returning RLM_SQL_RECONNECT
rlm_sql (uobsql-write): Reconnecting (1)
rlm_sql_mysql: Starting connect to MySQL server
rlm_sql_mysql: Couldn't connect socket to MySQL server 
radiusd at db-write.nomadic-core.bris.ac.uk:radius
rlm_sql_mysql: Mysql error 'Lost connection to MySQL server at 'reading 
initial communication packet', system error: 104'
rlm_sql (uobsql-write): Reserved connection (0)
rlm_sql (uobsql-write): Executing query: 'select count(*) from 
localprod.all_users_table where suspended is not null and 
common_username='it000543''
rlm_sql_mysql: MYSQL check_error: 2006, returning RLM_SQL_RECONNECT
rlm_sql (uobsql-write): Reconnecting (0)
rlm_sql_mysql: Starting connect to MySQL server
rlm_sql_mysql: Couldn't connect socket to MySQL server 
radiusd at db-write.nomadic-core.bris.ac.uk:radius
rlm_sql_mysql: Mysql error 'Lost connection to MySQL server at 'reading 
initial communication packet', system error: 104'
rlm_sql (uobsql-write): Failed to reconnect (0), no free connections are 
available

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff1a5f700 (LWP 23087)]
sql_error (handle=0x0, config=0x555555b58b70) at 
src/modules/rlm_sql/drivers/rlm_sql_mysql/rlm_sql_mysql.c:479
479        rlm_sql_mysql_conn_t *conn = handle->conn;

(gdb) bt
#0  sql_error (handle=0x0, config=0x555555b58b70) at 
src/modules/rlm_sql/drivers/rlm_sql_mysql/rlm_sql_mysql.c:479
#1  0x00007ffff298558b in sql_xlat (instance=0x555555b58b70, 
request=0x555555ca2c90,
     query=0x7fffd4035970 "select count(*) from 
localprod.all_users_table where suspended is not null and 
common_username='it000543'", out=0x7fffd4017980 "", freespace=1024)
     at src/modules/rlm_sql/rlm_sql.c:185
#2  0x00007ffff7bcc606 in xlat_aprint (ctx=ctx at entry=0x555555ca2c90, 
request=0x555555ca2c90, node=0x555555c62b30, escape=escape at entry=0x0, 
escape_ctx=0x0, lvl=0) at src/main/xlat.c:2025
#3  0x00007ffff7bcd1bd in xlat_process (out=0x7ffff1a5df60, 
request=0x555555ca2c90, head=0x555555c62b30, escape=0x0, escape_ctx=0x0) 
at src/main/xlat.c:2103
#4  0x00007ffff7bcd210 in xlat_expand_struct (out=0x7ffff1a5e000, 
outlen=0, request=<optimized out>, node=<optimized out>, 
escape=<optimized out>, escape_ctx=<optimized out>) at src/main/xlat.c:2175
#5  0x00007ffff7bc105e in radius_expand_tmpl 
(out=out at entry=0x7ffff1a5e000, request=request at entry=0x555555ca2c90, 
vpt=0x555555aa28b0) at src/main/evaluate.c:135
#6  0x00007ffff7bc15ca in radius_evaluate_map 
(request=request at entry=0x555555ca2c90, modreturn=modreturn at entry=10, 
depth=depth at entry=0, c=c at entry=0x555555aa2620) at src/main/evaluate.c:807
#7  0x00007ffff7bc1be0 in radius_evaluate_cond 
(request=request at entry=0x555555ca2c90, modreturn=10, 
depth=depth at entry=0, c=0x555555aa2620) at src/main/evaluate.c:913
#8  0x0000555555577dd8 in modcall_recurse (request=0x555555ca2c90, 
component=RLM_COMPONENT_POST_AUTH, depth=3, 
entry=entry at entry=0x7ffff1a5e9d8) at src/main/modcall.c:489
#9  0x0000555555577a70 in modcall_child (request=<optimized out>, 
component=<optimized out>, depth=<optimized out>, entry=0x7ffff1a5e9c0, 
c=<optimized out>, result=0x7ffff1a5e3e4) at src/main/modcall.c:420
#10 0x0000555555577c67 in modcall_recurse (request=0x555555ca2c90, 
component=RLM_COMPONENT_POST_AUTH, depth=2, 
entry=entry at entry=0x7ffff1a5e9c0) at src/main/modcall.c:798
#11 0x0000555555577a70 in modcall_child (request=<optimized out>, 
component=<optimized out>, depth=<optimized out>, entry=0x7ffff1a5e9a8, 
c=<optimized out>, result=0x7ffff1a5e5f4) at src/main/modcall.c:420
#12 0x0000555555577c67 in modcall_recurse (request=0x555555ca2c90, 
component=RLM_COMPONENT_POST_AUTH, depth=1, 
entry=entry at entry=0x7ffff1a5e9a8) at src/main/modcall.c:798
#13 0x0000555555577a70 in modcall_child (request=<optimized out>, 
component=<optimized out>, depth=<optimized out>, entry=0x7ffff1a5e990, 
c=<optimized out>, result=0x7ffff1a5e804) at src/main/modcall.c:420
#14 0x0000555555577c67 in modcall_recurse 
(request=request at entry=0x555555ca2c90, 
component=component at entry=RLM_COMPONENT_POST_AUTH, depth=depth at entry=0, 
entry=entry at entry=0x7ffff1a5e990)
     at src/main/modcall.c:798
#15 0x0000555555578e90 in modcall 
(component=component at entry=RLM_COMPONENT_POST_AUTH, 
c=c at entry=0x555555c33170, request=request at entry=0x555555ca2c90) at 
src/main/modcall.c:1138
#16 0x00005555555762bf in indexed_modcall 
(comp=comp at entry=RLM_COMPONENT_POST_AUTH, idx=idx at entry=0, 
request=request at entry=0x555555ca2c90) at src/main/modules.c:909
#17 0x0000555555576eef in process_post_auth 
(postauth_type=postauth_type at entry=0, 
request=request at entry=0x555555ca2c90) at src/main/modules.c:1930
#18 0x000055555556724e in rad_postauth 
(request=request at entry=0x555555ca2c90) at src/main/auth.c:305
#19 0x000055555558185c in request_finish 
(request=request at entry=0x555555ca2c90, action=action at entry=1) at 
src/main/process.c:1346
#20 0x00005555555866cf in request_running (request=0x555555ca2c90, 
action=<optimized out>) at src/main/process.c:1535
#21 0x0000555555580617 in request_handler_thread (arg=0x555555c5f250) at 
src/main/threads.c:678
#22 0x00007ffff625ddf5 in start_thread (arg=0x7ffff1a5f700) at 
pthread_create.c:308
#23 0x00007ffff5b0e1ad in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:113



More information about the Freeradius-Users mailing list