Segfault on FR 3.0.4 when MySQL server goes away
Jonathan Gazeley
Jonathan.Gazeley at bristol.ac.uk
Tue Nov 24 13:19:27 CET 2015
I'm currently running a mixed estate of RADIUS 2 and 3 on CentOS 6 and 7
(halfway through a migration project). The older servers are running a
home-built FR 2.2.9 and the newer ones are running stock FR 3.0.4, as
supplied with CentOS 7.
I've noticed the FR 3 boxes segfault from time to time and I think I've
pinned it down to odd behaviour when the MySQL server is temporarily
unavailable. We have clustered SQL behind an F5 load balancer but there
is a small delay during failover when one of the SQL nodes goes away or
when the F5 decides to switch to a different node for some other reason.
The FR2 boxes cope fine with this transition but the FR3 boxes segfault
in about 50% of cases.
I understand 3.0.4 is not the latest and I'm happy to build 3.1.x if
this will fix the problem but initially I'm trying to work out if this
is a bug in the server, or a misconfiguration on my part (although I'd
hope a misconfiguration wouldn't cause a crash).
Backtrace included below, although this was not built with
--enable-developer since it is a busy production system. I tried to
replicate this on a dev system but the much lower load makes it hard to
cause the crash.
Thanks,
Jonathan
...
rlm_sql (uobsql-write): Reserved connection (4)
rlm_sql (uobsql-write): Executing query: 'select count(*) from
localprod.all_users_table where suspended is not null and
common_username='it000543''
rlm_sql_mysql: MYSQL check_error: 2006, returning RLM_SQL_RECONNECT
rlm_sql (uobsql-write): Reconnecting (4)
rlm_sql_mysql: Starting connect to MySQL server
rlm_sql_mysql: Couldn't connect socket to MySQL server
radiusd at db-write.nomadic-core.bris.ac.uk:radius
rlm_sql_mysql: Mysql error 'Lost connection to MySQL server at 'reading
initial communication packet', system error: 104'
rlm_sql (uobsql-write): Reserved connection (3)
rlm_sql (uobsql-write): Executing query: 'select count(*) from
localprod.all_users_table where suspended is not null and
common_username='it000543''
rlm_sql_mysql: MYSQL check_error: 2006, returning RLM_SQL_RECONNECT
rlm_sql (uobsql-write): Reconnecting (3)
rlm_sql_mysql: Starting connect to MySQL server
rlm_sql_mysql: Couldn't connect socket to MySQL server
radiusd at db-write.nomadic-core.bris.ac.uk:radius
rlm_sql_mysql: Mysql error 'Lost connection to MySQL server at 'reading
initial communication packet', system error: 104'
rlm_sql (uobsql-write): Reserved connection (2)
rlm_sql (uobsql-write): Executing query: 'select count(*) from
localprod.all_users_table where suspended is not null and
common_username='it000543''
rlm_sql_mysql: MYSQL check_error: 2006, returning RLM_SQL_RECONNECT
rlm_sql (uobsql-write): Reconnecting (2)
rlm_sql_mysql: Starting connect to MySQL server
rlm_sql_mysql: Couldn't connect socket to MySQL server
radiusd at db-write.nomadic-core.bris.ac.uk:radius
rlm_sql_mysql: Mysql error 'Lost connection to MySQL server at 'reading
initial communication packet', system error: 104'
rlm_sql (uobsql-write): Reserved connection (1)
rlm_sql (uobsql-write): Executing query: 'select count(*) from
localprod.all_users_table where suspended is not null and
common_username='it000543''
rlm_sql_mysql: MYSQL check_error: 2006, returning RLM_SQL_RECONNECT
rlm_sql (uobsql-write): Reconnecting (1)
rlm_sql_mysql: Starting connect to MySQL server
rlm_sql_mysql: Couldn't connect socket to MySQL server
radiusd at db-write.nomadic-core.bris.ac.uk:radius
rlm_sql_mysql: Mysql error 'Lost connection to MySQL server at 'reading
initial communication packet', system error: 104'
rlm_sql (uobsql-write): Reserved connection (0)
rlm_sql (uobsql-write): Executing query: 'select count(*) from
localprod.all_users_table where suspended is not null and
common_username='it000543''
rlm_sql_mysql: MYSQL check_error: 2006, returning RLM_SQL_RECONNECT
rlm_sql (uobsql-write): Reconnecting (0)
rlm_sql_mysql: Starting connect to MySQL server
rlm_sql_mysql: Couldn't connect socket to MySQL server
radiusd at db-write.nomadic-core.bris.ac.uk:radius
rlm_sql_mysql: Mysql error 'Lost connection to MySQL server at 'reading
initial communication packet', system error: 104'
rlm_sql (uobsql-write): Failed to reconnect (0), no free connections are
available
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff1a5f700 (LWP 23087)]
sql_error (handle=0x0, config=0x555555b58b70) at
src/modules/rlm_sql/drivers/rlm_sql_mysql/rlm_sql_mysql.c:479
479 rlm_sql_mysql_conn_t *conn = handle->conn;
(gdb) bt
#0 sql_error (handle=0x0, config=0x555555b58b70) at
src/modules/rlm_sql/drivers/rlm_sql_mysql/rlm_sql_mysql.c:479
#1 0x00007ffff298558b in sql_xlat (instance=0x555555b58b70,
request=0x555555ca2c90,
query=0x7fffd4035970 "select count(*) from
localprod.all_users_table where suspended is not null and
common_username='it000543'", out=0x7fffd4017980 "", freespace=1024)
at src/modules/rlm_sql/rlm_sql.c:185
#2 0x00007ffff7bcc606 in xlat_aprint (ctx=ctx at entry=0x555555ca2c90,
request=0x555555ca2c90, node=0x555555c62b30, escape=escape at entry=0x0,
escape_ctx=0x0, lvl=0) at src/main/xlat.c:2025
#3 0x00007ffff7bcd1bd in xlat_process (out=0x7ffff1a5df60,
request=0x555555ca2c90, head=0x555555c62b30, escape=0x0, escape_ctx=0x0)
at src/main/xlat.c:2103
#4 0x00007ffff7bcd210 in xlat_expand_struct (out=0x7ffff1a5e000,
outlen=0, request=<optimized out>, node=<optimized out>,
escape=<optimized out>, escape_ctx=<optimized out>) at src/main/xlat.c:2175
#5 0x00007ffff7bc105e in radius_expand_tmpl
(out=out at entry=0x7ffff1a5e000, request=request at entry=0x555555ca2c90,
vpt=0x555555aa28b0) at src/main/evaluate.c:135
#6 0x00007ffff7bc15ca in radius_evaluate_map
(request=request at entry=0x555555ca2c90, modreturn=modreturn at entry=10,
depth=depth at entry=0, c=c at entry=0x555555aa2620) at src/main/evaluate.c:807
#7 0x00007ffff7bc1be0 in radius_evaluate_cond
(request=request at entry=0x555555ca2c90, modreturn=10,
depth=depth at entry=0, c=0x555555aa2620) at src/main/evaluate.c:913
#8 0x0000555555577dd8 in modcall_recurse (request=0x555555ca2c90,
component=RLM_COMPONENT_POST_AUTH, depth=3,
entry=entry at entry=0x7ffff1a5e9d8) at src/main/modcall.c:489
#9 0x0000555555577a70 in modcall_child (request=<optimized out>,
component=<optimized out>, depth=<optimized out>, entry=0x7ffff1a5e9c0,
c=<optimized out>, result=0x7ffff1a5e3e4) at src/main/modcall.c:420
#10 0x0000555555577c67 in modcall_recurse (request=0x555555ca2c90,
component=RLM_COMPONENT_POST_AUTH, depth=2,
entry=entry at entry=0x7ffff1a5e9c0) at src/main/modcall.c:798
#11 0x0000555555577a70 in modcall_child (request=<optimized out>,
component=<optimized out>, depth=<optimized out>, entry=0x7ffff1a5e9a8,
c=<optimized out>, result=0x7ffff1a5e5f4) at src/main/modcall.c:420
#12 0x0000555555577c67 in modcall_recurse (request=0x555555ca2c90,
component=RLM_COMPONENT_POST_AUTH, depth=1,
entry=entry at entry=0x7ffff1a5e9a8) at src/main/modcall.c:798
#13 0x0000555555577a70 in modcall_child (request=<optimized out>,
component=<optimized out>, depth=<optimized out>, entry=0x7ffff1a5e990,
c=<optimized out>, result=0x7ffff1a5e804) at src/main/modcall.c:420
#14 0x0000555555577c67 in modcall_recurse
(request=request at entry=0x555555ca2c90,
component=component at entry=RLM_COMPONENT_POST_AUTH, depth=depth at entry=0,
entry=entry at entry=0x7ffff1a5e990)
at src/main/modcall.c:798
#15 0x0000555555578e90 in modcall
(component=component at entry=RLM_COMPONENT_POST_AUTH,
c=c at entry=0x555555c33170, request=request at entry=0x555555ca2c90) at
src/main/modcall.c:1138
#16 0x00005555555762bf in indexed_modcall
(comp=comp at entry=RLM_COMPONENT_POST_AUTH, idx=idx at entry=0,
request=request at entry=0x555555ca2c90) at src/main/modules.c:909
#17 0x0000555555576eef in process_post_auth
(postauth_type=postauth_type at entry=0,
request=request at entry=0x555555ca2c90) at src/main/modules.c:1930
#18 0x000055555556724e in rad_postauth
(request=request at entry=0x555555ca2c90) at src/main/auth.c:305
#19 0x000055555558185c in request_finish
(request=request at entry=0x555555ca2c90, action=action at entry=1) at
src/main/process.c:1346
#20 0x00005555555866cf in request_running (request=0x555555ca2c90,
action=<optimized out>) at src/main/process.c:1535
#21 0x0000555555580617 in request_handler_thread (arg=0x555555c5f250) at
src/main/threads.c:678
#22 0x00007ffff625ddf5 in start_thread (arg=0x7ffff1a5f700) at
pthread_create.c:308
#23 0x00007ffff5b0e1ad in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:113
More information about the Freeradius-Users
mailing list