multiple freeradius instances with sqlippool, is it safe?

Matteo Sgalaberni sgala at sgala.com
Tue Mar 22 10:24:14 UTC 2022


----- On 22 Mar, 2022, at 10:00, Nathan Ward lists+freeradius at daork.net wrote:
> If you want some extra confidence, you may note that a single FreeRADIUS server
> establishes multiple connections to the database server and runs transactions
> in parallel - database connections from different RADIUS servers is no
> different in terms of concurrency in the database than multiple connections
> from one server.
> 
> Note there has been work in the past several years on those sqlippool queries to
> improve performance and so on - you should make sure you’re running the latest
> code.

Ok thanks Nathan for your feedback! In my tests seems that some weird events happen.

See this log:
Mar 22 10:25:01 HOST patroni[2142563]: 2022-03-22 10:25:01 CET 623995bc.20b163 pg_user_radius1 X.X.X.X(36340)LOG:  statement: UPDATE radippool SET nasipaddress = 'Y.Y.Y.Y', pool_key = '28905475', callingstationid = '', username = 'USER1', expiry_time = 'now'::timestamp(0) + '3600 second'::interval WHERE framedipaddress = 'Z.Z.Z.Z'
Mar 22 10:25:01 HOST patroni[2130290]: 2022-03-22 10:25:01 CET 62397479.208172 pg_user_radius2 X.X.X.X(38266)LOG:  statement: UPDATE radacct SET AcctStopTime = TO_TIMESTAMP(1647941081), AcctUpdateTime = TO_TIMESTAMP(1647941081), AcctSessionTime = COALESCE(26332, (1647941081 - EXTRACT(EPOCH FROM(AcctStartTime)))), AcctInputOctets = (('0'::bigint << 32) + '39282628'::bigint), AcctOutputOctets = (('0'::bigint << 32) + '603682375'::bigint), AcctTerminateCause = 'Port-Error', FramedIPAddress = NULLIF('Z.Z.Z.Z', '')::inet, FramedIPv6Address = NULLIF('', '')::inet, FramedIPv6Prefix = NULLIF('', '')::inet, FramedInterfaceId = NULLIF('', ''), DelegatedIPv6Prefix = NULLIF('', '')::inet, ConnectInfo_stop = '' WHERE AcctUniqueId = '355a47bc7d94803b22248312771a6d18' AND AcctStopTime IS NULL
Mar 22 10:25:01 HOST patroni[2130291]: 2022-03-22 10:25:01 CET 6239747a.208173 pg_user_radius2 X.X.X.X(38268)LOG:  statement: UPDATE radippool SET nasipaddress = '', pool_key = 0, callingstationid = '', expiry_time = 'now'::timestamp(0) - '1 second'::interval WHERE nasipaddress = 'Y.Y.Y.Y' AND pool_key = '28905475' AND username = 'USER1' AND callingstationid = '' AND framedipaddress = 'Z.Z.Z.Z'
Mar 22 10:25:21 HOST patroni[2130288]: 2022-03-22 10:25:21 CET 62397479.208170 pg_user_radius2 X.X.X.X(38262)LOG:  statement: INSERT INTO radacct (AcctSessionId, AcctUniqueId, UserName, Realm, NASIPAddress, NASPortId, NASPortType, AcctStartTime, AcctUpdateTime, AcctStopTime, AcctSessionTime, AcctAuthentic, ConnectInfo_start, ConnectInfo_Stop, AcctInputOctets, AcctOutputOctets, CalledStationId, CallingStationId, AcctTerminateCause, ServiceType, FramedProtocol, FramedIpAddress, FramedIpv6Address, FramedIpv6Prefix, FramedInterfaceId, DelegatedIpv6Prefix) VALUES('0/0/1/2961.3_0180D016', 'e417ebdf297f3734c255d14089716c22', 'USER1', NULLIF('', ''), 'Y.Y.Y.Y', NULLIF('0/0/1/2961.3', ''), 'Ethernet', TO_TIMESTAMP(1647941101), TO_TIMESTAMP(1647941101), NULL, 0, 'RADIUS', '', NULL, 0, 0, '', '', NULL, 'Framed-User', 'PPP', NULLIF('Z.Z.Z.Z', '')::inet, NULLIF('', '')::inet, NULLIF('', '')::inet, NULLIF('', ''), NULLIF('', '')::inet) ON CONFLICT (AcctUniqueId) DO UPDATE SET AcctStartTime = TO_TIMESTAMP(1647941101), AcctUpdateTime = TO_TIMESTAMP(1647941101), ConnectInfo_start = '' WHERE radacct.AcctUniqueId = 'e417ebdf297f3734c255d14089716c22' AND radacct.AcctStopTime IS NULL

This timeline happened:
- radius1 received the auth packet and sqlippool assigned the ip (correct information on radippool)
- radius2 received the accounting stop and sqlippool freed the ip (destroying the information on radippool of that ip address previously assigned)
- radius2 received the accounting start and do nothing on 

So at the end of this timeline I have a wrong entry on the table (the user is actually connected but not for the radipoool):
  id  |    pool_name    | framedipaddress | nasipaddress | pool_key | calledstationid | callingstationid |     expiry_time     |   username
------+-----------------+-----------------+--------------+----------+-----------------+------------------+---------------------+---------------
 1686 | dyn_ipv4        | Z.Z.Z.Z         |              | 0        |                 |                  | 2022-03-22 10:25:01 | USER1


Seems to be a concurrency issue occurred because the BRAS (Cisco ASR1001X) sent the auth packet and the acct packets to different radius servers and executed the queries in a logical wrong order. 

Can this scenario happen also in a single server setup? Eg: the accounting stop packet is processed after the authentication packet.

My impression is that the BRAS should send the auth and acct to the same server. Also the radius should process the packets in the right order to prevent this issue.

Can you share with me your BRAS "aaa *" configuration?

Thanks

Matteo


More information about the Freeradius-Users mailing list