2.1.8 proxy zombie/dead/alive loops

Craig Campbell craig at ccraft.ca
Mon Jan 4 17:31:09 CET 2010


In testing the new 2.1.8 for accounting packets I have discovered the following strange situation,

There are 2 radius servers (radius-a and radius-b).
Each server will relay packets it receives to the other server. (Currently only accounting packets are being received)
The packets are collected in detail-relay file.
The packets are then relayed via the sites/enabled/copy-acct-to-home-server config.

What I observe is a single packet being read from the detail-relay.work file on radius-b and being sent radius-a.
I do not see any response from radius-a being returned to radius-b. 
After what seems to be about 30 seconds the packet is resent from radius-b to radius-a.  Again and again...

On radius-b the following messages are logged (status_check = status-server)....

  Mon Jan  4 10:10:15 2010 : Info: rlm_sql (sql): Driver rlm_sql_mysql (module rlm_sql_mysql) loaded and linked
  Mon Jan  4 10:10:15 2010 : Info: rlm_sql (sql): Attempting to connect to radiusd at localhost:/radius 
  Mon Jan  4 10:10:15 2010 : Info: rlm_sql_mysql: Starting connect to MySQL server for #0
  Mon Jan  4 10:10:15 2010 : Info: rlm_sql_mysql: Starting connect to MySQL server for #1
  Mon Jan  4 10:10:15 2010 : Info: rlm_sql_mysql: Starting connect to MySQL server for #2
  Mon Jan  4 10:10:15 2010 : Info: rlm_sql_mysql: Starting connect to MySQL server for #3
  Mon Jan  4 10:10:15 2010 : Info: rlm_sql_mysql: Starting connect to MySQL server for #4
  Mon Jan  4 10:10:15 2010 : Info: Loaded virtual server inner-tunnel
  Mon Jan  4 10:10:15 2010 : Info: Loaded virtual server copy-acct-to-home-server
  Mon Jan  4 10:10:15 2010 : Info: Loaded virtual server <default>
  Mon Jan  4 10:10:15 2010 : Info: Ready to process requests.
  Mon Jan  4 10:10:42 2010 : Proxy: Marking home server 192.168.1.225 port 1813 as zombie (it looks like it is dead).
  Mon Jan  4 10:10:42 2010 : Proxy: Received response to status check 5938 (1 in current sequence)
  Mon Jan  4 10:11:11 2010 : Proxy: Received response to status check 6013 (2 in current sequence)
  Mon Jan  4 10:11:40 2010 : Proxy: Received response to status check 6048 (3 in current sequence)
  Mon Jan  4 10:11:40 2010 : Proxy: Marking home server 192.168.1.225 port 1813 alive
  Mon Jan  4 10:11:43 2010 : Proxy: Marking home server 192.168.1.225 port 1813 as zombie (it looks like it is dead).
  Mon Jan  4 10:11:43 2010 : Proxy: Received response to status check 6051 (4 in current sequence)
  Mon Jan  4 10:11:43 2010 : Proxy: Marking home server 192.168.1.225 port 1813 alive
  Mon Jan  4 10:12:13 2010 : Proxy: Marking home server 192.168.1.225 port 1813 as zombie (it looks like it is dead).
  Mon Jan  4 10:12:13 2010 : Proxy: Received response to status check 6086 (5 in current sequence)
  Mon Jan  4 10:12:13 2010 : Proxy: Marking home server 192.168.1.225 port 1813 alive
  Mon Jan  4 10:12:44 2010 : Proxy: Marking home server 192.168.1.225 port 1813 as zombie (it looks like it is dead).
  Mon Jan  4 10:12:44 2010 : Proxy: Received response to status check 6118 (6 in current sequence)
  Mon Jan  4 10:12:44 2010 : Proxy: Marking home server 192.168.1.225 port 1813 alive
  Mon Jan  4 10:13:14 2010 : Proxy: Marking home server 192.168.1.225 port 1813 as zombie (it looks like it is dead).
  Mon Jan  4 10:13:14 2010 : Proxy: Received response to status check 6153 (7 in current sequence)
  Mon Jan  4 10:13:14 2010 : Proxy: Marking home server 192.168.1.225 port 1813 alive

On radius-a for the same time frame...

  Mon Jan  4 10:11:55 2010 : Proxy: Marking home server 192.168.1.226 port 1813 as zombie (it looks like it is dead).
  Mon Jan  4 10:11:55 2010 : Proxy: Received response to status check 669399 (6 in current sequence)
  Mon Jan  4 10:11:55 2010 : Proxy: Marking home server 192.168.1.226 port 1813 alive



I then tried using the alternate "status_check = request " method, but I think I have it configured incorrectly since it is also not receiving a reply...

  Mon Jan  4 11:52:16 2010 : Info: rlm_sql (sql): Driver rlm_sql_mysql (module rlm_sql_mysql) loaded and linked
  Mon Jan  4 11:52:16 2010 : Info: rlm_sql (sql): Attempting to connect to radiusd at localhost:/radius
  Mon Jan  4 11:52:16 2010 : Info: rlm_sql_mysql: Starting connect to MySQL server for #0
  Mon Jan  4 11:52:16 2010 : Info: rlm_sql_mysql: Starting connect to MySQL server for #1
  Mon Jan  4 11:52:16 2010 : Info: rlm_sql_mysql: Starting connect to MySQL server for #2
  Mon Jan  4 11:52:16 2010 : Info: rlm_sql_mysql: Starting connect to MySQL server for #3
  Mon Jan  4 11:52:16 2010 : Info: rlm_sql_mysql: Starting connect to MySQL server for #4
  Mon Jan  4 11:52:16 2010 : Info: Loaded virtual server inner-tunnel
  Mon Jan  4 11:52:16 2010 : Info: Loaded virtual server copy-acct-to-home-server
  Mon Jan  4 11:52:16 2010 : Info: Loaded virtual server <default>
  Mon Jan  4 11:52:16 2010 : Info: Ready to process requests.
  Mon Jan  4 11:52:43 2010 : Proxy: Marking home server 192.168.1.225 port 1813 as zombie (it looks like it is dead).
  Mon Jan  4 11:52:44 2010 : Info: [sql] stop packet with zero session length. [user 'test_user_please_reject_me', nas '192.168.1.226']
  Mon Jan  4 11:52:47 2010 : Error: No response to status check 5969 for home server 192.168.1.225 port 1813
  Mon Jan  4 11:53:13 2010 : Info: [sql] stop packet with zero session length. [user 'test_user_please_reject_me', nas '192.168.1.226']
  Mon Jan  4 11:53:16 2010 : Error: No response to status check 6007 for home server 192.168.1.225 port 1813
  Mon Jan  4 11:53:43 2010 : Proxy: Marking home server 192.168.1.225 port 1813 as dead.
  Mon Jan  4 11:53:46 2010 : Info: [sql] stop packet with zero session length. [user 'test_user_please_reject_me', nas '192.168.1.226']

      on radius-a, for status_check=request I added to acct_user,

      test_user_please_reject_me      Auth-Type := Reject
          Reply-Message = "Status check only",
          Fall-Through = No


I suspect I SHOULD be using status_check=status-server.
Which then leads to why my server keeps getting marked as zombie/dead/alive....

It seems like the accounting stop packet being sent is not generating a reply...?

Below is the only packet being sent... again and again....

  Mon Jan  4 11:53:45 2010
          Acct-Status-Type = Stop
          User-Name = "not_real_id at somerealm"
          Event-Timestamp = "Dec 31 2009 15:41:36 DST"
          Acct-Delay-Time = 331887
          NAS-Identifier = "ERX-2"
          Acct-Session-Id = "0378168264"
          NAS-IP-Address = 192.168.1.101
          Service-Type = Framed-User
          Framed-Protocol = PPP
          Framed-Compression = None
          ERX-Pppoe-Description = "pppoe 00:1d:68:ec:ee:64"
          Framed-IP-Address = 66.247.201.49
          Framed-IP-Netmask = 255.255.255.255
          ERX-Ingress-Policy-Name = "SOMEREALM_UP"
          ERX-Egress-Policy-Name = "SOMEREALM_DOWN"
          Calling-Station-Id = "ERX-08000000283"
          Acct-Input-Gigawords = 0
          Acct-Input-Octets = 96
          Acct-Output-Gigawords = 0
          Acct-Output-Octets = 60
          ERX-Input-Gigapkts = 0
          Acct-Input-Packets = 0
          ERX-Output-Gigapkts = 0
          Acct-Output-Packets = 0
          NAS-Port-Type = Ethernet
          NAS-Port = 2147483931
          NAS-Port-Id = "GigabitEthernet 8/0.283:283"
          Acct-Authentic = RADIUS
          Acct-Session-Time = 0
          Acct-Terminate-Cause = User-Request
          Proxy-State = 0x3534313734
          Acct-Unique-Session-Id = "038113f04620fe75"
          Timestamp = 1262620425
          Request-Authenticator = Verified

I'm not certain what additional info would be helpful at this point.

Thanks,
-craig


--------------------------------------------------------------------------------
Craig Campbell 
craig.campbell at ccraft.ca 
CampbellCraft Consulting Inc
2 Kenny Court 
Whitby, Ontario 
Canada 
L1R 2L8 
905 922-2789 

 



__________ Information from ESET Smart Security, version of virus signature database 4743 (20100104) __________

The message was checked by ESET Smart Security.

http://www.eset.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freeradius.org/pipermail/freeradius-users/attachments/20100104/2f9c4c26/attachment.html>


More information about the Freeradius-Users mailing list