2.1.8 proxy zombie/dead/alive loops
Alan DeKok
aland at deployingradius.com
Mon Jan 4 17:46:46 CET 2010
Craig Campbell wrote:
> There are 2 radius servers (radius-a and radius-b).
> Each server will relay packets it receives to the other server.
> (Currently only accounting packets are being received)
> The packets are collected in detail-relay file.
> The packets are then relayed via the
> sites/enabled/copy-acct-to-home-server config.
OK...
> What I observe is a single packet being read from the detail-relay.work
> file on radius-b and being sent radius-a.
> I do not see any response from radius-a being returned to radius-b.
> After what seems to be about 30 seconds the packet is resent from
> radius-b to radius-a. Again and again...
If the home server doesn't respond, the packet will be retried forever.
> On radius-b the following messages are logged (status_check =
> status-server)....
..
> Mon Jan 4 10:10:42 2010 : Proxy: Marking home server 192.168.1.225
> port 1813 as zombie (it looks like it is dead).
> Mon Jan 4 10:10:42 2010 : Proxy: Received response to status check
> 5938 (1 in current sequence)
> Mon Jan 4 10:11:11 2010 : Proxy: Received response to status check
> 6013 (2 in current sequence)
> Mon Jan 4 10:11:40 2010 : Proxy: Received response to status check
> 6048 (3 in current sequence)
> Mon Jan 4 10:11:40 2010 : Proxy: Marking home server 192.168.1.225
> port 1813 alive
> Mon Jan 4 10:11:43 2010 : Proxy: Marking home server 192.168.1.225
> port 1813 as zombie (it looks like it is dead).
Uh... the "response_window" is 3 seconds? Why?
> Mon Jan 4 10:11:43 2010 : Proxy: Received response to status check
> 6051 (4 in current sequence)
> Mon Jan 4 10:11:43 2010 : Proxy: Marking home server 192.168.1.225
> port 1813 alive
Hmm... that last bit shouldn't happen. The timing is such that it
*can* send more than 3 Status-Server packets. But it should use those
to make only *one* transition from "dead" to "alive".
> Mon Jan 4 10:12:13 2010 : Proxy: Marking home server 192.168.1.225
> port 1813 as zombie (it looks like it is dead).
> Mon Jan 4 10:12:13 2010 : Proxy: Received response to status check
> 6086 (5 in current sequence)
OK... the home server isn't getting the sequence numbers re-set.
That's an issue, but probably a minor one.
> Mon Jan 4 11:53:46 2010 : Info: [sql] stop packet with zero session
> length. [user 'test_user_please_reject_me', nas '192.168.1.226']
Why are you trying to log a test packet to SQL? Just make sure that
the 'accounting' section returns 'ok' for the test packet.
> I suspect I SHOULD be using status_check=status-server.
> Which then leads to why my server keeps getting marked as
> zombie/dead/alive....
Because the home server is down, and isn't responding.
> It seems like the accounting stop packet being sent is not generating a
> reply...?
Yes. Go fix that. In 2.1.8, see raddb/sites-available/default. Go
to the "accounting" section, and read the comments after the "sql" module.
Alan DeKok.
More information about the Freeradius-Users
mailing list