Home servers constantly zombied, and I can't figure out how to fix it

Alan DeKok aland at deployingradius.com
Fri Jul 16 03:27:02 CEST 2010


Adam Bultman wrote:
> I have FreeRADIUS 2.1.3 servers that are proxying accounting information
> to two remote RADIUS servers (radiator, if it matters.)

  It could matter.

http://www.cesnet.cz/doc/techzpravy/2008/eduroam-authentication-over-jammed-network/

40% packet loss (client to server) => 78% success rate for PEAP-MSCHAPv2
40% packet loss (server to client) => 20% success rate for PEAP-MSCHAPv2

  Huh?  40% packet loss means that when the client sends a request, 40%
of the time it doesn't see a response to that request.  It shouldn't
matter if the request is lost, or of the response is lost.  But it does.

  The paper says:

"The more favorable results for client-to-server jamming are caused by
the aggressive packet re-sending strategy of wpa_supplicant compared to
the behavior of Radiator."

  I have no idea what that means.  If the server receives a packet, it
should respond to it.  The graphs in the paper seems to indicate that
client retransmits are being *ignored* by radiator.  i.e. If there's
packet loss client to server, then when a packet is lost, the server
doesn't see it.  The client retransmits, the server sees the packet, and
responds.

  On the other hand, if there's packet loss server to client, the client
sends a packet, the server responds.  The client doesn't see the
response, so it retransmits the packet.  The server then sees the
retransmit, and ignores the packet.

  As an indication that this is happening, see:

http://www.open.com.au/radiator/history.html

  See Revision 4.0 (2008-01-14), reference to RFC 5080.  They
implemented the duplicate detection cache which has been in FreeRADIUS
since day 1. (1999).

  My suggestion is to try replacing one of the home servers with
FreeRADIUS.  It will respond to any retransmit.  So if there are packet
loss problems, they should be less problematic.

> My problem is that the two servers I am sending to are constantly
> declared zombies.  Perhaps related is that in packet traces on the
> RADIUS servers, I see my RADIUS servers sending duplicate packets. I do
> not know if the duplicate packets are because the NAS is sending
> duplicate packets to me (it is indeed sending duplicate packets,
> according to wireshark), or if it is something on the RADIUS server's
> end.

  The NAS is likely sending retransmits because it isn't seeing a response.

> Furthermore, in wireshark, I also see plenty of 'Malformed
> Packets', but I don't know if that's because the packet is *truly*
> Malformed, or if it is because wireshark is having some issues (the
> RADIUS servers are VMWare Virtual Machines, and I've seen previously
> that various things can cause  wireshark to detect malformed packets
> when they actually are fine.)

  I'm not sure.  If FreeRADIUS isn't complaining that the packets are
malformed, then they should be OK.  See the statistics (radmin / snmp)
for counts of malformed packets.

> I have been making a lot of configuration changes (esp. with regard to
> the check interval, number of responses before alive, etc) - so if
> anything is seriously out of whack, let me know - but it seems that no
> matter what, those systems get marked as zombies by my RADIUS servers a
> half a dozen times a minute.

  The servers are marked zombie when the proxy sends a request, and
doesn't see a response in 30s.

  That's a little aggressive.  The current logic doesn't take into
account if *other* packets have received responses.  It should probably
mark the home server "zombie" only if there have been *no* responses in
the "zombie" time interval.

  I'd suggest replacing one of the home servers with FreeRADIUS.  If
that makes a big difference for the proxy, then the Radiator server is
borked.

  Alan DeKok.



More information about the Freeradius-Users mailing list