Issue with EAP authentication on packet loss

Alan DeKok aland at
Wed Apr 25 14:08:46 CEST 2018

On Apr 25, 2018, at 4:45 AM, Stefan Winter <stefan.winter at> wrote:
> To be fair, this is not limited to packet loss.
> We've seen this in normal operations - the story goes like:
> - server sends Access-Accept with an attribute X via a chain of proxies
> - some proxy takes offence by the presence of attribute X, discards

  Such proxies are broken and should not be used.  <mumble IAS>

> - client times out and re-sends

  That's the issue... resends *what* ?

  Not the same packet, because the original request has timed out.  So the client sends a new packet, with a new ID.  And the proxies send another new packet when they proxy it.  And the home server receives a brand new RADIUS packet.

  That RADIUS packet contains a State attribute from (say) 10 seconds ago.  But the original RADIUS request and reply are long gone.

> - server has forgotten all about the session state, rejects

  As it should. (mostly)

> I believe the underlying issue is that FreeRADIUS thinks "fire and
> forget" when the final packet is out.

  The server caches replies for 5s.  If it receives an *identical* request, it resends the identical response.

  In a sequence of packets, the server caches requests and replies for *all* of them, for 5s.  But that has limitations.  Let's say we have:

packet N		ID X		State S1
packet N+ 1	ID Y		State S2

* if the client (or proxy) sends a NEW packet which re-uses ID X, then the server tosses the cached request / reply, and processes the new packet
* the same goes for ID Y, of course
* States S1 and S2 are tracked by the EAP module (for the sake of discussion) so that it can track ongoing EAP sessions

  The problem comes when ID X is re-used, and the client "retransmits" packet N, with ID Z, and State S1.  This packet isn't detected as a retransmission, because it has ID Z (not ID X).  So it gets passed to the EAP module.

  For fairly good reasons, the EAP module doesn't keep State attributes around for minutes.  Once it moves to the next state, it discards the previous one.  This makes sense, because the EAP module has no idea what is in the final reply.  For Access-Challenge it's mostly EAP stuff.  But the same issue can be seen with Access-Accept, so we have to deal with that, too.

  There is a solution, I think.  It involves the cache module.  The idea is this:

* when sending an Access-Challenge, the server should:
  * cache all attributes in the reply
  * using a key of State in the request

* when receiving an Access-Request, the server should:
  * look up the State in the cache
  * if found, send back the cached reply attributes
  * using whatever packet (accept / challenge) from the original reply

  That should mostly work.  Except the cache will get huge, because it isn't cleaned up.  The solution is to clean up packet with State S1 when you get the *next* packet in sequence, with state S2.


* when sending an Access-Challenge, the server should ALSO
  * add a SECOND cache entry containing only the State from the request - S1
  * and an internal attribute saying "this isn't a real reply"
  * using a key of the State in the *reply* - S2

* when receiving an Access-Request, the server should FIRST
  * look up the State in the cache (this should be S2)
  * if the cache entry is marked as "this isn't a real reply', THEN
  * the cache entry contains State S1, which is pointer to the previously cached attributes
  * delete the cache entry for the State S1, as we know the end user has received the previous reply

  This lets the server "clean up" old cache entries when the *end users system* moves to the next packet.

  I think that could be done in unlang with a bit of cache, in v3.

  We haven't done it until now because the cache module is fairly new.  And, 99.99% of the time everything works.

  But... even if we did this, it wouldn't solve the problem of intermediate proxies dropping replies.  If they dropped a reply because of a "bad attribute", the next reply will include that attribute, and the proxy will drop it again.

  This change would only help with transient networking issues.  Which I suspect is pretty much every day for Eduroam. :(

  Alan DeKok.

More information about the Freeradius-Users mailing list