"No EAP session matching state" - logging
aland at deployingradius.com
Tue Oct 13 21:08:18 CEST 2020
On Oct 13, 2020, at 12:16 PM, Robert Franklin <rcf34 at cam.ac.uk> wrote:
> Please excuse I haven't got a "radiusd -X" debugging output for this -- trying to catch one is quite difficult, given the intermittent nature of the problem, the difficulty matching a problematic supplicant and the volume of traffic I'm sifting through, as well as the issue being on a NAS that's behind a downstream proxy (but I'm responsible for the home server[*]) and with a supplicant I don't have access to, so I thought I'd start with general principles first, just to check my understanding is correct...
Sometimes it's fine to not have debugging issues.
> I am investigating a situation where we're getting an Access-Reject reply being sent, with a Module-Failure-Message of "No EAP session matching state ...". The previous connection successful attempt was several minutes before and the supplicant repeatedly tries for a minute, with errors showing this state number, then I get messages with another state number, etc. Eventually it successfully connects, so I get streams such as:
> 2020-10-13 15:44:41 [various challenges] + Access-Accept
> 2020-10-13 15:47:41 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
> 2020-10-13 15:47:43 Access-Reject: No EAP session matching state 0xc600f921c5e0e014
Something is doing retransmits *after* the previous packets have been replied to, or have expired. This is bad.
It's generally the supplicant, or maybe the NAS. And there's little you can do to "fix" FreeRADIUS when something *else* is broken.
Note also that the state is 16 bytes, and not all of it is being printed. So the "No EAP session" messages may be for multiple packets.
From my magic 8-ball, I can tell that the State attributes are for the first or second packets in an EAP session. So the session isn't really getting established.
> Is there anything I can infer from this, or would I need a full debug log for an affected supplicant? I assume these sessions either never existed (and the supplicant is just plain broken) or they were previous sessions that have been cleaned up (which I believe happens after 5 seconds - these go on for longer).
Pretty much. The server is replying, and the reply never makes it to the supplicant. So the supplicant retransmits, going "hey, what's up?". The NAS sees that, re-uses the old State... and the server goes "no, I replied already".
> Does getting these these new state numbers off the bat (i.e. with no previous EAP messages from the supplicant) imply the supplicant is doing something odd and continuing EAP conversations without me having seen the start of them? I linelog a message every time I receive a packet (so a full authentication stream has several lines) and have nothing for this supplicant aside from the above messages.=
Each packet containing EAP has a State attribute. The only ones missing it are the first one, and the last one.
> Is there any way I can log the 'EAP session state' field in my linelog from previous messages so I can see if they're related to ones I've seen before?
The State attribute is largely random garbage. There isn't much point in logging that.
> One thing that does look odd is that, from this particular downstream RADIUS proxy, we are seeing a lot of reauthentications (albeit mostly successful): I would normally only expect to see supplicants reauthenticating occasionally, but I'm often seeing things reauthenticate every 2-3 minutes, which looks a bit suspiciously like there are local problems at the visited site.
Pretty much. If they have radio layer issues, or AP issues, supplicants will retry a LOT.
Call them and tell them that their equipment is crap. :(
> Apologies if there's nothing you can really do to help me without the full debug log and, if I need it, I'll try and get in touch with an affected user and see if we can get it to fail - let me know and I'll come back when I have it. Because problems often only get reported after the event, I try to get as much as possible logged that I can look back through easily (a full debug log probably being too much), if someone reports something.
Debug logs won't really help here. Something in the network is broken, and it isn't FreeRADIUS.
More information about the Freeradius-Users