AW: EAP-TLS Failed in handler question

PENZ Robert ROBERT.PENZ at TIROL.GV.AT
Wed Nov 21 13:00:33 CET 2012


Hi!

first thx for your response.

> My first question is, how can I decode a EAP-Message from the debug
> Wireshark, or read the EAP RFC and decode it manually (see below)

ok, I'll believe i got lucky and got a tcpdump trace on a client yesterday ... need to check it and if it is the same problem I'll provide more info.

> > log to check if the request is itself ok. Here is first packet from
> No, this is *not* the first packet, because it has a "State" attribute, 
> which is only present in 2nd and subsequent packets of the EAP exchange.

With first packet I meant first packet the radius server saw in some time ... the switch forces a reauthentification every 2h

> The reason you're getting the error message is that the "State" 
> attribute is unknown, so FR can't proceed with the EAP session and has 
> no choice but to drop it.
> Check you haven't reduced the "timer_expire" value in eap.conf to a 
> too-low value.

                #  A list is maintained to correlate EAP-Response
                #  packets with EAP-Request packets.  After a
                #  configurable length of time, entries in the list
                #  expire, and are deleted.
                #
                timer_expire     = 120

default was 60 .. I doubled it some weeks ago, as I saw "No EAP session matching the State variable" entries in the log.

> How many FR servers do you have serving this NAS? Is it possible the NAS 
> is sending packets in a round-robin fashion (which is bad) which is why 
> you're seeing a packet for which you don't have State?

In this case it is only one .. we're running in pre-production with the IT department clients (about 100 clients) to make sure it is stable before rollout. But in production it will be more than one ... good point, we need to check that too, before going into production.
 
> I guess it's possible something is mangling the State attribute from the 
> previous packet (which is *actually* the first packet).
> Otherwise, the client or NAS is doing something odd.

> It *could* be that the client just got stuck and is responding (very) 
> late. But I'm quite surprised the NAS didn't timeout the EAP auth before 
> that.

We're running Extreme Networks Switches with following timers set:

configure netlogin dot1x timers quiet-period 30
configure netlogin dot1x timers reauth-period 7200

following other timers are set to the default values:

  server-timeout     Configure RADIUS server timeout for 802.1X
  supp-resp-timeout  Configure supplicant response timeout

> > rad_recv: Access-Request packet from host 10.xxx.xxx.4 port 44519,
> > id=151, length=244 User-Name = "host/xxxxxxxxxxxxx.tirol.local"
> > EAP-Message = 0x02ff00690d800000005f160301005a01
> >
> 
> Ok so this says:
> 
> 02 - eap response
> ff - eap ID 255 - bit odd..
> 0069 - length in hex
> 0d - eap type 13 (EAP-TLS)
> 80 - eap TLS flags = length included
> 0000005f - tls length
> 160301 - TLS packet 0x16==22==handshake record, version 3,1 (TLS 1.0)
> 005a - record length
> 01 - handshake=client hello

cool !!

> 
> etc. etc.
> 
> So, it's the start of an EAP-TLS exchange, but as above, it's *not* the 
> first packet. If you start a tcpdump on the server, you'll see how this 
> works:
> 
> C: Access-Request, no state, EAP-Identity=abc
> S: Access-Challenge, state=xxxx, EAP-TLS blah
> C: Access-Request, state=xxxx, EAP-TLS blah

ok

> i.e. the NAS has to reflect the "State" back to FreeRADIUS on each 
> packet. Something is interfering with that, or erasing the "State" at 
> your end (a timer or restart).
> 
> > rlm_eap: No EAP session matching the State variable
> See?

But I didn't see a reason for it ;-)

> > Invalid means I return a reject ... should I return something else?
> No.

but reject means the switch sets the port to the guest vlan, and therefor the PC loses the connections ... is there a way to request a new full eap/tls handshake from the client?

> > Is this a client problem or a misconfiguration on my part?
> It's probably a client or NAS problem, unless you've set timer_expire 
> too low.

> However: I guess this could also happen right after the server is 
> restarted. Could that be it - is a cron job restarting it maybe?

no the server is running for > 10 days

but if I would restart the server I would reject all clients to the guest vlan on reauthentication after that ... that can't be the designed way.

Robert


More information about the Freeradius-Users mailing list