EAPTLS Stress test: 2.1.7

leopold vova_b at yahoo.com
Thu Oct 1 14:17:01 CEST 2009


Hello Alan,
Thanks as always for your reply.
Yes regarding Load Balancer we know that we need "IP stickiness" and we do
have it.
In this case we disabled IP stickiness just to be able to test how
FreeRADIUS reacts when it recieves messages dedicated for another FR server.
This scenario could happen during failover from RAdius1 to RAdius2, when
EAPTLS handshake started against R1 and in the middle of handshake (low
probability, but still can occur) R1 goes down and failver is done against
R2 and we expected R2 to drop the message instead of sending Access-Reject

Another test case we did was stressing one freeradius server (no
loadbalancers in the middle) and it could cope gracefully with load of 200
eaptls authentications/sec, but when we increased load to 300 auth/sec
things when really bad
1. We could reproduce constantly this error
Wed Sep 30 17:33:28 2009 : Error: rlm_eap: Failed to store handler
Wed Sep 30 17:33:28 2009 : Error: rlm_eap: Failed to store handler
Wed Sep 30 17:33:28 2009 : Error: rlm_eap: Failed to store handler

2. Around 50% of the clients received Access-Reject


Yes I understand your point regarding radius dropping/not responding to
invalid eaptls messages and that it causes client retries and even more load
on radius infrastructure, but unfortunately due to own bussiness
requirements we can't send Access-Reject to a user/machine that "tries" to
present a valid certificate during load conditions. We view a failure for a
valid client as outage.

At some point when no answer is received from radius a valid client will
retry and get to network, on the other hand when receiving Access-Reject
client state machine goes into a state when retry timeout is too long and it
will cause client machine outage.

We think when client presents invalid certificate (signed by untrusted CA or
expired certificate or revoked) then it should get Access-Reject which is
good, but when error is cause by load or other infrastructure or network
problems we feel that not responding in a better choice.
Unfortunately there is no other reply code in radius protocol in addition to
Access-Reject that says Access-CriticalError that indicates that sort of
error condition.

If we still want to proceed with Do-Not-respond path, do you think it is
doable?
Something like config changes below help, not sure if they break other
things need to test.
authorize {
..
        eap {
                ok = return
                updated = return
                handled = return
                invalid = 1
                fail = 2
        }
        if (invalid || fail) {
                do_not_respond
        }

..
}
authenticate
{
..
       Auth-Type eap {
                eap {
                        invalid = 1
                        fail = 2
                }
                if (invalid || fail) {
                        do_not_respond
                }
        }

..
}


Alan DeKok-2 wrote:
> 
> leopold wrote:
>> Hi,
>> We tried to stress test (EAPTLS) FreeRADIUS 2.1.7 which sits behind Load
>> Balancer 
> 
>   That doesn't work.  Don't bother trying to "fix" FreeRADIUS.  Instead,
> use a load balancer that is aware of EAP.  e.g. FreeRADIUS.
> 
>> Now we understand that if EAPTLS session started (we have 10
>> Access-Challenge messages because of our certificate chain) against
>> Radius_1
>> and then continued to Radius_2 because load balancer reverted it there
>> then
>> EAPTLS handshake cannot succeed, but we expected that FreeRADIUS should
>> drop
>> packets and NOT RESPOND instead of sending Access-Reject when it cannot
>> find
>> STATE variable
> 
>   No.  Users that fail authentication get Access-Reject.  The
> alternative is worse.
> 
>> By looking at the code we think eap_tls module returns RLM_MODULE_INVALID
>> or
>> RLM_MODULE_FAIL when it cannot find EAP session in the tree.
> 
>   Yes.  This is what it's supposed to do.
> 
>> What is proper configuration that we can do?
> 
>   Use a load balancer that is aware of EAP.  FreeRADIUS can do this.
> See "keyed-balance" in proxy.conf.  You can load-balance over the client
> IP && the User-Name.  This is often good enough to get most EAP sessions
> working.
> 
>> Is something like this recommended?
> 
>   No.
> 
>   All it does is force the client to retransmit.  At which point,
> there's a 50% chance that it will go to the SAME server, and *again* be
> thrown away.  For an EAP session of 10 packets, the odds are 1/(2^10)
> that *all* EAP packets will go to the right server.  So you'll have 1
> out of 1000 authentications succeed.  The rest will be rejected, after
> processing many, many, duplicates.
> 
>   Don't increase your load by a factor of 1000 for nothin.  Use a
> load-balancer that is aware of EAP, *or* configure your load balancer to
>  hash on source IP, and balance based on that.  It means that *most* EAP
> sessions will go to the same RADIUS server.
> 
>   It's not perfect, but it's a lot better than having a 99.9% failure
> rate.
> 
>   Alan DeKok.
> -
> List info/subscribe/unsubscribe? See
> http://www.freeradius.org/list/users.html
> 
> 

-- 
View this message in context: http://www.nabble.com/EAPTLS-Stress-test%3A-2.1.7-tp25686662p25696829.html
Sent from the FreeRadius - User mailing list archive at Nabble.com.




More information about the Freeradius-Users mailing list