Antw: Re: How many NAS kann radius take?

Phil Mayers p.mayers at imperial.ac.uk
Thu Feb 13 19:01:13 CET 2014


On 13/02/14 17:03, Anja Ruckdaeschel wrote:

> Tests with exact one ldap server showed no differences. Ldap response times
> went e.g. from 0.1 s to 0.2 s when the problem appears, but then we do a lot
> more ldap queries, too.
> So I think, that´s quite normal.?

0.2 sec for a query is quite slow. Our median time for an AD MSCHAP auth 
is around 35 milliseconds, and our SQL lookups are closer to 5 milliseconds.

If you do the maths, a thread pool of 100 threads can only serve 500 
auths/sec if you have 0.2 sec blocking per auth. Scale numbers 
appropriately for number of lookups, divided by 11/3 (usual ratio of 
PEAP outer to PEAP inner packets).

>
> Here are our authentication methods:
> PAP with or without ldap for switches, routers and dial-in servers (cisco, hp,
> max ascend) and for Juniper VPN controllers (only those devices do accounting)
> CHAP PAP with or without ldap for switches and routers (cisco, hp)
> For Wi-Fi: PEAP/MSCHAPv2 and EAP/TTLS-PAP with 350 lancom aps and 1 Colubris
> Controller with does 2 different SSIDs with ldap for our users at home and
> those proxied in via the eduroam-project,
> for which we are service provider and identity provider

Consider splitting these out into separate processes, listening on 
separate IP/ports. This will help with brief thread pool exhaustion 
spikes, packet ID space exhaustion from common NAS source IPs and so on.

>
> 90% of our radius requests are coming in over Wi-Fi.
> ldap, sql and so on are only called in inner-tunnel (3 Packets)

You can, and should, reduce that even further. In later server versions, 
you can do:

authorize {
   eap {
     ok = return
   }
   ...
}

...in the inner tunnel as well, which will skip LDAP/SQL lookups for 
EAP-Identity and EAP-MSCHAP success/fail packets. You only need the 
LDAP/SQL for EAP-MSCHAP auth.

In earlier versions of the server you can do crazy hacks to match the 
EAP-Message against a regexp and ignore identity / mschap success/fail 
packets - search the archives.

Also investigate rlm_cache. Cache every possible lookup, even if it's 
just for a few seconds.

> I think we have quite a default config, but with a few modules commented out
> (e.g. unix, radutmp, ... we don´t need) and a few policies in place to reduce
> unnecessary stuff like querying ldap or sql
> in default, when its peap.
>
> Short eap timeouts on client devices and sleep mode increased our radius
> requests for the last two years. There are single users doing up to ~ 1500
> login requests per day from one device.
> But there is nothing we can do about that :-(

Maybe, but you need to be aware of it. Keep an eye on fast roaming, 
discuss the issues with your wireless vendor, etc.

>
> "Some NASes e.g. Cisco lightweight wireless use a single UDP source  port, so
> at most 256 requests can be in-flight at any given time" could be our problem
> to  because our lancom access points do the same.
>
> Does this mean that when one NAS is sending more than 256 Access-Requests from
> one port freeradius cannot process one more at that time from this NAS?

It's not a freeradius thing - *no* RADIUS server could. It's a protocol 
limitation.

A given source/dest ip/port 4-tuple can only carry 256 requests 
in-flight, so no software could handle more. The NAS would have to open 
more ports.

It would be rare - but not impossible, under heavy load - for 256 
requests to be outstanding. When that happens, the NAS will either start 
dropping auth requests, or re-using IDs. In the latter case, that makes 
things worse, and the backlog can grow until the backend database (SQL, 
LDAP, AD, whatever) unblocks the thread pool.

TBH, I think 0.2 sec is slow for an LDAP lookup; can you run a local 
replica?


More information about the Freeradius-Users mailing list