recommendations for max_servers
john.douglass at oit.gatech.edu
Tue Sep 23 19:31:42 CEST 2014
Be aware, there are some major design flaws within the Cisco WLC
controller software (Georgia Tech is working with Cisco to work through
them) regarding the number of requests a controller can field.
The flaws in the controller software cause an "overrun" of radiusIDs if
you have too many authentications/second which will manifest as
"duplicate" and "discards" in the logs. No amount of tweaking on the
radius side will fix this. You can however, improve performance to try
and improve the client experience.
When we are talking about AD, Phil Mayers had some great suggestions on
improving ntlm_auth performance. Here were his recommendations:
1. Upgraded the radius servers.
Old spec: 3Gb RAM, 2x P4-based Xeon 1 core @ 3.2GHz, RHEL5
New spec: 16Gb RAM, 1x Xeon E5-2620 6 core @ 2GHz, RHEL6
2. Upgraded Samba - went from RHEL5 samba3x-3.5.4 to RHEL6 samba-3.6.9
3. Set "winbind max domain connections = 12" in smb.conf (restart
winbind) (we at GT actually have so many authentications, we set to 128
as we reached our limit during peak times)
4. Forced our smb.conf to talk to specific AD controllers which are
physical, not VMWare (most our DCs are VMWare)
5. Spent a *lot* of time debugging and tracking the Samba->DC RPC
round-trip times and hassling our AD people to keep these stable; not
sure what they did, if anything.
6. Increased radiusd.conf setting to "max_requests = 16384"
7. Worked really, really hard on getting the Cisco APs, AP radios and
controllers to STOP CRASHING; their software quality has been abysmal,
and this was a contributing factor - APs or controllers would crash
under load, and this would trigger a burst of auths, which would trigger
As Alan said before, there are lots of moving parts where issues can
happen. If you improve server performance within the pieces
(AD/database/winbind/etc), that's a start.
If you are in a large scale Cisco deployment, depending on how many APs
and users, you may find yourself having issues regardless. It's a hard
problem to advise on, but adding additional radius servers and
optimizing ours for performance has helped us immensely.
On 09/19/2014 02:58 PM, Louis Munro wrote:
> While troubleshooting a system I came upon a case of 'hung children'
> and duplicate requests.
> I would usually ascribe this to a database issue but in this case the
> database is mostly unused and properly indexed.
> Accounting is not used, so that's one less thing to consider.
> On the other hand the max_servers setting had been set as high as 192
> by someone with good intentions.
> Tuning it down to 64 seemed to significantly reduce the load on the
> system and the number of hung children was reduced by a factor of
> about 100.
> While there remains an issue with some (intermittent) slow ntlm_auth
> to take care of, I wondered how others tune the value of max_servers
> other than by trial and error. Most of the time the default of 32 has
> been enough for me. Higher is not necessarily better in my experience
> since at least in this case it seems to have led to the main thread
> working harder when under load (with most of the work done in the
> "system" space).
> This is a system running 2.2.5 on RHEL 6.4 in VmWare. It's got 24Gb of
> RAM and 16 cores so it should still be pretty capable.
> Does anyone have an algorithm, rule of thumb or other ballpark way of
> estimating the "ideal" maximum number of threads?
> Louis Munro
> lmunro at inverse.ca <mailto:lmunro at inverse.ca> :: www.inverse.ca
> +1.514.447.4918 x125 :: +1 (866) 353-6153 x125
> Inverse inc. :: Leaders behind SOGo (www.sogo.nu <http://www.sogo.nu>)
> and PacketFence (www.packetfence.org <http://www.packetfence.org>)
> List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Freeradius-Users