Discarding Duplicate Request

John Douglass john.douglass at oit.gatech.edu
Thu Oct 2 18:56:11 CEST 2014


On 10/02/2014 12:35 PM, Rando Nakarmi wrote:
> Hello John,
>
> Thanks
>
> you increased max_request = 16384 (so you have only 64 clients ?)

That was a cut/paste from Phil Huxley who responded to my question. I'm 
still figuring out how to optimize. I can say that the max domain 
connections helped A LOT. However, the faster you churn, the more you 
might hit the Cisco WLC bug. We've seen _less_ but we've added radius 
servers and moved some controllers to their own radius server pairs. I 
hate adding radius servers as I feel it masks the real problem and it 
doesn't solve peak (change of classes) issues.

> you set winbind max domain connections = 12 (how do I know which value 
> is right ) (we have around 300 clients (WAPs)
It's really mainly about handling peak connections. With 300 WAPs you 
probably won't go that high. I have about 500 aps/controller but we have 
30k users online at once spread across maybe 20 controllers with 
multiple controllers on each radius server.

Actually I increased winbind max domain connections to 128. The way I 
kind of felt that out was to (on the linux/unix server)

lsof | grep winbind | grep TCP

You can see the number of TCP connections to the AD server. We were 
hitting or initial limit of 50 during peak times. I just increased it to 
a high enough number so that I probably won't reach it. The number of 
connections goes up and down. On a radius failover we might be 
generating a lot of connections but they eventually close and die off.
>
> so I set the max_request= 300*256 (I use 256 the value which is in the 
> radious.conf file)
>
> winbind max clients = 1200 ( has anybody used this parameter ? is this 
> mean how many winbind client can connect to AD ?

I'm actually not 100% sure on that stat/setting. :) I don't think I 
really care about it enough. No really sure how to determine this one. 
The documentation isn't really pointing out what that means (on samba.org)
>
> --cheers
> Rando
>
>
>
> On Thu, Oct 2, 2014 at 3:27 PM, John Douglass 
> <john.douglass at oit.gatech.edu <mailto:john.douglass at oit.gatech.edu>> 
> wrote:
>
>     :) Rando,
>
>     There has been much discussion on this list about that problem. IF
>     you are using Cisco WLC, there is a flaw in the way radius is
>     processed which could lead to these log messages. Here is the
>     previous set of threads that have some pointers as to what to look at.
>
>     Cisco WLCs use the same source port and the 8-bit ID that is used
>     to track radius conversations during peak times, gets cycled so
>     fast that it creates duplicates where there really shouldn't be.
>     We are pushing Cisco hard to fix this flaw in their design
>     especially since they are creating controllers with more and more
>     capacity. The problem is only going to get worse.
>
>     I highly suggest you move to radius 2.2.5 and enable the ntlm_auth
>     timeout and upgrade your samba to 3.6 where you can add some
>     additional parameters. Here are some hints that Phil Huxley shared
>     with us that have been helpful in making our services better. The
>     issues haven't been handled 100%, and there are other things to
>     consider like if using a Cisco WLC, enabling client exclusion,
>     etc, etc but I don't have a ton of info on that as I just run the
>     radius servers.
>
>     http://lists.freeradius.org/pipermail/freeradius-users/2014-September/073929.html
>
>     - John Douglass @ Georgia Tech
>
>     PS: I really need to write up a blog post about this :)
>     PSS: Yes we know AD is slow and it sucks as a backend but for a
>     lot of us, it's what we have to deal with :)
>
>
>
>     On 10/02/2014 11:10 AM, Rando Nakarmi wrote:
>>     I been seeing quite a large number of message like below logged
>>     in radius.log lately.
>>
>>     Discarding duplicate request from client classroom98 port 32880 -
>>     ID: 131 due to unfinished request 241848
>>
>>     I read some thread, this might be the case when back-end server
>>     (i.e auth servers) are too slow to respond.
>>
>>     My back-end is AD, using ntlm_auth.
>>     radius version 2.1.12-4
>>     samba version 3.5.8-68
>>
>>     Any hints or suggestion how to resolve this would be very helpful.
>>
>>     Most of the users get authenticated ( I don't think ntlm_auth is
>>     responding slow), I could not figure this out
>>
>>     --cheers,
>>     Rando
>>
>>
>>     -
>>     List info/subscribe/unsubscribe? Seehttp://www.freeradius.org/list/users.html
>
>
>     -
>     List info/subscribe/unsubscribe? See
>     http://www.freeradius.org/list/users.html
>
>
>
>
> -
> List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freeradius.org/pipermail/freeradius-users/attachments/20141002/8fadc6df/attachment.html>


More information about the Freeradius-Users mailing list