Discarding Duplicate Request
John Douglass
john.douglass at oit.gatech.edu
Thu Oct 2 18:56:11 CEST 2014
On 10/02/2014 12:35 PM, Rando Nakarmi wrote:
> Hello John,
>
> Thanks
>
> you increased max_request = 16384 (so you have only 64 clients ?)
That was a cut/paste from Phil Huxley who responded to my question. I'm
still figuring out how to optimize. I can say that the max domain
connections helped A LOT. However, the faster you churn, the more you
might hit the Cisco WLC bug. We've seen _less_ but we've added radius
servers and moved some controllers to their own radius server pairs. I
hate adding radius servers as I feel it masks the real problem and it
doesn't solve peak (change of classes) issues.
> you set winbind max domain connections = 12 (how do I know which value
> is right ) (we have around 300 clients (WAPs)
It's really mainly about handling peak connections. With 300 WAPs you
probably won't go that high. I have about 500 aps/controller but we have
30k users online at once spread across maybe 20 controllers with
multiple controllers on each radius server.
Actually I increased winbind max domain connections to 128. The way I
kind of felt that out was to (on the linux/unix server)
lsof | grep winbind | grep TCP
You can see the number of TCP connections to the AD server. We were
hitting or initial limit of 50 during peak times. I just increased it to
a high enough number so that I probably won't reach it. The number of
connections goes up and down. On a radius failover we might be
generating a lot of connections but they eventually close and die off.
>
> so I set the max_request= 300*256 (I use 256 the value which is in the
> radious.conf file)
>
> winbind max clients = 1200 ( has anybody used this parameter ? is this
> mean how many winbind client can connect to AD ?
I'm actually not 100% sure on that stat/setting. :) I don't think I
really care about it enough. No really sure how to determine this one.
The documentation isn't really pointing out what that means (on samba.org)
>
> --cheers
> Rando
>
>
>
> On Thu, Oct 2, 2014 at 3:27 PM, John Douglass
> <john.douglass at oit.gatech.edu <mailto:john.douglass at oit.gatech.edu>>
> wrote:
>
> :) Rando,
>
> There has been much discussion on this list about that problem. IF
> you are using Cisco WLC, there is a flaw in the way radius is
> processed which could lead to these log messages. Here is the
> previous set of threads that have some pointers as to what to look at.
>
> Cisco WLCs use the same source port and the 8-bit ID that is used
> to track radius conversations during peak times, gets cycled so
> fast that it creates duplicates where there really shouldn't be.
> We are pushing Cisco hard to fix this flaw in their design
> especially since they are creating controllers with more and more
> capacity. The problem is only going to get worse.
>
> I highly suggest you move to radius 2.2.5 and enable the ntlm_auth
> timeout and upgrade your samba to 3.6 where you can add some
> additional parameters. Here are some hints that Phil Huxley shared
> with us that have been helpful in making our services better. The
> issues haven't been handled 100%, and there are other things to
> consider like if using a Cisco WLC, enabling client exclusion,
> etc, etc but I don't have a ton of info on that as I just run the
> radius servers.
>
> http://lists.freeradius.org/pipermail/freeradius-users/2014-September/073929.html
>
> - John Douglass @ Georgia Tech
>
> PS: I really need to write up a blog post about this :)
> PSS: Yes we know AD is slow and it sucks as a backend but for a
> lot of us, it's what we have to deal with :)
>
>
>
> On 10/02/2014 11:10 AM, Rando Nakarmi wrote:
>> I been seeing quite a large number of message like below logged
>> in radius.log lately.
>>
>> Discarding duplicate request from client classroom98 port 32880 -
>> ID: 131 due to unfinished request 241848
>>
>> I read some thread, this might be the case when back-end server
>> (i.e auth servers) are too slow to respond.
>>
>> My back-end is AD, using ntlm_auth.
>> radius version 2.1.12-4
>> samba version 3.5.8-68
>>
>> Any hints or suggestion how to resolve this would be very helpful.
>>
>> Most of the users get authenticated ( I don't think ntlm_auth is
>> responding slow), I could not figure this out
>>
>> --cheers,
>> Rando
>>
>>
>> -
>> List info/subscribe/unsubscribe? Seehttp://www.freeradius.org/list/users.html
>
>
> -
> List info/subscribe/unsubscribe? See
> http://www.freeradius.org/list/users.html
>
>
>
>
> -
> List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freeradius.org/pipermail/freeradius-users/attachments/20141002/8fadc6df/attachment.html>
More information about the Freeradius-Users
mailing list