error discarding packet

Borislav Dimitrov b.dimitrov at ngsystems.net
Wed Dec 23 14:46:07 CET 2009


Just to add that I hope that you are starting FR without the debug  
flag/option (i.e. without -X). When started like that (radiusd -X &)  
it starts in a single thread and obviously the requests will await  
each other to finish...

С поздрави

Борислав Димитров
e-mail: b.dimitrov at ngsystems.net
GSM: 0888 51 55 45; 0889 28 54 57
NG Systems
Лавеле 32, ет: 4,
София, България




On 23.12.2009, at 15:43, Borislav Dimitrov wrote:

> In radiusd.conf:
>
> # THREAD POOL CONFIGURATION
> thread pool {
> 	start_servers = 1
> 	max_servers = 1
> 	min_spare_servers = 1
> 	max_spare_servers = 1
> 	max_requests_per_server = 0
> }
>
> ...but instead of ones (1s) put something more appropriate for your  
> network usage (like 5s or 7s). It's similar to Apache's thread pool  
> settings... Stay monitoring and tuning until the error discarding  
> duplicate packet disappears or becomes very rare. Also look at the  
> Acct-Delay-Time parameter returned from the NAS to FR. It should be  
> 0. If it's more than 0, then there's some delay. When you increase  
> you thread pool settings the CPU usage will start increasing as FR  
> starts processing more requests simultaneously/concurrently. Also  
> check your NAS documentation for configuration options of these  
> timeout etc parameters. For Cisco they are like that:
>> "radius-server retransmit 0" etc
>
> С поздрави
>
> Борислав Димитров
> e-mail: b.dimitrov at ngsystems.net
> GSM: 0888 51 55 45; 0889 28 54 57
> NG Systems
> Лавеле 32, ет: 4,
> София, България
>
>
>
>
> On 23.12.2009, at 15:36, Alisson wrote:
>
>> hi, my DB is ok I tested with another programms e etc, and is  
>> running well
>>
>> how I set the thread pool to better concurrency?
>>
>> 2009/12/23 Borislav Dimitrov <b.dimitrov at ngsystems.net>
>> Hi,
>>
>> This question has been answered many times on this ML. I myself  
>> have (at least tried) answered it two times. Here're some of my  
>> previous messages:
>>
>> Msg1:
>> Hi,
>>
>> I've already tried to answer a similar question some time ago (and  
>> I'm probably not the only one) but anyways...
>> The cause of the problems probably is some delay or packet loss or  
>> something like that. Notice the Acct-Delay-Time value increasing as  
>> the NAS retries to send the "lost" accounting packet (although - at  
>> least in my case - it wasn't lost but just its processing was  
>> delayed). I've experienced such issues with Cisco VoIP routers -  
>> the router's log is flooded with RADIUS Server DEAD - and then ...  
>> ALIVE messages and in the FR log you can see the retries with the  
>> values of Acct-Delay-Time increasing. The main cause of the problem  
>> may be different, so you'll have to check it in your case. In my  
>> case it was caused by the thread pool settings not being  
>> appropriate for the load. In this case the CPU usage stays low but  
>> it's not used because you cannot achieve good concurrency and  
>> request have to await each other to finish. So find the main cause  
>> for your problems and eliminate it. The other thing is that most  
>> NASs have options to configure the RADIUS timeout, dead, retransmit  
>> etc times. E.g.for Cisco you could try "radius-server retransmit 0".
>>
>> Msg2:
>> Hi,
>>
>> As far as I can see, the people on the list have provided you with  
>> a lot of very useful suggestions on what could cause the problem.  
>> As I said earlier (let me clarify) and to help you narrow things a  
>> little bit - it's probably due to the RADIUS response timing out  
>> hence the NAS complains the server is dead and later when it  
>> responds finally it marks it as alive again. The reasons can be  
>> different depending on your setup - slow network, database, custom  
>> module (like rlm_perl/python etc) or as I suggested (from my  
>> personal experiences) improperly configured concurrence settings of  
>> FR itself. See which component of your setup is causing the slow  
>> responds (it can be the backend, or messed up FR configuration) and  
>> fix it. Just for completeness check your NASs manuals - most have  
>> these settings configurable - response timeouts, retransmits,  
>> marking the server as dead etc but playing with the NAS while  
>> possibly useful is probably not the main issue in your setup -  
>> check what is slowing things down.
>>
>> Msg3:
>> Hi there,
>>
>> I may be mistaken but... these are log message on the NAS aren't  
>> they?
>> If this is the case, I've experienced similar behavior with Cisco  
>> VoIP routers (RADIUS Server DEAD and then... ALIVE). This happens  
>> if you haven't properly enabled concurrency in FreeRADIUS - the CPU  
>> usage stays low 0%-1%-2% but if the requests are many they are  
>> obviously waiting each other... This happens when you have stared  
>> FreeRADIUS with the -X key (I think it starts with a single thread  
>> then) or have too low values for the thread pool parameters (and/or  
>> the *_clones options of rlm_perl which are to be deprecated soon).  
>> If you configure proper values according to the expected usage  
>> (concurrent requests), then the request won't wait each other to  
>> finish while the CPU stays unused and you'll avoid this annoying  
>> message in your logs. A sure sing that something like that is going  
>> on is the Acct-Delay-Time parameter with values greater than 0 -  
>> that is for accounting not sure for auth etc. Anyways if the values  
>> of that parameter are high (they are in seconds I think) then the  
>> requests are waiting too long and hence the error messages.
>>
>> Bottom line:
>> 1) Check the ML for more info
>> 2) The NAS can be configured when to timeout and resend the RADIUS  
>> packages
>> 3) Something is slowing down your setup. It may be the DB or  
>> something else. If your CPU usage stays low (< 5%), check your  
>> thread pool settings and increase them to achieve better concurrency.
>>
>> Sincerely,
>>
>> Borislav Dimitrov
>> e-mail: b.dimitrov at ngsystems.net
>> GSM: 0888 51 55 45; 0889 28 54 57
>> NG Systems
>> Lavele 32 str, fl: 4,
>> Sofia, Bulgaria
>>
>>
>>
>>
>> On 23.12.2009, at 15:10, Alisson wrote:
>>
>> hi, in another day I posted this same error ' Error: Discarding  
>> duplicate request from client '
>>
>> and the answer was 'your database is slow'
>>
>> so I upgrade my server with more memory, and changed servers  
>> variables...
>>
>> but, i'm still having this problem
>>
>> and I dont know what can be
>>
>> -- 
>> Att.
>> Alisson F. Gonçalves
>> Sistemas de Informação - UFGD
>> -
>> List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
>>
>>
>> -
>> List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
>>
>>
>>
>> -- 
>> Att.
>> Alisson F. Gonçalves
>> Sistemas de Informação - UFGD
>> -
>> List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freeradius.org/pipermail/freeradius-users/attachments/20091223/12b1debd/attachment.html>


More information about the Freeradius-Users mailing list