System frequently stops responding...

Mohamed Lrhazi Mohamed.Lrhazi at georgetown.edu
Fri Jul 24 16:09:33 CEST 2015


Thanks guys... I will find the answers to your questions... t happens
infrequently enough, and goes away fast enough, to be difficult to debug...

Mohamed.

On Fri, Jul 24, 2015 at 6:38 AM, Alan DeKok <aland at deployingradius.com>
wrote:

> On Jul 24, 2015, at 1:34 AM, Mohamed Lrhazi <Mohamed.Lrhazi at georgetown.edu>
> wrote:
> > Still still trying to get to the bottom of this issue... to summarize:
> > - Wireless controllers log that RADIUS server (a load balanced VIP), did
> > not respond to a query. this is logged in clusters of dozen or so,
> several
> > times a day.
> > - Using docker containers.. so decided to try without them
> > - Built two VMs, RedHat Enterprise 7, running provided freeradius
> > RPMs. 3.0.4
>
>   Please use 3.0.9.  We're not going to debug issues which were tracked
> down and fixed six months ago.
>
> > - Sending the quarter of our traffic to this pool of two VMs.
> > - Issue still occurs on these VMs.
> > - I run radiusd in -Xx mode, on both of the RHEL7 VMs, also run a
> > continuous tcdpump, on each VM.
> >
> > - Problem occurrences seem to reliably coincide with:
> > -- tcpdump shows all the requests logged by the controllers having been
> > resent few times (duplicates in wireshark).
> > -- radiusd goes silent (no log at all) for 30 seconds. after which it
> > resumes logging and I presume, working.
>
>   And.... what does the debug log say during this time?  You should be
> able to correlate timestamps.
>
>   If there's *nothing* in the debug output, then most likely is that
> database is locking up, and preventing FreeRADIUS from doing anything
>
> > - radiusd logs a line for each missed query, I think, like so:
> > Error: (7719) Ignoring duplicate packet from client gu_net_10 port 3010 -
> > ID: 96 due to unfinished request in component <core> module
> >
> > -- Spikes in CPU usage (as seen in sar output).
> >
> > What can I do next? to further zoom in on the root cause? Or is this
> pretty
> > clearly CPU starvation? just add more VMs ?
>
>  Use 3.0.9.
>
>   Alan DeKok.
>
>
> -
> List info/subscribe/unsubscribe? See
> http://www.freeradius.org/list/users.html
>


More information about the Freeradius-Users mailing list