fr_packet_cmp again

Alan DeKok aland at deployingradius.com
Thu Apr 28 15:30:57 CEST 2011


Josip Almasi wrote:
>>   That's terrible.  But at least it helps track down where the problem
>> is.
> 
> Care to elaborate?

  If it's still the same issue in 2.1.x and 3.0, then it's not the
internal state machine.  I thought it was, but it looks like it's the
hash tables.

> But this terrible thing is up to RHEL 6, maybe kernel issue.
> I mentioned it earlier as non-critical.
> After a few hundered thousands threads, no new threads can start. Be it
> freeradius or java threads.

  Ouch.  The only work-around is to set the number of threads high, and
to set the number of "spare threads" high.  That way few threads will be
created or destroyed.

> Workaround is using fixed number of threads, as you explained sometimes.
> To make it even wierder, if we set radiusd uid and gid to 0, it works:)
> I have no clue how to set this up, sure it's not up to ulimit.

  It does sound like ulimit...

> Want to try it on our boxes? We could set you up an openvpn account.
> Or, whatever, it's lab anyway, we'll nat it to internet and give you ssh.

  Maybe next week.  I'm busy this week.

> Sorry no graphs.
> But we narrowed it down to rlm_detail.
> For auth requests, performance degrades less, to max 20 ms response
> time. Say, with about 2K auth req/s, it's still about 2ms, quite nice.
> But for acct it can get to more than a second.

  Wow... that's really horrible.  I have no idea why that's happening.

> When we turn off detail, it's the same as for acct requests.
> But when we detail to /dev/null, response grows 40 ms.
> With detailing to files, we got additional 80-120 ms.
> It was at about 5K acct req/s.
> All this, IIRC. We may do it again if you're interested.

  That's astonishing.  I don't know why that's happening.

> OTOH, FR 3 keeps response at 1-2ms at all times.

  Much, much, better.

  Alan DeKok.



More information about the Freeradius-Devel mailing list