v2.x.x redundant-load-balance broken

Alan DeKok aland at deployingradius.com
Tue Mar 24 23:19:40 CET 2015


On Mar 24, 2015, at 5:07 PM, Brian De Wolf <bldewolf at cpp.edu> wrote:
> I upgraded from v.2.1.x to v.2.2.6 recently and noticed (while
> upgrading one of the backend DBs) that redundant-load-balance sections
> are broken.  I poked at it a little and found an off-by-one error
> where it would only try N-1 of the group members, instead of all of
> them (which is pretty crippling when N=2).  Here's a quick diff that
> fixes it:

  Added, thanks.

> I'm still a little puzzled by the behavior of the server, though.  I
> was testing by adding this to authorize in the src/tests config:
> 
> redundant-load-balance {
> 	ok
> 	fail
> 	fail
> 	fail
> }
> 
> This causes the other tests to randomly fail, as it sometimes load
> balances to the second member, which causes it to only try the three
> fail modules.

  It should loop around to the beginning if there’s a failure.  The code should do that...

>  What puzzles me is that, when I add this config instead:
> 
> redundant-load-balance {
> 	fail
> 	ok
> 	fail
> 	fail
> }
> 
> I stop getting random failures.  When I add logging to record which one
> we picked and to identify the module before we call modcall_child, it
> says:
> 
> ++load-balance redundant-load-balance {
> pick is 3
> ++redundant-load-balance group redundant-load-balance {
> trying 0x13a7780
> +++[fail] = fail
> +++[fail] = fail
> trying 0x13a77e0
> +++[fail] = fail
> trying 0x13a7550
> +++[fail] = fail
> +++[ok] = ok
> ++} # redundant-load-balance group redundant-load-balance = ok
> 
> It's not clear to me why it's listing fail multiple times for some
> modcalls, or where that last ok comes from.

  Maybe your instrumentation code is wrong?

> Anyway, I checked v3.x.x for the off-by-one error and it looks like the
> loop was re-done to avoid count entirely.  Maybe more of the v3.x.x
> code needs to be back ported?

  Try the change.  If it fixes the problem, send a patch, and I’ll put it in 5 min later.

  I’m out of the office right now, and have limited ability to test.

  Alan DeKok.




More information about the Freeradius-Devel mailing list