Pre-release of Version 2.1.8

Bjørn Mork bjorn at mork.no
Tue Dec 8 12:34:55 CET 2009


Josip Rodin <joy at entuzijast.net> writes:
> On Tue, Dec 08, 2009 at 10:10:11AM +0100, Bj??rn Mork wrote:
>> The symptoms are that all home servers are marked dead/zombie.  Typical
>> obfuscated home_server list in this state:
>> 
>> server(bjorn) ~ 71$ radmin -e "show home_server list"
>> 192.168.8.120   1812    auth    alive   0
>> 192.168.8.246   1812    auth    alive   0
>> 192.168.8.132   1645    auth    dead    0
>> 192.168.8.132   1645    auth    dead    3
>> 192.168.8.14    1812    auth    alive   0
>> 192.168.8.10    1812    auth    alive   0
>> 192.168.8.210   1812    auth    alive   0
>> 192.168.8.50    1812    auth    zombie  0
>> 192.168.8.20    1812    auth    zombie  0
>> 
>> There are a number of servers marked "alive", but these are all servers
>> which have been revived after the fixed period.  When used, they will be
>> marked dead/zombie again.
>
> What does the log file say? There should be many messages marked 'Proxy'
> in the v2.1.x branch since a couple of weeks ago, and definitely in your
> case if they keep changing state so often.

Sure.  You'll find all "Proxy:" prefixed messages sinc log rotation at
midnight here: http://www.mork.no/~bjorn/fr-218-prerelease-proxying.log
(It's 300 kB, so it was a little over the limit for this list)

The addresses have been replaced using the same pattern as for the
"home_server list", so they are directly comparable.

You'll notice that some of the home servers are truly unavailable.  This
is unfortunately something we have to live with.  There are also some
servers which are unavailable at certain times, but mostly available.

At approximately 08:40 something happens, and a lot of servers are
flagged as dead or zombie.  

This could of course have been caused by network problems, but there was
no such problem at this time. Proxying goes over the same interface as
the rest of the traffic, and non-proxied authentication and accounting
continued to work without problems.  The home servers are in a number of
different networks, and any network incident taking them all out would
be very visible.

The server was restarted at 09:35 and you'll see that only the usual
suspects are logged as zombies after this.

>> But I will test that now, starting with the stable branch from
>> git.freeradius.org, commit d7b4f003477644978f3fefa694305dce9b5dc8bf,
>> which was the last point where things seemed to work
>
> BTW you could probably do a git bisect.

Yes, if I can verify the good versions...  

As I said, I'm not entirely sure that there actually was a good version.
And it looks like each positive test will have to take 3+ days.  Unless
I can find out what triggers this.


Bjørn




More information about the Freeradius-Users mailing list