num_answers_to_alive
Gary Gatten
Ggatten at waddell.com
Thu Aug 4 17:52:13 CEST 2011
Yup. Typically once something fails I consider it questionable / unstable until it proves itself to me again. The routing / circuit analogy is a perfect example.
Many HA "things" allow the user to configure preemption or not - such that once the primary node fails and the secondary takes over, when the primary is believed to be healthy again, does it "automatically" become the primary again - OR - must the admin manually make it the primary again? Personally preemption is disabled in all my HA routers, firewalls, etc. Once something fails I want to review / analyze the failure and validate it's stable before I trust it again and start running traffic through it!
G
-----Original Message-----
From: freeradius-users-bounces+ggatten=waddell.com at lists.freeradius.org [mailto:freeradius-users-bounces+ggatten=waddell.com at lists.freeradius.org] On Behalf Of Alexander Clouter
Sent: Thursday, August 04, 2011 9:20 AM
To: freeradius-users at lists.freeradius.org
Subject: Re: num_answers_to_alive
Stefan Winter <stefan.winter at restena.lu> wrote:
>
> The documentation says that 3..10 are *useful* ranges, but doesn't
> mention that everything else is forbidden. In particular, I would like
> to use 1, not 3. The idea is: the server was dead before, but now it
> managed to send a reply back - so it must have been fixed. I would like
> to mark it alive immediately. Is that unreasonable?
>
Similar to 'link flapping' (think OSPF/BGP), you should use heuristics
as things are not just black and white. If a service simply had two
states "up" and "down" then that probably would be okay, but we also
have 'unstable'. Imagine this state coming from:
* overloaded RADIUS server (or backend DB)
* link congestion between RADIUS servers
Having a value of three, says not just "alive" but also "alive and has
been for a while"; this could be further interpreted that the service is
stable as well as alive. If the system briefly came back and died then
on attempt two or three you would have likely seen a failure.
Hope I am explaining myself well :)
Cheers
--
Alexander Clouter
.sigmonster says: BOFH excuse #256:
You need to install an RTFM interface.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
<font size="1">
<div style='border:none;border-bottom:double windowtext 2.25pt;padding:0in 0in 1.0pt 0in'>
</div>
"This email is intended to be reviewed by only the intended recipient
and may contain information that is privileged and/or confidential.
If you are not the intended recipient, you are hereby notified that
any review, use, dissemination, disclosure or copying of this email
and its attachments, if any, is strictly prohibited. If you have
received this email in error, please immediately notify the sender by
return email and delete this email from your system."
</font>
More information about the Freeradius-Users
mailing list