Proxy realms and home_server_pool fallback not working
Peter Lambrechtsen
peter at crypt.co.nz
Mon Mar 7 09:22:36 CET 2016
On Mon, Mar 7, 2016 at 2:55 PM, Alan DeKok <aland at deployingradius.com>
wrote:
> On Mar 6, 2016, at 6:54 PM, Peter Lambrechtsen <peter at crypt.co.nz> wrote:
> > I'm looking to add more robustness into my proxy architecture and noticed
> > in the home_server_pool there is the option for "fallback = virtualrealm"
> > so if all home servers fail then a last resort home_server is used with
> > some config locally to always accept / reject customers based on the
> realm
> > they are coming from. I'm not using the status_check
>
> Then you can do "status_check = request". An Access-Accept or
> Access-Reject response will be accepted as an indication that the home
> server i alive.
>
> > as some of the
> > downstream clients don't support status-server, but I will look into that
> > to see if it makes a difference.
>
> It should.
>
> > However for this situation I would expect
> > if you are using or not using Status server checks shouldn't have any
> > impact on how the fallback server works.
>
> It does. A lot.
>
> The problem is that without Status-Server, FreeRADIUS has to *guess*
> when the home server is alive. And the guess is usually wrong. Because
> most guesses are wrong.
>
Yes, I have figured that out. I'm now pinging all our downstream radius
clients to see which respond to something sane when sent a Status, and then
turning on Status server for them.
> > In the proxy.conf I have configured:
> >
> > home_server ProxyDest {
> > type = auth+acct
> > ipaddr = 192.168.1.113
> > port = 1812
> > secret = password
> > response_window = 1
> > require_message_authenticator = no
> > zombie_period = 5
> > revive_interval = 10
>
> That's really low. After 10s, just mark the home server alive?
>
> It should be 60s at the minimum. Maybe 5min.
>
It was purely for testing as waiting around for 10 seconds is much better
than waiting around for 2 mins. Now with check_interval with status turned
on things are making more sense.
> > But the second and subsequent request I would expect to get proxied to
> the
> > local fallback virtual server as the home_server has been marked as
> zombie.
> > But that never seems to happen. It keeps on rejecting the requests and
> > fallback never seems to be used.
>
> Hmm... I'll take a look.
>
> > If I configure a second home server in the pool.
> ...
> > Then the second server is failed over to when the first fails. Which is
> all
> > good if I wanted to use the type fail-over, but if I wanted to use
> > load-balance then I can't have my fallback server as a home server
> > otherwise a percentage of requests will always be local which isn't
> ideal.
>
> Yes. You can't do load-balance and fallback.
>
> You *can* put something into Post-Proxy-Type Fail. Which is probably
> what we should do. And remove the fallback virtual server.
>
What could I do in Post-Proxy-Type? As I can't call the virtual server, and
Proxy-To-Realm doesn't proxy to a new destination nor does setting the
control to accept. There doesn't seem to be a way to turn a Reject from a
failed proxy request back into an Accept.
(0) ERROR: Failing proxied request for user "peter", due to lack of any
response from home server 192.168.1.113 port 1812
(0) Clearing existing &reply: attributes
(0) Found Post-Proxy-Type Fail-Authentication
(0) # Executing group from file ./sites-enabled/default
(0) Post-Proxy-Type Fail-Authentication {
(0) policy accept {
(0) update control {
(0) &Response-Packet-Type = Access-Accept
(0) } # update control = noop
(0) [handled] = handled
(0) } # policy accept = handled
(0) } # Post-Proxy-Type Fail-Authentication = handled
(0) There was no response configured: rejecting request
(0) Using Post-Auth-Type Reject
> This allows the same behaviour for all packets, and simplifies the proxy
> code.
>
> > The other interesting thing with the failover is I set the check_interval
> > to 10 seconds, or 30 seconds. But it only seems that the first client is
> > re-checked after 60 seconds and assumed to be back up.
>
> Because you have revive_interval set.
>
> > Waking up in 0.2 seconds.
> > Marking home server 192.168.1.113 port 1812 alive again... we have no
> idea
> > if it really is alive or not.
>
> And that message is printed only when you have revive_interval set.
>
> The solution is to *not* set revive_interval. And use Status-Server
> exclusively.
>
> > Waking up in 1.0 seconds.
> >
> > I would have thought that
> >
> > zombie_period = 5
> > revive_interval = 10
> > check_interval = 10
> >
> > Would mean that the client would be re-checked in 10 seconds.
>
> check_interval and revive_interval should be mutually exclusive. It
> just doesn't make sense to both check that the home server is alive every
> 10s, and then *always* mark it as alive after 10s.
>
> > Am I mis-understanding how fallback is supposed to work?
>
> A bit.
>
> But the fallback virtual server should work. Tho I'm inclined to remove
> it in 3.1, as it makes everything more complicated.
>
Thanks for all your help on this, the fail-over with the second server
being the virtual seems to work well, just means I am restricted to a
single server and can't use load-balance. But having this config would be
my ideal:
home_server_pool ProxyDestPool {
type = load-balance
home_server = ProxyDest1
home_server = ProxyDest2
home_server = ProxyDest3
fallback = cacheuser
}
Where if all the home servers go awol I use the local virtual server
cacheuser.
Many thanks
Peter
More information about the Freeradius-Users
mailing list