Handling unreliable proxy partners

Wed May 19 20:51:14 CEST 2021

On Thu, 20 May 2021 at 05:37, Alan DeKok <aland at deployingradius.com> wrote:

> On May 19, 2021, at 1:20 PM, Paul Moser via Freeradius-Users <
> freeradius-users at lists.freeradius.org> wrote:
> > We'd also like a manual mechanism that our support team can trigger to
> cover other failure scenarios, eg  the remote radius server is incorrectly
> returning access-reject for all valid users, and those scenarios that we
> haven't been able to think of but will occur, inevitably at the most
> inconvenient of times.
> >
> > My first attempt at this was that the support team could use radmin to
> set the home servers to dead which would mean packets were routed via the
> falback virtual server. I initially thought this worked as a solution, but
> if FreeRadius is doing status checks against the remote servers then it
> will automatically bring them back into service as long as the status check
> requests are responding, which if say the remote partner is responding with
> access-rejects to even valid users is not what you want.
>
>   Yes.  The server tries very hard to proxy packets.
>
> > One idea I haven't explored is having two copies of each virtual server,
> in different files, one for the normal situation and one for the failure
> situation and switching which one to use using symlinks and radmin to
> reload the configuration.
>
>   That's probably too complex.  I wouldn't recommend it.
>
> > So far what I have come up with so far is within a virtual server
> pre-proxy section to use the exec module to call a simple shell script that
> check for the presence of flag files indicating which if any partners are
> in a bad state. The support team are responsible for creating these files.
> If any flag files are present the the script adds a radius attribute for
> each, the value indicating which partner. In the pre-proxy section I can
> then check for this attribute and value if it indicates that the partner
> the virtual server is handling is in a failure state then call accept from
> the always module which will cancel the proxying attempt and send an
> access-accept. We can also call any policy that would also get called in
> the fallback virtual server or Post-Proxy-Type Fail-Authentication if we
> want common radius attributes to be returned in the response to apply some
> sort of QoS restriction.
> >
> > The rlm_exec documentation states using exec is very slow and something
> like the perl module would be more appropriate for a live environment.
> Before I carry on down the path of performance testing this and trying
> perl/python/rest/custom C module does anyone have any thoughts/observations
> or alternative suggestions?
>
>   Use "rlm_always".   From the Changelog for 3.0.22:
>
>         * New xlat for setting status of rlm_always instances and new
>           resource-check example virtual server for manipulating control
> flow
>           in unlang policies based on status of some external resource.
>           Patches from Terry Burton.
>
>
> https://github.com/FreeRADIUS/freeradius-server/blob/v3.0.x/raddb/mods-available/always#L35
>
>   You can also also use "radmin" to poke the "always" configuration live:
>
> radmin> set module config always rcode fail
>
>   The idea would be to create an instance of the "always" module, for each
> home server / pool you're proxying to.  You can then do something like:
>
> always server_x_status {
>         rcode = ok
> }
>
>   and then
>
>         server_x_status
>         if (ok) {
>                 proxy to server x
>         }
>
>   And then you can set that to ok / fail, depending on whatever you want.
>
>   Alan DeKok.

I found Replicate-To-Realm very useful here. I could proxy the request to a
secondary server to process the request and create the database entries and
check if the password was correct and decided if I needed to update the
database or not out of band from the main instance as the database couldn’t
write as quickly as FR could process requests.

I can dig up the code and sanitise it and post it publicly if that would
help. It was using FR3 and an LDAP backend and dynamic clients. I remember
asking if dynamic realms would be possible but it turns out it’s just
easier having a static file and doing rolling updates of the proxies when a
new realm came onboard.

>
>
> -
> List info/subscribe/unsubscribe? See
> http://www.freeradius.org/list/users.html