Robust Authentication Proxying

Sun Jul 12 16:06:26 CEST 2009

On Jul 12, 2009, at 1:58 AM, Alan DeKok wrote:

> Philip Molter wrote:
>
>> b) makes the post_proxy_fail_handler optional on a pool-by-pool basis
>
>  If the early "reject" is wrong, it might be best to just delete it.
> Sites with a small number of home servers will still run the
> post_proxy_fail_handler, just a little bit later than they do now.

I have left it in as a configurable option.  I would rather someone  
not upgrade their freeradius codebase with this patch and find that  
the behavior they have come to rely on has changed.  Maybe you change  
it in the future (heck, maybe you change it now ... it is your code),  
but I did not want this patch to break anything for anyone.

>> Does that seem acceptable?  You seem hesitant to accept a solution  
>> that
>> you do not think could be used for more than a few people.  This
>> solution is going to be minimally invasive to the code.
>
>  It seems fine.

Patch is attached to this e-mail.  Please let me know if you would  
like it sent somewhere else or in some other format.

>> You make
>> it sound as if the NAS is doing something illegal by using a previous
>> cached accept.  It's not.
>
>  I said it's "outside of RADIUS".  It does not follow the RADIUS
> operational model, which is that the RADIUS server authenticates  
> users.
> If the NAS caches credentials, it's authenticating users via a
> non-RADIUS method.

Many NASes use a variety of different backends to authenticate users.   
Freeradius should not always assume that it is the only source for  
authentication and make decisions for the NAS.  Note I said "always."   
Freeradius *can* assume if told so (or inversely, it can be told not  
to assume, depending on which way you want the default to work).

>> The NAS can implement whatever logic it
>> wants, and that particular feature is one that leads to a better user
>> experience.  Just because you think a failure-to-contact is the  
>> same as
>> a denial does not mean that other vendors have not come up with
>> solutions that can work around it.
>
>  It's a "fail-safe" security practice.  The alternative is to take the
> RADIUS server down, and then what?  Does the NAS let everyone on the
> network?  What happens if their credentials have been cached, but they
> haven't paid their bills?

What happens indeed?  That is not for the RADIUS server to decide.   
That is for the NAS to decide.  The NAS ultimately decides what to do  
with the information it receives from the RADIUS server.  If the  
RADIUS server (or proxy in this case) sends back inaccurate  
information, the NAS cannot correctly make those decisions.  I  
actually have a use-case where if the authentication server is down,  
customers are allowed temporary access to our service with limits.  If  
I cannot keep my service up, I take on the liability of that, rather  
than going dark.

>  The work-around you're talking about is *very* site-specific.  ISP's
> and telcos would go crazy if NAS vendors implemented it.  They could
> lose a lot of money...

ISPs and telcos are not the only organizations that use RADIUS  
servers!  There are many other service providers out there that  
authenticate customers that do not fit into an ISP/telco model.  Of  
course the workaround I am talking about is site-specific, because  
every site has different business logic.  And again, I am not saying  
that the NAS should NOT deny access to individuals.  I am not saying  
they SHOULD either.  All I am saying is that the RADIUS proxy should  
not assume that because it did not receive a response from its home  
server, that the NAS will reject the client.  The RADIUS server does  
not know what its reply is going to be used for, so it should send  
back as accurate a response as possible (in this case, none).

>> RFC 2607 is clear that the proxy should not respond to the client  
>> unless
>> it receives a reply from the home server.  At the very least,  
>> returning
>> a rejection is not an accurate portrayal of the state of the
>> authentication.  It would be a better representation to just let it
>> timeout, but I understand returning the rejection so that the NAS can
>> short-circuit more quickly the transaction.
>
>  Not returning a response also makes the NAS think that the proxy is
> down, when it's not.  See the status-server draft for a discussion of
> this issue, and a solution:
>
> http://tools.ietf.org/internet-drafts/draft-ietf-radext-status-server

Yes, I have read it.  I wish more RADIUS servers and NASes implemented  
it.  Still, have the ability to let the NAS figure out if the RADIUS  
server is down through its own logic (whether that be through more  
Access-Accept packets or Status-Server packets or pings or whatever).   
I would like the proxy to not return an Access-Reject because the  
proxy thinks the NAS is too dumb to figure out on its own whether or  
not the proxy is actually down, just like Freeradius itself is not so  
dumb as to assume that one no-response is tantamount to a home server  
being down (that is why you have a zombie period after all).  However,  
I also know that some people have dumb NASes and they want Freeradius  
to make that determination for them.  That is why I proposed this  
configuration method for the behavior.

>  Yes, I'm not just a random opinionated guy on the net.  I have 4-5
> RADIUS specifications either published, or on track to be published.

And I am not just some guy who started using RADIUS yesterday.  I have  
run a proxy farm for years that backends to thousands of home servers  
maintained by hundreds of different organizations around the world.  I  
do not fit into the telco/ISP model of RADIUS usage, but I am not  
doing anything fancy with RADIUS.  I just have a different resiliency  
model than an ISP or telco.  I am another use-case for your code.  I  
would very much like to switch the entire farm over to Freeradius  
because I think it is a very well implemented product.  I just need to  
it support -- in addition to its current features, not in replacement  
of any features -- some more flexible handling of various proxy- 
related events.  When you process thousands of authentications a  
second over links that range from sub-ms to over 300ms, against  
servers that go up and down for maintenance or network outage or just  
plain packet loss, you require a bit more flexibility than is required  
by an ISP/telco model.

Patch attached.  Let me know what you think.

Philip

-------------- next part --------------
A non-text attachment was scrubbed...
Name: freeradius-proxy-no-response.patch
Type: application/octet-stream
Size: 3211 bytes
Desc: not available
URL: <http://lists.freeradius.org/pipermail/freeradius-users/attachments/20090712/5a95a0a6/attachment.obj>
-------------- next part --------------