LDAP timeouts during failure conditions

Thu Jun 23 19:45:19 CEST 2011

If you do a generic connection pool system please allow for a max uses before forcing a reconnection. 

My desire to see this is primarily that after a failure of say an LDAP server it can take some unnecessary work to force a more balanced usage of servers behind load balancers. 

On 23 Jun 2011, at 17:28, Alan DeKok <aland at deployingradius.com> wrote:

> Phil Mayers wrote:
>> So, some discussion on the JANET-ROAMING list leads me to believe that,
>> during an "ldap server down" condition, rlm_ldap will incur
>> "net_timeout" on every (or many) passes through the module.
> 
>  It's better for the module to track when connections are down, and
> return quickly if all are down.
> 
>> I don't really understand the MAX_FAILED_* logic at the start of
>> perform_search, but it seems to conflict with the comments at the top of
>> the file:
>> 
>> * If conn->failed_conns > MAX_FAILED_CONNS_START then we don't
>> * try to do anything and we just do conn->failed_conns++ and
>> * return RLM_MODULE_FAIL
> 
>  Yeah...
> 
>> ...perform_search has no such logic; in any event, it seems like it
>> would be better to do an optional time-based per-server "fast fail" so
>> that:
>> 
>> redundant {
>>  ldap1
>>  ldap2
>> }
>> 
>> ...fails quickly if ldap1 is down.
> 
>  Sure.  That should be easy to do.
> 
>> In some ways it's a shame we can't use a worker thread to manage the
>> LDAP connection(s); that way, the module could be marked "fast fail"
>> unless and until a live connection exists. Is there any scope for that?
> 
>  I'd really like 3.0 to have generic connection pools.  That would
> solve this problem by having common code, instead of stuff in rlm_sql,
> rlm_ldap, etc.
> 
>  Alan DeKok.
> -
> List info/subscribe/unsubscribe? See http://www.freeradius.org/list/devel.html