LDAP timeouts during failure conditions
Phil Mayers
p.mayers at imperial.ac.uk
Tue Jun 28 22:11:58 CEST 2011
On 06/28/2011 08:01 PM, Alexander Clouter wrote:
> Phil Mayers<p.mayers at imperial.ac.uk> wrote:
>>
>>> I'd really like 3.0 to have generic connection pools. That would
>>> solve this problem by having common code, instead of stuff in
>>> rlm_sql, rlm_ldap, etc.
>>
>> Do you have any pointers how to get started on this? Off the top of my
>> head it seems we'd need something like the code below; a struct to hold
>> module-supplied connection create/keepalive/delete functions, some code
>> in the server core to set and re-set "last used" times and call a
>> keepalive function, and delete
>>
> I probably would not bother with keep alive (or 'last used'). I would
The idea of the keepalive was not to "hold the connection open" at the
TCP layer. It's to detect dead server(s) in a timely fashion i.e.
hopefully _before_ someone tries to run a radius packet through the module.
But now that I think about it, it won't work as I envisaged. The "open"
and "keepalive" call on a connection may block, so can't be run in the
main event loop - it needs to be in a thread. You don't want to just
connect on-demand, since the whole point is that a pool with
connections==0 will fast-fail and let "redundant {}" do its work.
It needs a worker thread, or a proper async LDAP API, which libldap
doesn't have sadly :o(
> imagine in practice your 'idle' time should be shorter that any NAT or
> server daemon concept of idleness? What I'm trying to say is the cost
> of an open idle link is low, tearing it down and rebuilding it is...if
> the connection has been idle for a long time the server (or
> NAT/firewall) would have probably killed it.
>
> If you really want keep alives, it probably would be better to go for
> SO_KEEPALIVE (as NOOP as you can get)? No doubt this would have to be
> done in the driver rather than the layer you are constructing?
Maybe. Depends if the driver layer offers you any ability to tweak
underlying TCP parameters. But as I say, I'm not concerned about TCP
connections; I'm concerned about detecting dead servers.
And SO_KEEPALIVE is way, way too slow to detect dead servers.
>
> As a passing not, I susepect you do not care for async LDAP queries?
On the contrary, I really like async LDAP queries, and async/event
driven architectures in general. My preferred network programming
framework is Python Twisted, which is completely non-blocking
event/callback driven
Problem is, async LDAP queries are a little (!) more work to implement
in a threadpool-based server like FreeRADIUS:
* put LDAP query params into struct
* call query function; this allocates a semaphore, locks a queue, puts
the query & semaphore into the queue, unlocks then waits on the semaphore
* a separate worker thread/threadpool continually locks/pulls requests
off/unlocks the queue, issues them & stores the LDAP msgid, then polls
in a loop over ldap_result(); as each message comes back, it finds the
corresponding query, copies a pointer to the result and flags the semaphore
Basically you have run another thread/pool AFACIT. And I was hoping to
avoid that, as it seems to me likely to risk stomping all over
FreeRADIUS carefully crafted and testing internals.
It's a real shame that libldap doesn't offer a better way to integrate
the open LDAP TCP socket into a select()-based loop.
Some projects go the whole hog and fork() a child process to do their
LDAP queries e.g. sssd, but this is just tedious; you have to marshal
the LDAP query and results across process boundaries.
> It's probably the only database FreeRADIUS supports that supports this
> anyway so probably not worth thinking about.
Postgres, at least, can be used in async mode. And it can cooperate with
select()
More information about the Freeradius-Devel
mailing list