3.0.2: rlm_sql_null duplicating its statements

Fri Apr 11 16:42:34 CEST 2014

On 11 Apr 2014, at 07:24, Alan DeKok <aland at deployingradius.com> wrote:

> Stefan Winter wrote:
>> Whoa. Why would that be a configuration error?
> 
>  Because we can't guarantee that the writes occur properly.
> 
>> I have multiple instances
>> of rlm_sql_null, all writing different SQL statements, and all writing
>> into the same file for forwarding to the SQL server by radsqlrelay.
> 
>  Which doesn't work with POSIX locks, as you pointed out.  It might
> work accidentally, but different threads writing to the same file will
> cause problems.  One thread closing a file descriptor will cause another
> thread to magically lose its lock.

See this is why I wasn't quite sure why you did this on a per module basis.
My suggestion would be to use a global hash of filenames -> FDs, which is
shared between all things that write to log files on the server.

That's sort of the point of a separate FD API, to add a layer on top of the
broken POSIX locking to reflect lock states back into things within the 
same process space. 

My other suggestion to use multiple FDs would also work alleviate some of
the contention.

Assuming hash table is already initialised.

typedef struct fr_fd_t {
	fr_pool_t 	parent;
	int		fd;
	struct fr_fd_t	next;
} fr_fd_t;

typedef struct fr_pool_t {
	int	in_use;		//!< How many file descriptors are in use
	int	in_pool;	//!< How many file descriptors are open for this file

	mutex	mutex;		//!< Protect access to the pool.
	fr_fd_t	fd;		//!< Head of FD list
} fr_pool_t;

get_fd(filename, bytes_to_lock)
{
lock(hash)
pool = find(filename)
if (!pool) {
	/* new pool */
}
lock(pool)
unlock(hash)

pool->in_use++				/* We don't want this to be removed whilst were using it */

for (fd = pool->fd; fd->next && fd->in_use; fd = fd->next);

if (fd->in_use) fd = new_fd()		/* add a new fd to the pool for this file (maybe limit this to n FDs) */

lseek(fd, pool->offset, SEEK_CUR)
fcntl()					/* add range lock */
fd->in_use = true

pool->offset += bytes_to_lock		/* Set new offset */

unlock(pool)

return fd
}

release_fd(fd)
{
	pool = fd->parent
	lock(pool)
	fctnl()				/* release range lock */
	fd->in_use = false;
	pool->in_use--
}

You can just use one mutex, but two would probably be better. You really
want to limit the critical region for the one global mutex, and locking
another mutex is probably quicker than doing a bunch of system calls.

There's also an issue one what you do if there are no more FDs available.
You probably don't want to drop the message... Maybe the max number of FDs
should equal the max number of threads + 1.

You can't close them whilst something is using any of the FDs in the pool.
So this has to be managed carefully.

Arran Cudbard-Bell <a.cudbardb at freeradius.org>
FreeRADIUS Development Team

FD31 3077 42EC 7FCD 32FE 5EE2 56CF 27F9 30A8 CAA2

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 881 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.freeradius.org/pipermail/freeradius-users/attachments/20140411/b3c7a55e/attachment.pgp>