1.1.7 rlm_detail locking is broken

Nicolas Baradakis nbk at sitadelle.com
Tue Dec 4 01:40:56 CET 2007


Brian De Wolf wrote:

> Here's the patch, as promised.

Many thanks.

> Now, instead of radrelay's previous behavior, radrelay will attempt to
> open and lock <detail>.work.  If it can, it will then read/transmit all
> of the entries and delete <detail>.work.  If it can't lock, it will
> simply try locking later.

The new algorith doesn't require a non-blocking lock in a loop. After the
file is renamed rlm_detail will not try to lock it anymore.

> If it doesn't exist, it will check if <detail> has entries and if so
> it will move <detail> to <detail>.work (and make a new <detail>).

rlm_detail will create the new detail file. I think the function
detail_move() is not needed anymore. Just use a normal rename().

> It seems to move faster, but that's because I removed some of the sleeps to 
> make sure writing had completed since that should be handled by the locks 
> now. Luckily, my relay destinations aren't production yet, so I can keep 
> testing with a decent load.

That's good news.

> --- src/main/radrelay.c.orig	2007-03-16 06:22:03.000000000 -0700
> +++ src/main/radrelay.c	2007-12-03 12:24:46.000000000 -0800
>
> [...]
>
> @@ -537,11 +514,13 @@
>  	if (rename(from, to) < 0)
>  		return -1;
>  
> +
>  	oldmask = umask(0);
>  	if ((n = open(from, O_CREAT|O_RDWR, st.st_mode)) >= 0)
>  		close(n);
>  	umask(oldmask);
>  
> +
>  	return 0;
>  }

This is only cosmetics.

> @@ -607,19 +586,47 @@
>  		if (got_sigterm) state = STATE_SHUTDOWN;
>  
>  		/*
> -		 *	Open detail file - if needed, and if we can.
> +		 *	Try to open our work file first.  If not, check if the
> +		 *	actual detail file has stuff in it.  If so, open it,
> +		 *	move it to the work file, and replace it with an empty
> +		 *	file.  Then keep the file handle for locking/reading.
>  		 */

Maybe re-use the comments from Alan in src/main/detail.c:

	/*
	 *	Open detail.work first, so we don't lose
	 *	accounting packets.  It's probably better to
	 *	duplicate them than to lose them.
	 *
	 *	Note that we're not writing to the file, but
	 *	we've got to open it for writing in order to
	 *	establish the lock, to prevent rlm_detail from
	 *	writing to it.
	 */
> +			if ((fp = fopen(work, "r+")) == NULL) {

		/*
		 *	Try reading the detail file.  If it
		 *	doesn't exist, we can't do anything.
		 *
		 *	Doing the stat will tell us if the file
		 *	exists, even if we don't have permissions
		 *	to read it.
		 */
> +				if(!stat(r_args->detail, &st)) {
> +					if(st.st_size > 0) {
> +						detail_move(r_args->detail, work);

As said above, I think you could just use rename() and then you don't
need to check st_size > 0.

> +						continue;

I have a question about this "continue": please check that radrelay will
not use 100% cpu when there is no detail file to read. (as many of the
sleep() are gone)

> +					}
> +				}
>  			}
>  
> +			/*
> +			 * Try to lock the detail-file.
> +			 * If lockf is used we want to lock the _whole_ file, hence the

Please no line of more than 80 characters in comments.

> +			 * fseek to the start of the file.
> +			 * 
> +			 * If we can't lock the detail file, we close it and
> +			 * will try it again on the next iteration.  Note that
> +			 * we will never unlock the file, on purpose, to stop
> +			 * rlm_detail from reading it.
> +			 */

The last part is confusing. rlm_detail must not try to acquire the lock
anymore after the file is renamed to detail.work.

> +			
> +			if(fp) {
> +				fseek(fp, 0L, SEEK_SET);
> +				do {
> +					x = rad_lockfd_nonblock(fileno(fp), 0);
> +					if (x == -1)
> +						ms_sleep(100);
> +				} while (x == -1 && i++ < 20);
> +
> +				if (x == -1) {
> +					fclose(fp);
> +					fp = NULL;
> +				} else {
> +					state = STATE_BACKLOG;
> +				}

No need to lock in a loop, so you can simplify that part. The lock is
only there to catch a corner case where the rename() happened while
the current rlm_detail was still writing in the file. (but new rlm_detail
execution will create and use a new file)

-- 
Nicolas Baradakis




More information about the Freeradius-Devel mailing list