request_dequeue problems (recent 3.0, when home-server stalls)

Brian Julin BJulin at clarku.edu
Thu Sep 20 16:19:47 CEST 2012


I had some more time to play with this; it seems to be related to retiring
old threads, not actual problem on the home server.  Some new observations below.

> Alan DeKok wrote on Aug28, 2012:

> Brian Julin wrote:
> > I'm currently hunting a problem that causes a recent checkout of FR3.0
> > to abort but which does not seem to be affecting an older revision
> > (April 8th or so)
> > of FR3.0 on another box.  I do have a couple small in-house patches applied
> > but they should probably not be relevant.
>  It's always worth checking.

OK, I've tested with a recent clean clone and it still has the problem.

> > request_dequeue found request 0x13d5220
> > ASSERT FAILED threads.c[468]: request->magic == REQUEST_MAGIC
>  Which means that the request has been deleted, but is still in the
> queue.  That's not supposed to happen...

After collecting some more debug logs, I noticed that this problem was happening too often
on requests numbered around 260 to be a coincidence.  It turns out this happens after
a thread is marked for recycling due to having handled over 128 requests.  I'm no longer positive
that the home server is not responding, though it could have similar limits set up on the
RADSEC connection so we cannot completely eliminate that as a possibility.  It seems
a lot more likely, though, that something is getting wedged up locally and the request to
the home server never actually hits the wire.

The server will ride over the thread recycling if it is handling requests that do not need
to be proxied at the moment when the threads are marked; it only has trouble in the middle
of conversations.

Below is a debug log with some extra radlogs thrown in by hand.  The "Reaping" lines
happen in the loop that tests whether threads have handled so many connections that
they should be retired.  Is it normal for a thread to grab and handle requests after it has
been marked for recycling?

We start in the middle of an ongoing EAP conversation.

Sending Access-Challenge of id 96 to 127.0.0.1 port 34912
[request dump omitted]
(259) Finished request 259.
Thread 2 waiting to be assigned a request
Waking up in 0.3 seconds.
rad_recv: Access-Request packet from host 127.0.0.1 port 34912, id=242, length=273
Deleting thread 3
Deleting thread 4
Reaping...
Reaping thread 1 due to number of requests handled (130/128 handled) 
Reaping...
Reaping thread 2 due to number of requests handled (131/128 handled) 
Waking up in 0.2 seconds.
Thread 2 got semaphore
Thread 2 handling request 260, (132 handled so far)
[request dump omitted]
Thread 1 got semaphore
Thread 1 exiting...
[unlang trace omitted]
Sending Access-Request of id 238 to XXXXXXXXXXX port 2083
[request dump omitted]
Thread 2 exiting...
Waking up in 0.3 seconds.
Waking up in 0.4 seconds.
Waking up in 0.7 seconds.
Waking up in 1.1 seconds.
rad_recv: Access-Request packet from host 127.0.0.1 port 34912, id=242, length=273
Discarding duplicate request from client EDUROAM_OUT port 34912 - ID: 242 due to unfinished request 260
Waking up in 0.9 seconds.
Waking up in 1.1 seconds.
rad_recv: Access-Request packet from host 127.0.0.1 port 34912, id=242, length=273
Discarding duplicate request from client EDUROAM_OUT port 34912 - ID: 242 due to unfinished request 260
(256) Cleaning up request packet ID 213 with timestamp +11084
rad_recv: Access-Request packet from host 127.0.0.1 port 34912, id=242, length=273
Discarding duplicate request from client EDUROAM_OUT port 34912 - ID: 242 due to unfinished request 260
(256) Cleaning up request packet ID 213 with timestamp +11084
Waking up in 0.2 seconds.
(257) Cleaning up request packet ID 10 with timestamp +11085
Waking up in 0.2 seconds.
(258) Cleaning up request packet ID 151 with timestamp +11085
Waking up in 0.2 seconds.
(259) Cleaning up request packet ID 96 with timestamp +11085
Waking up in 2.2 seconds.
rad_recv: Access-Request packet from host 127.0.0.1 port 34912, id=242, length=273
Discarding duplicate request from client EDUROAM_OUT port 34912 - ID: 242 due to unfinished request 260
Waking up in 1.1 seconds.
Waking up in 3.7 seconds.
rad_recv: Access-Request packet from host 127.0.0.1 port 34912, id=242, length=273
Discarding duplicate request from client EDUROAM_OUT port 34912 - ID: 242 due to unfinished request 260
Waking up in 2.9 seconds.
rad_recv: Access-Request packet from host 127.0.0.1 port 34912, id=242, length=273
Discarding duplicate request from client EDUROAM_OUT port 34912 - ID: 242 due to unfinished request 260
Waking up in 0.9 seconds.
Waking up in 5.6 seconds.
Waking up in 8.5 seconds.
Waking up in 12.8 seconds.
Waking up in 19.2 seconds.
Waking up in 28.8 seconds.
(260) <queue> : Cleaning up request packet ID 242 with timestamp +11086
Ready to process requests.
rad_recv: Access-Request packet from host 127.0.0.1 port 34912, id=104, length=272
Threads: total/active/spare threads = 2/0/2
Threads: Spawning 2 spares
Thread spawned new child 5. Total threads in pool: 3
Thread spawned new child 6. Total threads in pool: 4
Waking up in 0.3 seconds.
Thread 5 waiting to be assigned a request
Thread 5 got semaphore
ASSERT FAILED threads.c[461]: request->magic == REQUEST_MAGIC
Thread 6 waiting to be assigned a request
Thread 6 got semaphore



More information about the Freeradius-Users mailing list