NAS re-uses the same port and ID

Fajar A. Nugraha list at fajar.net
Wed Jul 13 11:34:37 CEST 2011


On Wed, Jul 13, 2011 at 4:14 PM, Konstantin Chekushin <koch2 at inbox.lv> wrote:
> Yes, but we just have got the problem, so, the source of the problem may be
> in other place...

There's an analogy I sometime use to explain things like this to my coworkers:

"You're driving a new car. Not far from the dealer, the car broke
down. You get out, and say 'hey, this car must be broken. Let's get
another car'.

... when in fact the problem is simply an empty gas tank"


> I've look through the sniffer file and found, this strange sequence:
>

Let's look at these one by one, shall we. I assume you got the log by
using grep or some capture filter to packet sniffer?

> ...
> 31:05 access-request (port 65025, id 229) (Authenticator1)
> 31:10 access-accept (port 65025, id 229)

As Alan said,  It's OK to re-use the same port & ID, *if* it's
received a response
from the RADIUS server. So based on that ...

> 31:10 access-request (port 65025, id 229) (Authenticator2)

... this request is valid, since the NAS should already got response
from radius. However ...

> 31:14 access-request (port 65025, id 229) (Authenticator2)
> 31:20 access-request (port 65025, id 229) (Authenticator2)

... those two are duplicate packets. The NAS resent those because the
radius server took too long to response. This is the main problem.

Now let's get to radius log

> Jul 12 14:31:10 radius1 radiusd[8647]: Discarding duplicate request from
> client fl2 port 65025 - ID: 229 due to unfinished request 6545
> Jul 12 14:31:16 radius1 radiusd[8647]: Discarding duplicate request from
> client fl2 port 65025 - ID: 229 due to unfinished request 6545
> Jul 12 14:31:21 radius1 radiusd[8647]: Received conflicting packet from
> client fl2 port 65025 - ID: 229 due to unfinished request 6545.  Giving up
> on old request.

The times match the duplicate request from your packet sniffer. So
bottom line, you need to find out what's causing radius to respond
slowly.

> Jul 12 14:31:22 radius1 radiusd[8647]: Dropping request (2049 is too many):
> from client fl1 port 65025 - ID: 229

this one is also an indication that something's wrong with your radius
server. It takes too long to respond to a request, thus it's unable to
accept new request.

Now let's take some time to read a snippet from radiusd.conf:

thread pool {
...

        #  You may find that the server is regularly reaching the
        #  'max_servers' number of threads, and that increasing
        #  'max_servers' doesn't seem to make much difference.
        #
        #  If this is the case, then the problem is MOST LIKELY that
        #  your back-end databases are taking too long to respond, and
        #  are preventing the server from responding in a timely manner.
        #
        #  The solution is NOT do keep increasing the 'max_servers'
        #  value, but instead to fix the underlying cause of the
        #  problem: slow database, or 'hostname_lookups=yes'.
        #
        #  For more information, see 'max_request_time', above.



That should be clear enough.

If you're using database backend, then most likely the database is
non-optimal (e.g. too many rows, non-optimum queries/index, slow disk,
and so on). Often the solution to this problem is to get a dba, cause
fixing it would require some skills and knowledge that only a dba has.

-- 
Fajar




More information about the Freeradius-Users mailing list