Robust proxy accounting

Ivan Kalik tnt at kalik.net
Thu Jun 11 12:55:59 CEST 2009


> When both RADIUS servers are operational robust proxying works. When one
> of the servers is
> unreachable the other server (that's attempting to proxy an accounting
> request) will delete
> the detail.work file from the listener's sub-directory after failing to
> get a response from
> the other server.

Let us see.

> rad_recv: Accounting-Request packet from host 10.12.80.109 port 32769,
> id=123, length=157
...
> [detail]        expand:
> /usr/local/var/log/radius/radacct/%Y-%m-%d/accounting-detail-%H:00 ->
> /usr/local/var/log/radius/radacct/2009-06-11/accounting-detail-08:00
> [detail]
> /usr/local/var/log/radius/radacct/%Y-%m-%d/accounting-detail-%H:00 expands
> to /usr/local/var/log/radius/radacct/2009-06-11/accounting-detail-08:00
> [detail]        expand: %{Packet-Src-IP-Address} - %t -> 10.12.80.109 -
> Thu Jun 11 08:39:20 2009
> ++[detail] returns ok

... writes to accounting detail file ...

> [attr_filter.accounting_response]       expand: %{User-Name} -> isschrpg
>  attr_filter: Matched entry DEFAULT at line 1
> ++[attr_filter.accounting_response] returns updated
>  server home.example.com {
>  }
> Sending Accounting-Request of id 66 to 129.11.162.17 port 1813
...
> Proxying request 10 to home server 129.11.162.17 port 1813 Sending
> Accounting-Request of id 66 to 129.11.162.17 port 1813

... and sends proxy.

> Polling for detail file
> /usr/local/var/log/radius/radacct/detail.example.com/detail-*:*
> Waking up in 0.4 seconds.
> Polling
> for detail file
> /usr/local/var/log/radius/radacct/detail.example.com/detail-*:*

Looks in backup detail file a few times.

> rad_recv: Accounting-Request packet from host 10.12.80.109 port 32769,
> id=123, length=157 Discarding duplicate request from client wism07-1 port
> 32769 - ID: 123 due to unfinished request 10 Waking up in 0.1 seconds.

This is NAS repeating request. But we are still waiting for home server to
respond.

> Polling for detail file
> /usr/local/var/log/radius/radacct/detail.example.com/detail-*:*
> Waking up in 0.9 seconds.
> Polling for detail file
> /usr/local/var/log/radius/radacct/detail.example.com/detail-*:*
> Waking up in 1.1 seconds.
> rad_recv: Accounting-Request packet from host 10.12.80.109 port 32769,
> id=123, length=157 Discarding duplicate request from client wism07-1 port
> 32769 - ID: 123 due to unfinished request 10 Waking up in 0.3 seconds.
> Polling for detail file
> /usr/local/var/log/radius/radacct/detail.example.com/detail-*:*
> Waking up in 0.9 seconds.
> Polling for detail file
> /usr/local/var/log/radius/radacct/detail.example.com/detail-*:*
> Waking up in 0.6 seconds.
> Rejecting request 10 due to lack of any response from home server
> 129.11.162.17 port 1813

Waited enough. Go to failover.

>   Found Post-Proxy-Type
>  server home.example.com {
> +- entering group Fail {...}
> [detail.example.com]    expand:
> /usr/local/var/log/radius/radacct/detail.example.com/detail-%Y%m%d:%H ->
> /usr/local/var/log/radius/radacct/detail.example.com/detail-20090611:08
> [detail.example.com]
> /usr/local/var/log/radius/radacct/detail.example.com/detail-%Y%m%d:%H
> expands to
> /usr/local/var/log/radius/radacct/detail.example.com/detail-20090611:08
> [detail.example.com]    expand: %t -> Thu Jun 11 08:39:20 2009
> [detail.example.com] Freeradius-Proxied-To = 129.11.162.17
> ++[detail.example.com] returns ok
>  }

Writing packet to failover detail ...

> Sending Accounting-Response of id 123 to 10.12.80.109 port 32769

... and sending response to the NAS ...

> Finished
> request 10.
> Cleaning up request 10 ID 123 with timestamp +9

... removing request from the list (that might be a problem) ...

> Going to the next request
> PROXY: Marking home server 129.11.162.17 port 1813 as zombie (it looks
> like it is dead).

... marking home server as dead ...

> Sending Status-Server of id 19 to 129.11.162.17 port 1813
>         Message-Authenticator := 0x00000000000000000000000000000000
>         NAS-Identifier := "Status Check. Are you alive?"

... and checking if it works.

> Waking up in 0.3 seconds.

Done, but ...

> rad_recv: Accounting-Request packet from host 10.12.80.109 port 32769,
> id=123, length=157

... your NAS didn't get the response. Since packet (id 123) was removed
from the list server doesn't know it is the same one. So off we go again:

> [detail]        expand:
> /usr/local/var/log/radius/radacct/%Y-%m-%d/accounting-detail-%H:00 ->
> /usr/local/var/log/radius/radacct/2009-06-11/accounting-detail-08:00
> [detail]
> /usr/local/var/log/radius/radacct/%Y-%m-%d/accounting-detail-%H:00 expands
> to /usr/local/var/log/radius/radacct/2009-06-11/accounting-detail-08:00
> [detail]        expand: %{Packet-Src-IP-Address} - %t -> 10.12.80.109 -
> Thu Jun 11 08:39:26 2009
> ++[detail] returns ok

Write to accounting detail ...

> Sending Accounting-Request of id 23 to 129.11.162.17 port 1813
...
> Proxying request 12 to home server 129.11.162.17 port 1813 Sending
> Accounting-Request of id 23 to 129.11.162.17 port 1813

... try to proxy (I am not sure why did this happen - server is suposed to
be marked dead; but it is still treated as alive) ...

> Polling for detail file
> /usr/local/var/log/radius/radacct/detail.example.com/detail-*:*
> detail_recv: Renaming
> /usr/local/var/log/radius/radacct/detail.example.com/detail-20090611:08 ->
> /usr/local/var/log/radius/radacct/detail.example.com/detail.work

... and run into trouble. This is not suposed to be polled again until
status of home server is changed to active.

> detail_recv: Read packet from
> /usr/local/var/log/radius/radacct/detail.example.com/detail.work
...
> server home.example.com {
> +- entering group accounting {...}
> ++[control] returns noop
> } # server home.example.com
>  server home.example.com {
>  }
> Sending Accounting-Request of id 243 to 129.11.162.17 port 1813
...
> Proxying request 13 to home server 129.11.162.17 port 1813 Sending
> Accounting-Request of id 243 to 129.11.162.17 port 1813

Failover now tries to proxy (home server is still not treated as dead)...

> Going to the next request
> Waking up in 0.5 seconds.
> Waking up in 0.4 seconds.
> Waking up in 0.1 seconds.
> Waking up in 1.0 seconds.
> Waking up in 1.1 seconds.
> Waking up in 0.2 seconds.
> Waking up in 1.0 seconds.
> Waking up in 0.9 seconds.
> Rejecting request 12 due to lack of any response from home server
> 129.11.162.17 port 1813

... and decides to reject ...

>   Found Post-Proxy-Type
>  server home.example.com {
> +- entering group Fail {...}

... and gets into the loop. After a few loops home server is treated as
dead, packet has nowhere to go and is lost.

Try increasing cleanup delay in radiusd.conf. See if that fixes the issue.
If the request is kept longer on the list NAS request recieved after
accounting response was sent should be caught and ignored.

Ivan Kalik
Kalik Informatika ISP




More information about the Freeradius-Users mailing list