info: WARNING: Child is hung for request XXXX in component <core> module (2.2.5)

Jonathan huffelduffel at gmail.com
Fri Nov 21 17:12:10 CET 2014


Hi list,

i've just upgraded our server to 2.2.6 in the hope our child hung
would be fixed.

The problem we are having is that whenever a radius proxy server
(outside of our control: eduroam national home server) doesn't respond
in time, we get the following messages blocking freeradius:

Fri Nov 21 17:02:07 2014 : Error: Internal sanity check failed for
child state <= NEW MESSAGE since 2.2.6
Fri Nov 21 17:02:07 2014 : Error: Reply from home server A.B.C.D port
1812  - ID: 84 arrived too late for request 11986. Try increasing
'retry_delay' or 'max_request_time'
Fri Nov 21 17:02:18 2014 : Error: WARNING: Unresponsive thread 11986
for request 0, in component <core> module
Fri Nov 21 17:02:19 2014 : Info: WARNING: Child is hung for request
11986 in component <core> module .
Fri Nov 21 17:02:20 2014 : Info: WARNING: Child is hung for request
11986 in component <core> module .
Fri Nov 21 17:02:23 2014 : Info: WARNING: Child is hung for request
11986 in component <core> module .
Fri Nov 21 17:02:26 2014 : Info: WARNING: Child is hung for request
11986 in component <core> module .
Fri Nov 21 17:02:31 2014 : Info: WARNING: Child is hung for request
11986 in component <core> module .
Fri Nov 21 17:02:39 2014 : Info: WARNING: Child is hung for request
11986 in component <core> module .
Fri Nov 21 17:02:50 2014 : Info: WARNING: Child is hung for request
11986 in component <core> module .
Fri Nov 21 17:03:07 2014 : Info: WARNING: Child is hung for request
11986 in component <core> module .
Fri Nov 21 17:03:33 2014 : Info: WARNING: Child is hung for request
11986 in component <core> module .


Config:

#
#  This configuration file handles the requests to the EDUROAM community
#

home_server roaming1-v4 {
#       type                    = auth+acct
        type                    = auth
        ipaddr                  = A.B.C.D1
        port                    = 1812
        secret                  = secretABCD1
        require_message_authenticator = yes

        status_check            = status-server
        check_interval          = 30
        response_window         = 20
        zombie_period           = 40
        num_answers_to_alive    = 3
        max_request_time        = 5
}

home_server roaming2-v4 {
#        type                    = auth+acct
        type                    = auth
        ipaddr                  = A.B.C.D2
        port                    = 1812
        secret                  = secretABCD2
        require_message_authenticator = yes

        status_check            = status-server
        check_interval          = 30
        response_window         = 20
        zombie_period           = 40
        num_answers_to_alive    = 3
        max_request_time        = 5
}

home_server_pool EDUROAM {
        type                    = client-port-balance
        home_server             = roaming1-v4
        home_server             = roaming2-v4
        virtual_server          = eduroam
}

realm LOCAL {
        #  If we do not specify a server pool, the realm is LOCAL, and
        #  requests are not proxied.
}

realm NULL {
        #  If we do not specify a server pool, the realm is LOCAL, and
        #  requests are not proxied.
}


# Match-any/previous DEFAULT realm statement
realm "~.+$" {
#        pool                    = EDUROAM
        auth_pool               = EDUROAM
        nostrip
}

On Wed, Oct 22, 2014 at 4:38 PM, Jonathan <huffelduffel at gmail.com> wrote:
> Hi we regulary get these messages since we upgraded from 2.2.3 to 2.2.5
>
> I have attached the GDB log as i'm no developper expert to find out
> what's going on...
>
> Hopefully someone can help us out with this
>
> <SOME INFO>
>
> We are running Debian 7.7 x64
>
> Linux 3.2.0-4-amd64 (radius1)   10/22/2014      _x86_64_        (8 CPU)
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.24    0.00    0.11    0.10    0.00   99.56
>
> Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
> cciss/c0d0      163.25         9.31      1491.34    3263611  522909360
>
> <RADIUS LOGS>
> Wed Oct 22 16:35:35 2014 : Error: Discarding duplicate request from
> client telenet-81.82.0.0/15 port 3072 - ID: 226 due to unfinished
> request 2664227
> Wed Oct 22 16:35:37 2014 : Error: WARNING: Unresponsive thread 2664227
> for request 0, in component <core> module
> Wed Oct 22 16:35:38 2014 : Info: WARNING: Child is hung for request
> 2664227 in component <core> module .
>
>
> Attached a debug with command, "echo 'thread apply all bt full' >
> /tmp/gdb_cmd; gdb --pid `cat /var/run/freeradius/freeradius.pid`
> -batch -x /tmp/gdb_cmd > gdb_bt;  gzip gdb_bt"
>
> Hopefully this brings something useful, as previous debug to devel
> list didnt show the reason.


More information about the Freeradius-Users mailing list