bin/164526: kill(1) can not kill process despite on -KILL
Коньков Евгений
kes-kes at yandex.ru
Wed Feb 1 23:16:39 CET 2012
Здравствуйте, Jilles.
Вы писали 28 января 2012 г., 20:24:07:
>> [stuck process cannot be killed, system hangs when reboot is
>> attempted]
JT> A signal cannot forcibly kill a process that is stuck in the kernel.
JT> Allowing this would put the integrity of the kernel data structures at
JT> risk and likely cause hangs, data corruption or panics later on.
JT> If a process is stuck in the kernel for a long time, this can be things
JT> like broken hardware, a non-responsive NFS server or a bug.
JT> A state 'T' (stopped) probably means the process is multi-threaded and
JT> is trying to suspend but one or more threads will not cooperate
JT> (non-interruptible sleep or running in the kernel).
JT> Useful commands to obtain more information (supposing pid is 45471):
JT> ps Hl45471
JT> procstat -k 45471
JT> Of course, this does not help if you already rebooted.
repeated again:
bug is repeateable:
1. radiusd + mod_perl + example.pl(it is connects to FireBird) +
FireBIrd
2. restart firebird
3. try to restart radiusd
4. process in fall into STOP state
# ps awx | grep radi
9438 ?? TLs 5:10.12 /usr/local/sbin/radiusd
27603 2 S+ 0:00.00 grep radi
# procstat -k 9438
PID TID COMM TDNAME KSTACK
9438 100080 radiusd - mi_switch sleepq_switch sleepq_wait _sx_xlock_hard _sx_xlock _vm_map_lock_upgrade vm_map_lookup vm_fault_hold vm_fault trap_pfault trap calltrap
9438 100195 radiusd - mi_switch sleepq_switch sleepq_wait __lockmgr_args ffs_lock VOP_LOCK1_APV _vn_lock vm_object_deallocate unlock_and_deallocate vm_fault_hold vm_fault trap_pfault trap calltrap
9438 101144 radiusd - mi_switch thread_suspend_switch thread_single exit1 sigexit postsig ast doreti_ast
# ps wHl9438
UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
133 9438 1 0 20 0 351124 322000 user m TLs ?? 0:03.65 /usr/local/sbin/radiusd
133 9438 1 0 20 0 351124 322000 ufs TLs ?? 0:00.00 /usr/local/sbin/radiusd
133 9438 1 0 20 0 351124 322000 - TLs ?? 0:05.28 /usr/local/sbin/radiusd
#top
last pid: 28497; load averages: 0.56, 2.34, 9.37 up 0+10:23:14 00:12:5
162 processes: 1 running, 158 sleeping, 3 stopped
CPU: 1.9% user, 0.0% nice, 1.9% system, 5.3% interrupt, 90.8% idle
Mem: 525M Active, 1259M Inact, 182M Wired, 41M Cache, 112M Buf, 1890M Free
Swap: 4096M Total, 4096M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
6893 root 1 26 0 15392K 5580K select 0 21:17 6.10% snmpd
75797 bind 7 20 0 100M 77280K kqread 2 4:27 0.00% named
5553 root 7 20 0 53544K 39832K select 1 0:19 0.00% mpd5
77411 dhcpd 1 20 0 15032K 5360K select 3 0:18 0.00% dhcpd
3605 root 1 20 0 10460K 4004K select 3 0:11 0.00% zebra
5316 root 1 20 0 9616K 1244K select 1 0:10 0.00% syslogd
9438 freeradius 3 20 0 343M 314M STOP 0 0:09 0.00% radiusd
80843 mysql 26 20 0 402M 333M sbwait 0 0:05 0.00% mysqld
3611 root 1 20 0 14660K 5348K select 2 0:05 0.00% bgpd
80396 www 1 20 0 37908K 22876K lockf 1 0:01 0.00% httpd
26278 root 1 20 0 33812K 15608K select 2 0:01 0.00% httpd
10559 www 1 20 0 42004K 26768K lockf 1 0:01 0.00% httpd
if I can supply another usefull debug info, answer as fast as you can, I can
not wait too long. Thank you.
--
С уважением,
Коньков mailto:kes-kes at yandex.ru
More information about the Freeradius-Users
mailing list