| Title: | DIGITAL UNIX (FORMERLY KNOWN AS DEC OSF/1) |
| Notice: | Welcome to the Digital UNIX Conference |
| Moderator: | SMURF::DENHAM |
| Created: | Thu Mar 16 1995 |
| Last Modified: | Fri Jun 06 1997 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 10068 |
| Total number of notes: | 35879 |
Hello all,
today, one of our customers experienced a process hang due to threads which
were blocked in an uninterruptible sleep. My findings indicate the following
scenario :
1. Thread "A" faulted on virtual address (VA) 0x1407bc000.
"u_anon_fault" acquired the anon cluster lock of array entry 58,
i.e. ao_acla[58].acl_klock. Then thread "A" was blocked within
"u_anon_faultpage" since the associated page was busy.
2. Thread "B" faulted on virtual address (VA) 0x1407b8000.
"u_anon_fault" attempted to acquire the anon cluster lock of array entry 58,
BUT THIS ENTRY IS STILL HELD BY THREAD "A". Hence, thread "B" was blocked in
an uninterruptible sleep.
3. Thread "C" attempted to terminate the process. It had to wait for thread "B"
but this thread was blocked indefinitely.
There is one remarkable fact : The page which thread "A" had been waiting for
is no longer busy, i.e. there's no I/O hanging !
The relevant data structures are appended below.
The O/S is V3.2G with patch OSF375-050 installed. I recommended to replace
OSF375-050 by OSF375-056, because the only difference between the two patch
versions is the kernel object "u_mape_anon.o". The revision of "u_mape_anon.c"
was changed from 1.1.143.2 in OSF375-050 to 1.1.143.3 to OSF375-056.
Now my question : Does "u_mape_anon.c" revision 1.1.143.3 contain the fix
for the type of scenario described above ?
This is urgent since the problem occured in a mission critical enviroment.
I must make sure that OSF375-056 solves the problem. If there are any doubts,
I'll have to open an IPMT case.
I would really appreciate a quick reply !
Many thanks in advance and Best Regards,
Uli Obergfell
Offsite Services Open Systems / CSC Munich
DTN : 775-8137
E-Mail : [email protected]
--------------------------------------------------------------------------------
# ps -m -p 16046
PID TTY S TIME COMMAND
16046 ?? U 26:54.97 /usr/users/inss7/bin/gwcs1 /usr/users/inss7/scr
T 0:08.71
T 1:25.64
TW 0:00.00
TW 0:00.00
TW 0:00.39
T 0:00.40
T 0:11.90
T 0:42.35
T 0:00.01
TW 0:00.01
TW 0:00.00
TW 0:00.03
TW 0:00.00
TW 0:33.11
T 0:41.72
H 0:00.00
H 0:00.00
T 0:00.01
H 0:00.00
H 0:00.00
H 0:00.00
TW 0:00.01
TW 0:00.00
TW 0:21.88
TW 0:00.00
T 0:31.79
TW 0:00.00
TW 0:00.00
TW 0:00.00
TW 0:00.01
T 14:12.94
T 3:21.42
U 0:41.56 <--- Thread "B"
T 0:08.33
TW 0:00.09
U 0:00.07 <--- Thread "C"
T 0:00.11
T 0:00.48
H 0:01.72
T 0:00.50
T 0:00.45
T 1:49.16
T 0:39.21
T 0:41.46
T 0:39.48
H 0:00.00
# dbx -k /vmunix
(dbx) set $pid=16046
(dbx) tstack
:
Thread 0xfffffc001e5ed000: ... Thread "B"
> 0 thread_block
1 u_anon_fault(0xfffffc0009a43860, 0x1407b8000, ...)
2 u_map_fault ^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^
3 vm_fault vm_map_entry faulting VA
4 trap
5 _XentMM
Thread 0xfffffc001e5ed800: ... Thread "A"
> 0 thread_block
1 u_anon_faultpage
2 u_anon_fault(0xfffffc0009a43860, 0x1407bc000, 0x1, ...)
3 u_map_fault ^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^
4 vm_fault vm_map_entry faulting VA
5 trap
6 _XentMM
Thread 0xfffffc001e5ec000: ... Thread "C"
> 0 thread_block
1 thread_dowait
2 task_dowait
3 thread_ex_check
4 exit
5 rexit
6 syscall
7 _Xsyscall
# kdbx -k /vmunix
(kdbx) px *(struct vm_map_entry *)0xfffffc0009a43860
struct {
vme_links = struct {
prev = 0xfffffc0009a42960
next = 0xfffffc0019cc2f60
start = 0x14005e000
end = 0x142656000
}
vme_map = 0xfffffc00182189c0
vme_uobject = union {
vm_object = 0xfffffc0016f66c80
sub_map = 0xfffffc0016f66c80
}
vmet = union {
tvme_offset = 0x0
tvme_seg = (nil)
}
vme_ops = 0xfffffc0000643650
vme_vpage = struct {
_uvpage = union {
_uvp = struct {
_uvp_prot = 0x7
_uvp_plock = 0x0
}
_kvp = struct {
_kvp_prot = 0x7
_kvp_kwire = 0x0
}
}
}
vme_faultlock = struct {
sl_data = 0x389628
sl_info = 0x0
sl_cpuid = 0x0
sl_lifms = 0x0
}
vme_faults = 0x3
vmeu = union {
uvme = struct {
uvme_faultwait = 0x0
uvme_keep_on_exec = 0x0
uvme_inheritance = 0x1
uvme_maxprot = 0x7
}
kvme = struct {
kvme_faultwait = 0x0
kvme_is_submap = 0x0
kvme_copymap = 0x1
}
}
vme_private = 0x0
}
(kdbx) px *(struct vm_anon_object *)0xfffffc0016f66c80
struct {
ao_object = struct {
ob_memq = (nil)
ob_lock = struct {
sl_data = 0x3812c0
sl_info = 0x0
sl_cpuid = 0x0
sl_lifms = 0x0
}
ob_ops = 0xfffffc0000643888
ob_aux_obj = (nil)
ob_ref_count = 0x1
ob_res_count = 0x1
ob_size = 0x2600000
ob_resident_pages = 0x0
ob_flags = 0x1
ob_type = 0x2
}
ao_flags = 0x0
ao_rbase = 0x0
ao_crefcnt = 0x1
ao_rswanon = 0x0
ao_swanon = (nil)
ao_ranon = 0x12fc
ao_bobject = (nil)
ao_boffset = 0x0
ao_acla = 0xffffffff807a0000
}
(kdbx) px ((struct vm_anon_object *)0xfffffc0016f66c80).ao_acla[58]
struct {
acl_klock = struct {
akl_slock = struct {
sl_data = 0x36ce30
sl_info = 0x0
sl_cpuid = 0x0
sl_lifms = 0x0
}
akl_want = 0x10600001
akl_lock = 0x3
akl_mlock = 0x1
akl_plock = 0x0
akl_rpages = 0x10
akl_anon = 0x10
akl_pagelist = 0xfffffc0000e21f40
}
acl_anon = {
[0] 0xffffffffa0dd17d0
[1] 0xffffffffa0dd17e0
[2] 0xffffffffa1014da0
[3] 0xffffffffa0dd17f0
[4] 0xffffffffa1014db0
[5] 0xffffffffa0dd1800
[6] 0xffffffffa1014dc0
[7] 0xffffffffa0dd1810
[8] 0xffffffffa1014dd0
[9] 0xffffffffa0dd1820
[10] 0xffffffffa1014de0
[11] 0xffffffffa0dd1830
[12] 0xffffffffa1014df0
[13] 0xffffffffa0dd1840 <--- array entry for VA 0x1407b8000
[14] 0xffffffffa1014e00
[15] 0xffffffffa0dd1850 <--- array entry for VA 0x1407bc000
}
}
(kdbx)px *((struct vm_anon_object *)0xfffffc0016f66c80).ao_acla[58].acl_anon[13]
struct {
_uanonx = union {
_an_page = 0xfffffc0000c43020
_an_next = 0xfffffc0000c43020
}
_uanony = union {
_an_bits0 = struct {
_an_refcnt = 0x1
_an_cowfaults = 0x0
_an_hasswap = 0x0
_an_type = 0x0
}
_an_bits1 = struct {
_an_anon = 0x1
_an_type1 = 0x0
}
}
}
(kdbx) px *(struct vm_page *)0xfffffc0000c43020
struct {
pg_pnext = 0xfffffc0000ceaf00
pg_pprev = 0xfffffc0000c12f60
pg_onext = 0xfffffc0000e77380
pg_oprev = 0xfffffc0000b82ea0
pg_hnext = 0xfffffc0000a5a1a0
pg_hprev = 0xfffffc0000bfc640
pg_object = 0xfffffc001fe04bc0
pg_offset = 0xbb08000
pg_wire_count = 0x0
pg_iocnt = 0x0
pg_free = 0x0
pg_busy = 0x0
pg_wait = 0x0
pg_error = 0x0
pg_dirty = 0x0
pg_zeroed = 0x0
pg_reserved = 0x1
pg_hold = 0x0
pg_phys_addr = 0xd1b4000
_upg = union {
_apg = struct {
ap_owner = 0xfffffc0016f66c80
ap_roffset = 0x75a000
}
_vppg = struct {
vp_addr = 0xfffffc0016f66c80
vp_pfs = 0x75a000
}
_pkva = 0xfffffc0016f66c80
_pg_private = {
[0] 0xfffffc0016f66c80
[1] 0x75a000
}
}
}
(kdbx)px *((struct vm_anon_object *)0xfffffc0016f66c80).ao_acla[58].acl_anon[15]
struct {
_uanonx = union {
_an_page = 0xfffffc0000a0f4a0
_an_next = 0xfffffc0000a0f4a0
}
_uanony = union {
_an_bits0 = struct {
_an_refcnt = 0x1
_an_cowfaults = 0x0
_an_hasswap = 0x0
_an_type = 0x0
}
_an_bits1 = struct {
_an_anon = 0x1
_an_type1 = 0x0
}
}
}
(kdbx) px *(struct vm_page *)0xfffffc0000a0f4a0
struct {
pg_pnext = 0xfffffc0000b30dc0
pg_pprev = 0xfffffc0000fa3440
pg_onext = 0xfffffc0000e21f40
pg_oprev = 0xfffffc0000e77380
pg_hnext = 0xfffffc0000c05a00
pg_hprev = 0xfffffc0000d28700
pg_object = 0xfffffc001fe04bc0
pg_offset = 0xbb0a000
pg_wire_count = 0x0
pg_iocnt = 0x0
pg_free = 0x0
pg_busy = 0x0
pg_wait = 0x0
pg_error = 0x0
pg_dirty = 0x0
pg_zeroed = 0x0
pg_reserved = 0x1
pg_hold = 0x1
pg_phys_addr = 0x15cc000
_upg = union {
_apg = struct {
ap_owner = 0xfffffc0016f66c80
ap_roffset = 0x75e000
}
_vppg = struct {
vp_addr = 0xfffffc0016f66c80
vp_pfs = 0x75e000
}
_pkva = 0xfffffc0016f66c80
_pg_private = {
[0] 0xfffffc0016f66c80
[1] 0x75e000
}
}
}
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 9894.1 | V3.2G / Thread Hangs in "u_anon_faultpage" | MGOF01::UOBERGFELL | Thu May 22 1997 09:54 | 55 | |
Hello,
I've spent some more time on analyzing the problem and I finally found the clue
(I was even able to reproduce the problem with a small test program on a V4.0a
machine). For those who are interested in details, here's what happens :
1. Thread "A" faults on a page which is currently swapped. The routine
"u_anon_fault" acquires the corresponding anon cluster lock and calls
"a_anon_getpage". "a_anon_getpage" allocates a new (free) page and
initiates the I/O (-> read the page from the swap partition). The
page is marked "busy" while the I/O is in progress. "u_anon_fault"
then calls "u_anon_faultpage". The routine "u_anon_faultpage" blocks
while the page is "busy" (-> wait for I/O completion).
THE ESSENTIAL PROBLEM HERE IS, THAT THIS "SLEEP" IS INTERRUPTIBLE
(although it shouldn't be) !
2. Thread "B" faults on a page within the same anon cluster. The routine
"u_anon_fault" attempts to acquire the lock which is currently held by
thread "A". "u_anon_fault" blocks until the lock will be released by
thread "A".
THIS "SLEEP" IS NON-INTERRUPTIBLE !
3. Thread "C" attempts to terminate the process (-> exit system call) while
thread "A" and "B" are still blocked. In a multi-threaded process,
the exiting thread has to take care, that the other threads enter a
"safe" (suspended) state. It may wake up threads which are blocked
in an interruptible sleep, but it must wait for non-interruptible
threads (-> routine "thread_dowait"). Hence, it wakes up thread "A"
but must wait for thread "B".
THREAD "A" IMMEDIATELY ENTERS THE SUSPENDED STATE AND DOES NEVER RELEASE
THE ANON CLUSTER LOCK. THAT'S WHY THREAD "B" SLEEPS INDEFINITELY AND WHY
THREAD "C" WAITS FOREVER.
The solution is straightforward : "u_anon_faultpage" must be blocked in a NON-
interruptible "sleep" while it waits for the "busy" page. By disassembling
"u_anon_faultpage" from patch OSF375-056, I found that this is fixed.
I would appreciate a feedback if somebody (from engineering) is going to
read this and does not agree.
Best Regards,
Uli Obergfell
| |||||