[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DECthreads Conference |
|
Moderator: | PTHRED::MARYS TE ON |
|
Created: | Mon May 14 1990 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1553 |
Total number of notes: | 9541 |
1513.0. "Locking a mutex" by MUFFIT::helen (Helen Pratt) Wed Mar 26 1997 13:42
I am working with a 3rd party who are currently testing their product
on Digital UNIX V4.0A with the majority of the patches available.
They are currently at the stage of running stress tests and yesterday
after running for just over 1.5 hours on a 2 cpu system, one
of the major components of the system hung. They managed to attach
with ladebug and gathered thread stack, mutex and condition variable
information. I have been through this today and I'm a bit confused
by a mutex which doesn't appear to be locked, but has two threads which
are blocked attempting to lock it - can anyone help me in my confusion?
The following is I believe the pertinent information from ladebug, (note
that I have removed information about the other threads, but if anyone
wants to take a look, let me know):
Thread State Substate Policy Priority Name
------ ---------- --------------- ---------- -------- -------------
51 blocked mutex wait throughput 11 <anonymous>
72 blocked mutex wait throughput 11 <anonymous>
Mutex 706 (normal) "mutex at 0x140201680" (0x140201680, block 0x1401ffcb0) is
not locked, 2 threads waiting; ref count is 5; waiters: 51, 72
Stack trace for thread 51
#0 0x3ff8057bbb4 in /usr/shlib/libpthread.so
#1 0x3ff80566820 in /usr/shlib/libpthread.so
#2 0x3ff80564254 in dspDispatch(0x1401ffcb0, 0x141090268, 0x1410900f0, 0x0, 0x1
40201680, 0x100000000) DebugInformationStrippedFromFile89:???
#3 0x3ff80567c50 in pthread_mutex_block(0x0, 0x30040186a30, 0x475, 0x0, 0x14020
1680, 0x0) DebugInformationStrippedFromFile96:???
#4 0x3ff8057b9b0 in __pthread_mutex_lock(0x30040186a30, 0x475, 0x0, 0x140201680
, 0x0, 0x3ff805afc00) DebugInformationStrippedFromFile112:???
#5 0x3ff805afbfc in pthread_mutex_lock(0x0, 0x140201680, 0x0, 0x3ff805afc00, 0x
30000113018, 0x1410d11b8) DebugInformationStrippedFromFile7:???
#6 0x30000113014 in tc_watchDog_AddWatch(0x100000001, 0x12004b500, 0x0, 0x0, 0x
0, 0x100000000) DebugInformationStrippedFromFile93:???
#7 0x1200c2db4 in UnknownProcedure6FromFile23(0x14173e938, 0x1410d0e80, 0x2, 0x
1410cfdb0, 0x100000000, 0xa02f3) DebugInformationStrippedFromFile23:???
#8 0x1200c9e64 in UnknownProcedure45FromFile23(0x1410d11b8, 0x1, 0xa02f3, 0x3,
0x100000000, 0x0) DebugInformationStrippedFromFile23:???
#9 0x1200ca984 in svr_Read(0x14112f160, 0x140f6ee20, 0x141613ce0, 0x1416fa6e0,
0xabcdef02, 0x30000110088) DebugInformationStrippedFromFile23:???
#10 0x12008f43c in svr_FltTFRead(0x100000132, 0x100000002, 0x1410d15e8, 0x0, 0x1
00000000, 0x1410d1108) DebugInformationStrippedFromFile18:???
#11 0x12009e4cc in UnknownProcedure16FromFile21(0x1200b63b0, 0x1410d36a8, 0x1410
d36b0, 0xe8, 0x14113fb58, 0x1401a1180) DebugInformationStrippedFromFile21:???
#12 0x1200b6834 in /opt/encina/bin/sfs
#13 0x3000014bd3c in UnknownProcedure17FromFile112(0x1410900f0, 0x1410a4958, 0x1
00000001, 0x45586732, 0x3, 0x0) DebugInformationStrippedFromFile112:???
#14 0x300001e0ec4 in UnknownProcedure12FromFile179(0x3, 0x0, 0x3ff80573c88, 0x30
0001e0e00, 0x1410900f0, 0x0) DebugInformationStrippedFromFile179:???
#15 0x3ff80573c84 in thdBase(0x0, 0x0, 0x0, 0x1, 0x45586732, 0x3) DebugInformati
onStrippedFromFile102:???
Stack trace for thread 72
#0 0x3ff8057bbb4 in /usr/shlib/libpthread.so
#1 0x3ff80566820 in /usr/shlib/libpthread.so
#2 0x3ff80564254 in dspDispatch(0x1401ffcb0, 0x141535aa8, 0x141535930, 0x0, 0x1
40201680, 0x100000000) DebugInformationStrippedFromFile89:???
#3 0x3ff80567c50 in pthread_mutex_block(0x0, 0x30040186a30, 0x475, 0x0, 0x14020
1680, 0x0) DebugInformationStrippedFromFile96:???
#4 0x3ff8057b9b0 in __pthread_mutex_lock(0x30040186a30, 0x475, 0x0, 0x140201680
, 0x0, 0x3ff805afc00) DebugInformationStrippedFromFile112:???
#5 0x3ff805afbfc in pthread_mutex_lock(0x0, 0x140201680, 0x0, 0x3ff805afc00, 0x
30000113018, 0x1415cd3a8) DebugInformationStrippedFromFile7:???
#6 0x30000113014 in tc_watchDog_AddWatch(0x100000001, 0x12004b500, 0x0, 0x0, 0x
0, 0x100000000) DebugInformationStrippedFromFile93:???
#7 0x1200c2db4 in UnknownProcedure6FromFile23(0x14168d2d8, 0x1415ccf70, 0x2, 0x
1415cbec0, 0x100000000, 0x2d00fe) DebugInformationStrippedFromFile23:???
#8 0x1200c9e64 in UnknownProcedure45FromFile23(0x1415cd3a8, 0x0, 0x2d00fe, 0x3,
0x100000000, 0x2130456) DebugInformationStrippedFromFile23:???
#9 0x1200ca984 in svr_Read(0x1200924e0, 0x6b95064b, 0x2, 0x0, 0xb400000000, 0x4
fa8abed) DebugInformationStrippedFromFile23:???
#10 0x120092504 in svr_FltTRead(0x6b95064b, 0x100000152, 0x1415cd2c0, 0x0, 0x100
000000, 0x1415ebf98) DebugInformationStrippedFromFile18:???
#11 0x1200b177c in UnknownProcedure32FromFile21(0x1200b63b0, 0x1415cf6a8, 0x1415
cf6b0, 0x218, 0x14125b5d8, 0x141127460) DebugInformationStrippedFromFile21:???
#12 0x1200b6a74 in /opt/encina/bin/sfs
Thank you in advance for any words of wisdom - I've been looking at this
too long!
Helen.
T.R | Title | User | Personal Name | Date | Lines |
---|
1513.1 | | SMURF::DENHAM | Digital UNIX Kernel | Wed Mar 26 1997 14:50 | 1 |
| Were these threads truly hung or were they burning up cpu cycles?
|
1513.2 | Not sure on the threads but process was ! | MUFFIT::helen | Helen Pratt | Wed Mar 26 1997 15:25 | 15 |
|
>> Were these threads truly hung or were they burning up cpu cycles?
Unfortunately I don't know the answer to that for these particular
threads. The process as a whole was burning up 182% of the 2 cpu's.
however, I suspect that that may have been the running threads of which
there were a couple.
What puzzles me more is the fact that the threads in .0 are both blocked.
Thanks for the quick response,
Helen.
|
1513.3 | | DCETHD::BUTENHOF | Dave Butenhof, DECthreads | Thu Mar 27 1997 06:46 | 9 |
| We have found a few race conditions that can lead to stranded threads on a
mutex. They've all been patched, but I don't know the patch numbers or even
whether they've all gotten out to the field. Pete's been doing the patch
submissions, so perhaps he'll have a better idea. It's impossible to tell for
sure whether your case is related to any of these problems, though, based on
the information you've given. (And might be hard to tell even with all the
information -- they're fairly subtle races.)
/dave
|
1513.4 | Wait -- all of this looks OK... | WTFN::SCALES | Despair is appropriate and inevitable. | Thu Mar 27 1997 09:28 | 28 |
| Hang on everybody, things may not be quite what you think!
.0> They managed to attach with ladebug and gathered thread stack, mutex and
.0> condition variable information.
So, then, what we're looking at is a single snapshot of a busy process in
action.
.0> Mutex 706 (normal) "mutex at 0x140201680" (0x140201680, block 0x1401ffcb0) is
.0> not locked, 2 threads waiting; ref count is 5; waiters: 51, 72
There is nothing wrong here, per se. Suppose just a moment ago the mutex
were locked with three waiters: when the owner of the mutex unlocks it, it
would wake up one of the waiters, and the result would be exactly what we
have here -- an unlocked mutex two threads waiting on it. (You'll notice
that the "ref count" is five, so there's alot going on with this mutex right
at the moment...)
We would need to know what the other threads in the process are doing (and
some general idea of what they are supposed to be doing) before we could
comment much more on what you're seeing. Unfortunately, the stack traces you
posted are probably the least interesting ones in the process -- we alreaady
know that these two are waiting for the mutex (which the stack trace
confirms). :-) The most interesting ones are probably the two that are
currently running...
Webb
|
1513.5 | Just the thing | CICS03::helen | Helen Pratt | Tue Apr 01 1997 10:57 | 16 |
|
Webb,
Thanks for the information it was just what I was looking for!
>>There is nothing wrong here, per se. Suppose just a moment ago the mutex
>>were locked with three waiters: when the owner of the mutex unlocks it, it
>>would wake up one of the waiters, and the result would be exactly what we
>>have here -- an unlocked mutex two threads waiting on it. (You'll notice
>>that the "ref count" is five, so there's alot going on with this mutex right
>>at the moment...)
Regards,
Helen.
|