[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference clt::cma

Title:DECthreads Conference
Moderator:PTHRED::MARYSTEON
Created:Mon May 14 1990
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1553
Total number of notes:9541

1489.0. "deadlock and metering options" by TUXEDO::FARRELL (Jacqueline Proulx Farrell) Thu Feb 20 1997 12:09

    
    I am running a DECthreads application on UNIX V4.0a (rev 464) with 
    PTHREAD_CONFIG set to "meter=all" to get a mutex locking history.
    It is deadlocking in fork() (see ladebug output below).  
    
    Is there another value for meter besides "all" that will give me what
    I need that may not deadlock??  Please say yes.
    
    Jackie
    
    ps. I'm using the DCE pthreads interface, if that matters.
    
    Stack trace for thread 1
    #0  0x3ff8057ca24 in /usr/shlib/libpthread.so
    #1  0x3ff8057a450 in hstTransferContext(0x1, 0x14001ea78,
    0x3ffc018a010, 0x4, 0x3ffc0188a50, 0x0)
    DebugInformationStrippedFromFile109:???
    #2  0x3ff8056429c in dspDispatch(0x14001d610, 0x3ffc0183640,
    0x3ffc01834c8, 0x1400270b0, 0x3ffc01834c8, 0x0)
    DebugInformationStrippedFromFile89:???
    #3  0x3ff80567b98 in pthread_mutex_block(0x0, 0x3ffc018a398,
    0x3ffc0184db8, 0x3ffc01834c8, 0x3ffc0185680, 0x1400270b0)
    DebugInformationStrippedFromFile95:???
    #4  0x3ff8057c820 in __pthread_mutex_lock(0x3ffc018a398, 0x3ffc0184db8,
    0x3ffc01834c8, 0x3ffc0185680, 0x1400270b0, 0x3ff805790bc)
    DebugInformationStrippedFromFile111:???
    #5  0x3ff805790b8 in vmAlloc(0x3ff80569e2c, 0x3ffc018a3a8, 0x0, 0x4,
    0x0, 0x0) DebugInformationStrippedFromFile108:???
    #6  0x3ff80569e28 in muGetBlock(0x3ffc01834c8, 0x0, 0x11ffffeba,
    0x1840, 0x3ffc0184d88, 0x1) DebugInformationStrippedFromFile95:???
    #7  0x3ff805675f4 in pthread_mutex_block(0x3ffc008ae28, 0x3ffc01876f8,
    0x11ffffeba, 0x1840, 0x3ffc0184d88, 0x0)
    DebugInformationStrippedFromFile95:???
    #8  0x3ff8057c820 in __pthread_mutex_lock(0x3ffc01876f8, 0x11ffffeba,
    0x1840, 0x3ffc0184d88, 0x0, 0x3ff80577d40)
    DebugInformationStrippedFromFile111:???
    #9  0x3ff80577d3c in tsdReinit(0x1840, 0x3ffc0184d88, 0x0,
    0x3ff80577d40, 0x3ff80578e98, 0x3ffc0081de0)
    DebugInformationStrippedFromFile105:???
    #10 0x3ff80578e94 in /usr/shlib/libpthread.so
    #11 0x3ff800d36c0 in __fork(0x12002e85c, 0x14000ff20, 0x0, 0x1,
    0x14001af00, 0x0) DebugInformationStrippedFromFile335:???
    #12 0x12002e858 in cds_server_fork(program_name=0x11ffffeba="cdsadv",
    mode=0, status=0x11ffffc78)
    /project/dce/build/dce2.0aSSB/src/directory/cds/library/cds_rpcserver.c:199
    #13 0x12000c1e0 in InitializeRPC(myname_p=0x11ffffeba="cdsadv",
    debug_mode=0)
    /project/dce/build/dce2.0aSSB/src/directory/cds/adver/adver_cds_event.c:497
    #14 0x1200095b4 in main(argc=1, argv=0x11ffffde8)
    /project/dce/build/dce2.0aSSB/src/directory/cds/adver/adver_main.c:325
    
    (ladebug) pthread "t -f 1"
    main thread 1 (blocked, mutex wait) "default thread" (0x3ffc01834c8),
    created
        by pthread
      Waiting to lock mutex 9
      Scheduling: throughput policy at priority 11
      Masked signals: none
      Pending signals: none
      Object flags: none; self flags: none; sched flags: none; mutex flags:
    none;
        atomic flags: none
      Thread specific data: 0=0x3ffc0183920
      Stack: 0x11ffff7c0; base is 0x120000000, guard area at 0x4000000
      General cancelability enabled, asynch cancelability disabled
      Current vp is 0, synch port is 0, vp ID is 0
      Join uses mutex 14 and condition variable 1; wait uses mutex 15 and
        condition variable 2
      The thread's start function and argument are unknown
      The thread's latest errno is 2
      The thread has mutexes locked: 1, 2, 5, 6, 7, 8, 9, 10, 11, 12, 13,
    14, 16,
        21, 22, 23, 24, 25, 28, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
    41, 42,
        43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,
    60, 61,
        62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78,
    79
    (ladebug) pthread "m -f 9"
    Mutex 9 (normal) "VM 4 lookaside" (0x3ffc0185680, block 0x14001d610) is
    locked
      by 1, 1 thread waiting; ref count is 1
    
T.RTitleUserPersonal
Name
DateLines
1489.1DECthreads bug; should be fixed by now...WTFN::SCALESDespair is appropriate and inevitable.Thu Feb 20 1997 14:3113
That looks like a self deadlock.  It looks like the thread is locking a mutex
involved with thread-specific data, and the mutex (for some reason) doesn't
have a blocking structure, so the thread tries to allocate one, which
involves locking a memory management mutex, which the thread already has
locked, which results in the deadlock and subsequent hang.

It looks like this problem should have been fixed (in our sources) more than
a year ago, but I don't know which release the fix would have made it into... 

Are you sure that you have all the relevant patches installed?


				Webb
1489.2looking for patches in all the wrong places...TUXEDO::FARRELLJacqueline Proulx FarrellThu Feb 20 1997 15:095
    
    RE: relevant patches.  That I'm not sure of.  There have since been
    patches issued for 4.0B that don't seem to be available on 4.0 & 4.0A.
    I'm working on getting those.
    
1489.3DCETHD::BUTENHOFDave Butenhof, DECthreadsThu Feb 20 1997 15:1024
I'm not entirely sure that it has been fixed. Metering and fork are currently
very uneasy allies, at best. Metering requires a tremendous amount of data
being allocated to each mutex (and condition variable). In some cases, we
only find out that it needs the data during fork -- at which point we can't
allocate it. We've gone through a number of iterations on this, but the real
fix is going to be to rip out the current metering and substitute something
much more robust. That's not going to happen soon.

I've already started to take a look at this particular problem, but it's a
background task (that's now gotten even more background). And, when it's
done, it'll probably be just another point fix patching up one more hole in a
leaky dam. You'll be able to stroll past it, at least, into the next gusher.
Don't forget your umbrella!

The current metering, like pthread_debug(), is a big hack, almost completely
unsupportable, that we released and documented only because it was there (it
has been, mostly, good enough for OUR debugging needs), and because there was
a need for it. We don't intend to (and, realistically, can't) support these
debugging hacks at the same level as we support the real threading
interfaces. We'll make them work when we can, as well as we can, and when you
hit the limits, that's really too bad, and we're sorry, and... well, that's
about it.

	/dave ("I'm a thread library, Jim, not a debugger!")
1489.4TUXEDO::FARRELLJacqueline Proulx FarrellMon Feb 24 1997 09:217
    
    It may be hack, but it's a good one!  It's saved my skin many 
    times.
    
    Thanks for the explanation.         
    
    J
1489.5DCETHD::BUTENHOFDave Butenhof, DECthreadsMon Feb 24 1997 10:1812
Well, "good" is an ambiguous term in this case! Yeah, the information
provided by metering is valuable, and we definitely intend to always try to
provide that information.

The current IMPLEMENTATION of metering, however, is a hack, and it's very
difficult to support. I wrote it because, like you, I needed it, and, unlike
you, I had access to the DECthreads sources (I'm also really good at hacking
it). I made it available because everyone else needed it, too. And someday
I'll have time to re-implement metering from the ground up in a way that's
actually supportable.

	/dave