[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference clt::cma

Title:DECthreads Conference
Moderator:PTHRED::MARYSTEON
Created:Mon May 14 1990
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1553
Total number of notes:9541

1473.0. "app takes ~90%CPU, swtch_pri running ?" by MPGS::ENRIGHT () Mon Jan 27 1997 12:38

    I'm trying to figure out what's causing an application
    that I recently modified and moved from DU 3.2c to 
    DU 4.0b to behave "badly", maybe someone here can 
    help?
    
    The problem is that ocassionally the program gets into 
    a state where it takes almost all the cpu (as seen with
    "ps aux") and stops doing what it's supposed to do - it 
    does not die/core dump, it just sits there taking up CPU
    cycles.
    
    I've run "memory advisor", found a couple of problems, fixed
    them with no effect to this problem. I've tried "atom -tool third",
    which was OK, but it cores before the program gets going (it did
    give a useful output once before it core'd, so it did help...). Of
    course I've also studied my code, used dbx and ladebug, etc., but
    I'm at a loss to explain what's happening.
    
    Below is ladebug output from an instance of the program in this
    state. One consistent thing about all the debug sessions I've
    attached to is that the running thread is in " swtch_pri"...
    
    I'd sure appreciate any help/thoughts/ideas that may get me closer
    to resolving this problem...
    
    Thanks,
    
    Michael Enright
    New Media Solutions
    DTN: 237-2165
    [email protected]
    
    ===========
    
    sparky># ladebug PBserver -pid 31226
    Welcome to the Ladebug Debugger Version 4.0-25
    ------------------
    object file name: PBserver
    Reading symbolic information ...done
    Attached to process id 31226  ....
    Thread received signal INT
    stopped at [swtch_pri: ??? 0x3ff8053eb1c]
    (ladebug) t
    >0  0x3ff8053eb1c in swtch_pri(0x1401573f8, 0x3ffc01859c0, 0x140135878,
    0x7afb0000, 0x3ff8057ba30, 0x3ffc01859c0) DebugInformationStri
    ppedFromFile19:???
    #1  0x3ff8057b824 in UnknownProcedure25FromFile109(0x0, 0x1401573f8,
    0x14025cc30, 0x140135b00, 0x1, 0x0) DebugInformationStrippedFromF
    ile109:???
    #2  0x3ff8057b0d8 in UnknownProcedure20FromFile109(0xb, 0x0,
    0x1401d3638, 0x3ffc01859c0, 0x3ffc01877e0, 0x0)
    DebugInformationStrippedF
    romFile109:???
    #3  0x3ff80579b30 in hstTransferContext()
    DebugInformationStrippedFromFile109:???
    (ladebug) show thread
    Thread State      Substate        Policy     Priority Name
    ------ ---------- --------------- ---------- -------- -------------
    >*  -2 ready                      idle        0       null thread for
    VP 0x0
         1 blocked    kernel          throughput 11       default thread
        -1 ready      kernel          fifo       32       manager thread
         2 blocked    cond wait       throughput 11       <anonymous>
      9180 running                    throughput 11       <anonymous>
      9181 blocked    cond wait       throughput 11       <anonymous>
      9182 blocked    timed cond wait throughput 11       <anonymous>
    (ladebug) thread -2
    Thread State      Substate        Policy     Priority Name
    ------ ---------- --------------- ---------- -------- -------------
    >*  -2 ready                      idle        0       null thread for
    VP 0x0
    
    (ladebug) t
    >0  0x3ff8053eb1c in swtch_pri(0x1401573f8, 0x3ffc01859c0, 0x140135878,
    0x7afb0000, 0x3ff8057ba30, 0x3ffc01859c0) DebugInformationStri
    ppedFromFile19:???
    #1  0x3ff8057b824 in UnknownProcedure25FromFile109(0x0, 0x1401573f8,
    0x14025cc30, 0x140135b00, 0x1, 0x0) DebugInformationStrippedFromF
    ile109:???
    #2  0x3ff8057b0d8 in UnknownProcedure20FromFile109(0xb, 0x0,
    0x1401d3638, 0x3ffc01859c0, 0x3ffc01877e0, 0x0)
    DebugInformationStrippedF
    romFile109:???
    #3  0x3ff80579b30 in hstTransferContext()
    DebugInformationStrippedFromFile109:???
    (ladebug) thread 1
    Thread State      Substate        Policy     Priority Name
    ------ ---------- --------------- ---------- -------- -------------
    >    1 blocked    kernel          throughput 11       default thread
    
    (ladebug) t
    >0  0x3ff8053eb1c in swtch_pri(0x3ff805643e8, 0x0, 0x3ff8057bbc4,
    0x3ffc0183508, 0x3ffc0183508, 0x0) DebugInformationStrippedFromFile1
    9:???
    #1  0x3ff8057bafc in UnknownProcedure26FromFile109(0x0, 0x11ffff7f8,
    0x3ffc0183508, 0x3ffc0183508, 0x1, 0x11ffff7f8) DebugInformationS
    trippedFromFile109:???
    #2  0x3ff8057b0d8 in UnknownProcedure20FromFile109(0x1400ceab0,
    0x3ffc0096490, 0x11ffffc28, 0x1, 0x60, 0x0) DebugInformationStrippedFr
    omFile109:???
    #3  0x1200880e8 in svc_run(0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
    DebugInformationStrippedFromFile270:???
    #4  0xfffffffffffffffd in ???
    (ladebug) thread -1
    Thread State      Substate        Policy     Priority Name
    ------ ---------- --------------- ---------- -------- -------------
    >   -1 ready      kernel          fifo       32       manager thread
    
    (ladebug) t
    >0  0x3ff80568b20 in pthread_mutex_unblock(0x3ffc00802a0, 0x0, 0x0,
    0x3ffc0181580, 0x3ffc018a968, 0x0) DebugInformationStrippedFromFil
    e95:???
    #1  0x3ff8057c1f8 in __pthread_mutex_unlock(0x0, 0x0, 0x3ffc0181580,
    0x3ffc018a968, 0x0, 0x3ff80575618) DebugInformationStrippedFromFi
    le111:???
    #2  0x3ff80575614 in UnknownProcedure5FromFile103(0x3ffc0181580,
    0x3ffc018a968, 0x0, 0x3ff80575618, 0x3ff800d6904, 0x3ffc0187760) Debu
    gInformationStrippedFromFile103:???
    #3  0x3ff800d6900 in fflush(0x3ff800d6904, 0x3ffc0187760,
    0x3ff80565cfc, 0x3ffc018a968, 0x3ffc0187768, 0x0)
    DebugInformationStrippedFr
    omFile84:???
    #4  0x3ff80565cf8 in errBugcheck(0x3ffc0185da0, 0x3ffc01700d8, 0x4,
    0x26e, 0x0, 0x3ffc018aae0) DebugInformationStrippedFromFile90:???
    #5  0x3ff8057bcd4 in UnknownProcedure26FromFile109(0x0, 0x14014b438,
    0x3ffc018a968, 0x3ffc018a968, 0x1, 0x14014b438) DebugInformationS
    trippedFromFile109:???
    #6  0x3ff8057b0d8 in UnknownProcedure20FromFile109(0x500, 0x1f4, 0x1,
    0xffffffffdfbbee16, 0x1f4, 0x14014b478) DebugInformationStripped
    FromFile109:???
    #7  0x3ff80535200 in msg_receive(0x0, 0x3ffc018abf0, 0x0, 0x0,
    0x3ffc018a968, 0x0) DebugInformationStrippedFromFile6:???
    #8  0x3fefffffffd in ???
    (ladebug) thread 2
    Thread State      Substate        Policy     Priority Name
    ------ ---------- --------------- ---------- -------- -------------
    >    2 blocked    cond wait       throughput 11       <anonymous>
    
    (ladebug) t
    >0  0x3ff8057c324 in /usr/shlib/libpthread.so
    #1  0x3ff80579b30 in hstTransferContext(0x1, 0x140135878, 0x14013ef20,
    0x140138e90, 0x0, 0xffffffffffffffff) DebugInformationStrippedF
    romFile109:???
    #2  0x3ff805643c4 in dspDispatch(0x1401415c0, 0x100000, 0x14013ef20,
    0x0, 0x1401415c0, 0x3ff8055fa18) DebugInformationStrippedFromFile
    89:???
    #3  0x3ff80560020 in cvWait(0x3ffc018a100, 0x0, 0x14014c8c0,
    0x14014c030, 0x14013eef0, 0x14013ef20)
    DebugInformationStrippedFromFile1:
    ???
    #4  0x3ff8055d4f4 in __pthread_cond_wait(0x14014c8c0, 0x14014c030,
    0x14013eef0, 0x14013ef20, 0x3ff805aef5c, 0x2) DebugInformationStrip
    pedFromFile1:???
    #5  0x3ff805aef58 in pthread_cond_wait(0x14013eef0, 0x14013ef20,
    0x3ff805aef5c, 0x2, 0x120081e90, 0x14014c030) DebugInformationStrippe
    dFromFile7:???
    #6  0x120081e8c in WaitOnLocalSemaphore(0x14014c030, 0x1,
    0x6f4c6e4f74696157, 0x70616d65536c6163, 0x65726f68, 0x0)
    DebugInformationStr
    ippedFromFile257:???
    #7  0x12002cd08 in serviceCompleteThread()
    /xswe/vod/CR_V02-003A-JAN24/pb/src/pb_init.c:765
    #8  0x3ff80574274 in thdBase(0x0, 0x0, 0x0, 0x1, 0x45586732, 0x3)
    DebugInformationStrippedFromFile101:???
    (ladebug) thread 9180
    Thread State      Substate        Policy     Priority Name
    ------ ---------- --------------- ---------- -------- -------------
    > 9180 running                    throughput 11       <anonymous>
    
    (ladebug) t
    >0  0x3ff8053eb1c in swtch_pri(0x14025cc30, 0x140136020, 0x3ff8057bbc4,
    0x14025cc30, 0x14025cc30, 0x0) DebugInformationStrippedFromFil
    e19:???
    #1  0x3ff8057bafc in UnknownProcedure26FromFile109(0x0, 0x1401d3638,
    0x14025cc30, 0x14025cc30, 0x1, 0x3ff80579b34) DebugInformationStr
    ippedFromFile109:???
    #2  0x3ff8057b0d8 in UnknownProcedure20FromFile109(0x14024c000,
    0x140163a70, 0x14019e030, 0xc000, 0x3ffc018a100, 0x3ff80567778) DebugI
    nformationStrippedFromFile109:???
    #3  0x3ffbffa5a20 in tcp_unpended_send(ctlp=0x14019e030)
    /xswe/vod/CR_V02-003A-JAN24/stm/src/tcpstlib.c:198
    #4  0x3fefffffffd in ???
    (ladebug) thread 9181
    Thread State      Substate        Policy     Priority Name
    ------ ---------- --------------- ---------- -------- -------------
    > 9181 blocked    cond wait       throughput 11       <anonymous>
    
    (ladebug) t
    >0  0x3ff8057c324 in /usr/shlib/libpthread.so
    #1  0x3ff80579b30 in hstTransferContext(0x1, 0x140135878, 0x14013f420,
    0x140139210, 0x0, 0xffffffffffffffff) DebugInformationStrippedF
    romFile109:???
    #2  0x3ff805643c4 in dspDispatch(0x140163b00, 0x100000, 0x14013f420,
    0x1401bbaa0, 0x140163b00, 0x3ff8055fa18) DebugInformationStripped
    FromFile89:???
    #3  0x3ff80560020 in cvWait(0x14018a7a0, 0x1401bbaa0, 0x14014ca40,
    0x14025d190, 0x14013f3f0, 0x14013f420) DebugInformationStrippedFrom
    File1:???
    #4  0x3ff8055d4f4 in __pthread_cond_wait(0x14014ca40, 0x14025d190,
    0x14013f3f0, 0x14013f420, 0x3ff805aef5c, 0x2) DebugInformationStrip
    pedFromFile1:???
    #5  0x3ff805aef58 in pthread_cond_wait(0x14013f3f0, 0x14013f420,
    0x3ff805aef5c, 0x2, 0x120081e90, 0x0) DebugInformationStrippedFromFil
    e7:???
    #6  0x120081e8c in WaitOnLocalSemaphore(0x14025d190, 0x2,
    0x6f4c6e4f74696157, 0x70616d65536c6163, 0x65726f68, 0x14018a7a0)
    DebugInform
    ationStrippedFromFile257:???
    #7  0x12004efec in pb_CFSService(arg_index=2)
    /xswe/vod/CR_V02-003A-JAN24/pb/src/pb_cfs.c:746
    #8  0x3ff80574274 in thdBase(0x0, 0x1403ab948, 0x3ff80197074, 0x1,
    0x45586732, 0x3) DebugInformationStrippedFromFile101:???
    (ladebug) thread 9182
    Thread State      Substate        Policy     Priority Name
    ------ ---------- --------------- ---------- -------- -------------
    > 9182 blocked    timed cond wait throughput 11       <anonymous>
    
    (ladebug) t
    >0  0x3ff8057c324 in /usr/shlib/libpthread.so
    #1  0x3ff80579b30 in hstTransferContext(0x1, 0x140135878, 0x14025d6f0,
    0x14025d868, 0x14013c310, 0x2801000069) DebugInformationStrippe
    dFromFile109:???
    #2  0x3ff805643c4 in dspDispatch(0x1402f16c0, 0x0, 0x14025d6f0, 0x0,
    0x1402f16c0, 0x1403d3830) DebugInformationStrippedFromFile89:???
    #3  0x3ff8055f210 in cvTimedWait(0x0, 0x0, 0x140259dc0, 0x14025d6f0,
    0x1832000000000, 0x14019e340) DebugInformationStrippedFromFile1:?
    ??
    #4  0x3ff8055d4ac in __pthread_cond_timedwait(0x140259dc0, 0x14025d6f0,
    0x1832000000000, 0x14019e340, 0x3ff805aeec0, 0x2) DebugInforma
    tionStrippedFromFile1:???
    #5  0x3ff805aeebc in pthread_cond_timedwait(0x3ff805aeec0, 0x2,
    0x120082908, 0x132e96768, 0x23b59a00, 0x2) DebugInformationStrippedFro
    mFile7:???
    #6  0x120082904 in PBserver
    #7  0x1200244b8 in vds_deliver_loop(vdb=0x1401c0820)
    /xswe/vod/CR_V02-003A-JAN24/pb/src/deliver.c:794
    #8  0x1200212e4 in vds_play_video(arg_index=3)
    /xswe/vod/CR_V02-003A-JAN24/pb/src/deliver.c:127
    #9  0x3ff80574274 in thdBase(0x0, 0x1403d3978, 0x3ff80197074, 0x1,
    0x45586732, 0x3) DebugInformationStrippedFromFile101:???
    
    
    Sorry if the format's hard to read!
    
    - Michael
    
T.RTitleUserPersonal
Name
DateLines
1473.1DECthreads is bugchecking...PTHRED::PORTANTEPeter Portante, DTN 381-2261, (603)881-2261, MS ZKO2-3/Q18Mon Jan 27 1997 16:0427
Michael,

Please enter a QAR for this problem.  If you will, use your first note in the
QAR and add this note to it as well.

The problem here is that the "manager thread" is trying to bugcheck in an
"unblocked" upcall routine.  The unblock upcall routine is running in a kernel
thread which does not have a complete "VP" setup.  Bugcheck trys to write to
stdout a message for why it is bugchecking and does this by calling fflush(). 
This routine locks and unlocks mutexes associated with the stdout stream.  Since
we don't have a complete "VP" setup, those routines are hanging.

The DECthreads library needs to be modified to avoid this mess.

Now, the bugcheck is occuring because of a race condition with the kernel. 
During the "unblocked" upcall, we try to find a VP to run the thread coming out
of the kernel.  The routine thought it found a VP running another thread at a
lower priority.  But when it tried to boot the thread off of that VP, it found
that the VP had blocked in the kernel.  The routine used to do the "boot"
returned an error and so we issued a bugcheck.

I am developing a patch for the bugcheck reason.  I do not know of a work-around
for this problem.  Though the fix is already in the next release stream.

Thanks,

-Peter
1473.2Any news on the patch?CICS03::helenHelen PrattTue Apr 01 1997 14:0611
Am I correct in thinking that this problem could occur equally easily
on Digital UNIX V4.0A?

If so, what is the state of any patches.  I believe I am working with
a partner who has run into is running into this problem during stress
runs.

Thanks,

Helen.
1473.3SMURF::DENHAMDigital UNIX KernelTue Apr 01 1997 18:097
    Yes, it can happen on any V4.0 family release.
    
    The patches start going into the support pools this week.
    Then it's only a matter of weeks and weeks before it becomes
    officially available. So -- it you've got a customer really
    stuck dead in the water, let Peter know and he'll get
    you some test patches, I'm sure.