[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::decladebug

Title:Digital Ladebug debugger
Moderator:TLE::LUCIA
Created:Fri Feb 28 1992
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:969
Total number of notes:3959

916.0. "Detach kicks it off again" by CICS03::helen (Helen Pratt) Tue Apr 08 1997 12:08


I am working with a partner who is working on a transaction processing
product on Digital UNIX V4.0A.  They are currently stress testing
the product.

During the running of stress tests, they get processes which "hang", not
consuming cpu or apparently doing anything.  They then attach to 
the hung processes and get stack output before detaching.  Detaching
from the process appears to start it going again, the question is
why? 

The suspicions here are that this is associated with signal handling.
However any thoughts or ideas would be appreciated.

Thanks,

Helen.


T.RTitleUserPersonal
Name
DateLines
916.1TLE::LUCIAhttp://asaab.zko.dec.com/~lucia/biography.htmlTue Apr 08 1997 13:149
This is the intended behavior of the debugger.  Ladebug does not kill any
process it did not create.  Since the process in question was already running
when ladebug interrupted it (by attaching), ladebug's disposition towards the
process is to continuee it (when detaching).

There should not be any suspicious signal handling things going on as a result.

Regards,
Tim
916.2maybe...QUARRY::petertrigidly defined areas of doubt and uncertaintyTue Apr 08 1997 14:4334
Hmmm, I think you missed the point of the question, Tim.  As I understand
it, the process was basically hung before they attached to it.  They attached
only for the purpose of getting a stack trace and then detached, expecting
not that the process would die, but that it would still be hung.  Yet,
instead the process continued normally, so that the 'attach - detach'
sequence had the effect of 'unhanging' the process.

As to why that would happen... two possibilities.  One is your assumption,
that the process was not handling a signal correctly and that the detach
(which in dbx would clear the pending signals before letting it run
by itself - something I imagine ladebug does also, but I'm not sure
of the internals there) cleared that signal.  I'm not too sure how likely
that scenario is, as I would expect the process to die outright if it 
didn't handle the signal correctly.  But it's a possiblity.

The other possibilty goes to one of the reasons a process can hang without
consumming cpu or doing anything detectable.  That is often the case when 
the process is waiting on some resource that is not currently
available: io on some disk that is nfs mounted but might be down at the
moment, or on some serial device that is being used exclusively by 
another process.  Here the attach-detach scenario may be one of 
lucky timing.  So that by the time the detach was done, the resource
was available again.  Since you said this was during stress testing, 
it seems like a good possibility, especially if you have more than one
process fighting for the same resource.  But that's just a guess based 
on your brief explanation without a detailed view of the test process.

If this thing is consistently repeatable, I might opt for the signal handling
idea.  If it only crops up now and then, and there are a number of 
processes running, I'd vote for the second option.

Of course, with my usual luck, it would be a third option I haven't considered.

PeterT
916.3Does ladebug clear all pending signals???CICS03::helenHelen PrattWed Apr 09 1997 06:4433
Peter,

Thanks for the reply.

>>Hmmm, I think you missed the point of the question, Tim.  As I understand
>>it, the process was basically hung before they attached to it.  They attached
>>only for the purpose of getting a stack trace and then detached, expecting
>>not that the process would die, but that it would still be hung.  Yet,
>>instead the process continued normally, so that the 'attach - detach'
>>sequence had the effect of 'unhanging' the process.

This is indeed the scenario we're talking about.

>>As to why that would happen... two possibilities.  One is your assumption,
>>that the process was not handling a signal correctly and that the detach
>>(which in dbx would clear the pending signals before letting it run
>>by itself - something I imagine ladebug does also, but I'm not sure
>>of the internals there) cleared that signal. 

Can someone give some detail about whether ladebug clears pending signals
here or not?  Given that the scenario was seen with 5 pairs of processes
on one run over a couple of hours, signal handling is the suspected cause
of our problem.

Any other suggestions are welcome!

Thanks,

Helen.



916.4TLE::SHAMIMWed Apr 09 1997 10:004
Yes, ladebug does clear pending signals and faults when it detaches
from a process.

shamim