T.R | Title | User | Personal Name | Date | Lines |
---|
386.1 | this question was cross posted in DECWINDOWS | COL01::VSEMUSCHIN | Duck and Recover ! | Wed Mar 26 1997 14:19 | 10 |
| I asked this question in DECWINDOWS notes conference (note 5813.*).
The only answer was from Steve Hoffman, who suppose, that QUADWORD
may be IOSB (0x2c is SS$_ABORT). To be closer to Steve I decide to
move this thread here.
Btw. the shareable image contains only Xt library calls and no
directly calls to system services. This code is written in C++
and the main program in fortran.
=Seva
|
386.2 | These Stack Corrupters Are Fun To Find... | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Wed Mar 26 1997 14:33 | 22 |
| : To be closer to Steve I decide to move this thread here.
I'm not sure I want to think about that. :-)
--
I'll assume the current set of Motif patches have been applied,
and this problem is (still) being seen.
I'd be suspicious of any subroutine calls -- it does not have to be
a call to the X toolkit or to a system service...
Does this error always involve a corruption to the same memory
location? (Is it possible to use a watchpoint or to program the
debugger to watch the location(s), possibly enabling the watchpoint
or breakpoint only when the appropriate application callframe(s) are
active? The debugger can literally be programmed to "lurk", looking
for the footprint of the error before it takes action.)
Without looking at the code -- and possibly only by running it under
the debugger -- it's only really possible to guess...
|
386.3 | fairly obvious, but have you checked for... | CUJO::SAMPSON | | Wed Mar 26 1997 22:33 | 6 |
| If a routine were to declare an IOSB as an automatic
(stack) variable, queue an asynchronous I/O request, then
return without waiting for completion, this kind of corruption
could happen to some later user of the same stack address, when
the I/O completes and fills in the IOSB quadword. I've seen
this happen before. It's often very timing-dependent.
|
386.4 | catching the Huffalump | COL01::VSEMUSCHIN | Duck and Recover ! | Thu Mar 27 1997 04:03 | 56 |
| "We will charm it with soap and smile,
We will scare it with railway fare ..."
Luis Caroll, The Hunting of the Snark
>> Does this error always involve a corruption to the same memory
>> location?
Yes ...
>> (Is it possible to use a watchpoint or to program the
>> debugger to watch the location(s), possibly enabling the watchpoint
>> or breakpoint only when the appropriate application callframe(s) are
>> active? The debugger can literally be programmed to "lurk", looking
>> for the footprint of the error before it takes action.)
No
Debugger doesn't allow watchpoints in the stack, 'cause it would make
this quiz too easy to solve ;-) Anyway to be sure I would RTFM it
still one time to check whether I can find there still something
usefull. And as you recall .0, I tried to build my own Very Cunning
Trap to catch the Huffalump. To my great sorrow instead of this animal
the SYS$IMGSTA_C was caught. And I hoped that probably one of readers
would have an idea WHY ?
>> Without looking at the code -- and possibly only by running it under
>> the debugger -- it's only really possible to guess...
I wrote some comments about the source code in DECWINDOWS notes conference
and I don't want to reapeat them. Only thing I want to say, that
investigating this code is as well hopefully as to guess ...
>> -< fairly obvious, but have you checked for... >-
>>--------------------------------------------------------------------------------
>> If a routine were to declare an IOSB as an automatic
>>(stack) variable, queue an asynchronous I/O request, then
>>return without waiting for completion, this kind of corruption
>>could happen to some later user of the same stack address, when
>>the I/O completes and fills in the IOSB quadword. I've seen
>>this happen before. It's often very timing-dependent.
Yes, and now please compare it with :
from .0
>> It is possible, that another program from the same shareable image
>> initiated an asynchronous action (AST or IO with AST notification)
>> and gives it address of its local varible as parameter. Then this
>> routine returns and the asynchronous action shot the stack of our
>> program.
from .1
>> I asked this question in DECWINDOWS notes conference (note 5813.*).
>> The only answer was from Steve Hoffman, who suppose, that QUADWORD
>> may be IOSB (0x2c is SS$_ABORT). To be closer to Steve I decide to
>> move this thread here.
Yes, this suggestion is that, what we are agree with. What is still
unclear is WHO did it (see also the title of this thread)
=Seva
|
386.5 | | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Thu Mar 27 1997 08:33 | 2 |
| Well, if it is SS$_ABORT, perhaps you could look for places that issue a
$CANCEL, and try to figure out who issued the operation that's being CANCELed?
|
386.6 | Debugger can be *Programmed* | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Thu Mar 27 1997 09:55 | 30 |
| :>> Does this error always involve a corruption to the same memory
:>> location?
: Yes ...
Then *program* the debugger to look at that offset in any
routine at the same "stack depth"...
Also look at the previous subroutines that would have been active
at this same "stack depth", and see what variables are at the
specified offset, and how these variables are used. If nothing
interesting is found at the first "stack depth", work deeper into
the call stack.
:>> (Is it possible to use a watchpoint or to program the
:>> debugger to watch the location(s), possibly enabling the watchpoint
:>> or breakpoint only when the appropriate application callframe(s) are
:>> active? The debugger can literally be programmed to "lurk", looking
:>> for the footprint of the error before it takes action.)
: No
: Debugger doesn't allow watchpoints in the stack, 'cause it would make
: this quiz too easy to solve ;-)
I've had good luck with programming the debugger -- the debugger
can be far more useful than "just" watchpoints or breakpoints.
The debugger can be programmed to take an action as a result of a
breakpoint, and the actions can be to start evaluating the contents
of the stack, or examining routine-local variables and decide to
continue or call attention to a problem, etc.
|