[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | Digital Ladebug debugger |
|
Moderator: | TLE::LUCIA |
|
Created: | Fri Feb 28 1992 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 969 |
Total number of notes: | 3959 |
881.0. "4.0-31 watch; run; causes C++ program to not progress. Store_conditional problem" by WIDTH::MDAVIS (Mark Davis - compiler maniac) Wed Mar 12 1997 14:19
Perhaps my enthusiasm in 869.2 was premature :-)
For this c++ program, if I set a watch point on a location, and then
"run", nothing happens for the 10 minutes I waited. Most of the time
is spent in ladebug (according to ps:
USER %CPU %MEM VSZ RSS S STARTED TIME COMMAND
mdavis 94.8 9.1 22.0M 11M R + 13:40:04 2:04.56 ladebug a.out
mdavis 3.9 0.4 1.80M 520K T + 13:40:14 0:04.71 a.out
)
If I set a breakpoint on main, and don't set the watchpoint until I hit
that breakpoint, everything runs fine.
Aha! Sometimes 2 ^C's will stop the program during initialization
#0 0x3ff81d37c00 in streambuf(
#1 0x3ff81d448f0 in filebuf(
#2 0x3ff81d43d40 in initialize(
#3 0x3ff81d438f8 in Iostream_init(
and the instruction in streambuf is:
(ladebug) 0x3ff81d37c00/i
[<opaque> streambuf(void), 0x3ff81d37c00] stq_c r3, 0(r2)
and $r2 happens to point into the same 8k page as the address I'm
watching. Every time this stq_c executes, it faults since it's writing
to a protected page. Ladebug then unprotects the page, single steps,
and reprotects the page. HOWEVER, this conditional store FAILS, since
lots of branching happened since the preceeding, matching ldq_l. So
the program loops back to the ldq_l and tries (and fails) again! (and
again and again ....)
In other words, if there's a lock on the same page we're trying to watch,
then the program will always loop infinitely.
WHAT CAN LADEBUG DO??
1. diagnose the problem and break: if the faulting instruction is
stq_c or stl_c, stop, warn the user, and ask if they want to stop
watching for a while.
2. try to workaround the problem:
if you fault on st?_c then
a. unprotect, single step TWICE
b. put a breakpoint on the instruction following the
st?_c
c. do NOT protect the page!
d. continue; the st?_c will fail, and the program should
loop back and do the
ld?_l; ...; st?_c successfully and then
e. hit the breakpoint from b.
f. PROTECT the page, and remove the breakpoint
g. continue
SOurce program and ladebug output
cat watch_bug.cxx
extern "C" void *putchar(char);
struct A{
virtual void *r(){char c;return((c=*p++)?(r(),putchar(c)):(p=*--q));}
static char *p,**q;
A(){r();}
~A(){r();}};
struct B:A{};
struct C:B,A{};
struct D:C,B,A{};
char*m[]={"\nkraM\n\n.uoy ","gniteem deyojne I","\n.yadnoM no lae",
"m lufrednow eht ","rof uoy knahT ","\n,ekiM dna ",",nhoJ ",",evetS"};
main(){B::p=*(B::q=&m[7]);D d;}
char *A::p,**A::q;
tagged 313% cxx -g watch_bug.cxx
tagged 314% ladebug a.out
Welcome to the Ladebug Debugger Version 4.0-31
------------------
object file name: a.out
Reading symbolic information ...done
line: 9 Unable to parse input as legal command or C++ expression.
(ladebug) watch (&A::q)
[#1: watch 0x140000698 to 0x14000069f ]
(ladebug) run <<<*** I have to hit ^C twice to get its attention
(ladebug) q
390.76u 180.94s 10:00 95% 46+138k 0+3io 7pf+7w 184stk+32416mem
***It was busy running for 10 minutes doing nothing useful.
tagged 315% ladebug a.out
Welcome to the Ladebug Debugger Version 4.0-31
------------------
object file name: a.out
Reading symbolic information ...done
line: 9 Unable to parse input as legal command or C++ expression.
(ladebug) sti main
[#1: stop in int main(void) ]
(ladebug) run
[1] stopped at [int main(void):12 0x1200022cc]
12 main(){B::p=*(B::q=&m[7]);D d;}
(ladebug) watch (&A::q) <<<<*** Doing "watch" after program
<<<*** has started is much more successful!
[#2: watch 0x140000698 to 0x14000069f ]
(ladebug) c
[2] The contents at address range 0x140000698 to 0x14000069f
was accessed by instruction 0x1200022d8
Old value = 0
New value = 5368709192
[2] stopped at [int main(void):12 0x1200022dc]
12 main(){B::p=*(B::q=&m[7]);D d;}
(ladebug) c
[2] The contents at address range 0x140000698 to 0x14000069f
was accessed by instruction 0x120001e34
Old value = 5368709192
New value = 5368709184
[2] stopped at [void* A::r(void):3 0x120001e38]
3 virtual void *r(){char c;return((c=*p++)?(r(),putchar(c)):(p=*--q));}
(ladebug) q
T.R | Title | User | Personal Name | Date | Lines |
---|
881.1 | | TLE::LUCIA | http://asaab.zko.dec.com/~lucia/biography.html | Wed Mar 12 1997 14:35 | 4 |
| I like this trend of posting solutions with the bug reports. Keep up the good
work, all!
Tim
|
881.2 | Not always so simple | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Wed Mar 12 1997 15:50 | 33 |
| Can we post bug reports on the solutions, too?
Mark's strategy assumes that the next time through the LDx_L + STx_C
sequence, you will actually reach the STx_C instruction, rather than
branching out early. If you branch out early, the program will "free-run"
without its watchpoint turned on. For example, the recommended
sequence to set a mutex looks roughly like:
LOOP: LDQ_L R0, (R1)
BLBS R0, WAIT ; Don't try store if already set
BIS R0, #1, R2
STQ_C R2, (R1) ; Try to set the lock
BEQ R2, LOOP ; Repeat if failed
SUCCESS:
:
WAIT: <do something entirely different>
It's quite likely that the first time through this code the STQ_C faults,
but the second time through it the BLBS is taken.
What does ladebug do when it's executing 'step' or 'stepi' and comes to
a LDx_L instruction? I seem to recall that VAX DEBUG parses the following
instructions, and sets a breakpoint on every potential branch target
and also after the STx_C instruction, then executes the whole sequence
without interference. The Alpha architecture says STx_C can fail if
there's a taken branch since the matching LDx_L, or if there are too
many instructions since the matching LDx_L, so this is a relatively-bounded
problem. The only trouble is if there's a computed jump in there...
Perhaps what should happen when the watchpoint triggers on STx_C is
that you step over the STx_C (which will produce 0 in its output register),
then reprotect the page to no-access, and try to step until you reach
a LDx_L instruction. At that point, unprotect the page and use an
appropriate strategy (such as the one in the preceding paragraph) to
step over the entire LDx_L + STx_C sequence.
|
881.3 | | TLE::CHIU | | Thu Mar 13 1997 09:59 | 5 |
|
Thank you for reporting this problem and providing a reproducer along
with suggested solutions :-) I'm looking into this now.
Caroline
|