T.R | Title | User | Personal Name | Date | Lines |
---|
2142.1 | | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Tue Apr 08 1997 10:13 | 3 |
| What does the "immediate crash" look like?
I bet you have an uninitialized local variable somewhere.
|
2142.2 | The crash | JAMSIE::CORBETT | | Tue Apr 08 1997 10:33 | 48 |
| %SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=000000CC20202020, PC=000000CC20202020, PS=0000001B
Improperly handled condition, image exit forced.
Signal arguments: Number = 0000000000000005
Name = 000000000000000C
0000000000010000
000000CC20202020
000000CC20202020
000000000000001B
Register dump:
R0 = 0000000000000001 R1 = 0000000000000000 R2 = 0000000000010250
R3 = 000000007B0BD869 R4 = 000000007FFCF818 R5 = 000000007FFCF938
R6 = 0000000000000000 R7 = 0000000000000001 R8 = 000000007FFAC208
R9 = 000000007FFAC410 R10 = 000000007FFAD238 R11 = 000000007FFCE3E0
R12 = 0000000000770000 R13 = 000000007B0681E0 R14 = FFFFFFFF80C2FE40
R15 = 0000000000000001 R16 = 0000000000000000 R17 = 0000000000000070
R18 = 0000000000000000 R19 = 0000000000040A80 R20 = 0000000000069B74
R21 = 0000000000000064 R22 = 0000000000000064 R23 = 0000000000000071
R24 = 0000000000069B18 R25 = 0000000000000070 R26 = 000000CC20202020
R27 = 0000000000000000 R28 = 0000000000000000 R29 = 000000007AFB9A00
SP = 000000007AFB9A00 PC = 000000CC20202020 PS = 000000000000001B
That is the crash.
Below are the first few lines of main (containing the printf statement
that was removed to produce the crash.
static main ()
{
FILENAME filename, tmp_string;
ASCIITIME file_creation_date;
int status, in_range, context = 0,remainder;
in_file *in_file_ptr;
printf("start\n");
init_global_data();
Thanks for the quick response.
Cheers
Daniel
|
2142.3 | | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Tue Apr 08 1997 11:23 | 17 |
| Well, somewhere in your application you have code
that is storing blanks (0x20202020) and other garbage
(0xCC) into memory that it doesn't own. This data is
corrupting the return address for some routine. When
you return through this corrupted address, you get this
failure.
I'm not sure how to advise you to debug this. Perhaps you
can try moving the printf down one statement at a time until
the failure returns. That would tend to implicate the line
right above the printf -- then look inside whatever functions
it calls, and repeat.
What has changed since the last time this code worked? If
it's just the move from VAX to Alpha, does the exact same
code work on VAX?
|
2142.4 | Yes | JAMSIE::CORBETT | | Tue Apr 08 1997 11:35 | 9 |
| The exact same code workd correctly on the VAX.
The code with the printf works correctly without
crashing on the Alpha and runs to successful
completion. When the printf is removed it crashes,
but if invoked via the debugger it runs correctly
to completion.
Daniel
|
2142.5 | some hints (collision with .3) | WIDTH::MDAVIS | Mark Davis - compiler maniac | Tue Apr 08 1997 11:41 | 8 |
| virtual address=000000CC20202020, PC=000000CC20202020
R26=000000CC20202020
1. Looks like the program was trying to JSR (R26), JMP (R26), or
RET (R26), where R26 doesn't have a valid code address.
2. Note that the hex 0x20202020 represents 4 space characters, so
perhaps this value came from some string literal?
|
2142.6 | Some more attempts | JAMSIE::CORBETT | | Tue Apr 08 1997 11:44 | 13 |
| Hi Mark
You can see from the code that the first call is to a function
that initialises a global data area.
I have taken the advice in the earlier responses and moved the
printf statment firstly after the function call (execution OK);
then into the code of the called function and again execution OK.
But if I remove the printf statement, I get the dump
It's a mystery to me.
Daniel
|
2142.7 | What I meant | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Tue Apr 08 1997 11:58 | 10 |
| Can you show us all of main()? If the first function call works,
keep moving the printf so it follows the second function call, or
the third function call, and so on:
main() { first(); second(); third(); } /* fails */
main() { printf(); first(); second(); third(); } /* works */
main() { first(); printf(); second(); third(); } /* works */
main() { first(); second(); printf(); third(); } /* suspect second */
main() { first(); second(); third(); printf(); } /* suspect third */
|
2142.8 | | SPECXN::DERAMO | Dan D'Eramo | Tue Apr 08 1997 12:10 | 17 |
| re .2
>static main ()
>{
> ...
Don't do that. :-)
main() should not be declared as 'static'. Simply
int main(void)
{
...
is sufficient.
Dan
|
2142.9 | Reply to .7 | JAMSIE::CORBETT | | Tue Apr 08 1997 12:15 | 5 |
| I have tried what you suggest.
The crash ONLY occurs when the printf statement is NOT there.
Daniel
|
2142.10 | Reply to .8 | JAMSIE::CORBETT | | Tue Apr 08 1997 12:20 | 4 |
| That has no affect.
Daniel
|
2142.11 | Might lib$signal(&SS$_DEBUG) help? | HYDRA::NEWMAN | Chuck Newman, 508/467-5499 (DTN 297), MRO1-3/F26 | Tue Apr 08 1997 12:37 | 6 |
| Try calling lib$signal(&SS$_DEBUG) in init_global_data.
Perhaps starting the debugger after the image is under way
will allow you to poke around and find the problem.
-- Chuck Newman
|
2142.12 | | SPECXN::DERAMO | Dan D'Eramo | Tue Apr 08 1997 12:39 | 5 |
| I'd also check if linking /TRACEBACK without the printf also
crashes (and hopefully the traceback tells you where the
accvio occurred).
Dan
|
2142.13 | Reply to .12 | JAMSIE::CORBETT | | Wed Apr 09 1997 07:02 | 12 |
| I tried and the traceback seems to imply it is the
first instruction.
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=000000CC20202020, PC=000000CC20202020, PS=0000001B
%TRACE-F-TRACEBACK, symbolic stack dump follows
image module routine line rel PC abs PC
0 0000000000000000 000000CC20202020
LMPH ? ?
LMPH 0 0000000000020064 0000000000030064
0 FFFFFFFF8D6D10D8 FFFFFFFF8D6D10D8
Daniel
|
2142.14 | Re: .13 For more help, post a full reproducer.. | COMEUP::SIMMONDS | loose canon | Wed Apr 09 1997 22:26 | 0 |
2142.15 | Re:.14 Can you explain what is needed and how to get it, please? | JAMSIE::CORBETT | | Thu Apr 10 1997 07:09 | 0 |
2142.16 | More detail about the problem | JAMSIE::CORBETT | | Thu Apr 10 1997 12:35 | 68 |
| The main looks like this
int main (void)
{
FILENAME filename, tmp_string;
ASCIITIME file_creation_date;
int status, in_range, context = 0,remainder;
in_file *in_file_ptr;
init_global_data();
printf("\nlocalMPH Availability / Reliability Report\n");
printf("------------------------------------------\n");
status = process_command();
-----------------------------------------------------
If I displace the printf commands after the next function call,
the crash occurs. So the problem in in that function
which looks like this :
static int process_command (void)
{
char *cp, interval_val[6], separate_val[4];
OS_TYPE os_val;
FILENAME fspec_val, output_val, spread_val;
ASCIITIME date_val;
char redirect[256];
int status, len = 0;
char cmd_line[LD_CM_K_COMMANDLINE_LEN];
char tmp_cmd_line[LD_CM_K_COMMANDLINE_LEN];
$DESCRIPTOR(cmd_dsc, cmd_line);
$DESCRIPTOR(help_dsc, "HELP");
$DESCRIPTOR(fspec_dsc, "FSPEC");
$DESCRIPTOR(fspec_val_dsc, fspec_val);
$DESCRIPTOR(output_dsc, "OUTPUT");
$DESCRIPTOR(output_val_dsc, output_val);
$DESCRIPTOR(since_dsc, "SINCE");
$DESCRIPTOR(since_val_dsc, date_val);
$DESCRIPTOR(before_dsc, "BEFORE");
$DESCRIPTOR(before_val_dsc, date_val);
$DESCRIPTOR(spread_dsc, "SPREAD");
$DESCRIPTOR(spread_val_dsc, spread_val);
$DESCRIPTOR(interval_dsc, "INTERVAL");
$DESCRIPTOR(interval_val_dsc, interval_val);
$DESCRIPTOR(separate_dsc, "SEPARATE");
$DESCRIPTOR(separate_val_dsc, separate_val);
If I place
the printf here
the crash occurs
so I assume that
the culprit is
the code above.
/*
** Get the foreign command.
*/
status = lib$get_foreign(&cmd_dsc, NULL, &len);
This might identify the area of the problem but
why does the crash NOT occur when the printf
statements are included before this code??
Daniel
|
2142.17 | | COMEUP::SIMMONDS | loose canon | Fri Apr 11 1997 00:12 | 12 |
| Re: .15
Daniel, a 'reproducer' is simply the smallest complete source program that
will demonstrate the problem (to you).. if you post it here, please ensure
that _all_ components that your example program requires are present
(excluding the standard DEC C header files) so that anyone in the audience
can extract your note(s) and successfully build on any system 'similar'
to yours.
Also tell us what platform, O/S, compiler versions you are using.
John.
|
2142.18 | | DECCXL::OUELLETTE | temerity time | Fri Apr 11 1997 14:35 | 12 |
| Often the simplest way to make a reproducer is to compile the file
with "-E > newtest.c" or /preprocess. Check that the new file still
exhibits the error symptom. Then if you're feeling ambitious, cut
out as many lines as you can and yet still exhibit the symptom.
Then send in your example.
I've had lots of practice and can usually reduce C failures to 10 or
20 lines. C++ usually requires more... 20 to 40. But we'd very much
rather have a 5000000 line test case (and cut it down ourselves) than
to have no test case at all.
Roland.
|
2142.19 | re: 16 why might printf make a difference? | DECC::MDAVIS | Mark Davis - compiler maniac | Mon Apr 21 1997 12:21 | 18 |
| When the "bad" routine (whatever it is) runs, it probably uses some
uninitialized value from the stack.
When you call printf, it calls some other routines, and they all
scribble into the top of the stack. Then when the "bad" routine runs,
it gets different uninitialized values and behaves differently.
You might add a "fflush(stdout);" after the printf, which will force the
message to be output to the terminal (instead of just being buffered
internally) - that way you can verify that the crash occurs before or
after the printf. (I.e., the crash could be happening AFTER the printf -
you just haven't seen the output on the terminal....)
Making a reproducer as described in .17 and .18 is the only way we can
duplicate your problem and give you better help.
Mark
|