[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference clt::cma

Title:DECthreads Conference
Moderator:PTHRED::MARYSTEON
Created:Mon May 14 1990
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1553
Total number of notes:9541

1484.0. "stack-related question" by KCBBQ::PRESTON (big enough never is) Thu Feb 13 1997 15:20

A VAR I support has an x-windows program that fires off some threads. 
The program works well on Alpha/OVMS v7.1, but consistently dies when 
the thread is created and left idle, as well as during "normal" use.  

More info including the error msg is given below.  We understand that this is
extremely limited information (we can supply more if needed), but the VAR 
asks, "are there any patches available that encompasses the changes
made to VMS 7.1 but are available on 6.1 (or 6.2)"?

Thanks in advance,

- Taylor

---------------------------------------------------------------------------

OpenVMS ALPHA 6.1

anal/image/inter sys$library:CMA$LIB_SHR.EXE

        Image Identification Information

                image name: "CMA$LIB_SHR"
                image file identification: "CMA V2.11-439"
                image file build identification: "X5SC-SSB-0000"
                link date/time:  2-APR-1994 07:25:33.88
                linker identification: "T11-11"


 anal/image/inter sys$library:CMA$OPEN_LIB_SHR.EXE

        Image Identification Information

                image name: "CMA$OPEN_LIB_SHR"
                image file identification: "CMA V2.11-439"
                image file build identification: "X5SC-SSB-0000"
                link date/time:  2-APR-1994 07:25:45.05
                linker identification: "T11-11"


anal/image/inter sys$library:CMA$OPEN_RTL.EXE    

        Image Identification Information

                image name: "CMA$OPEN_RTL"
                image file identification: "CMA V2.11-439"
                image file build identification: "X5SC-SSB-0000"
                link date/time:  2-APR-1994 07:20:05.36
                linker identification: "T11-11"


anal/image/inter sys$library:CMA$RTL.EXE

        Image Identification Information

                image name: "CMA$RTL"
                image file identification: "CMA V2.11-441"
                image file build identification: "X5SC-SSB-0000"
                link date/time:  2-APR-1994 07:19:37.51
                linker identification: "T11-11"


 anal/image/inter sys$library:CMA$TIS_SHR.EXE

        Image Identification Information

                image name: "CMA$TIS_SHR"
                image file identification: "CMA V2.11-439"
                image file build identification: ""
                link date/time:  2-APR-1994 06:54:47.55
                linker identification: "T11-11"




%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=7FF91FC0, PC=0134BD74, PS=0000001B
%DEBUG-E-LASTCHANCE, stack exception handlers lost, re-initializing
stack

%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual
address=011BFE90, PC=8042479C, PS=0000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=7FF91FC0, PC=0134BD74, PS=0000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=7FF91FC0, PC=0134BD74, PS=0000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=7FF91FC0, PC=0134BD74, PS=0000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=7FF91FC0, PC=0134BD74, PS=0000001B
%DEBUG-I-ERRINSDEC, error occurred while decoding instruction at current PC
%DEBUG-F-ACCVIO, access violation, reason mask=00, virtual
address=7FF91FC0, PC=0134BD74, PS=0000001B
-MAKE-F-NOMSG, Message number 80442754
break on unhandled exception at  in %TASK 2
error: %TASK 2 has overflowed its stack
  SP: 00000034  Stack top at: 011C2A00  Remaining bytes: -18622924


%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=7FF91FC0, PC=0134BD74, PS=0000001B
%DEBUG-W-BADSTACK, stack corrupted - no further data available
DBG>
T.RTitleUserPersonal
Name
DateLines
1484.1Confusion reigns -- we need clarification and information.WTFN::SCALESDespair is appropriate and inevitable.Thu Feb 13 1997 16:5036
Taylor, your note is really confusing.  

Are you saying that your VAR has an existing program which works well on V7.1
but which breaks after they have added threads to it?  I.e., please go reread
your base note and interpret the following sentence for us:

.0> The program works well on Alpha/OVMS v7.1, but consistently dies [...] 
.0> during "normal" use.

Also, if the VAR is doing their work on V7.1, why did you post image analyses
from V6.1?  I presume that the debugger output is from V6.1 as well...  

It might be more useful to post the output from the program run without the
debugger.  Also, in order to interpret the ACCVIO messages, it would be useful
to have the output from a Debug SHOW IMAGE command, so we know which images they
are in and at what offset.

I wouldn't pay too much attention to the "%TASK 2 has overflowed its stack"
message -- by the time that comes out, the process is in such dire straights
(i.e., after an access violation and an extreme debugger intervention) that
there isn't much you can trust.


.0> are there any patches available that encompasses the changes made to VMS 7.1 
.0> but are available on 6.1 (or 6.2)

What the customer is saying is that they want the enhancements, which prompted
us to increase the operating system major version number, packaged as a patch?  
(E.g., "We would rather not upgrade our system, but we're willing to completely
replace the kernel and several of the fundamental run-time libraries...") 
Suppose we offered them a patch which made their V7.1 system lie and call itself
V6.2-9?  8-)



				Webb
1484.2haste makes waste, sorry for the confusionKCBBQ::PRESTONbig enough never isFri Feb 14 1997 09:3923
Sorry for the confusion.

.0> The program works well on Alpha/OVMS v7.1, but consistently dies when 
.0> the thread is created and left idle, as well as during "normal" use.  

They presently have an app that works fine under OVMS v7.1; however,
when recompiled for OVMS v6.1 it dies.  (I believe the VAR was implying
it died both in debug mode ('created and left idle') as well as during
a normal non-debug mode run).

.0>                                                              the VAR 
.0> asks, "are there any patches available that encompasses the changes
.0> made to VMS 7.1 but are available on 6.1 (or 6.2)"?

They have a client who is presently running OVMS v6.1 and is reluctant
to move to v7.1 (although they would consider moving to v6.2).

We'll gather some more information & post as a reply.

Thanks for the quick response.

- Taylor
1484.3WAG: memory corruptorWTFN::SCALESDespair is appropriate and inevitable.Fri Feb 14 1997 10:1012
.2> They presently have an app that works fine under OVMS v7.1; however,
.2> when recompiled for OVMS v6.1 it dies.

Ah!  (That's an unusual senario, but now things make sense... :-)

In the nearly total absence of evidence, I'd wager that there's a memory
corruptor somewhere, and, on V7.1 it's benign (or doesn't happen because of
timing) while on V6.1 (where the execution timing and data layout is
completely different from V7) it's crippling.


				Webb
1484.4additional infoKCBBQ::PRESTONbig enough never isFri Feb 14 1997 11:13160
The .exe was compiled and linked on VMS 6.1 and the debugger info was 
from 6.1 (since that is where the program dies).

This exe runs just fine when moved to a 7.1 box.

It appears there is some run time issue that has been fixed in 7.1


Below is a run of the non-debug version on vms 6.1

- Taylor


%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=0118DE90,
PC
=8042479C, PS=0000001B

  Improperly handled condition, image exit forced.
    Signal arguments:   Number = 00000005
                        Name   = 0000000C
                                 00000004
                                 0118DE90
                                 8042479C
                                 0000001B

    Register dump:
    R0  = 0000000000000003  R1  = 0000000000000003  R2  = 000000007FE25FD8
    R3  = 0000000001192038  R4  = 0000000001190BD0  R5  = 0000000001192018
    R6  = 0000000000000066  R7  = 000000000118E448  R8  = 0000000000000003
    R9  = 0000000000000000  R10 = 0000000001193458  R11 = 00000066332E3225
    R12 = 0000000000000000  R13 = FFFFFFFF894425A8  R14 = 0000000000000000
    R15 = FFFFFFFFFFFFFFFF  R16 = 0000000000000001  R17 = 0000000000000024
    R18 = 0000000000000003  R19 = 0000000000000002  R20 = 000000000118F7D1
    R21 = 0000000000000000  R22 = FFFFFFFFFFFFEC9B  R23 = 000000000000000A
    R24 = 0000000000010000  R25 = 000000000000020A  R26 = FFFFFFFF805F0194
    R27 = 000000007FB211D0  R28 = FFFFFFFF805F10C0  R29 = 0000000000000003
    SP  = 000000007F912000  PC  = FFFFFFFF8042479C  PS  = 000000000000001B
%CMA-F-EXCCOP, exception raised; VMS condition code follows
-SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=D2D5E09C,
PC
=00F1F2CC, PS=0000001B


run of debug version on vms 6.1

run tg:das
%DEBUG-W-DWNOT1PROC, the 1 process debugger cannot be run in DECwindows mode

         OpenVMS Alpha AXP DEBUG Version V6.1-000

%DEBUG-I-INITIAL, language is C_PLUS_PLUS, module set to DAS
%DEBUG-I-NOTATMAIN, type GO to get to start of main program

DBG> go
break at routine DAS\main
 18638:         dbg.debugging=debugging;
DBG> go
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=7FF91FC0,
PC=0134BD74, PS=0000001B
%DEBUG-E-LASTCHANCE, stack exception handlers lost, re-initializing stack
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=011BFE90,
PC=8042479C, PS=0000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=7FF91FC0,
PC=0134BD74, PS=0000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=7FF91FC0,
PC=0134BD74, PS=0000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=7FF91FC0,
PC=0134BD74, PS=0000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=7FF91FC0,
PC=0134BD74, PS=0000001B
%DEBUG-I-ERRINSDEC, error occurred while decoding instruction at current PC
%DEBUG-F-ACCVIO, access violation, reason mask=00, virtual address=7FF91FC0,
PC=0134BD74, PS=0000001B
-MAKE-F-NOMSG, Message number 80442754
break on unhandled exception at  in %TASK 2
error: %TASK 2 has overflowed its stack
  SP: 00000034  Stack top at: 011C2A00  Remaining bytes: -18622924


%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=7FF91FC0,
PC=0134BD74, PS=0000001B
%DEBUG-W-BADSTACK, stack corrupted - no further data available

DBG> show image
 image name                      set    base address    end address

 CDA$ACCESS                      no     01090000        011305FF
 CDA$ACCESSMSG                   no     01452000        014621FF
 CMA$OPEN_RTL                    no     00EA6000        00ED79FF
 CMA$RTL                         no     00ED8000        00F597FF
 CMA$TIS_SHR                     no     7FBB6000        7FBE7FFF
    CODE0                               8049A000        8049ABFF
    DATA1                               7FBB6000        7FBB69FF
    DATA2                               7FBC6000        7FBC61FF
    DATA3                               7FBE6000        7FBE61FF
*DAS                             yes    00010000        000501FF
 DBGTBKMSG                       no     01464000        01470BFF
 DCXSHR                          no     01718000        017483FF
 DEBUG                           no     012DE000        013EE9FF
 DEBUGSHR                        no     01494000        017163FF
 DEC$COBRTL                      no     0033E000        003EF3FF
 DECC$MSG                        no     013F0000        013F13FF
 DECC$SHR                        no     7FE16000        7FE77FFF
    CODE0                               80554000        8060D1FF
    DATA1                               7FE16000        7FE2F5FF
    DATA2                               7FE36000        7FE3C9FF
    DATA3                               7FE46000        7FE481FF
    DATA4                               7FE56000        7FE561FF
    DATA5                               7FE66000        7FE6A3FF
    DATA6                               7FE76000        7FE77BFF
 DECW$DWTLIBSHR                  no     00F5A000        0108E1FF
 DECW$DWTMSG                     no     01440000        014501FF
 DECW$DXMLIBSHR                  no     00D60000        00E639FF
 DECW$TERMINALMSG                no     0141E000        0143E1FF
 DECW$TERMINALSHR                no     00A94000        00B757FF
 DECW$TRANSPORTMSG               no     0140C000        0141C1FF
 DECW$TRANSPORT_COMMON           no     00A30000        00A92410
 DECW$TRANSPORT_TCPIP            no     0187A000        018BA3FF
 DECW$XEXTLIBSHR                 no     00B76000        00BD69FF
 DECW$XLIBMSG                    no     013FA000        0140A1FF
 DECW$XLIBSHR                    no     0092E000        00A2F5FF
 DECW$XMLIBSHR                   no     00BD8000        00D5FDFF
 DECW$XTSHR                      no     0089C000        0092D7FF
 DPML$SHR                        no     7FBF6000        7FD47FFF
    CODE0                               8049C000        805539FF
    DATA1                               7FBF6000        7FC239FF
    DATA2                               7FC26000        7FC3A3FF
    DATA3                               7FC46000        7FC463FF
    DATA4                               7FC56000        7FC805FF
    DATA5                               7FD46000        7FD46FFF
 DY026_306                       no     01F6A000        021B2DFF
 FDLSHR                          no     002DC000        0033C3FF
 LBRSHR                          no     00E64000        00EA43FF
 LIBOTS                          no     7FB76000        7FBA7FFF
    CODE0                               8048C000        80499BFF
    DATA1                               7FB76000        7FB785FF
    DATA2                               7FB86000        7FB87BFF
    DATA3                               7FBA6000        7FBA61FF
 LIBOTS2                         no     0029A000        002DA5FF
 LIBRTL                          no     7FB16000        7FB67FFF
    CODE0                               80400000        8048B3FF
    DATA1                               7FB16000        7FB25FFF
    DATA2                               7FB26000        7FB26FFF
    DATA3                               7FB36000        7FB3F5FF
    DATA4                               7FB46000        7FB461FF
    DATA5                               7FB56000        7FB56FFF
    DATA6                               7FB66000        7FB673FF
 PTD$SERVICES_SHR                no     00868000        0089A2D8
 SECURESHRP                      no     00246000        00298750
 SHRCWS                          no     00482000        008669FF
 SHRIMGMSG                       no     013F2000        013F89FF
 SHRPCRS                         no     018EE000        01E753FF
 SHRSHL                          no     00052000        00245BFF
 SMGSHR                          no     003F0000        004807FF
 SYS$SSISHR                      no     01176000        011A63FF
 USS                             no     01132000        01174340

 total images: 42                bytes allocated: 647152

1484.5Who made LIBRTL do that?WTFN::SCALESDespair is appropriate and inevitable.Fri Feb 14 1997 13:5246
.4> It appears there is some run time issue that has been fixed in 7.1

That is certainly one possible explanation, but I can't think of any recent
fixes which would account for the reported symptoms.  Besides there are several
other possible explanations as well.


.4> %SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual 
.4> address=0118DE90, PC=8042479C, PS=0000001B

When run without the Debugger's influence, it would seem that something inside
LIBRTL tries to modify memory at an address in the SYS$SSISHR code, which
results in an access violation.

.4> %SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual 
.4> address=7FF91FC0, PC=0134BD74, PS=0000001B

When run under the Debugger, the Debugger itself incurs an access violation,
trying to read from the top of P1 space somewhere, presumably in the course of
trying to deal with the application's ACCVIO, which this time appears to occur
at the same place in LIBRTL but which is writing to a slightly different address
(which looks like it might be in part of the heap, perhaps a thread stack guard
page):

.4> %SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual 
.4> address=011BFE90, PC=8042479C, PS=0000001B


I don't know what is prompting LIBRTL to try to do this write.  It would be
interesting to know what routine 8042479C is in, if someone had the time to
chase that down, but it's probably something to do with raising a condition.

In the absence of anything more incriminating, I still suggest that the problem
is a memory corruptor (e.g., an uninitialized automatic pointer variable or an
instance of writing outside array bounds) which is causing this problem.  (Or,
possibly, a call to DECthreads with an invalid or non-existent object handle.)


.4> %DEBUG-W-DWNOT1PROC, the 1 process debugger cannot be run in DECwindows mode

BTW, is there some reason why the customer is running the 1-process debugger? 
They might have better (different) luck running the multiprocess debugger...



				Webb