[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference clusta::acms

Title:ACMS comments and questions
Notice:This is not an official software support channel. Kits 5.*
Moderator:CLUSTA::HALLAN
Created:Mon Feb 17 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:4179
Total number of notes:15091

4160.0. "cp accvio - acms v4.0-2" by CSC32::J_HENSON (Don't get even, get ahead!) Thu May 01 1997 12:50

acms v4.0-2, openvms v6.1, vax, decforms v1.4-13, tdms v1.9a,  distributed

A customer, Mass Mutual, is reporting frequent CP crashes.  This is
occurring on systems configured as noted above which are serving
as front ends in a distributed environment.

This customer has several front end systems, and it's happening
about once a day on each fe node.  The basic footprint of the 
crash is that the cp process is dying with an accvio.  Analyzing
the process dump only reveals that the stack has become corrupted.
Also, the failing pc reported in the swlup log is 000008, which
seems to substantiate that there is a corrupted stack.

I am attaching an analysis of the cp process dump below, but will hold
off posting the swlup log.  In a 2 minute interval, 22k blocks of
swlup log file was generated, and I'd really rather not post it
here or try to figure out which part are relevant and which are
not.  I also have the atr log for the same time period, but didn't
see anything significant.  I can provide both if you need them,
or get additional information.

Should I spr this, or should I recommend an upgrade for acms/decforms/openvms?

Thanks,

Jerry

====================================================================


BRK001$::SET DEF SYS$ERRORLOG
BRK001$::ANALYZE/PROCESS/FULL ACMSCP.DMP

 R0 = 00000001	R1 = 000054A4	R2 = 7FEC71E8	R3 = 03C00000
 R4 = 8000033C	R5 = 0000000C	R6 = 7FEC71EC	R7 = 7FF0DE48
 R8 = 00000000	R9 = 86A58680	R10 = 00000000	R11 = 0005002F
 SP = 7FEC6FB0	AP = 7FEC6FB0	FP  = 7FEC7194

 FREE_P0_VA  0082A800		FREE_P1_VA  7FEC4400
 Active ASTs  08		Enabled ASTs 07
 Current Privileges  00000000  90108085
 Event Flags  D8000000  E0000080
 Buffered I/O count/limit 197/200
 Direct I/O count/limit   200/200
 File count/limit         89/100
 Process count/limit      0/0
 Timer queue count/limit  20/20
 AST count/limit          1991/2000
 Enqueue count/limit      498/500
 Buffered I/O total 1479    	Direct I/O total 13

 Link Date  24-MAR-1995 13:59:59.42

 Kernel stack 00000008 pages at 7FFE6A00 moved to 0082A800
 Exec stack 00000011 pages at 7FFE7800 moved to 0082B800
 Vector page 00000001 page at 7FFEFE00 moved to 0082DA00
 PIO (RMS) area 00000007 pages at 7FFDFE00 moved to 0082DC00
 Image activator context 00000001 page at 7FFE2600 moved to 0082EA00
 User writeable context 0000000A pages at 7FFE0C00 moved to 0082EC00

Creating a subprocess
Condition signalled to take dump:
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=006D9BF8, PC=8000033C, PSL=03C00000
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=006D9BF8, PC=8000033C, PSL=03C00000
%DEBUG-I-CANTCREATEMAIN, could not create the debugger subprocess
%DEBUG-I-CANTCREATEMAIN, could not create the debugger subprocess
-LIB-F-NOCLI, no CLI present to perform function
-LIB-F-NOCLI, no CLI present to perform function
%DEBUG-I-SHRPRC, debugger will share user process
%DEBUG-I-SHRPRC, debugger will share user process
         OpenVMS VAX DEBUG Version V6.1-000

%DEBUG-I-NOLOCALS, image does not contain local symbols
%DEBUG-W-BADSTACK, stack corrupted - no further data available


  Improperly handled condition, bad stack or no handler specified.
  Improperly handled condition, bad stack or no handler specified.


	Signal arguments	      Stack contents
	Signal arguments	      Stack contents


	Number = 00000005		 00000000
	Number = 00000005		 00000000
	Name   = 0000000C		 00000000
	Name   = 0000000C		 00000000
		 00000004		 00000000
		 00000004		 00000000
		 006D9BF8		 00000000
		 006D9BF8		 00000000
		 8000033C		 00000000
		 8000033C		 00000000
		 03C00000		 00000000
		 03C00000		 00000000
					 00000000
					 00000000
					 00000000
					 00000000
					 00000000
					 00000000
					 00000000
					 00000000


	Register dump
	Register dump


	R0 = 00000001  R1 = 000054A4  R2 = 00000001  R3 = 0000001F
	R0 = 00000001  R1 = 000054A4  R2 = 00000001  R3 = 0000001F
	R4 = 86A58680  R5 = 00000000  R6 = 00000000  R7 = 00005558
	R4 = 86A58680  R5 = 00000000  R6 = 00000000  R7 = 00005558
	R8 = 00000000  R9 = 86A58680  R10= 00000000  R11= 0005002F
	R8 = 00000000  R9 = 86A58680  R10= 00000000  R11= 0005002F
	AP = 006D9C68  FP = 006D9C2C  SP = 7FEC7200  PC = 8000033C
	AP = 006D9C68  FP = 006D9C2C  SP = 7FEC7200  PC = 8000033C
	PSL= 03C00000
	PSL= 03C00000


%DEBUG-I-EXITSTATUS, is '%SYSTEM-F-BADSTACK, bad stack encountered during exception dispatch'
DBG> SET RADIX HEX
DBG> SET IMAGE/ALL
DBG> SET MODULE/ALL
DBG> SHOW IMAGE *
 image name                      set    base address    end address

 CMA$TIS_SHR                     yes    00A39400        00A39BFF
 DBGSSISHR                       yes    00891E00        008953FF
 DBGTBKMSG                       yes    00898E00        008A53FF
*DEBUG                           yes    00830000        00891DFF
 DEBUGSHR                        yes    008D6400        009B67FF
 LBRSHR                          yes    00A06600        00A0EFFF
 LIBRTL                          yes    009D3A00        009F3BFF
 LIBRTL2                         yes    009FE200        00A065FF
 MTHRTL                          yes    00A0F000        00A393FF
 PLIRTL                          yes    009F3C00        009FE1FF
 SMGSHR                          yes    009B6800        009D39FF
 UISSHR                          yes    00A51C00        00A52DFF
 VAXCRTL                         yes    00A39C00        00A51BFF

 total images: 13                bytes allocated: 661320
DBG> SHOW MODULE *
module name                     symbols    size

total UNKNOWN modules: 0.               bytes allocated: 661360.
DBG> SHOW CALLS
%DEBUG-E-NOCALLS, no active call frames
DBG> SHO STACK
%DEBUG-E-NOCALLS, no active call frames
DBG> EXIT



You wrote:
***
 - A recent history of the problem (some of which you have already
   provided).  You stated that you had experienced the problem before
   an upgrade.  Since the upgrade, how often has this occurred?  Is
   the frequency of the occurrence different since the upgrade?  If
   so, how?
***

4) I can not really tell you whether the upgrade effected the frequency because
too much time has passed since we did the upgrade.  We get dumps almost daily
on at least one remote node. Monday we had four dumps on four different 
remote nodes.  

You wrote:
***
 - Anything else you think might be useful or important.
***

5) We use a combination of DECforms and TDMS.  I am wondering if there is some
type of quota we are exceding.  I have included ACMSGEN parameters in 
ACMSGEN.LIS  and the SYSUAF for the ACMS accounts in the file BRK001_SYSUAF.LIS.

T.RTitleUserPersonal
Name
DateLines
4160.1OHMARY::HALLBill Hall - ACMS Engineering - ZKO2-2Thu May 01 1997 22:2516
    
    	Not much to go on.  Send the pointer to the SWL file, maybe
    	there's a PC or two that might be interesting.
    
    	If the customer presses the issue, then log an IPMT, but
    	don't expect a quick answer.
    
    	One thing they might try is to reduce the number of users
    	per CP.
    
    	Are an Escape routines in use in DECforms, have any of them
    	changed recently.  If there are ER, what language are they
    	written in.  What 'upgrade' did they perform?
    
    	Bill
    
4160.2here's the pointerCSC32::J_HENSONDon't get even, get ahead!Fri May 02 1997 10:2037
>>   <<< Note 4160.1 by OHMARY::HALL "Bill Hall - ACMS Engineering - ZKO2-2" >>>

    
>>    	Not much to go on.  Send the pointer to the SWL file, maybe
>>    	there's a PC or two that might be interesting.
  
I have placed a compressed saveset at csc32::sys$public:acms_mm.a.  This
contains the swlup and atr log, as well as the acmsgen and sysuaf
parameters.  The customer thought they might come in handy.

To decompress this saveset, use the fstv decompress command.
  
>>    	If the customer presses the issue, then log an IPMT, but
>>    	don't expect a quick answer.

According to the customer, this is a problem that they have been living
with for a long time.  They had been planning an upgrade, and were
holding off reporting this until after the upgrade in hopes that
the upgrade would resolve their problem.  It didn't, and I don't
know what the upgrade was.
    
>>    	One thing they might try is to reduce the number of users
>>    	per CP.

It's already set to 5, which seems pretty low already.  However,
their perm_cps value is 0, and cp_slots is 20, so they're not
getting any load balancing across CPs.
    
>>    	Are an Escape routines in use in DECforms, have any of them
>>    	changed recently.  If there are ER, what language are they
>>    	written in.  What 'upgrade' did they perform?

I'll try to find out and let you know.
    
Thanks,

Jerry