[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference vaxaxp::vmsnotes

Title:VAX and Alpha VMS
Notice:This is a new VMSnotes, please read note 2.1
Moderator:VAXAXP::BERNARDO
Created:Wed Jan 22 1997
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:703
Total number of notes:3722

497.0. "Analyze/error bugcheck, process pids & epids??" by KERNEL::PULLEY (Come! while living waters flow) Mon Apr 21 1997 06:51

    Hi Folks,
    
    I've a customer with a cluster using VMS v6.2.
    On his VAX, he occationally get nonfatal bugchecks.
    I suspect that the processes that are falling over in this way, are
    leaving DECnet objects around--(I'm not sure if they're getting the
    chance to run the necessary exit handler to clear them).
    
    So at the moment, I've two DECnet objects with associated ID's, of one
    type, & a couple of analy/error/include=bugcheck, showing ID's in
    another format.
    How can I check if these processes were really the same?
    
    Thanks,
    Steve.
    
T.RTitleUserPersonal
Name
DateLines
497.1PID (EPID, IPID) FormatXDELTA::HOFFMANSteve, OpenVMS EngineeringMon Apr 21 1997 12:3032
   We need to obtain information on these (probably executive-mode)
   bugchecks -- what is in the error log?

   There is an RMS bugfix around that might be related: VAXRMS02_062.

   As for the DECnet problem, I'd log a QAR or IPMT, as it appears that
   there may be a bug in how DECnet recovers from non-fatal bugchecks.
   We'll need some examples of what you are looking at -- from that,
   we may be able to confirm the match for you.

   To convert an IPID (internal PID) to an EPID (external PID), you
   will need to take a look at the Internals and Data Structures.
   (If the low bits of the IPID match the EPID, the processes are
   in the same process slot.

	31 | 30 29 | 28 .. 21 | 20 .. 13 | 12 .. 0 
        31 | 30 29 | 28 .. 21 | 20 ... 5 |  4 .. 0 
         0   node    node       process    process
             seq #   index      seq #      index #

    To see the width of the process index field -- a value
    usually between 5 and 13, look in sch$lg_pixwidth.  The
    upper line is the EPID with larger values for MAXPROCESSCNT,
    while the lower line is for smaller ones.

    The IPID format is:

        31 | 30  .. 16 | 15 .. 0
        0    process     process
             seq #       index #

497.2KERNEL::PULLEYCome! while living waters flowMon Apr 21 1997 13:31108
    This is what's in the error log.
    The processes I wondered if it would be were:  292006f0;  29202adc.
    I don't think that looks possible.
    
    
 V A X / V M S        SYSTEM ERROR REPORT         COMPILED 21-APR-1997 15:49:26
                                                                      PAGE   1.

 ******************************* ENTRY      33. *******************************
 ERROR SEQUENCE 4757.                            LOGGED ON:        SID 17000201
 DATE/TIME 16-APR-1997 15:56:08.90                            SYS_TYPE 01410201
 SYSTEM UPTIME: 0 DAYS 13:59:22
 SCS NODE: STAR8                                               VAX/VMS V6.2

 NON-FATAL BUGCHECK KA7AA-AA  CPU FW REV# 1.  CONSOLE FW REV# 4.1

 SSRVEXCEPT, Unexpected system service exception

       PROCESS NAME    ID_OCONNOR_S

       PROCESS ID      00100087

       ERROR PC        AC4279F7
       ERROR PSL       01400000
                                       INTERRUPT PRIORITY LEVEL = 00.
                                       PREVIOUS MODE = EXECUTIVE
                                       CURRENT MODE = EXECUTIVE
                                       FIRST PART DONE CLEAR

 STACK POINTERS
    
 KSP 7FFE77B4  ESP 7FFE9794  SSP 7FFEBC2C  USP 7FDD2DC4  ISP AC3D7E00

 GENERAL REGISTERS
    
 R0  01400009  R1  7FFE97D0  R2  00000002  R3  7FF208FC  R4  B530F3C0
 R5  00000004  R6  B530F3D0  R7  00000003  R8  7FFED930  R9  7FFEBE6C
 R10 7FFED7D4  R11 7FFE2BDC  AP  7FFE97AC  FP  7FFE9794  SP  7FFE77F8

 V A X / V M S        SYSTEM ERROR REPORT         COMPILED 21-APR-1997 15:49:27
                                                                      PAGE   2.

 ******************************* ENTRY     217. *******************************
 ERROR SEQUENCE 5040.                            LOGGED ON:        SID 17000201
 DATE/TIME 17-APR-1997 15:18:14.37                            SYS_TYPE 01410201
 SYSTEM UPTIME: 1 DAYS 13:21:28
 SCS NODE: STAR8                                               VAX/VMS V6.2

 NON-FATAL BUGCHECK KA7AA-AA  CPU FW REV# 1.  CONSOLE FW REV# 4.1

 SSRVEXCEPT, Unexpected system service exception

       PROCESS NAME    ID_RONAN_G

       PROCESS ID      002400A9

       ERROR PC        AC4279F7
       ERROR PSL       01400000
                                       INTERRUPT PRIORITY LEVEL = 00.
                                       PREVIOUS MODE = EXECUTIVE
                                       CURRENT MODE = EXECUTIVE
                                       FIRST PART DONE CLEAR

 STACK POINTERS
    
 KSP 7FFE77B4  ESP 7FFE9794  SSP 7FFEBC2C  USP 7FDD2DC4  ISP AC3D5E00

 GENERAL REGISTERS
    
 R0  01400009  R1  7FFE97D0  R2  00000002  R3  7FF208FC  R4  B4E2B200
 R5  00000004  R6  B4E2B210  R7  00000003  R8  7FFED930  R9  7FFEBE6C
 R10 7FFED7D4  R11 7FFE2BDC  AP  7FFE97AC  FP  7FFE9794  SP  7FFE77F8

 V A X / V M S        SYSTEM ERROR REPORT         COMPILED 21-APR-1997 15:49:27
                                                                      PAGE   3.

 ******************************* ENTRY     403. *******************************
 ERROR SEQUENCE 5315.                            LOGGED ON:        SID 17000201
 DATE/TIME 18-APR-1997 12:07:18.23                            SYS_TYPE 01410201
 SYSTEM UPTIME: 2 DAYS 10:10:32
 SCS NODE: STAR8                                               VAX/VMS V6.2

 NON-FATAL BUGCHECK KA7AA-AA  CPU FW REV# 1.  CONSOLE FW REV# 4.1

 SSRVEXCEPT, Unexpected system service exception

       PROCESS NAME    ID_SPRATT_K

       PROCESS ID      002000F9

       ERROR PC        AC4279F7
       ERROR PSL       01400000
                                       INTERRUPT PRIORITY LEVEL = 00.
                                       PREVIOUS MODE = EXECUTIVE
                                       CURRENT MODE = EXECUTIVE
                                       FIRST PART DONE CLEAR

 STACK POINTERS
    
 KSP 7FFE77B4  ESP 7FFE9794  SSP 7FFEBC2C  USP 7FDD2DC4  ISP AC3D5E00

 GENERAL REGISTERS
    
 R0  01400009  R1  7FFE97D0  R2  00000002  R3  7FF208FC  R4  B5411280
 R5  00000004  R6  B5411290  R7  00000003  R8  7FFED930  R9  7FFEBE6C
 R10 7FFED7D4  R11 7FFE2BDC  AP  7FFE97AC  FP  7FFE9794  SP  7FFE77F8
ANAL/ERR/INCL=BUG/SIN=15-APR-1997 00:00:00.00/OUT=P.P
    
497.3What Code Is At Failing PCXDELTA::HOFFMANSteve, OpenVMS EngineeringMon Apr 21 1997 14:2612
   Those bugcheck entries are nearly identical. 

   What code is located at that AC4279F7 PC?

   What is common between ID_RONAN_G and ID_OCONNOR_S?

   The customer can choose to set BUGCHECKFATAL non-zero, which will
   cause a non-fatal bugcheck to become a fatal bugcheck -- while
   this isn't directly desirable, the fatal bugcheck writes a dump,
   which can help track these problems.