[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference iosg::all-in-1_v30

Title:	OLD ALL-IN-1 (tm) Support Conference
Notice:	Closed - See Note 4331.l to move to IOSG::ALL-IN-1
Moderator:	IOSG::PYE

Created:	Thu Jan 30 1992
Last Modified:	Tue Jan 23 1996
Last Successful Update:	Fri Jun 06 1997
Number of topics:	4343
Total number of notes:	18308

2301.0. "INVEXCEPTN, Exception while above ASTDEL node$srv73" by BUSHIE::SETHI (Man from Downunder) Mon Feb 22 1993 06:20

    Hi,
    
    A customer has two nodes in a cluster and has just enabled cluster
    ALIAS the system crashed.  I logged on and examined the crash dump and
    have got the following information( <node name>$SRV73 was active):
    
    VAX/VMS System dump analyzer
    
    Dump taken on 22-FEB-1993 12:49:46.22
    INVEXCEPTN, Exception while above ASTDEL or on interrupt stack
    
    SDA> sh crash
    
    
    System crash information
    ------------------------
    Time of system crash: 22-FEB-1993 12:49:46.22
    
    Version of system: VAX/VMS VERSION V5.5-1
    
    System Version Major ID/Minor ID: 1/0
    
    
    VAXcluster node: PER1, a VAX 6000-610
    
    Crash CPU ID/Primary CPU ID:  01/01
    
    Bitmask of CPUs active/available:  00000002/00000002
    
    
    CPU bugcheck codes:
            CPU 01 -- INVEXCEPTN, Exception while above ASTDEL or on
    interrupt stack
    
        Press RETURN for more.
    SDA>
    
    CPU 01 Processor crash information
    ----------------------------------
    
    
    CPU 01 reason for Bugcheck: INVEXCEPTN, Exception while above ASTDEL or
    on inter
    rupt stack
    
    
    Process currently executing on this CPU: PER1$SRV73
    
    
    Current IPL: 8  (decimal)
         
    
    CPU database address:  87FA2000
    
    
    MPB address:   00000000
    CPU 01 Processor crash information
    ----------------------------------
    
    General registers:
    
            R0  = 00000008   R1  = 04080000   R2  = 00000003   R3  = 20545349
            R4  = 00000001   R5  = 85F35580   R6  = 2053554C   R7  = 85F355AC
            R8  = 85F35570   R9  = 80004FB0   R10 = 00000005   R11 = 85F35580
            AP  = 7FFE96A8   FP  = 7FFE966C   SP  = 87FA3D88   PC  = 80DA8842
            PSL = 04080009
    CPU 01 Processor crash information
    ----------------------------------
    Processor registers:
    
    
            P0BR   = 8C88FE00     SBR    = 0F8C2800     ASTLVL = 00000001
            P0LR   = 00008ABD     SLR    = 001CB280     SISR   = 00000104
            P1BR   = 8C297C00     PCBB   = 0BEE1620     ICCS   = 00000041
            P1LR   = 001FF78A     SCBB   = 0F8B4800     SID    = 13000202
    
            XDEV   = 00048087     XBE    = 00000040     XBEER  = 00000000
            XFADR  = 61880008     NCSR   = 00000800     TODR   = 2B0DEF74
    
            TBSTS  = 800001D0     PCSTS  = FFFFF800     BCETSTS= 00000140
            NESTS  = 00000000     CEFSTS = 00019200     BCEDSTS= 00000400
    
            ICR    = FFFFD906     ICCS   = 00000041
    
            IPORT  = 000000C1     OPORT0 = 0000000D     OPORT1 = 000000C0
    
            RXCS   = 00000040     TXCS = 00000080
    
        Press RETURN for more.
    SDA>
    CPU 01 Processor crash information
    ----------------------------------
    
            ISP    = 87FA3D88
            KSP    = 7FFE7800
            ESP    = 7FFE966C
            SSP    = 7FFED800
            USP    = 002FD954
    
                    No spinlocks currently owned by CPU 01
    SDA>
    
    The customer had started up the server on the nodes after cluster alias
    was enabled, also the crash occured two hours after the aliasing was
    enabled.
    
    The question is is there anything the customer may have missed out that
    may have led to the crash ?  I have seen topics 1228 and 2102 but they
    do not match the problem.  The customer has just bought the VAX and
    will be getting a 3rd one soon, I would be grateful for any pointers.
    
    Thanks,
    
    Sunil

T.R	Title	User	Personal Name	Date	Lines
2301.1	No real guess	CHRLIE::HUSTON		`Mon Feb 22 1993 13:30`	16
	I can't think of anything specific. One thought though, have the customer reboot the cluster. There are some internal funny games DECnet plays with aliasing. The reason I suspect this is that hte server itself never raises it priority level or specifically does much AST work. This is however done by DASL which the server uses, DASL also talks to DECnet and does load balancing if cluster aliasing is being used. Maybe something is not quite correct due to this. An alternative which is somewhat less painfull than a reboot, it stop and re-start DECnet, note that this will stop the server also so it will have to restarted by the ALL-IN-1 manager. --Bob
2301.2	Will reboot after cluster alias enabled	BUSHIE::SETHI	Man from Downunder	`Tue Feb 23 1993 00:13`	15
	Hi Bob, I should have followed my feels on this I had the funny feeling that they may not have rebooted. I say this because I read something somewhere regarding this and I just didn't want to take a jump in the deep without some confirmation. Were these words of wisdom spoken in this conference ? I have made the suggestion to the customer and the systems are due for a reboot tonight. I will post a reply here to confirm if if it's been a success or not. Thanks and regards, Sunil
2301.3	Reboot cleared the problem	GIDDAY::SETHI	Man from Downunder	`Mon Mar 15 1993 23:24`	1