[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference wrksys::alphastation

Title:Alpha Workstation Conference
Notice:See note 1.* for conference notices
Moderator:WRKSYS::HOUSE
Created:Wed Sep 07 1994
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1996
Total number of notes:9122

1921.0. "AS200 4/100 boot crash with V7.1" by CSC32::D_BROWN (Dave Brown CSC-VSG/INTDRV) Thu Apr 10 1997 20:01

    
    
    	I'm working with a customer who has an AlphaStation 200 4/100 which
    works fine with V6.2, V7.0 but with V7.1 has a rather nasty problem
    during boot. Infact, the system NEVER sucessfully boots V7.1 but will
    crash every time in the final V7.1 upgrade phase after the SYSMAN> 
    IO AUTOCONFIGURE is performed by STARTUP. Here's what the customer said 
    he gets: 
    	
    .
    .
    .
    SYSTEM-I-BOOTUPGRADE, security server not started
    SYSTEM-I-MOUNTVER, $3$DKA300: (SUSAN) is offline.  Mount verification in
    progress
    
    %%%%%  OPCOM......%%%%%%
    Device $3$DKA300: (SUSAN) is offline.
    Mount verification in progress.
    
    SYSTEM-I-MOUNTVER, $3$DKA300:  (SUSAN) has completed mount
    verification.
    
    %%%%%  OPCOM...... %%%%%
    Mount Verification has completed for device $3$dka300: (SUSAN)
    
    %EWA0, twisted-Pair(10baseT) mode set by console
    
    **** OpenVMS (TM) Alpha Operating System V7.1    - BUGCHECK ****
    ** Code=000001CC:  INVEXCEPTN,Exception while above ASTDEL
    ** Crash CPU: 00     Primary CPU: 00   Active CPUS: 00000001
    ** Current Process = STARTUP
    ** Image name = $3$DKA300:[sys0.syscommon.][SYSEXE]SYSMAN.EXE;1
    **** Starting selective memory dump at (date time)...
    .
    .
    .
    .
        
    I did obtain a new SYS$PKEDRIVER from Steve Skonetski (reference note
    1776.*) which resolved the mount verification problems and I had hoped 
    that it would have corrected the crash problem too. It did not. Other 
    than SYSMAN being the current image in the dump, we have no more 
    information at this time. We're FTPing the dump over so we can check it 
    out.
    
    The SCSI devices are an RZ26, RZ29, two RZ58S, and an RRD45. The
    Ethernet intercact is a "DECchip 21040" according to >>>show config.
    
    Any ideas?
    
    Thanks,
    
    Dave
T.RTitleUserPersonal
Name
DateLines
1921.1STAR::KLEINSORGEFred Kleinsorge, OpenVMS EngineeringFri Apr 11 1997 09:376
    
    You have a crash dump, so analyze it.  How can we guess what the
    problem is?  Odds are since it was in SYSMAN that some driver barfed in
    it's initialization.
    
    
1921.2Facts, not guessworkCSC32::D_BROWNDave Brown CSC-VSG/INTDRVFri Apr 11 1997 11:348
    
    	The problem is that I don't have the crash dump. I was hoping that
    the foregoing might have looked familiar to someone. Having the crash
    dump would be an added benefit; we're working on it. I take it
    therefore that this is not a known problem.
    
    	I was not intending anyone  to "guess" the problem. I was hoping for 
    more than that.
1921.3Question of Scale...XDELTA::HOFFMANSteve, OpenVMS EngineeringFri Apr 11 1997 11:476
:I take it therefore that this is not a known problem.

   We have probably thirty-five thousand "known" crashdump footprints
   on VAX and Alpha, and there are certainly a large number more that
   haven't been uploaded...

1921.4STAR::KLEINSORGEFred Kleinsorge, OpenVMS EngineeringFri Apr 11 1997 11:5212
    
    >    **** OpenVMS (TM) Alpha Operating System V7.1    - BUGCHECK ****
    >    ** Code=000001CC:  INVEXCEPTN,Exception while above ASTDEL
    >    ** Crash CPU: 00     Primary CPU: 00   Active CPUS: 00000001
    >    ** Current Process = STARTUP
    >    ** Image name = $3$DKA300:[sys0.syscommon.][SYSEXE]SYSMAN.EXE;1
    >    **** Starting selective memory dump at (date time)...
    >
    
    I guess this is what mislead me ;-)
    
    
1921.5SYS$GQDDRIVER's doing itCSC32::D_BROWNDave Brown CSC-VSG/INTDRVFri Apr 11 1997 13:3113
    
    
    	Still don't have the dump but we've determined that its
    SYS$GQDDRIVER which crashes the system. Finally got the system booted
    far enough to have the customer look at the dump. Renaming
    SYS$GQDDRIVER permits the system to boot and the installation to
    complete. 
    
    	He's got two GQA devices, an ATI Mach64-CX and a ZLXp-E (TGA). The
    crash occurs at SYS$GQDDRIVER_NPRO+06CA4 for which we have no listings.
    
    	Is there a way I can get V7.1 SYS$GQDDRIVER listings?
                                                    
1921.6STAR::KLEINSORGEFred Kleinsorge, OpenVMS EngineeringFri Apr 11 1997 13:5913
    
    Could you include the sho crash (or clue crash) output.  It looks like
    a crash in a common routine that allocates pages for special structures
    - like the device information block.
    
    This is so unlikely that either I am not looking at the right code, or
    you have stale images on the disk.  Please check your
    SYS$LOADABLE_IMAGES area to make sure there are no drivers that someone
    copied by hand into a sys$specific area and then did an upgrade to
    V7.1.  Also check SYS$LIBRARY while you are at it.
    
    _Fred
    
1921.7Right-shift problem?CSC32::D_BROWNDave Brown CSC-VSG/INTDRVFri Apr 11 1997 14:41115
	We have no spurious executibles in either SYS$LOADABLE_IMAGES or 
SYS$LIBRARY. As the following information shows, we have some kind of a 4-bit
right shift problem afoot:


Crash Time:        11-APR-1997 10:10:47.25
Bugcheck Type:     INVEXCEPTN, Exception while above ASTDEL
Node:              SUSAN   (Standalone)
CPU Type:          AlphaStation 200 4/100
VMS Version:       V7.1    
Current Process:   SYSTEM_3
Current Image:     $3$DKA300:[SYS0.SYSCOMMON.][SYSEXE]SYSMAN.EXE;1
Failing PC:        FFFFFFFF.802E6CA4    SYS$GQDDRIVER_NPRO+06CA4
Failing PS:        10000000.00000604
Module:            SYS$GQDDRIVER                          
Offset:            0000ACA4

Boot Time:         11-APR-1997 10:08:09.00
System Uptime:               0 00:02:38.25
Crash/Primary CPU: 00/00
System/CPU Type:   0D02
Saved Processes:   7
Pagesize:          8 KByte (8192 bytes)
Physical Memory:   32 MByte (4096 PFNs, contiguous memory)
Dumpfile Pagelets: 57129 blocks
Dump Flags:        olddump,writecomp,errlogcomp,dump_style
Dump Type:         raw,selective
EXE$GL_FLAGS:      poolpging,init,bugdump
Paging Files:      1 Pagefile and 1 Swapfile installed

Stack Pointers:
KSP = 00000000.7FFA1B28   ESP = 00000000.7FFA5AC0   SSP = 00000000.7FFAC100
USP = 00000000.7AFF5390

General Registers:
R0  = 00000000.00000000   R1  = 00000000.0000000C   R2  = 00000000.7FFA1D10
R3  = FFFFFFFF.8103E5A0   R4  = 00000000.7FFA1B80   R5  = 00000000.7FFA1CF8
R6  = 00000000.7FFA1D40   R7  = 10000000.00000604   R8  = 00000000.00002011
R9  = FFFFFFFF.82000000   R10 = 00000000.00002001   R11 = 00000000.006A4610
R12 = 00000000.00000001   R13 = FFFFFFFF.810928C0   R14 = 00000000.00000000
R15 = 00000000.00000000   R16 = 00000000.000001CC   R17 = 00000000.7FFA1B80
R18 = 00000000.7FFA1D40   R19 = 00000000.00000003   R20 = FFFFFFFD.FFE0AAF8
R21 = 00000000.001C11FF   R22 = FFFFFFFF.826C0000   R23 = 00000000.0000000D
R24 = FFFFFFFF.81006C18   AI  = 00000000.00000003   RA  = FFFFFFFF.802E5110
PV  = FFFFFFFF.810944C8   R28 = FFFFFFFF.802E50E4   FP  = 00000000.7FFA1D90
PC  = FFFFFFFF.80069158   PS  = 28000000.00000604

Exception Frame:
R2  = FFFFFFFF.81094010   R3  = FFFFFFFF.8118BB80   R4  = FFFFFFFF.81147200
R5  = 00000000.00002031   R6  = FFFFFFFF.82000000   R7  = 00000000.003FFC00
PC  = FFFFFFFF.802E6CA4   PS  = 10000000.00000604

Signal Array:                            64-bit Signal Array:
Arg Count    = 00000005                  Arg Count      =          00000005
Condition    = 0000000C                  Condition      = 00000000.0000000C
Argument #2  = 00010000                  Argument #2    = 00000000.00010000
Argument #3  = 07FF9418                  Argument #3    = 00000000.07FF9418
Argument #4  = 802E6CA4                  Argument #4    = FFFFFFFF.802E6CA4
Argument #5  = 00000604                  Argument #5    = 10000000.00000604

Mechanism Array:
Arguments    = 0000002C                  Establisher FP = 00000000.7FFA1D90
Flags        = 00000000                  Exception FP   = 00000000.7FFA1D40
Depth        = FFFFFFFD                  Signal Array   = 00000000.7FFA1CF8
Handler Data = FFFFFFFF.82000000         Signal64 Array = 00000000.7FFA1D10
R0  = 00000000.00000001   R1  = 00000000.07FF9418   R16 = 00000000.00000000
R17 = 00000000.07FF9400   R18 = 00000000.003FFCA0   R19 = 00000000.00000003
R20 = FFFFFFFD.FFE0AAF8   R21 = 00000000.001C11FF   R22 = FFFFFFFF.826C0000
R23 = 00000000.0000000D   R24 = FFFFFFFF.81006C18   R25 = 00000000.00000003
R26 = FFFFFFFF.802E5110   R27 = FFFFFFFF.810944C8   R28 = FFFFFFFF.802E50E4

System Registers:
Page Table Base Register (PTBR)                           00000000.00000DB6
Processor Base Register (PRBR)                            FFFFFFFF.8110E000
Privileged Context Block Base (PCBB)                      00000000.01B6A080
System Control Block Base (SCBB)                          00000000.000001B8
Software Interrupt Summary Register (SISR)                00000000.00000000
Address Space Number (ASN)                                00000000.0000003A
AST Summary / AST Enable (ASTSR_ASTEN)                    00000000.0000000F
Floating-Point Enable (FEN)                               00000000.00000000
Interrupt Priority Level (IPL)                            00000000.00000006
Machine Check Error Summary (MCES)                        00000000.00000008
Virtual Page Table Base Register (VPTB)                   FFFFFFFC.00000000

Failing Instruction:
SYS$GQDDRIVER_NPRO+06CA4:  	LDL		R0,(R1)

Instruction Stream (last 20 instructions):
SYS$GQDDRIVER_NPRO+06C54:  	MB	
SYS$GQDDRIVER_NPRO+06C58:  	BIS		R31,#X01,R19
SYS$GQDDRIVER_NPRO+06C5C:  	SLL		R19,R1,R1
SYS$GQDDRIVER_NPRO+06C60:  	SLL		R18,R17,R17
SYS$GQDDRIVER_NPRO+06C64:  	AND		R18,#X03,R18
SYS$GQDDRIVER_NPRO+06C68:  	BIS		R17,R1,R1
SYS$GQDDRIVER_NPRO+06C6C:  	ADDL		R16,R1,R1
SYS$GQDDRIVER_NPRO+06C70:  	S8ADDL		R18,R31,R18
SYS$GQDDRIVER_NPRO+06C74:  	LDL		R1,(R1)
SYS$GQDDRIVER_NPRO+06C78:  	ZAPNOT		R1,#X0F,R28
SYS$GQDDRIVER_NPRO+06C7C:  	SRL		R28,R18,R1
SYS$GQDDRIVER_NPRO+06C80:  	ZAPNOT		R1,#X03,R0
SYS$GQDDRIVER_NPRO+06C84:  	RET		R31,(R26)
SYS$GQDDRIVER_NPRO+06C88:  	SUBL		R17,#X02,R1
SYS$GQDDRIVER_NPRO+06C8C:  	MB	
SYS$GQDDRIVER_NPRO+06C90:  	BIS		R31,#X03,R19
SYS$GQDDRIVER_NPRO+06C94:  	SLL		R19,R1,R1
SYS$GQDDRIVER_NPRO+06C98:  	SLL		R18,R17,R17
SYS$GQDDRIVER_NPRO+06C9C:  	BIS		R17,R1,R1
SYS$GQDDRIVER_NPRO+06CA0:  	ADDL		R16,R1,R1
SYS$GQDDRIVER_NPRO+06CA4:  	LDL		R0,(R1)
SYS$GQDDRIVER_NPRO+06CA8:  	RET		R31,(R26)
SYS$GQDDRIVER_NPRO+06CAC:  	BIS		R31,R31,R31
SYS$GQDDRIVER_NPRO+06CB0:  	LDQ		R1,#X0010(R27)
SYS$GQDDRIVER_NPRO+06CB4:  	LDL		R1,(R1)

1921.8STAR::KLEINSORGEFred Kleinsorge, OpenVMS EngineeringFri Apr 11 1997 14:568
    
    Found it.  It is trying to do a 32-bit read from ISA space... and
    something is very broken.  What rev of the firmware are you running?
    
    Can you give me the call frames so I can see what is attempting to be
    read?
    
    
1921.9On it's wayCSC32::D_BROWNDave Brown CSC-VSG/INTDRVFri Apr 11 1997 15:2115
    
    Fred,
    
    The firmware is as follows:
    
    	SRM 6.3-4
    	PAL 5.56-2
    
    	I don't have the call frames but I have mailed to you everything I
    have received from the customer so far including the stack. I can have
    him E-mail me some more if needed. He's sending in the dump on a tape.
    
    Thanks,
    
    Dave
1921.10Try a released version V7.1!WRKSYS::BROOKSFri Apr 11 1997 15:336
    V7.1 is this an official released version that you used.  There was an
    inernal version V7.1 that would hang.  It is a problem we have seen
    from several Customers.  Some prereleased versions were sent to some
    Customers.  Please try an official version.  It should have a date of
    March or April--before this there were problems--Good Luck!
    							Art Brooks
1921.11STAR::KLEINSORGEFred Kleinsorge, OpenVMS EngineeringFri Apr 11 1997 15:3619
    
    Just to save me time, do you know which firmware rev (i.e. 3.7, 3.8,
    3.9)...
    
    In any case.  It looks like an attempt to read a 32-bit register in the
    memory mapped space at the end of the frame buffer.  That is, the port
    address is 3FFCA0, and the ATI has it's registers memory mapped at the
    4mb address - 1024... or offset A0 into it... which is register 28,
    which is BUS_CNTL, which is the first address read during init, after
    the frame buffer is mapped... BUT the base address being passed in is
    zero, which indicates that the driver *failed* to map the PCI address
    space in sparse mode.  Which means typically means that the init of the
    card is garbage.
    
    Of course, we don't support the CX card *and* the TGA or TGA2 in the
    same box at the same time... so maybe that is the problem?
    
    
    
1921.12STAR::KLEINSORGEFred Kleinsorge, OpenVMS EngineeringFri Apr 11 1997 15:454
    
    We'll take this offline and work the problem.  Thanks.
    
    
1921.133.8 Firmware CDCSC32::D_BROWNDave Brown CSC-VSG/INTDRVFri Apr 11 1997 15:4918
    
    
    	Re: .10 - V7.1 SSB right off CD
    
    	Re: .11 - 3.8 CD
    
    >>Of course, we don't support the CX card *and* the TGA or TGA2 in the
    >>same box at the same time... so maybe that is the problem?
    
    	Customer's reaction to this was "*&*^#%, that's the way you guys
    shipped me the system!!"
    
    	Is indeed the combination of a ATI Mach64-CX and a ZLXp-E (TGA)
    on the same PCI (or in the same box) illegal under V7.1?
    
    	Thanks,
    
    	Dave
1921.14STAR::KLEINSORGEFred Kleinsorge, OpenVMS EngineeringFri Apr 11 1997 15:5514
    It's an illegal configuration as far as I know... I know I don't
    remember it ever being qual'ed or tested.
    
    I will certainly take a look at it, and see if I can figure out what's
    going wrong (I'll try plugging in a TGA and a CX card to my mustang).
    If you get the dump up someplace I can set host to, I'll give you more
    detail on what is out of whack.
    
    VMS has always said that the commodity graphics cards would only work
    multi-head in specific configurations.  Usually the TGA combination is
    benign because it functions purely in PCI space and doesn't muck with
    VGA space.  But it's certainly possible that something is getting
    stepped on by one of the card inits.
    
1921.15STAR::KLEINSORGEFred Kleinsorge, OpenVMS EngineeringFri Apr 11 1997 16:466
    
    Here's a question for the AlphaStation folks... what would a customer
    have to do to actually have such a system manufactured?  Was the
    ZLXp-E1 substituted for a Mach64-CX when the CX went to EOL?
    
    
1921.16It shipped that wayCSC32::D_BROWNDave Brown CSC-VSG/INTDRVFri Apr 11 1997 17:478
    
    
    	The system shipped from Digital as DEC#96018492S; shipping papers 
    dated 3-October-1995. The customer said he got it with some type of
    promotional deal. The system option was an 82-PB411-EB (containing I
    beleive a Mach64-CX) with an optional ZLXp-E1; factory configured.
    
    	I've instructed him to remove the Mach64-CX
1921.17product management has been notifiedWRKSYS::HOUSEKenny House, Workstations EngineeringSun Apr 13 1997 09:0619
    re .15 - "what would a customer have to do to actually have such a
              system manufactured?" ...
    
    I've extracted this note string (through reply .16) and sent it along
    to Product Management with the following preface.
    
    
        A customer got an AlphaStation 200 4/100 with both a ATI
        Mach64-CX and a ZLXp-E (TGA).  This combination was apparently
        never tested, and indeed seems to have caused his system to
        crash.
    
        The question has been raised (by the customer and by the folks
        who had to deal with the problem):  How did this system get out
        the door?  Shouldn't something in the order process or in
        manufacturing have caught it?
    
    
    -- Kenny House