[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference wonder::turbolaser

Title:TurboLaser Notesfile - AlphaServer 8200 and 8400 systems
Notice:Welcome to WONDER::TURBOLASER in it's new homeshortly
Moderator:LANDO::DROBNER
Created:Tue Dec 20 1994
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1218
Total number of notes:4645

1163.0. "EEprom dumps" by VYGER::BRYCEG () Sun Apr 06 1997 06:23

    
    Does anyone have information on how to debug dumps which are logged in
    the cpu's eeprom.
    
    I assume there is a method in which the address at the far right of the
    dump matches up with specific registers.
    
    Any info appreciated.
    
    Gordon Bryce
T.RTitleUserPersonal
Name
DateLines
1163.1"This should be what you are looking for"CSC32::KIRKSun Apr 06 1997 22:06324
This Blitz shows how to decode the eeprom halt frame.



[TD 2262] AlphaServer 8200/8400, System Hang/Checklist - BLITZ

 
********************   CAUTION:  FOR INTERNAL USE ONLY   *********************
*                                                                            *
*      THIS INFORMATION IS FOR USE BY DIGITAL EQUIPMENT CORP. AND ITS        *
*      EMPLOYEES ONLY.  PLEASE USE EXTREME CARE IF YOU MUST DISCUSS ANY      *
*      PART OF THIS INFORMATION WITH ANYONE WHO IS NOT A DIGITAL EMPLOYEE.   *
*                                                                            *
******************************************************************************

Copyright (c) Digital Equipment Corporation 1997. All rights reserved.

+---------------------------+TM
|    |   |   |   |   |   |   |
|  d | i | g | i | t | a | l |           TIME   DEPENDENT   BLITZ
|    |   |   |   |   |   |   |
+---------------------------+


   BLITZ TITLE:  AlphaServer 8200/8400, System Hang/Checklist

   PRIORITY LEVEL: 1

   DATE: 12 Mar 97
   TD #: 2262

   AUTHOR: Wayne Sylvia
   DTN:  223-6325
   EMAIL: Proxy::Sylvia
   DEPARTMENT: Revenue Systems Engineering

   =======================================================================

   PRODUCT NAME(S):  Alphaserver 8200/8400

   PRODUCT FAMILY(IES):

   Storage         ___
   Systems/OS      _x_
   Networks        ___
   PC/Peripherals  ___ 
   Software Apps.  ___


   BLITZ TYPE:

   Maintenance Tip           _x_
   Service Action Requested  ___


   IF SERVICE ACTION IS REQUESTED:

   Labor Support Required     ___ 
   Material Support Required  ___ 


   Estimated time to complete activity (in hours):
   Will this require a change in the field's inventory:  Yes ___  No _x_
   Will an FCO be associated with this advisory?  Yes ___  No _x_

   ***********************************************************************

   PROBLEM:
     Alphaserver 8200/8400 console version V4.8-6 has incorporated a fix 
     to a console multiprocessor synchronization bug encountered following 
     the occurrence of a Machine Check while in PAL Mode Halt; this bug 
     surfaces as a system hang.  (It should be noted that this fix has 
     also been included into the interim V4.3 console release.)

   RESOLUTION/WORKAROUND:
     It is strongly recommended that, wherever possible, the Alphaserver 
     8200/8400 console firmware be upgraded to V4.8-6, or greater.  
     Version V4.8-6 is contained on the Firmware Update CD V3.9, order 
     number AI-ROUFC-BE.

   ADDITIONAL INFORMATION:
     The following checklist is intended as an aid, in the event of an 
     Alphaserver 8200/8400 system hang, in the collection of hardware 
     state information and isolation of the failure to its root cause.

     1.  Check the status of all LEDs, to include the system control 
         panel, cabinet control logic module, power subsystem(s), TLSB 
         modules, and all I/O busses, adapters and devices.  (For 
         information relating to the locations and descriptions of most 
         LEDs, please refer to the Alphaserver 8200/8400 Service Manual, 
         EK-T8030-SV.)

     2.  Type "Control-P" to enter console I/O mode.

         Please note that where no entry into the system can be gained, 
         it will not be possible to acquire the minimum hardware state 
         information required to effectively isolate the failure to its 
         root cause.  In this situation, it will be necessary to:

         a.  Type "Control-T" (ref: step 3 below).

         b.  Initialize the system; check system self-test display.

         c.  Perform a console "SHOW EEPROM HALT" and "SHOW EEPROM 
             SYMPTOM" command on each CPU module (ref: step 4 below).

         d.  Test MS7CC* to verify the functionality of the memory.

             Wherever possible, the console firmware should be upgraded 
             to the latest revision.  Recent firmware revisions have 
             become more robust in regards to the identification and 
             handling of memory failures.

         e.  Boot the operating system

     3.  Type "Control-T" to display the status of all running console 
         processes.

     4.  Perform a console "SHOW EEPROM HALT" and "SHOW EEPROM SYMPTOM" 
         command; these commands should be performed on each CPU module 
         (specify or set cpu as appropriate).  For Dual-CPU modules, 
         please note that the show eeprom halt/symptom commands will 
         display all frames logged on that CPU module and will identify 
         the frames as logged by CPU 0 or 1.

         The turbolaser console provides a non-volatile area in each 
         processor's EEPROM (flash) for the storage of halt and symptom 
         frames.  Briefly, a halt frame is built upon the occurrence of 
         a CPU double-error halt or machine check while in PAL mode halt 
         and, basically, consists of the machine check logout frame and 
         TLSB node registers (ref: EEPROM Halt Frame Description).  
         Similarly, the EEPROM symptom area stores OS error log 
         information.  The contents of each entry is based upon the event 
         type: 620 System Correctable Error, 630 CPU Correctable Error, 
         660 System Machine Check, or 670 CPU Machine Check.

     5.  Using the Console "INFO" Command, extract, at a minimum, the 
         following information.  (The console INFO command will list the 
         available options.)

          1.  Bitmap
          5.  TLSB Registers
          7.  LOGOUT Area
         16. PCIA Registers

     6.  Force a system crash using the console "CRASH" command.

     7.  Analyze the system crash dump file and console and system error 
         logs.


     EEPROM Halt Frame Description:

     The following layout is provided as an aid in decoding the EEPROM 
     Halt Frame (ref: EEPROM HALT FRAME example) and is applicable to 
     console versions prior to V4.8-6.  Console version V4.8-6 and 
     greater provides the respective register/longword description.

     Offset               Longword/Quadword Description
     ------     ------------------------------------------------
     Machine Check Logout Frame:
        0       Frame Size
        4       R,S,D,C
        8       CPU Area Offset
        C       System Area Offset
       10       Machine Check Reason Mask
       14       Machine Check Frame Revision
       18       PAL shadow Register 0
       20       PAL shadow Register 1
       28       PAL shadow Register 2
       30       PAL shadow Register 3
       38       PAL shadow Register 4
       40       PAL shadow Register 5
       48       PAL shadow Register 6
       50       PAL shadow Register 7
       58       PAL Temp Register 0
       60       PAL Temp Register 1
       68       PAL Temp Register 2
       70       PAL Temp Register 3
       78       PAL Temp Register 4
       80       PAL Temp Register 5
       88       PAL Temp Register 6
       90       PAL Temp Register 7
       98       PAL Temp Register 8
       A0       PAL Temp Register 9
       A8       PAL Temp Register 10
       B0       PAL Temp Register 11
       B8       PAL Temp Register 12
       C0       PAL Temp Register 13
       C8       PAL Temp Register 14
       D0       PAL Temp Register 15
       D8       PAL Temp Register 16
       E0       PAL Temp Register 17
       E8       PAL Temp Register 18
       F0       PAL Temp Register 19
       F8       PAL Temp Register 20
      100       PAL Temp Register 21
      108       PAL Temp Register 22
      110       PAL Temp Register 23
      118       EXC_Addr
      120       EXC_Sum
      128       EXC_Mask
      130       PAL_Base
      138       ISR
      140       ICSR
      148       IC_PERR_Stat
      150       DC_PERR_Stat
      158       VA
      160       MM_Stat
      168       SC_Addr
      170       SC_Stat
      178       BC_Tag_Addr
      180       EI_Addr
      188       Fil_Syn
      190       EI_Stat
      198       LD_Lock
      1A0       rsvd | MISCR | rsvd | Whami
      1A4       reserved
      1A8       TLDEV
      1AC       TLBER
      1B0       TLCNR
      1B4       TLVID
      1B8       TLESR0
      1BC       TLESR1
      1C0       TLESR2
      1C4       TLESR3
      1C8       TLEPAERR
      1CC       TLMODCONFIG
      1D0       TLEPMERR
      1D4       TLEPDERR
      1D8       TLINTRMASK
      1DC       TLINTRSUM
      1E0       TLEP_VMG
      1E4       spare
      1E8       spare
      1EC       TL56WERR0 (KN7CE only)
      1F0       TL56WERR1 (KN7CE only)
      1F4       TL56WERR2 (KN7CE only)
      1F8       TL56WERR3 (Kn7CE only)
      1FC       spare
     TLSB Node Registers:
      200       TLDEV, TLSB Node 0
      204       TLBER, TLSB Node 0
      208       TLDEV, TLSB Node 1
      20C       TLBER, TLSB Node 1
      210       TLDEV, TLSB Node 2
      214       TLBER, TLSB Node 2
      218       TLDEV, TLSB Node 3
      21C       TLBER, TLSB Node 3
      220       TLDEV, TLSB Node 4
      224       TLBER, TLSB Node 4
      228       ICCNSE  or BB+2040, TLSB Node 4
      22C       ICCWTR  or BB+2100, TLSB Node 4
      230       IDPNSE0 or BB+2A40, TLSB Node 4
      234       IDPNSE1 or BB+2140, TLSB Node 4
      238       IDPNSE2 or BB+2240, TLSB Node 4
      23C       IDPNSE3 or BB+2340, TLSB Node 4
      240       TLDEV, TLSB Node 5
      244       TLBER, TLSB Node 5
      248       ICCNSE  or BB+2040, TLSB Node 5
      24C       ICCWTR  or BB+2100, TLSB Node 5
      250       IDPNSE0 or BB+2A40, TLSB Node 5
      254       IDPNSE1 or BB+2140, TLSB Node 5
      258       IDPNSE2 or BB+2240, TLSB Node 5
      25C       IDPNSE3 or BB+2340, TLSB Node 5
      260       TLDEV, TLSB Node 6
      264       TLBER, TLSB Node 6
      268       ICCNSE  or BB+2040, TLSB Node 6
      26C       ICCWTR  or BB+2100, TLSB Node 6
      270       IDPNSE0 or BB+2A40, TLSB Node 6
      274       IDPNSE1 or BB+2140, TLSB Node 6
      278       IDPNSE2 or BB+2240, TLSB Node 6
      27C       IDPNSE3 or BB+2340, TLSB Node 6
      280       TLDEV, TLSB Node 7
      284       TLBER, TLSB Node 7
      288       ICCNSE  or BB+2040, TLSB Node 7
      28C       ICCWTR  or BB+2100, TLSB Node 7
      290       IDPNSE0 or BB+2A40, TLSB Node 7
      294       IDPNSE1 or BB+2140, TLSB Node 7
      298       IDPNSE2 or BB+2240, TLSB Node 7
      29C       IDPNSE3 or BB+2340, TLSB Node 7
      2A0       TLDEV, TLSB Node 8
      2A4       TLBER, TLSB Node 8
      2A8       ICCNSE, TLSB Node 8
      2AC       ICCWTR, TLSB Node 8
      2B0       IDPNSE0, TLSB Node 8
      2B4       IDPNSE1, TLSB Node 8
      2B8       IDPNSE2, TLSB Node 8
      2BC       IDPNSE3, TLSB Node 8
     Timestamp:
      2C0       WATCH$: DD | HH | MM | SS
      2C4       WATCH$: YY | MM


  EEPROM Halt Frame Example (as displayed by console):

  CPU 0 Fatal Error Halt 1: PALcode Machine Check
  00000000 00000000 00000001 0000fffa 000001a0 00000118 00000000 00000200     0
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000    20
  00000000 00000001 00000000 00000000 00000000 00000000 00000000 00000000    40
  fffffc00 005d3eb0 00000000 00005200 fffffc00 00466c10 fffffccf 8113a200    60
  1f1e1615 14020100 fffffc00 00466660 fffffc00 b8f0fa40 fffffc00 b8f0fa40    80
  fffffc00 00466b80 fffffc00 004667e0 00000000 0001c515 fffffc00 00466980    a0
  00000098 06700001 00000002 040585d9 00000000 00000000 00000055 55400000    c0
  00000000 00af6000 fffffffe 8fbb7508 00000000 00000000 00000000 00000000    e0
  00000000 0001c515 00000000 e9d6fa38 fffffc00 005c3eb0 fffffc00 00466bb0   100
  00000000 00d00000 00000000 00018000 00000000 00000000 00000000 00000000   120
  fffffffe 007c8018 00000000 00000000 00000000 00002000 00000061 60020000   140
  ffffff80 e84d6fff 00000000 00000000 ffffff00 0001d24f 00000000 00014910   160
  ffffff00 e76fc5cf fffffff0 01ffffff 00000000 00009000 ffffff00 0011d69f   180
  00400c0c 00800303 00000010 00000200 00800000 73008014 00000000 00550000   1a0
  000000fe 000001ff 00000000 00000000 00e08a84 00600800 00409090 00406060   1c0
  00000000 00041313 00000498 0004f811 0003a201 00000000 00000000 04000852   1e0
  00000000 00000000 00800000 73008014 00800000 73008014 00800000 73008014   200
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   220
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   240
  00000000 00000000 00000000 00000000 00000000 00000000 00100000 00005000   260
  00000000 00000000 00000000 00000000 00000000 00000000 00800000 00005000   280
  00000007 00000006 00000006 00000006 00000008 80000000 00000000 00002000   2a0
                                                        00002c0b 19152316   2c0

                     *** DIGITAL INTERNAL USE ONLY ***


1163.2any more info ?VYGER::BRYCEGMon Apr 07 1997 12:3814
    
    Thanks for the info.
    
    Is there a similar table showing the offset address for console
    versions before the FIX was added to the latest consoles.
    
    The reason i ask is we are currently working on customer return cpu
    modules which mostly have old console code and the information stored
    in the eeprom is all we have to go on.
    
    Thanks,
    
    Gordon