[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DEChub/HUBwatch/PROBEwatch CONFERENCE |
Notice: | Firmware -2, Doc -3, Power -4, HW kits -5, firm load -6&7 |
Moderator: | NETCAD::COLELLA DT |
|
Created: | Wed Nov 13 1991 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 4455 |
Total number of notes: | 16761 |
2847.0. "Help with Errorlog from DEChub 900MS crash" by CSC32::R_BUCK (Have been assimilated) Mon Oct 09 1995 14:39
The following (edited), errorlog information was FAX'd to me from a
field service engineer. The DEChub 900MS lost all configuration
information as far as backplane connections, etc. Did retain IP
configuration.
Engineer believes that since re-seating the DECswitch 900EF module the
customer has not had any other unexpected hub crashes. Does not know
of any event that could be related to the original crash. Engineer
states that all modules, and the DEChub 900MS itself, are running with
recent firmware, i.e., the 4.0 kit.
Basic questions are what does this errorlog information point to? How
are entries interpreted? Is there any reference and/or formal
documentation concerning crash codes on the DEChub? (the infamous) Any
known problems that this matches?
I believe the bottom line for the Field Engineer is that he just needs
to know more about the crash so that he can give the customer a
reasonable explanation. Also he wants to be sure that he does not
start swapping hardware if the problem is specific to firmware.
Thanks
Randall Buck
--------------------------------------------------------------------
DEChub 900 MultiSwitch
DUMP ERROR LOG
Current Reset Count: 17
Entry = 14
TimeStamp =00
Reset Count =15
Stop Thrash; Cleared Nonvolatile Data.
Dump another entry [Y]/N? y
Entry =13
Time Stamp = 0 3500
Reset Count =14
Catch VO=07C SR=2004 PC=41B608
Dump another entry [Y]/N? y
Entry =12
Time Stamp =0 17700
Reset Count =13
Catch VO=07C SR=2004 PC=41B60E
Dump another entry [Y]/N? y
Entry =11
TimeStamp =0 7100
Reset Count =12
Catch VO=07C SR=2004 PC=41B2FO
Dump another entry [Y]/N? y
Entry =10
Time Stamp =0 136300
Reset Count =11
Catch VO=07C SR=2009 PC=41B2BC
Dump another entry [Y]/N? y
Entry =9
Time Stamp =0 10600
ResetCount =10
Catch VO=07C SR=2004 PC=41B2B8
Dump another entry [Y]/N? y
No more Error Log entries.
Press Return for Main Menu...
DECswitch 900EF - Slot 8
=================================================================
DUMP ERROR LOG
Current Reset Count: 9
=================================================================
Entry# =2
EntryStatus =0 [0=valid,1=write_error,2=Invalid,3=empty,4=crc_error
Entry Id =10
Firmware Rev =1.5
Reset Count =6
Timestamp = 0 2B D131
Write Count =5
FRU Mask =0
Test ID = DEAD
Error Data = SR=2000 PC=0303516A Error Code=00002008 ProcCsr=556D
Registers = D0=05000220 D1=00000002 D2=00000002 D3=00000002
D4=00000000 D5=00000000 D6=00000000 D7=0000FFFF
A0=00002094 A1=030555DC A2=00000003 A3=05000012
A4=00058D5A A5=05000016 A6=0004B650 A7=0004B5CC
Dump another entry [Y]/N? Y
Entry# =1
EntryStatus =0 [0=valid,1=write_error,2=Invalid,3=empty,4=crc_error
Entry Id =11
Firmware Rev =1.5
Reset Count =6
Timestamp = 0 1F 68BD
Write Count =5
FRU Mask =0
Test ID =0
Error Data =SR=00000000 PC=00000000 ErrorCode=00000000
Registers =Phy1Csr = 00000000 ElmBase =00000000 MacBase =00000000
CamCsr =00000000 CamData15_00=00000000 PmCsr =00000000
CamData31_16=00000000 CamData47_32=00000000 PortDataA =00000000
RtosTimer=00000000 RtosTimerVal=00000000 PortDataB =00000000
i68k68kInt =00000000 i68k68kMask =00000000 Dmaint =00000000
i68kForceInt=00000000 DmaMask =00000000 HostData =00000000
HostInt0Mask =00000000 HostInit0 =00000000 PortStatus=00000000
PortCtrlMask=00000000 HostDmaMask=00000000 PortCtrlInt=00000000
FmcControl=00000000 FmcStatus=00000000 FmcInt=00000000
Dump another entry [Y]/N? y
Entry# =0
EntryStatus =0 [0=valid,1=write_error,2=Invalid,3=empty,4=crc_error
Entry Id =11
Firmware Rev =1.5
Reset Count =6
Timestamp = 0 14 9B41
Write Count =5
FRU Mask =0
Test ID =0
Error Data =SR=00000000 PC=00000000 ErrorCode=00000000
Registers =Phy1Csr = 00000000 ElmBase =00000000 MacBase =00000000
CamCsr =00000000 CamData15_00=00000000 PmCsr =00000000
CamData31_16=00000000 CamData47_32=00000000 PortDataA =00000000
RtosTimer=00000000 RtosTimerVal=00000000 PortDataB =00000000
i68k68kInt =00000000 i68k68kMask =00000000 Dmaint =00000000
i68kForceInt=00000000 DmaMask =00000000 HostData =00000000
HostInt0Mask =00000000 HostInit0 =00000000 PortStatus=00000000
PortCtrlMask=00000000 HostDmaMask=00000000 PortCtrlInt=00000000
FmcControl=00000000 FmcStatus=00000000 FmcInt=00000000
Dump another entry [Y]/N? y
No more Error Log entries
T.R | Title | User | Personal Name | Date | Lines |
---|
2847.1 | | NETCAD::DOODY | Michael Doody | Mon Oct 09 1995 15:41 | 75 |
| Looks to me like the Hub's backplane configuration got corrupted
somehow, bad enough to cause the hub to crash several times until
it erased it's configuration to stop itself from thrashing.
It does not look like a hardware problem; its probably a firmware problem
related to the Hub's backplane management. Many bugs were fixed related
to backplane configuration in the latest MAM V4.1. I suggest you have
them upgrade.
DUMP ERROR LOG
Current Reset Count: 17
This entry means the hub has crashed too many times in a row and it
has erased it's configuration. This includes the IP address, etc. So
you were misinformed on this point (maybe someone set the IP address
again afterwards):
Entry = 14
TimeStamp =00
Reset Count =15
Stop Thrash; Cleared Nonvolatile Data.
Dump another entry [Y]/N? y
This entry means the hub crashed at 35 seconds uptime, in the routine
at PC=41B608. Which routine this is depends on the _exact_ MAM firmware
version. But the address is definitely backplane management related:
Entry =13
Time Stamp = 0 3500
Reset Count =14
Catch VO=07C SR=2004 PC=41B608
Dump another entry [Y]/N? y
The rest of the entries are similar, the hub is crashing due to
corrupted configuration:
Entry =12
Time Stamp =0 17700
Reset Count =13
Catch VO=07C SR=2004 PC=41B60E
Dump another entry [Y]/N? y
Entry =11
TimeStamp =0 7100
Reset Count =12
Catch VO=07C SR=2004 PC=41B2FO
Dump another entry [Y]/N? y
Entry =10
Time Stamp =0 136300
Reset Count =11
Catch VO=07C SR=2009 PC=41B2BC
Dump another entry [Y]/N? y
Entry =9
Time Stamp =0 10600
ResetCount =10
Catch VO=07C SR=2004 PC=41B2B8
Dump another entry [Y]/N? y
No more Error Log entries.
Press Return for Main Menu...
|
2847.2 | Appreciate the explaination | CSC32::R_BUCK | Have been assimilated | Mon Oct 09 1995 18:58 | 7 |
| Michael,
Thanks for taking the time to explain the entries. Will pass the
information along to the field engineer along with the suggestion to
go ahead and upgrade to the latest firmware.
Randall
|