[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DEChub/HUBwatch/PROBEwatch CONFERENCE |
Notice: | Firmware -2, Doc -3, Power -4, HW kits -5, firm load -6&7 |
Moderator: | NETCAD::COLELLA DT |
|
Created: | Wed Nov 13 1991 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 4455 |
Total number of notes: | 16761 |
3516.0. "some more errorlog from DS900EF..." by BERFS4::NORD () Tue May 07 1996 15:41
Good morning, good evening and something between
(do you know this entry, ouh yaah, the one with the errorlog-entries
in the DECswitch 900 is back!)
Have some new one for you:
Berlin, hospital, GIGAswitch, some DM900 with DS900EF, stackables, a
lot of copper and fiber!
Sitting there with my laptop, and giving one of the DM900 an IP-address,
as I saw, that the DS900EF, connected to the slot 8 of this backplane
is running selftest, grumble grumble. Selftest didn't finish, but port
2 (AUI) is yellow, hmmm. (I've nothing configured to the backplane or
the DS900EF). Swap it out and swap it in: the same! Swap it out and
swap it in: the selftest-LED is blinking: non-fatal error! OK. DM900-
menue: 9 (redirect...) to slot 8: No line-card..., eeyy! It's just
running the selftest (nobody knows why, but...). After the module came
up, I was able to "redirect" to slot 8 and I did a "dump error log".
Below you will see the output:
The first entries are befor the selftest-non-fatal-error, the seconds
are from after the selftest-non-fatal-error. Amazing is, that the module
has installed the firmware-version 1.5.2 and, as you look at the second
entries, they are telling me, it has V2.1 installed. Should I be asto-
nished or is that normal?
Yes, I'll swap this module, no problem, but I need some informations
about the entries, 'cause this is the second DS900EF of 20 (are not
configured yet).
Many thanks,
with regards
Wolfgng Nord
MCS at Berlin at Germany
DECswitch 900EF - slot 8
==============================================================================
DUMP ERROR LOG
Current Reset Count: 9223
==============================================================================
Entry # = 3
Entry Status = 0 [0=valid, 1=write_error, 2=invalid, 3=empty, 4=crc_error
Entry Id = 10
Firmware Rev = 1.5
Reset Count = 9222
Timestamp = 0 0 0
Write Count = 2513
FRU Mask = 0
Test ID = DEAD
Error Data = SR=2008 PC=03033234 Error Code=000023C0 ProcCsr=376D
Registers = D0=00008344 D1=00000001 D2=00000006 D3=00000001
D4=00000001 D5=00000787 D6=00000000 D7=0000FFFF
A0=05033010 A1=0004BA04 A2=0004B8B4 A3=05010012
A4=030020D8 A5=03020000 A6=0004B830 A7=0004B7C8
Dump another entry [Y]/N?
Entry # = 2
Entry Status = 0 [0=valid, 1=write_error, 2=invalid, 3=empty, 4=crc_error
Entry Id = 10
Firmware Rev = 1.5
Reset Count = 9221
Timestamp = 0 0 0
Write Count = 2513
FRU Mask = 0
Test ID = DEAD
Error Data = SR=2008 PC=03033234 Error Code=000023C0 ProcCsr=3F6D
Registers = D0=00008344 D1=00000001 D2=00000006 D3=00000001
D4=00000001 D5=00000787 D6=00000000 D7=0000FFFF
A0=05033010 A1=0004BA04 A2=0004B8B4 A3=05010012
A4=030020D8 A5=03020000 A6=0004B830 A7=0004B7C8
Dump another entry [Y]/N?
Entry # = 1
Entry Status = 0 [0=valid, 1=write_error, 2=invalid, 3=empty, 4=crc_error
Entry Id = 10
Firmware Rev = 1.5
Reset Count = 9220
Timestamp = 0 0 0
Write Count = 2513
FRU Mask = 0
Test ID = DEAD
Error Data = SR=2008 PC=03033234 Error Code=000023C0 ProcCsr=376D
Registers = D0=0000C344 D1=00000001 D2=00000006 D3=00000001
D4=00000000 D5=00000000 D6=00000001 D7=00000000
A0=05033010 A1=0004BA04 A2=0004B8B4 A3=05010012
A4=030020D8 A5=03020000 A6=0004B860 A7=0004B7F8
Dump another entry [Y]/N?
Entry # = 0
Entry Status = 0 [0=valid, 1=write_error, 2=invalid, 3=empty, 4=crc_error
Entry Id = 10
Firmware Rev = 1.5
Reset Count = 9219
Timestamp = 0 0 F
Write Count = 2513
FRU Mask = 0
Test ID = DEAD
Error Data = SR=2008 PC=03044244 Error Code=000023C0 ProcCsr=376D
Registers = D0=0000C344 D1=00000001 D2=00000006 D3=00000001
D4=00000000 D5=00000000 D6=00000001 D7=00000000
A0=05033010 A1=0004BA04 A2=0004B8B4 A3=05010012
A4=030020D8 A5=03020000 A6=0004B860 A7=0004B834
Dump another entry [Y]/N?
DECswitch 900EF - slot 8
==============================================================================
DUMP ERROR LOG
Current Reset Count: 9225
==============================================================================
Entry # = 0
Entry Status = 0 [0=valid, 1=write_error, 2=invalid, 3=empty, 4=crc_error
Entry Id = 1
Firmware Rev = 2.1
Reset Count = 9224
Timestamp = 0 0 0
Write Count = 2515
FRU Mask = 2
Test ID = 962
Error Data = SR=0006 PC=00000000 Error Code=00000000 ProcCsr=0000
0:00000006 1:00000000 2:00000000 3:00000000
4:00000000 5:00000000 6:00000000 7:00000000
Dump another entry [Y]/N? y
Entry # = 3
Entry Status = 0 [0=valid, 1=write_error, 2=invalid, 3=empty, 4=crc_error
Entry Id = 1
Firmware Rev = 2.1
Reset Count = 9224
Timestamp = 0 0 0
Write Count = 2514
FRU Mask = 2
Test ID = 964
Error Data = SR=0002 PC=00000002 Error Code=00000004 ProcCsr=0000
0:00000002 1:00000002 2:00000004 3:00000000
4:00000000 5:00000000 6:00000000 7:00000000
Dump another entry [Y]/N?
Entry # = 2
Entry Status = 0 [0=valid, 1=write_error, 2=invalid, 3=empty, 4=crc_error
Entry Id = 1
Firmware Rev = 2.1
Reset Count = 9225
Timestamp = 0 0 0
Write Count = 2515
FRU Mask = 2
Test ID = 963
Error Data = SR=0002 PC=00000043 Error Code=00000000 ProcCsr=0000
0:00000002 1:00000043 2:00000000 3:00000000
4:00000000 5:00000000 6:00000000 7:00000000
Dump another entry [Y]/N?
Dump another entry [Y]/N? y
Entry # = 1
Entry Status = 0 [0=valid, 1=write_error, 2=invalid, 3=empty, 4=crc_error
Entry Id = 10
Firmware Rev = 1.5
Reset Count = 9225
Timestamp = 0 0 269
Write Count = 2515
FRU Mask = 0
Test ID = DEAD
Error Data = SR=2008 PC=03033234 Error Code=000023C0 ProcCsr=1F6D
Registers = D0=0000C344 D1=00000001 D2=00000006 D3=00000001
D4=00000000 D5=00000000 D6=00000001 D7=00000000
A0=05033010 A1=0004BA04 A2=0004B8B4 A3=05010012
A4=030020D8 A5=03020000 A6=0004B860 A7=0004B7F8
Dump another entry [Y]/N?
Dump another entry [Y]/N? y
Entry # = 0
Entry Status = 0 [0=valid, 1=write_error, 2=invalid, 3=empty, 4=crc_error
Entry Id = 1
Firmware Rev = 2.1
Reset Count = 9224
Timestamp = 0 0 0
Write Count = 2515
FRU Mask = 2
Test ID = 962
Error Data = SR=0006 PC=00000000 Error Code=00000000 ProcCsr=0000
0:00000006 1:00000000 2:00000000 3:00000000
4:00000000 5:00000000 6:00000000 7:00000000
Dump another entry [Y]/N?
T.R | Title | User | Personal Name | Date | Lines |
---|
3516.1 | Answers on your error log entries & other hints.... | NETCAD::BATTERSBY | Don't use time/words carelessly | Wed May 08 1996 11:18 | 46 |
| Wolfgang, I'll try to answer this note.
First off, don't grumble about the selftest. Treat the selftest
diagnostics as a tool for finding internal hardware problems before
the user environment finds them, not as a hinderance. :-)
Now, what the yellow port 2 state led is telling you is that there is
a non-fatal problem with one of the two internal modules within the
900EF box, namely the I/O module. The "Test ID's indicating failure of
this are in your "after" dump of the error log with a test id equal to
962, 964, and 963. These are internal diagnostic tests specifically
telling you that there is a hardware problem related to the ethernet
port #7. How you got more than 4 error log entries is beyond me as
there are only 4 error log table entries and after the 4th error log
entry is written to, the first one is over-written, and so on.
The earlier error log entries you captured with the "DEAD" Test ID, are
Operational Firmware reported errors. The error codes reported are 23C0.
These, as I recall are probably related to Packet Memory parity errors.
These probably are related to the ultimate failure of port 7. What may
have happened earlier is that whatever was intermittent, (or partially
failing), on port 7 may have corrupted packets being received and stored
in packet memory. BTW when individual ports fail on the switch products,
the module is allowed to come up into operational mode so that diagnosis
can be done to determine the failure via access through one of the other
working ports.
Now for the "Firmware Rev" descrepancy. The error log entries with the
"Test ID = DEAD" are error log entries from the operational firmware
and so the Firmware Rev = 1.5 field will be the major rev of the firmware
rev of 1.5.2. The error log entries with the "Test ID = XXX" (XXX being
some alphanumeric), are error log entris from the internal diagnostics.
The Firmware Rev = 2.1 field is the rev of the diagnostic dispatcher
code used to report the failure.
Notice too how the error log entries reported by the operational firmware
have a more detailed structure than the internal diagnostics have.
This was done to facilitate being able to more easily debug operational
firmware error log entries.
So to summarize, you have a DECswitch which apparently had some sort of
partial failure on port 7 that likely caused the packet memory parity
errors. Subsequently, something in the port 7 circuitry has failed
completely enough to now be reported by the internal diagnostics.
Replacing the module is the prudent thing to do.
I hope this information is helpful to you.
-Bob
|