[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | AlphaServer 4100 |
|
Moderator: | MOVMON::DAVIS S |
|
Created: | Tue Apr 16 1996 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 648 |
Total number of notes: | 3158 |
543.0. "machine check code 202 ?" by LEMAN::AUBERT (Multivendor Customer Services at CERN...) Mon Mar 24 1997 10:17
Hello,
A customer of mine has an AlphaServer 4100 who has crashed with a
Machine Check Code = 0x2020000.
I found a tool who can decode machine check code 660, 670 but not
202.
The configuration (uerf -r 300 abstract) and the crash-data file
is attached below.
Thanks in advance for any diagnostic.
Thierry Aubert/DEC at CERN
---------------------------------------------------------------------
EVENT CLASS OPERATIONAL EVENT
OS EVENT TYPE 300. SYSTEM STARTUP
SEQUENCE NUMBER 1.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Fri Mar 21 18:30:17 1997
OCCURRED ON SYSTEM shd50
SYSTEM ID x00050016
SYSTYPE x00000000
MESSAGE Alpha boot: available memory
from
_0xb12000 to 0xfffe000
Digital UNIX V4.0B (Rev. 564);
Tue
_Feb 25 16:05:17 MET 1997
physical memory = 256.00
megabytes.
available memory = 244.92
megabytes.
using 975 buffers containing
7.61
_megabytes of memory
Master cpu at slot 0.
Firmware revision: 3.0
PALcode: Digital-UNIX/OSF
version 1.21
AlphaServer 4100 5/300 0MB
pci1 at mcbus0 slot 5
psiop0 at pci1 slot 1
Loading SIOP: script c0000300,
reg
_4444000, data c000c250
scsi0 at psiop0 slot 0
rz5 at scsi0 target 5 lun 0
(LID=0)
_(DEC RRD45 (C) DEC
0436)
pza0 at pci1 slot 2
pza0 firmware version: DEC P01
A10
_
scsi1 at pza0 slot 0
tz8 at scsi1 target 0 lun 0
(LID=1)
_(STK SD-3
011E)
_(Wide16)
pza1 at pci1 slot 3
pza1 firmware version: DEC P01
A10
_
scsi2 at pza1 slot 0
tz16 at scsi2 target 0 lun 0
(LID=2)
_(STK SD-3
011E)
_(Wide16)
pza2 at pci1 slot 4
pza2 firmware version: DEC P01
A10
_
scsi3 at pza2 slot 0
tz24 at scsi3 target 0 lun 0
(LID=3)
_(STK SD-3
011E)
_(Wide16)
pza3 at pci1 slot 5
pza3 firmware version: DEC P01
A10
_
scsi4 at pza3 slot 0
tz32 at scsi4 target 0 lun 0
(LID=4)
_(STK SD-3
011E)
_(Wide16)
gpc0 at eisa0
pci0 at mcbus0 slot 4
eisa0 at pci0
ace0 at eisa0
ace1 at eisa0
lp0 at eisa0
fdi0 at eisa0
fd0 at fdi0 unit 0
pci2000 at pci0 slot 2
isp0 at pci2000 slot 0
isp0: QLOGIC ISP1020A
isp0: Firmware revision 2.10
(loaded
_by console)
scsi5 at isp0 slot 0
rz40 at scsi5 target 0 lun 0
(LID=5)
_(DEC RZ29B (C) DEC
0016)
_(Wide16)
rz41 at scsi5 target 1 lun 0
(LID=6)
_(DEC RZ29B (C) DEC
0016)
_(Wide16)
rz42 at scsi5 target 2 lun 0
(LID=7)
_(DEC RZ29B (C) DEC
0016)
_(Wide16)
rz43 at scsi5 target 3 lun 0
(LID=8)
_(DEC RZ29B (C) DEC
0016)
_(Wide16)
rz44 at scsi5 target 4 lun 0
(LID=9)
_(DEC RZ29B (C) DEC
0016)
_(Wide16)
rz45 at scsi5 target 5 lun 0
(LID=10)
_(DEC RZ29B (C) DEC
0016)
_(Wide16)
rz46 at scsi5 target 6 lun 0
(LID=11)
_(DEC RZ29B (C) DEC
0016)
_(Wide16)
tu0: DECchip 21140-AA:
Revision: 1.2
tu0 at pci0 slot 3
tu0: DEC Fast Ethernet
Interface,
_hardware address:
00-00-F8-31-10-CF
tu0: console mode: selecting
10BaseT
_(UTP) port: half duplex
hip0: Roadrunner version 2
(20000900)
hip0 at pci0 slot 4
hip0 slot 4: PCI/HIPPI
interface
_0-a0-88-1-0-72
fta0 DEC DEFPA FDDI Module,
Hardware
_Revision 1
fta0 at pci0 slot 5
fta0: DMA Available.
fta0: DEC DEFPA (PDQ) FDDI
Interface,
_Hardware address:
00-00-F8-4A-8C-BA
fta0: Firmware rev: 2.46
Created FRU table configuration
binary
_errorlog packet
kernel console: ace0
dli: configured
-----------------------------------------------------------------------
#
# Crash Data Collection (Version 1.4)
#
_crash_data_collection_time: Fri Mar 21 18:31:05 MET 1997
_current_directory: /
_crash_kernel: /var/adm/crash/vmunix.0
_crash_core: /var/adm/crash/vmcore.0
_crash_arch: alpha
_crash_os: Digital UNIX
_host_version: Digital UNIX V4.0B (Rev. 564); Tue Feb 25 16:05:17 MET
1997
_crash_version: Digital UNIX V4.0B (Rev. 564); Tue Feb 25 16:05:17 MET
1997
thread 0xfffffc000ff0e2c0 stopped at [boot:2466 ,0xfffffc0000463818]
Source
not available
_crashtime: struct {
tv_sec = 858965172
tv_usec = 574525
}
_boottime: struct {
tv_sec = 857067239
tv_usec = 77469
}
_config: struct {
sysname = "OSF1"
nodename = "shd50"
release = "V4.0"
version = "564"
machine = "alpha"
}
_cpu: 49
_system_string: 0xffffffffff800f58 = "AlphaServer 4100 5/300 0MB"
_ncpus: 4
_avail_cpus: 4
_partial_dump: 1
_physmem(MBytes): 255
_panic_string: 0xfffffc00005c0dd8 = "System Uncorrectable Machine
Check "
_paniccpu: 0
_panic_thread: 0xfffffc000ff0e2c0
_preserved_message_buffer_begin:
struct {
msg_magic = 0x63061
msg_bufx = 0xf3a
msg_bufr = 0xe41
msg_bufc = "Alpha boot: available memory from 0xb12000 to 0xfffe000
Digital UNIX V4.0B (Rev. 564); Tue Feb 25 16:05:17 MET 1997
physical memory = 256.00 megabytes.
available memory = 244.92 megabytes.
using 975 buffers containing 7.61 megabytes of memory
Master cpu at slot 0.
Firmware revision: 3.0
PALcode: Digital-UNIX/OSF version 1.21
AlphaServer 4100 5/300 0MB
pci1 at mcbus0 slot 5
psiop0 at pci1 slot 1
Loading SIOP: script c0000300, reg 4444000, data c000c250
scsi0 at psiop0 slot 0
rz5 at scsi0 target 5 lun 0 (LID=0) (DEC RRD45 (C) DEC 0436)
pza0 at pci1 slot 2
pza0 firmware version: DEC P01 A10
scsi1 at pza0 slot 0
tz8 at scsi1 target 0 lun 0 (LID=1) (STK SD-3 011D)
(Wide16)
pza1 at pci1 slot 3
pza1 firmware version: DEC P01 A10
scsi2 at pza1 slot 0
tz16 at scsi2 target 0 lun 0 (LID=2) (STK SD-3 011D)
(Wide16)
pza2 at pci1 slot 4
pza2 firmware version: DEC P01 A10
scsi3 at pza2 slot 0
tz24 at scsi3 target 0 lun 0 (LID=3) (STK SD-3 011D)
(Wide16)
pza3 at pci1 slot 5
pza3 firmware version: DEC P01 A10
scsi4 at pza3 slot 0
tz32 at scsi4 target 0 lun 0 (LID=4) (STK SD-3 011D)
(Wide16)
gpc0 at eisa0
pci0 at mcbus0 slot 4
eisa0 at pci0
ace0 at eisa0
ace1 at eisa0
lp0 at eisa0
fdi0 at eisa0
fd0 at fdi0 unit 0
pci2000 at pci0 slot 2
isp0 at pci2000 slot 0
isp0: QLOGIC ISP1020A
isp0: Firmware revision 2.10 (loaded by console)
scsi5 at isp0 slot 0
rz40 at scsi5 target 0 lun 0 (LID=5) (DEC RZ29B (C) DEC 0016)
(Wide16)
rz41 at scsi5 target 1 lun 0 (LID=6) (DEC RZ29B (C) DEC 0016)
(Wide16)
rz42 at scsi5 target 2 lun 0 (LID=7) (DEC RZ29B (C) DEC 0016)
(Wide16)
rz43 at scsi5 target 3 lun 0 (LID=8) (DEC RZ29B (C) DEC 0016)
(Wide16)
rz44 at scsi5 target 4 lun 0 (LID=9) (DEC RZ29B (C) DEC 0016)
(Wide16)
rz45 at scsi5 target 5 lun 0 (LID=10) (DEC RZ29B (C) DEC 0016)
(Wide16)
rz46 at scsi5 target 6 lun 0 (LID=11) (DEC RZ29B (C) DEC 0016)
(Wide16)
tu0: DECchip 21140-AA: Revision: 1.2
tu0 at pci0 slot 3
tu0: DEC Fast Ethernet Interface, hardware address: 00-00-F8-31-10-CF
tu0: console mode: selecting 10BaseT (UTP) port: half duplex
hip0: Roadrunner version 2 (20000900)
hip0 at pci0 slot 4
hip0 slot 4: PCI/HIPPI interface 0-a0-88-1-0-72
fta0 DEC DEFPA FDDI Module, Hardware Revision 1
fta0 at pci0 slot 5
fta0: DMA Available.
fta0: DEC DEFPA (PDQ) FDDI Interface, Hardware address:
00-00-F8-4A-8C-BA
fta0: Firmware rev: 2.46
Created FRU table configuration binary errorlog packet
kernel console: ace0
dli: configured
Starting secondary cpu 1
Starting secondary cpu 2
Starting secondary cpu 3
ADVFS: using 2322 buffers containing 18.14 megabytes of memory
hip0: RunCode 1.0.50 is up
fta0: Link Unavailable.
fta0: Link Available.
hip0: Optical link OFF
hip0: Optical link ON
fta0: Link Unavailable.
fta0: Link Available.
NFS3 RFS3_GETATTR failed for server f-shiftnfs: RPC: Timed out
fta0: Link Unavailable.
fta0: Link Available.
NFS3 RFS3_WRITE failed for server f-shiftnfs: RPC: Timed out
NFS3 write error 60 on host f-shiftnfs
hip0: Optical link OFF
hip0: RunCode 1.0.50 is up
hip0: Optical link OFF
hip0: Optical link ON
hip0: Optical link OFF
hip0: RunCode 1.0.50 is up
hip0: Optical link OFF
hip0: Optical link ON
hip0: Optical link OFF
hip0: RunCode 1.0.50 is up
hip0: Optical link OFF
hip0: Optical link ON
hip0: Optical link OFF
hip0: RunCode 1.0.50 is up
hip0: Optical link OFF
hip0: Optical link ON
hip0: Optical link OFF
hip0: RunCode 1.0.50 is up
hip0: Optical link OFF
hip0: RunCode 1.0.50 is up
hip0: Optical link OFF
hip0: Optical link ON
HIPPI Adapter hip0: New logical address 0x150
panic (cpu 0): System Uncorrectable Machine Check
device string for dump = SCSI 0 2000 0 0 0 0 0.
DUMP.prom: dev SCSI 0 2000 0 0 0 0 0, block 262144
device string for dump = SCSI 0 2000 0 0 0 0 0.
DUMP.prom: dev SCSI 0 2000 0 0 0 0 0, block 262144
"
}
_preserved_message_buffer_end:
_kernel_process_status_begin:
PID COMM
00000 kernel idle
00001 init
25602 tcsh
00007 kloadsrv
00022 vold
23584 dktotpF
23597 rtcopyd
00050 update
05173 stgdaemon
27703 rlogind
27711 getty
25682 telnetd
24678 rtcopyd
00107 essd
00110 essdmpd
03186 telnetd
00147 syslogd
00149 binlogd
20632 cpdsktp
14521 rexx
24811 rexx
05364 ctpdaemon
03318 cpdsktp
11571 rlogind
00325 portmap
00333 ypbind
00340 nfsiod
21850 dumptape
23899 tpmnt
31091 tpdaemon
00398 sendmail
23986 sh
00439 inetd
00442 snmpd
00451 os_mibs
11721 csh
00475 cron
00478 volwatch
00484 volwatch
00485 volnotify
00499 lpd
00527 dtlogin
00557 rfiod
00560 tmsdaemon
23095 rtcopyd
00570 ssi
26203 dktotpF
26212 dktotpF
26221 csh
14967 csh
25246 rtcopyd
26340 tcsh
24325 cpdsktp
25354 rtcopyd
25386 cpdsktp
16220 vdumptape
14282 dktotpF
14310 tpdump
_kernel_process_status_end:
_current_pid: 0
_current_tid: 0xfffffc000ff0e2c0
_proc_thread_list_begin:
thread 0xfffffc000ff0e2c0 stopped at [boot:2466 ,0xfffffc0000463818]
Source
not available
thread 0xfffffc000ff0e580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0e840 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0eb00 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0edc0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0f080 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0f340 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0f600 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0f8c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0fb80 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6ba000 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6ba2c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6ba580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6ba840 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bab00 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6badc0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bb080 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bb340 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bb600 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bb8c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bbb80 stopped at [stop_secondary_cpu:513
,0xfffffc000045a0
f8] Source not available
thread 0xfffffc000e690000 stopped at [stop_secondary_cpu:513
,0xfffffc000045a0
f8] Source not available
thread 0xfffffc000e6902c0 stopped at [stop_secondary_cpu:513
,0xfffffc000045a0
f8] Source not available
thread 0xfffffc000e690580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e690840 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e690dc0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e691080 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e691340 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e691600 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6918c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e691b80 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69c000 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69c2c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69cdc0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69d080 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69c580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69d8c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69db80 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc0002f7d600 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc0002f7d8c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc0002f7db80 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000291e000 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000291e2c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000291e580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000291e840 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc00026a02c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc00026a0580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
_proc_thread_list_end:
warning: Files compiled -g3: parameter values probably wrong
_dump_begin:
> 0 boot(0x400000000, 0xffffffff90150000, 0xfffffc00005c0948, 0x0,
0x100000000)
["../../../../src/kernel/arch/alpha/machdep.c":2466,
0xfffffc0000463818]
1 panic(s = 0xfffffc00005c0dd8 = "System Uncorrectable Machine Check
") ["../
../../../src/kernel/bsd/subr_prf.c":707, 0xfffffc00002993ac]
pcpu = 0xfffffc0000615b10
i = 6032856
mycpu = 0
spl = 4
2 machcheck(0xfffffc000ff50000, 0xfffffc0000004838,
0xfffffc0000004838, 0xfff
fffff90153638, 0xfffffc0000004838)
["../../../../src/kernel/arch/alpha/hal/kn300
.c":2092, 0xfffffc000048fd70]
3 mach_error(0xfffffc0000004838, 0xffffffff90153638,
0xfffffc0000004838, 0x66
0, 0xfffffc000045f4b0)
["../../../../src/kernel/arch/alpha/hal/cpusw.c":808, 0xf
ffffc00004746f8]
4 _XentInt(0x2, 0xfffffc0000468f94, 0xfffffc00005ddd90, 0x2, 0x0)
["../../../
../src/kernel/arch/alpha/locore.s":1112, 0xfffffc000045f4ac]
5 swap_ipl(0x2, 0xfffffc0000468f94, 0xfffffc00005ddd90, 0x2, 0x0)
["../../../
../src/kernel/arch/alpha/spl.s":135, 0xfffffc0000468f90]
6 boot(0x0, 0xfffffc000ff0e2c0, 0x2c0000003f, 0x3f,
0xfffffc0000000001) ["../
../../../src/kernel/arch/alpha/machdep.c":2381, 0xfffffc00004636a4]
7 panic(s = 0xfffffc00005c0dd8 = "System Uncorrectable Machine Check
") ["../
../../../src/kernel/bsd/subr_prf.c":791, 0xfffffc000029954c]
pcpu = 0xd
i = 6217104
mycpu = 0
spl = 4
8 machcheck(0xfffffc000ff50000, 0xfffffc0000004838,
0xfffffc0000004838, 0xfff
fffff901538a0, 0xfffffc0000004838)
["../../../../src/kernel/arch/alpha/hal/kn300
.c":2092, 0xfffffc000048fd70]
9 mach_error(0xfffffc0000004838, 0xffffffff901538a0,
0xfffffc0000004838, 0x66
0, 0xfffffc000045f4b0)
["../../../../src/kernel/arch/alpha/hal/cpusw.c":808, 0xf
ffffc00004746f8]
10 _XentInt(0x0, 0xfffffc000044ec3c, 0xfffffc00005ddd90, 0x3fff, 0x1)
["../../
../../src/kernel/arch/alpha/locore.s":1112, 0xfffffc000045f4ac]
11 vm_page_tester() ["../../../../src/kernel/vm/vm_resident.c":1935,
0xfffffc0
00044ec38]
p = 0xfffffc0000200400
12 idle_thread() ["../../../../src/kernel/kern/sched_prim.c":3327,
0xfffffc000
02c87ac]
threadp = 0xfffffc0000200838
gcount = 0xfffffc000061856c
lcount = 0xfffffc0000200414
myprocessor = 0xfffffc0000200400
th = 0x87c0f48d
mycpu = 0
pset = 0xfffffc000061856c
steal_depth = 1
lbolt_idle = 2277569677
skipped_cpu = 0
_dump_end:
_kernel_thread_list_begin:
thread 0xfffffc000ff0e2c0 stopped at [boot:2466 ,0xfffffc0000463818]
Source
not available
thread 0xfffffc000ff0e580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0e840 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0eb00 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0edc0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0f080 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0f340 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0f600 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0f8c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000ff0fb80 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6ba000 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6ba2c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6ba580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6ba840 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bab00 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6badc0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bb080 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bb340 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bb600 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bb8c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6bbb80 stopped at [stop_secondary_cpu:513
,0xfffffc000045a0
f8] Source not available
thread 0xfffffc000e690000 stopped at [stop_secondary_cpu:513
,0xfffffc000045a0
f8] Source not available
thread 0xfffffc000e6902c0 stopped at [stop_secondary_cpu:513
,0xfffffc000045a0
f8] Source not available
thread 0xfffffc000e690580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e690840 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e690dc0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e691080 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e691340 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e691600 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e6918c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e691b80 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69c000 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69c2c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69cdc0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69d080 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69c580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69d8c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000e69db80 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc0002f7d600 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc0002f7d8c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc0002f7db80 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000291e000 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000291e2c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000291e580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc000291e840 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc00026a02c0 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
thread 0xfffffc00026a0580 stopped at [thread_block:2097
,0xfffffc00002c6c50]
Source not available
_kernel_thread_list_end:
_savedefp: (nil)
_kernel_memory_fault_data_begin:
struct {
fault_va = 0x0
fault_pc = 0x0
fault_ra = 0x0
fault_sp = 0x0
access = 0x0
status = 0x0
cpunum = 0x0
count = 0x0
pcb = (nil)
thread = (nil)
task = (nil)
proc = (nil)
}
_kernel_memory_fault_data_end:
_uptime: 527.20 hours
thread 0xfffffc000ff0e2c0 stopped at [boot:2466 ,0xfffffc0000463818]
Source
not available
paniccpu: 0x0
machine_slot[paniccpu]: struct {
is_cpu = 0x1
cpu_type = 0xf
cpu_subtype = 0x16
running = 0x1
cpu_ticks = {
[0] 0x8dea45
[1] 0x0
[2] 0x1ca1f16
[3] 0x7e28ce46
[4] 0x74038be
}
clock_freq = 0x4b0
error_restart = 0x0
cpu_panicstr = 0xfffffc00005c0dd8 = "System Uncorrectable Machine
Check "
cpu_panic_thread = 0xfffffc000ff0e2c0
}
tset machine_slot[paniccpu].cpu_panic_thread:
Begin Trace for machine_slot[paniccpu].cpu_panic_thread:
thread 0xfffffc000ff0e2c0 stopped at [boot:2466 ,0xfffffc0000463818]
Source
not available
> 0 boot(0x400000000, 0xffffffff90150000, 0xfffffc00005c0948, 0x0,
0x100000000)
["../../../../src/kernel/arch/alpha/machdep.c":2466,
0xfffffc0000463818]
1 panic(s = 0xfffffc00005c0dd8 = "System Uncorrectable Machine Check
") ["../
../../../src/kernel/bsd/subr_prf.c":707, 0xfffffc00002993ac]
2 machcheck(0xfffffc000ff50000, 0xfffffc0000004838,
0xfffffc0000004838, 0xfff
fffff90153638, 0xfffffc0000004838)
["../../../../src/kernel/arch/alpha/hal/kn300
.c":2092, 0xfffffc000048fd70]
3 mach_error(0xfffffc0000004838, 0xffffffff90153638,
0xfffffc0000004838, 0x66
0, 0xfffffc000045f4b0)
["../../../../src/kernel/arch/alpha/hal/cpusw.c":808, 0xf
ffffc00004746f8]
4 _XentInt(0x2, 0xfffffc0000468f94, 0xfffffc00005ddd90, 0x2, 0x0)
["../../../
../src/kernel/arch/alpha/locore.s":1112, 0xfffffc000045f4ac]
5 swap_ipl(0x2, 0xfffffc0000468f94, 0xfffffc00005ddd90, 0x2, 0x0)
["../../../
../src/kernel/arch/alpha/spl.s":135, 0xfffffc0000468f90]
6 boot(0x0, 0xfffffc000ff0e2c0, 0x2c0000003f, 0x3f,
0xfffffc0000000001) ["../
../../../src/kernel/arch/alpha/machdep.c":2381, 0xfffffc00004636a4]
7 panic(s = 0xfffffc00005c0dd8 = "System Uncorrectable Machine Check
") ["../
../../../src/kernel/bsd/subr_prf.c":791, 0xfffffc000029954c]
8 machcheck(0xfffffc000ff50000, 0xfffffc0000004838,
0xfffffc0000004838, 0xfff
fffff901538a0, 0xfffffc0000004838)
["../../../../src/kernel/arch/alpha/hal/kn300
.c":2092, 0xfffffc000048fd70]
9 mach_error(0xfffffc0000004838, 0xffffffff901538a0,
0xfffffc0000004838, 0x66
0, 0xfffffc000045f4b0)
["../../../../src/kernel/arch/alpha/hal/cpusw.c":808, 0xf
ffffc00004746f8]
10 _XentInt(0x0, 0xfffffc000044ec3c, 0xfffffc00005ddd90, 0x3fff, 0x1)
["../../
../../src/kernel/arch/alpha/locore.s":1112, 0xfffffc000045f4ac]
11 vm_page_tester() ["../../../../src/kernel/vm/vm_resident.c":1935,
0xfffffc0
00044ec38]
12 idle_thread() ["../../../../src/kernel/kern/sched_prim.c":3327,
0xfffffc000
02c87ac]
End Trace for machine_slot[paniccpu].cpu_panic_thread:
thread 0xfffffc000ff0e2c0 stopped at [boot:2466 ,0xfffffc0000463818]
Source
not available
"cpu_data" is not an array
thread 0xfffffc000ff0e2c0 stopped at [boot:2466 ,0xfffffc0000463818]
Source
not available
_stack_trace[0]_begin:
> 0 boot(0x400000000, 0xffffffff90150000, 0xfffffc00005c0948, 0x0,
0x100000000)
["../../../../src/kernel/arch/alpha/machdep.c":2466,
0xfffffc0000463818]
1 panic(s = 0xfffffc00005c0dd8 = "System Uncorrectable Machine Check
") ["../
../../../src/kernel/bsd/subr_prf.c":707, 0xfffffc00002993ac]
2 machcheck(0xfffffc000ff50000, 0xfffffc0000004838,
0xfffffc0000004838, 0xfff
fffff90153638, 0xfffffc0000004838)
["../../../../src/kernel/arch/alpha/hal/kn300
.c":2092, 0xfffffc000048fd70]
3 mach_error(0xfffffc0000004838, 0xffffffff90153638,
0xfffffc0000004838, 0x66
0, 0xfffffc000045f4b0)
["../../../../src/kernel/arch/alpha/hal/cpusw.c":808, 0xf
ffffc00004746f8]
4 _XentInt(0x2, 0xfffffc0000468f94, 0xfffffc00005ddd90, 0x2, 0x0)
["../../../
../src/kernel/arch/alpha/locore.s":1112, 0xfffffc000045f4ac]
5 swap_ipl(0x2, 0xfffffc0000468f94, 0xfffffc00005ddd90, 0x2, 0x0)
["../../../
../src/kernel/arch/alpha/spl.s":135, 0xfffffc0000468f90]
6 boot(0x0, 0xfffffc000ff0e2c0, 0x2c0000003f, 0x3f,
0xfffffc0000000001) ["../
../../../src/kernel/arch/alpha/machdep.c":2381, 0xfffffc00004636a4]
7 panic(s = 0xfffffc00005c0dd8 = "System Uncorrectable Machine Check
") ["../
../../../src/kernel/bsd/subr_prf.c":791, 0xfffffc000029954c]
8 machcheck(0xfffffc000ff50000, 0xfffffc0000004838,
0xfffffc0000004838, 0xfff
fffff901538a0, 0xfffffc0000004838)
["../../../../src/kernel/arch/alpha/hal/kn300
.c":2092, 0xfffffc000048fd70]
9 mach_error(0xfffffc0000004838, 0xffffffff901538a0,
0xfffffc0000004838, 0x66
0, 0xfffffc000045f4b0)
["../../../../src/kernel/arch/alpha/hal/cpusw.c":808, 0xf
ffffc00004746f8]
10 _XentInt(0x0, 0xfffffc000044ec3c, 0xfffffc00005ddd90, 0x3fff, 0x1)
["../../
../../src/kernel/arch/alpha/locore.s":1112, 0xfffffc000045f4ac]
11 vm_page_tester() ["../../../../src/kernel/vm/vm_resident.c":1935,
0xfffffc0
00044ec38]
12 idle_thread() ["../../../../src/kernel/kern/sched_prim.c":3327,
0xfffffc000
02c87ac]
_stack_trace[0]_end:
thread 0xfffffc000ff0e2c0 stopped at [boot:2466 ,0xfffffc0000463818]
Source
not available
"cpu_data" is not an array
thread 0xfffffc000e6bbb80 stopped at [stop_secondary_cpu:513
,0xfffffc000045a0
f8] Source not available
warning: Files compiled -g3: parameter values probably wrong
_stack_trace[1]_begin:
> 0 stop_secondary_cpu(do_lwc = 1)
["../../../../src/kernel/arch/alpha/cpu.c":5
07, 0xfffffc000045a0f4]
1 panic(s = 0xfffffc00005b5fc8 = "cpu_ip_intr: panic request")
["../../../../
src/kernel/bsd/subr_prf.c":761, 0xfffffc0000299518]
2 cpu_ip_intr() ["../../../../src/kernel/arch/alpha/cpu.c":647,
0xfffffc00004
5a480]
3 _XentInt(0x0, 0xfffffc00002c879c, 0xfffffc00005ddd90, 0x3fff, 0x1)
["../../
../../src/kernel/arch/alpha/locore.s":1076, 0xfffffc000045f45c]
4 idle_thread() ["../../../../src/kernel/kern/sched_prim.c":3322,
0xfffffc000
02c8798]
_stack_trace[1]_end:
thread 0xfffffc000ff0e2c0 stopped at [boot:2466 ,0xfffffc0000463818]
Source
not available
"cpu_data" is not an array
thread 0xfffffc000e690000 stopped at [stop_secondary_cpu:513
,0xfffffc000045a0
f8] Source not available
warning: Files compiled -g3: parameter values probably wrong
_stack_trace[2]_begin:
> 0 stop_secondary_cpu(do_lwc = 1)
["../../../../src/kernel/arch/alpha/cpu.c":5
07, 0xfffffc000045a0f4]
1 panic(s = 0xfffffc00005b5fc8 = "cpu_ip_intr: panic request")
["../../../../
src/kernel/bsd/subr_prf.c":733, 0xfffffc000029944c]
2 cpu_ip_intr() ["../../../../src/kernel/arch/alpha/cpu.c":647,
0xfffffc00004
5a480]
3 _XentInt(0x0, 0xfffffc000044ec5c, 0xfffffc00005ddd90, 0x3fff, 0x1)
["../../
../../src/kernel/arch/alpha/locore.s":1076, 0xfffffc000045f45c]
4 vm_page_tester() ["../../../../src/kernel/vm/vm_resident.c":1944,
0xfffffc0
00044ec58]
5 idle_thread() ["../../../../src/kernel/kern/sched_prim.c":3326,
0xfffffc000
02c87a8]
_stack_trace[2]_end:
thread 0xfffffc000ff0e2c0 stopped at [boot:2466 ,0xfffffc0000463818]
Source
not available
"cpu_data" is not an array
thread 0xfffffc000e6902c0 stopped at [stop_secondary_cpu:513
,0xfffffc000045a0
f8] Source not available
warning: Files compiled -g3: parameter values probably wrong
_stack_trace[3]_begin:
> 0 stop_secondary_cpu(do_lwc = 1)
["../../../../src/kernel/arch/alpha/cpu.c":5
07, 0xfffffc000045a0f4]
1 panic(s = 0xfffffc00005b5fc8 = "cpu_ip_intr: panic request")
["../../../../
src/kernel/bsd/subr_prf.c":733, 0xfffffc000029944c]
2 cpu_ip_intr() ["../../../../src/kernel/arch/alpha/cpu.c":647,
0xfffffc00004
5a480]
3 _XentInt(0x0, 0xfffffc00002c8820, 0xfffffc00005ddd90,
0xfffffc0000201600, 0
x1) ["../../../../src/kernel/arch/alpha/locore.s":1076,
0xfffffc000045f45c]
4 idle_thread() ["../../../../src/kernel/kern/sched_prim.c":3342,
0xfffffc000
02c881c]
_stack_trace[3]_end:
/usr/bin/crashdc: /bin/kdbx: not found
#
_crash_data_collection_finished:
T.R | Title | User | Personal Name | Date | Lines |
---|
543.1 | console output | LEMAN::AUBERT | Multivendor Customer Services at CERN... | Mon Mar 24 1997 10:25 | 129 |
| You will find also below the console output of this machine check :
Machine Check SYSTEM Fatal Abort
Machine check code = 0x2020000
pal temp[0-1] = 0000000000000000 fffffc0000200838
pal temp[2-3] = fffffc000045f7d0 0000000000004400
pal temp[4-5] = 0000000000000000 0000000000000000
pal temp[6-7] = 0000000000000002 fffffc000045f210
pal temp[8-9] = 1f1e171515020100 fffffc000045f540
pal temp[10-11] = fffffc000044ec3c fffffc000045f3a0
pal temp[12-13] = fffffc000045f740 0000000000006e80
pal temp[14-15] = 0000000000000000 00000000000f0000
pal temp[16-17] = 0000020306600001 0000000000000000
pal temp[18-19] = 0000000000000000 ffffffff901539a8
pal temp[20-21] = 0000000000730000 fffffc000045f770
pal temp[22-23] = fffffc00005ddd90 000000000fe2fa38
shadow[0-1] = 0000000000000000 0000000000000000
shadow[2-3] = 0000000000000000 0000000000000000
shadow[4-5] = 00004b3600000000 0000000000000000
shadow[6-7] = 0000000000000000 0000000000000000
Addr of excepting instruction = fffffc000044ec3c
Summary of arithmetic traps = 0000000000000000
Exception mask = 0000000000000000
Base address for PALcode = 0000000000014000
Interrupt Status Reg = 0000000000200000
CURRENT SETUP OF EV5 IBOX = 000000c160000000
I-CACHE Reg Tag parity error = 0000000000000000
D-CACHE error Reg = 0000000000000000
Effective VA = fffffffdff7fc000
Reason for D-stream = 0000000000014850
EV5 SCache address = ffffff000001902f
EV5 SCache TAG/Data parity = 0000000000000000
EV5 BC_TAG_ADDR = ffffffffffffefff
EV5 EI_ADDR: Phys addr of Xfer = ffffff000800060f
Fill Syndrome = 0000000000000c0c
EI_STAT reg = MC_ERR1 reg. = 800fca00
CAP_ERR reg. = e0000000
PCI_ERR1 reg. = 00000000
MDPA_STAT reg. = 00000000
MDPA_SYN reg. = 00000000
MDPB_STAT reg. = 00000000
MDPB_SYN reg. = 00000000
panic (cpu 0): System Uncorrectable Machine Check
Machine Check SYSTEM Fatal Abort
Machine check code = 0x2020000
pal temp[0-1] = 0000000000000007 fffffc00005dc590
pal temp[2-3] = fffffc000045f7d0 0000000000004400
pal temp[4-5] = 00000000000f4013 0000f980000003f8
pal temp[6-7] = 0000000000000000 fffffc000045f210
pal temp[8-9] = 1f1e171515020100 fffffc000045f540
pal temp[10-11] = fffffc0000468f94 fffffc000045f3a0
pal temp[12-13] = fffffc000045f740 0000000000006e80
pal temp[14-15] = 0000000000000000 00000000000f0000
pal temp[16-17] = 0000020306600001 0000000000000000
pal temp[18-19] = 0000000000000000 ffffffff90153740
pal temp[20-21] = 0000000000730000 fffffc000045f770
pal temp[22-23] = fffffc00005ddd90 000000000fe2fa38
shadow[0-1] = 0000000000000000 0000000000000000
shadow[2-3] = 0000000000000000 0000000000g. =
42460ff1
HAE_MEM reg. = 00000000
HAE_IO reg. = 00000000
INT_CTL reg. = 00000003
INT_REG reg. = 00811011
INT_MASK0 reg. = 00c51111
INT_MASK1 reg. = 00000000
MC_ERR0 reg. = 00006e80
MC_ERR1 reg. = 800e8a04
CAP_ERR reg. = 84000000
PCI_ERR1 reg. = 00000000
MDPA_STAT reg. = 00000000
MDPA_SYN reg. = 00000000
MDPB_STAT reg. = 00000000
MDPB_SYN reg. = 00000000
IOD 1 register dump:
Base Addr of PCI bridge = 000000fbe0000000
Whami reg. = 0000023a
Sys. Env. reg. = 00000000
PCI Rev. reg. = 06000032
CAP_CTL reg. = 42460ff1
HAE_MEM reg. = 00000000
HAE_IO reg. = 00000000
INT_CTL reg. = 00000003
INT_REG reg. = 00800100
INT_MASK0 reg. = 00c51111
INT_MASK1 reg. = 00000000
MC_ERR0 reg. = 00006e80
MC_ERR1 reg. = 800e8a04
CAP_ERR reg. = 84000000
PCI_ERR1 reg. = 00000000
MDPA_STAT reg. = 00000000
MDPA_SYN reg. = 00000000
MDPB_STAT reg. = 00000000
MDPB_SYN reg. = 00000000
DUMP: 1000000 blocks available for dumping.
DUMP: 57898 required for a partial dump.
DUMP: 0x814001 is the primary swap with 999999, start our last 57897
: of dump at 942102, going to end (real end is one more, for
header)
device string for dump = SCSI 0 2000 0 0 0 0 0.
DUMP.prom: dev SCSI 0 2000 0 0 0 0 0, block 262144
DUMP: Header to 0x814001 at 999999 (0xf423f)
device string for dump = SCSI 0 2000 0 0 0 0 0.
DUMP.prom: dev SCSI 0 2000 0 0 0 0 0, block 262144
DUMP: Dump to 0x814001: ............................: End 0x814001
device string for dump = SCSI 0 2000 0 0 0 0 0.
DUMP.prom: dev SCSI 0 2000 0 0 0 0 0, block 262144
DUMP: Header to 0x814001 at 999999 (0xf423f)
succeeded
halted CPU 1
halted CPU 2
halted CPU 3
CP - SAVE_TERM routine to be called
CP - SAVE_TERM exited with hlt_req = 1, r0 = 00000000.00000000
halted CPU 0
halt code = 5
HALT instruction executed
PC = fffffc000045fda0
CPU 0 booting
(boot dkf0.0.0.2000.0 -flags A)
------------------------------------------------------------------
Thierry Aubert at CERN
|
543.2 | | MAY30::CUMMINS | | Mon Mar 24 1997 11:59 | 38 |
| This is a 660 (IOD-detected/reported) MCHK. MCHK codes are used to
further differentiate MCHKs. CAP_ERR in both IODs is 0x84000000 which
means NXM. Failing address is 0x0400006e80. Looks like address bit 34
bit flipped, which we've seen before (twice). Both times after double
bit ECC (RDS) errors in memory.
We've not yet been able to figure out why a NXM is sometimes seen after
RDS errors. UNIX should and presumably does not ever access address
0x00006e80 (this is the base address of the AlphaServer 4100/4000's
interrupt dispatch table in memory). PALcode stores this address in a
PALtemp and has no reason to ever modify the upper longword of this
address. Only quad instructions are used when moving the address around
and accessing the dispatch table.
The fill error address is physical address 0x0044ec3c; it appears UNIX
was in its page testing code at the time of the MCHK. We've seen UNIX
panics whilst testing pages when SRM console's MEMORY_TEST environment
variable is set less than FULL. Perhaps MEMORY_TET < FULL was the
problem here? UNIX has a known, but not yet understood VM bug and so
does not currently support partially tested memory on 4100/4000.
Two things to check/do:
1. Check SRM console's MEMORY_TEST EV setting; it must be set to
FULL. If not FULL, then this was presumably the problem. I don't
believe a UNIX QAR has been officially filed against this bug, so
not sure about how aggressively it's being investigated right now.
2. If MEMORY_TEST was set FULL, then use DECevent V2.3 + KNL update
instead of UERF to reexamine the system error log. UERF is not
supported on the AlphaServer 4100/4000 platform. Look at all of
the error log entries. You'll likely find earlier MCHK entries
when you review/decipher the system error log.
Please let me know the outcome. If this was a MEMORY_TEST < FULL issue,
this may help us figure out why we see address bit 34 flip on a read of
6e80 on occasion.
BC
|
543.3 | KNL update ? | LEMAN::AUBERT | Multivendor Customer Services at CERN... | Thu Mar 27 1997 08:50 | 15 |
| First of all thanks for your input.
I have checked with my customer the setting of MEMORY_TEST. It is set
to FULL. So I will have some problem to investigate further as the
error logs are corrupted during the time of the machine check !!!
But in the meantime I will ask the customer to install DECevent so the
next machine check will be easier to analyse.
In .2 you suggest me to use DECevent V2.3 + KNL update instead of UERF;
but what is '+ KNL update' ??
Thanks for your answer.
Regards,
Thierry
|
543.4 | | MAY30::CUMMINS | | Thu Mar 27 1997 10:16 | 8 |
| Visit the DECEVENT Web page - I believe it or a sub-page talks about
KNL updates. My understanding is that there's DECevent V3.2 which
supports multiple platforms and then there are KNL updates for given
platforms, but this may be wrong. I don know for a fact that you need
the KNL updates if enabling FRU_TABLE at SRM console; and FRU table
info can be very useful in diagnosing faulty machines using DECevent.
BC
|