|
When you upgraded the PALcode, did you confirm that the correct disk
and system root were set in the post-PALcode-upgrade-environment
variables? (Multiple VMScluster nodes booting from the same SCSI
system root -- SYS0 is the default -- will tend to cause problems
and will potentially cause disk corruptions...)
I will assume that only one node was booted from the system disk
being upgraded -- the node that was performing the upgrade...
: When I rebooted one of the 2 nodes , the VMS banner indicated N6.2-1H3
: and the node crashed with a fatal bug check (process STACONFIG) .
The upgrade had (obviously) not completed. What was the crash?
: Then I booted conversational and set NISCS_LOAD_PEA0 to 0 and the
: machine booted correctly .
I have found that when I set VAXCLUSTER to 0, it is often best
to also set NISCS_LOAD_PEA0 to 0.
|
| Well Steve ,
First : of course when I did the upgrade , there was only one node booted
in the cluster . And the upgrade did not show any error message .
Second : the crash allways occured with VAXCLUSTER set to 2 . Of course
as I restored the system disk , I have no sysdump to analyze , but
below are some hardcopy informations :
(boot dkb100.1.0.2001.0 -flags 0,0)
.....
jumping to bootstrap code
OpenVMS (TM) Alpha Operating System , Version N6.2-1H3
%CNXMAN, Using remote access method for quorum disk
waiting to form or join a VMScluster system
%VMScluster-I-LOADSECDB, loading the cluster security database
System Machine Check Through Vector 00000660
logout frame address 0x6048 code 0x208
Machine Check Code---> 0x208 Invalid page table lookup (scatter gather)
IPRs:
EXC_ADD:00000000001214A8 ICCSR: 001EC4F800000004 HIER: 00000001FFFFDCC0
HIRR: some other hex codes...
EDSR (Comanche): 00002000-->
DCSR ( Epic): 802a4058--> lost iPTL
SEAR ( Sysaddr): 00285800
PEAR ( Pciaddr): 00000280
**** OpenVMS (TM) Alpha Operating System N6.2-1H3 -BUGCHECK ****
* Code=00000215: MACHINECHECK, Machine check while in kernel mode
* Crash CPI: 00 Primary CPU:00 Active CPUs:00000001
* Current Process = STACONFIG
* Image Name =STACONFIG.EXE
**** Starting Memory Dump...
*******************************************************
Below are the firmware information :
SRM Console: V4.7-166
ARC Console: 4.49
PALcode: VMS PALcode V5.56-6, OSF PALcode X1.45-12
Serial Rom: V2.8
Processor
DECchip (tm) 21064A-2 266MHz
Memory
384 Meg Of System Memory
Regards ,
Philippe RIVARD.
|
|
There's another similar discussion of a machine check in 475.*.
You might want to take a look through the following notes from
the Mikasa notes conference, as these cover the same machine
check code.
83 BACHUS::VANHAVERE 18-JAN-1995 10 Machinecheck 0x208 with EISA mo
140 CHOOKE::KLUMPES 15-MAR-1995 6 Machine Check 208
372 CSC32::HUTMACHER 29-SEP-1995 5 208 machine checks as1000 V6.1-
699 CSC32::BRISSETTE 15-AUG-1996 1 pal error type code 208 , repro
You will want to submit the CLUE output from the crashdump to the
CANASTA e-mail server -- see note 233.* for details on how to do
this -- as well. CANASTA can provide clues as to the cause of the
crash, if it has been seen before...
And, if the above notes and the CANASTA server do not find a pointer
to a solution to the problem, start an IPMT.
|