T.R | Title | User | Personal Name | Date | Lines |
---|
588.1 | | UTRTSC::utojvdbu1.uto.dec.com::JurVanDerBurg | Change mode to Panic! | Wed May 14 1997 07:11 | 7 |
| How does the hardware config look? What kind of Alpha's? What kind of controllers?
Cluster parameters? Contents of SYS$SYSTEM:SYS$DEVICES.DAT?
We are currently looking at a similar problem when using port allocation class.
Jur.
|
588.2 | | EVMS::KUEHNEL | Andy K�hnel | Wed May 14 1997 10:07 | 12 |
| This sounds like a known problem, but we don't have a good solution at
this point.
The SCSI-naming code uses locks to ensure that a given port allocation
class is not assigned to more than 1 bus.
- You need quorum to use locks.
- You need the bus to get to the quorum disk.
This is a classic catch-22, also known as "deadly embrace". I'll ask
the resident I/O gurus to reply here for better advice.
|
588.3 | configuration | NNTPD::"[email protected]" | Alain | Wed May 14 1997 10:55 | 134 |
| hy Jur,
Hereafter are configuration received from customer:
DSA1: Mounted 0 GTXAPL1 6979725 6
1
DSA2: Mounted 0 GTXAPL2 8349669 1
1
$1$DKA100: (GT0) ShadowSetMember 0 (member of DSA0:)
$1$DKA200: (GT0) ShadowSetMember 0 (member of DSA1:)
$1$DKA300: (GT0) ShadowSetMember 0 (member of DSA2:)
$1$DKA400: (GT0) Mounted 0 GTXQUORUM 8379639 1
1
$1$DKA500: (GT0) ShadowSetMember 0 (member of DSA0:)
$10$DVA0: (GT0) Online 0
DAD0: (GT0) Online 0
$11$DKA400: (GT0) Online wrtlck 0
gt0>ty sys$system:sys$devices.dat
;$$ SYS$DEVICES.DAT created by SYSINIT on 7-May-1997 11:07
[Port GT0$PKA]
Allocation Class = 3
Buffer: REQUEST DESCRIPTION | Read-only | Unmodifiable |
Forward
$ MC SYSGEN SHO/CLUSTER
Parameters in use: Active
Parameter Name Current Default Min. Max. Unit
Dynamic
-------------- ------- ------- ------- ------- ----
-----
--
VAXCLUSTER 2 1 0 2 Coded-valu
EXPECTED_VOTES 1 1 1 127 Votes
VOTES 1 1 0 127 Votes
RECNXINTERVAL 20 20 1 32767 Seconds D
DISK_QUORUM "$1$DKA400 " " " " " "ZZZZ" Ascii
QDSKVOTES 1 1 0 127 Votes
QDSKINTERVAL 10 10 1 32767 Seconds
ALLOCLASS 10 0 0 255 Pure-numbe
LOCKDIRWT 0 0 0 255 Pure-numbe
CLUSTER_CREDITS 10 10 10 128 Credits
NISCS_CONV_BOOT 0 0 0 1 Boolean
NISCS_LOAD_PEA0 1 0 0 1 Boolean
NISCS_PORT_SERV 0 0 0 3 Bitmask
MSCP_LOAD 0 0 0 16384 Coded-valu
TMSCP_LOAD 0 0 0 3 Coded-valu
MSCP_SERVE_ALL 0 0 0 2 Coded-valu
TMSCP_SERVE_ALL 0 0 0 3 Coded-valu
MSCP_BUFFER 128 128 16 -1 Coded-valu
MSCP_CREDITS 8 8 2 128 Coded-valu
MSCP_CMD_TMO 600 600 0 2147483647 CNTLRTMOs D
TAPE_ALLOCLASS 0 0 0 255 Pure-numbe
NISCS_MAX_PKTSZ 1498 1498 1080 8192 Bytes
NISCS_LAN_OVRHD 18 18 0 256 Bytes
CWCREPRC_ENABLE 1 1 0 1 Bitmask D
gt0>mc sysgen sho device
Parameter Name Current Default Min. Max. Unit
Dynamic
-------------- ------- ------- ------- ------- ----
-------
DEVICE_NAMING 1 0 0 -1 Bitmask
SDA> CLUE CONFIG
OpenVMS (TM) Alpha Operating System, Version V7.1 -- System Dump Analysis
14-MAY-1997 13:54:38.60 Page 2
System Configuration:
System Information:
System Type AlphaServer 1000A 5/400 Primary CPU ID 00
Cycle Time 2.5 nsec (400 MHz) Pagesize 8192 Byte
Memory Configuration:
Cluster PFN Start PFN Count Range (MByte) Usage
#03 0 230 0.0 MB - 1.7 MB Console
#04 230 16145 1.7 MB - 127.9 MB System
#05 16375 9 127.9 MB - 128.0 MB Console
Per-CPU Slot Processor Information:
CPU ID 00 CPU State rc,pa,pp,cv,pv,pmv,pl
CPU Type EV56 Pass 2 (21164A) Halt PC 00000000.20000000
PAL Code 1.19-4 Halt PS 00000000.00001F00
CPU Revision .... Halt Code 00000000.00000000
Serial Number .......... "Bootstrap or Powerfail"
Console Vers V4.8-74
OpenVMS (TM) Alpha Operating System, Version V7.1 -- System Dump Analysis
14-MAY-1997 13:54:38.60 Page 3
Adapter Configuration:
TR Adapter ADP Hose Bus BusArrayEntry Node Device Name / HW-Id
-- ----------- -------- ---- -------------------- ----
-------------------------
1 KA1B05 80D4EC00 0 BUSLESS_SYSTEM
2 PCI 80D4EE00 0 PCI
80D4F1A0 7 MERCURY
80D4F1D8 8 PBB
80D4F280 EWA: 11 DC21140 - 100 mbit NI
(Tu
lip)
80D4F2B8 EWB: 12 DC21140 - 100 mbit NI
(Tu
lip)
80D4F2F0 PKA: 13 NCR 53C810 SCSI
3 EISA 80D4F5C0 0 EISA
80D4F798 0 System Board
4 XBUS 80D4FB40 0 XBUS
80D4FD18 0 EISA_SYSTEM_BOARD
80D4FD50 DVA: 1 Floppy
80D4FD88 LRA: 2 Line Printer (parallel
po
rt)
80D4FDC0 TTA: 3 NS16450 Serial Port
5 PCI 80D50000 0 PCI
80D50218 PKB: 0 Qlogic ISP1020 SCSI-2
80D50250 PKC: 1 FWD SCSI (KZPSA)
Thanks for your help
Alain
[Posted by WWW Notes gateway]
|
588.4 | ?? | EVMS::KUEHNEL | Andy K�hnel | Wed May 14 1997 15:27 | 12 |
| I am thoroughly confused:
This system is using
10 as the node's disk allocation class
3 as the port allocation class for the bus on PKA
However, the SHOW DEV output shows devices $1$DKA500 and $11$DKA400
but no $3$DKAxxx.
Can you dump the DDBs for all PKx devices on GT0?
|
588.5 | | EEMELI::MOSER | Orienteers do it in the bush... | Thu May 15 1997 02:00 | 18 |
| I just think there was a cut-and-paste missing from the file
SYS$DEVICES.DAT. I can't see any entries for PKB and PKC.
The floopy das allocation class 10, which comes from the host
allocation class (i.e. SYSGEN's ALLOCLASS).
The local CDrom on node GT0 has 11, which means there must be
an entry in SYS$DEVICES.DAT for GT0$PKA to be 11, but we saw a
3, and we didn't see a 1 for all the $1$DKAxxx.
So my guess is, that there are multiple entries for the same
port in SYS$DEVICES.DAT, but I was expecting that the system
would use the first one found...
Anyway, please type the full SYS$DEVICES.DAT and a SHOW DEV D
from both nodes...
/cmos
|
588.6 | Missing characteristics | NNTPD::"[email protected]" | Alain | Thu May 15 1997 04:18 | 149 |
| Hy,
Sorry I missed my cut and paste, here is it again, only for node GT0, node GT1
has never been booted yet.
CONSOLE (>>>) :
PKA0 with DKA400 -> under VMS GT0$PKB
PKB0 with DKB100 DKB200 DKB300 DKB400 DKB500 -> under VMS GT0$PKC
PKC0 with MKC200 MKC600 -> under VMS GT0$PKA
gt0>sho dev d
Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks Count
Cnt
DSA0: Mounted 0 GTXSYS 1630344 572
1
DSA1: Mounted 0 GTXAPL1 6979725 6
1
DSA2: Mounted 0 GTXAPL2 8349669 1
1
$1$DKA100: (GT0) ShadowSetMember 0 (member of DSA0:)
$1$DKA200: (GT0) ShadowSetMember 0 (member of DSA1:)
$1$DKA300: (GT0) ShadowSetMember 0 (member of DSA2:)
$1$DKA400: (GT0) Mounted 0 GTXQUORUM 8379639 1
1
$1$DKA500: (GT0) ShadowSetMember 0 (member of DSA0:)
$10$DVA0: (GT0) Online 0
DAD0: (GT0) Online 0
$11$DKA400: (GT0) Online wrtlck 0
gt0>ty sys$system:sys$devices.dat
;$$ SYS$DEVICES.DAT created by SYSINIT on 7-May-1997 11:07
[Port GT0$PKA]
Allocation Class = 3
[Port GT0$PKB]
Allocation Class = 11
[Port GT0$PKC]
Allocation Class = 1
$ MC SYSGEN SHO/CLUSTER
Parameters in use: Active
Parameter Name Current Default Min. Max. Unit
Dynamic
-------------- ------- ------- ------- ------- ----
-----
--
VAXCLUSTER 2 1 0 2 Coded-valu
EXPECTED_VOTES 1 1 1 127 Votes
VOTES 1 1 0 127 Votes
RECNXINTERVAL 20 20 1 32767 Seconds D
DISK_QUORUM "$1$DKA400 " " " " " "ZZZZ" Ascii
QDSKVOTES 1 1 0 127 Votes
QDSKINTERVAL 10 10 1 32767 Seconds
ALLOCLASS 10 0 0 255 Pure-numbe
LOCKDIRWT 0 0 0 255 Pure-numbe
CLUSTER_CREDITS 10 10 10 128 Credits
NISCS_CONV_BOOT 0 0 0 1 Boolean
NISCS_LOAD_PEA0 1 0 0 1 Boolean
NISCS_PORT_SERV 0 0 0 3 Bitmask
MSCP_LOAD 0 0 0 16384 Coded-valu
TMSCP_LOAD 0 0 0 3 Coded-valu
MSCP_SERVE_ALL 0 0 0 2 Coded-valu
TMSCP_SERVE_ALL 0 0 0 3 Coded-valu
MSCP_BUFFER 128 128 16 -1 Coded-valu
MSCP_CREDITS 8 8 2 128 Coded-valu
MSCP_CMD_TMO 600 600 0 2147483647 CNTLRTMOs D
TAPE_ALLOCLASS 0 0 0 255 Pure-numbe
NISCS_MAX_PKTSZ 1498 1498 1080 8192 Bytes
NISCS_LAN_OVRHD 18 18 0 256 Bytes
CWCREPRC_ENABLE 1 1 0 1 Bitmask D
gt0>mc sysgen sho device
Parameter Name Current Default Min. Max. Unit
Dynamic
-------------- ------- ------- ------- ------- ----
-------
DEVICE_NAMING 1 0 0 -1 Bitmask
SDA> CLUE CONFIG
OpenVMS (TM) Alpha Operating System, Version V7.1 -- System Dump Analysis
14-MAY-1997 13:54:38.60 Page 2
System Configuration:
System Information:
System Type AlphaServer 1000A 5/400 Primary CPU ID 00
Cycle Time 2.5 nsec (400 MHz) Pagesize 8192 Byte
Memory Configuration:
Cluster PFN Start PFN Count Range (MByte) Usage
#03 0 230 0.0 MB - 1.7 MB Console
#04 230 16145 1.7 MB - 127.9 MB System
#05 16375 9 127.9 MB - 128.0 MB Console
Per-CPU Slot Processor Information:
CPU ID 00 CPU State rc,pa,pp,cv,pv,pmv,pl
CPU Type EV56 Pass 2 (21164A) Halt PC 00000000.20000000
PAL Code 1.19-4 Halt PS 00000000.00001F00
CPU Revision .... Halt Code 00000000.00000000
Serial Number .......... "Bootstrap or Powerfail"
Console Vers V4.8-74
OpenVMS (TM) Alpha Operating System, Version V7.1 -- System Dump Analysis
14-MAY-1997 13:54:38.60 Page 3
Adapter Configuration:
TR Adapter ADP Hose Bus BusArrayEntry Node Device Name / HW-Id
-- ----------- -------- ---- -------------------- ----
-------------------------
1 KA1B05 80D4EC00 0 BUSLESS_SYSTEM
2 PCI 80D4EE00 0 PCI
80D4F1A0 7 MERCURY
80D4F1D8 8 PBB
80D4F280 EWA: 11 DC21140 - 100 mbit NI
(Tu
lip)
80D4F2B8 EWB: 12 DC21140 - 100 mbit NI
(Tu
lip)
80D4F2F0 PKA: 13 NCR 53C810 SCSI
3 EISA 80D4F5C0 0 EISA
80D4F798 0 System Board
4 XBUS 80D4FB40 0 XBUS
80D4FD18 0 EISA_SYSTEM_BOARD
80D4FD50 DVA: 1 Floppy
80D4FD88 LRA: 2 Line Printer (parallel
po
rt)
80D4FDC0 TTA: 3 NS16450 Serial Port
5 PCI 80D50000 0 PCI
80D50218 PKB: 0 Qlogic ISP1020 SCSI-2
80D50250 PKC: 1 FWD SCSI (KZPSA)
[Posted by WWW Notes gateway]
|
588.7 | | UTRTSC::utojvdbu1.uto.dec.com::JurVanDerBurg | Change mode to Panic! | Thu May 15 1997 08:22 | 15 |
| Re .-1
That looks ok to me.
Re .2
> The SCSI-naming code uses locks to ensure that a given port allocation
> class is not assigned to more than 1 bus.
Are you sure about that? I could not find this anywhere in the code. Sure,
STACONFIG creates a lock before we're in a cluster, but that is a sublock
of the sysid lock, and it is allowed to create a sublock at that time.
Jur (waiting for hardware to reproduce the problem for further troubleshooting).
|
588.8 | | EEMELI::MOSER | Orienteers do it in the bush... | Thu May 15 1997 08:55 | 10 |
| your parameters look okay, so assuming there is indeed a valid quorum
file on $1$DKA400, then the system should boot up with expected votes
of 3.
If it doesn't, then it most likely tries to look for the quorum file
on your CDrom driver $11$DKA400. I don't think it'll help, but what
happens if you change the SCSI id of your CDrom from 4 to something
else, like 3 or 5?
/cmos
|
588.9 | | EEMELI::MOSER | Orienteers do it in the bush... | Thu May 15 1997 09:06 | 9 |
| re: .7
yup that's correct, a so-called 'port class lock' is taken out from
IOGEN to ensure that only controllers on the same bus have the same
PAC. In fact there is a known problem with this, which accounts for
the 72 second boot delay if you have PAC enabled and are booting from
a device on a PAC'ed bus. A fix is in the works.
/cmos
|
588.10 | no need to wait for hardware :-( | EVMS::KUEHNEL | Andy K�hnel | Thu May 15 1997 10:20 | 4 |
| re .7
Check out the use of $getlki in check_scsi_cpu() in
[cluster.lis]sccpuver.lis
|