[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference vaxaxp::vmsnotes

Title:VAX and Alpha VMS
Notice:This is a new VMSnotes, please read note 2.1
Moderator:VAXAXP::BERNARDO
Created:Wed Jan 22 1997
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:703
Total number of notes:3722

588.0. "Quorum disk problem" by NNTPD::"[email protected]" (Alain Dunski) Wed May 14 1997 05:43

My custommer has an alpha scsi cluster composed of 2 alphas and 1 quorum disk 
running vms 7.1.
He uses port allocation class
system disk is shadowed on $1$dka100 --> dsa0
quorum disk is $1$dka400 on both systems
Each system and quorum disk have 1 vote
Expected votes = 3
If he tries to boot 1 node with this configuration, startup hangs waiting to
join
cluster.
If he sets expected votes to 1 startup runs OK.
File quorum.dat exists on quorum disk.
Can somebody help me.

Many thanks
Alain Dunski
[Posted by WWW Notes gateway]
T.RTitleUserPersonal
Name
DateLines
588.1UTRTSC::utojvdbu1.uto.dec.com::JurVanDerBurgChange mode to Panic!Wed May 14 1997 07:117
How does the hardware config look? What kind of Alpha's? What kind of controllers?
Cluster parameters? Contents of SYS$SYSTEM:SYS$DEVICES.DAT?

We are currently looking at a similar problem when using port allocation class.

Jur.

588.2EVMS::KUEHNELAndy K�hnelWed May 14 1997 10:0712
    This sounds like a known problem, but we don't have a good solution at
    this point.
    
    The SCSI-naming code uses locks to ensure that a given port allocation
    class is not assigned to more than 1 bus.
    
    	- You need quorum to use locks.
    
    	- You need the bus to get to the quorum disk.
    
    This is a classic catch-22, also known as "deadly embrace".  I'll ask
    the resident I/O gurus to reply here for better advice.
588.3configurationNNTPD::"[email protected]"AlainWed May 14 1997 10:55134
hy Jur,

Hereafter are configuration received from customer:

DSA1:                   Mounted              0  GTXAPL1        6979725     6  
1
DSA2:                   Mounted              0  GTXAPL2        8349669     1  
1
$1$DKA100:       (GT0)  ShadowSetMember      0  (member of DSA0:)
$1$DKA200:       (GT0)  ShadowSetMember      0  (member of DSA1:)
$1$DKA300:       (GT0)  ShadowSetMember      0  (member of DSA2:)
$1$DKA400:       (GT0)  Mounted              0  GTXQUORUM      8379639     1  
1
$1$DKA500:       (GT0)  ShadowSetMember      0  (member of DSA0:)
$10$DVA0:        (GT0)  Online               0
DAD0:            (GT0)  Online               0
$11$DKA400:      (GT0)  Online wrtlck        0

gt0>ty sys$system:sys$devices.dat
;$$ SYS$DEVICES.DAT created by SYSINIT on  7-May-1997 11:07

[Port GT0$PKA]
Allocation Class = 3
 Buffer: REQUEST DESCRIPTION               | Read-only | Unmodifiable |
Forward


$ MC SYSGEN SHO/CLUSTER
  Parameters in use: Active
Parameter Name           Current    Default     Min.      Max.     Unit 
Dynamic
  --------------           -------    -------    -------   -------   ---- 
-----
--
VAXCLUSTER                      2          1         0          2 Coded-valu
EXPECTED_VOTES                  1          1         1        127 Votes
VOTES                           1          1         0        127 Votes
RECNXINTERVAL                  20         20         1      32767 Seconds    D
DISK_QUORUM     "$1$DKA400       "    "    "    "    "     "ZZZZ" Ascii
QDSKVOTES                       1          1         0        127 Votes
QDSKINTERVAL                   10         10         1      32767 Seconds
ALLOCLASS                      10          0         0        255 Pure-numbe
LOCKDIRWT                       0          0         0        255 Pure-numbe
CLUSTER_CREDITS                10         10        10        128 Credits
NISCS_CONV_BOOT                 0          0         0          1 Boolean
NISCS_LOAD_PEA0                 1          0         0          1 Boolean
NISCS_PORT_SERV                 0          0         0          3 Bitmask
MSCP_LOAD                       0          0         0      16384 Coded-valu
TMSCP_LOAD                      0          0         0          3 Coded-valu
MSCP_SERVE_ALL                  0          0         0          2 Coded-valu
TMSCP_SERVE_ALL                 0          0         0          3 Coded-valu
MSCP_BUFFER                   128        128        16         -1 Coded-valu
MSCP_CREDITS                    8          8         2        128 Coded-valu
MSCP_CMD_TMO                  600        600         0 2147483647 CNTLRTMOs  D
TAPE_ALLOCLASS                  0          0         0        255 Pure-numbe
NISCS_MAX_PKTSZ              1498       1498      1080       8192 Bytes
NISCS_LAN_OVRHD                18         18         0        256 Bytes
CWCREPRC_ENABLE                 1          1         0          1 Bitmask    D

gt0>mc sysgen sho device
Parameter Name           Current    Default     Min.      Max.     Unit 
Dynamic
--------------           -------    -------    -------   -------   ---- 
-------
DEVICE_NAMING                   1          0         0         -1 Bitmask


SDA> CLUE CONFIG
OpenVMS (TM) Alpha Operating System, Version V7.1     -- System Dump Analysis
14-MAY-1997 13:54:38.60                   Page     2
System Configuration:



System Information:
System Type    AlphaServer 1000A 5/400                Primary CPU ID 00
Cycle Time     2.5 nsec (400 MHz)                     Pagesize       8192 Byte

Memory Configuration:
Cluster    PFN Start    PFN Count         Range (MByte)        Usage
 #03             0          230         0.0 MB -     1.7 MB    Console
 #04           230        16145         1.7 MB -   127.9 MB    System
 #05         16375            9       127.9 MB -   128.0 MB    Console

Per-CPU Slot Processor Information:
CPU ID         00                        CPU State    rc,pa,pp,cv,pv,pmv,pl
CPU Type       EV56  Pass 2 (21164A)     Halt PC      00000000.20000000
PAL Code       1.19-4                    Halt PS      00000000.00001F00
CPU Revision   ....                      Halt Code    00000000.00000000
Serial Number  ..........                "Bootstrap or Powerfail"
Console Vers   V4.8-74


OpenVMS (TM) Alpha Operating System, Version V7.1     -- System Dump Analysis
14-MAY-1997 13:54:38.60                   Page     3
Adapter Configuration:



TR Adapter     ADP      Hose Bus   BusArrayEntry  Node Device Name / HW-Id
-- ----------- -------- ---- -------------------- ----
-------------------------
 1 KA1B05      80D4EC00    0 BUSLESS_SYSTEM
 2 PCI         80D4EE00    0 PCI
                                   80D4F1A0          7 MERCURY
                                   80D4F1D8          8 PBB
                                   80D4F280  EWA:   11 DC21140 - 100 mbit NI
(Tu
lip)
                                   80D4F2B8  EWB:   12 DC21140 - 100 mbit NI
(Tu
lip)
                                   80D4F2F0  PKA:   13 NCR 53C810 SCSI
 3 EISA        80D4F5C0    0 EISA
                                   80D4F798          0 System Board
 4 XBUS        80D4FB40    0 XBUS
                                   80D4FD18          0 EISA_SYSTEM_BOARD
                                   80D4FD50  DVA:    1 Floppy
                                   80D4FD88  LRA:    2 Line Printer (parallel
po
rt)
                                   80D4FDC0  TTA:    3 NS16450 Serial Port
 5 PCI         80D50000    0 PCI
                                   80D50218  PKB:    0 Qlogic ISP1020 SCSI-2
                                   80D50250  PKC:    1 FWD SCSI (KZPSA)





Thanks for your help

Alain
[Posted by WWW Notes gateway]
588.4??EVMS::KUEHNELAndy K�hnelWed May 14 1997 15:2712
    I am thoroughly confused:
    
    This system is using
    
    	10 as the node's disk allocation class
    
         3 as the port allocation class for the bus on PKA
    
    However, the SHOW DEV output shows devices  $1$DKA500  and  $11$DKA400  
    but no $3$DKAxxx.
    
    Can you dump the DDBs for all PKx devices on GT0?
588.5EEMELI::MOSEROrienteers do it in the bush...Thu May 15 1997 02:0018
    I just think there was a cut-and-paste missing from the file
    SYS$DEVICES.DAT. I can't see any entries for PKB and PKC.
    
    The floopy das allocation class 10, which comes from the host
    allocation class (i.e. SYSGEN's ALLOCLASS).
    
    The local CDrom on node GT0 has 11, which means there must be
    an entry in SYS$DEVICES.DAT for GT0$PKA to be 11, but we saw a
    3, and we didn't see a 1 for all the $1$DKAxxx.
    
    So my guess is, that there are multiple entries for the same
    port in SYS$DEVICES.DAT, but I was expecting that the system
    would use the first one found...
    
    Anyway, please type the full SYS$DEVICES.DAT and a SHOW DEV D
    from both nodes...
    
    /cmos
588.6Missing characteristicsNNTPD::"[email protected]"AlainThu May 15 1997 04:18149
Hy,

Sorry I missed my cut and paste, here is it again, only for node GT0, node GT1
has never been booted yet.

 CONSOLE (>>>) :

PKA0  with DKA400                                   -> under VMS GT0$PKB
PKB0  with DKB100  DKB200  DKB300  DKB400  DKB500   -> under VMS GT0$PKC
PKC0  with MKC200  MKC600                           -> under VMS GT0$PKA

gt0>sho dev d

Device                  Device           Error    Volume         Free
Trans Mnt
 Name                   Status           Count     Label        Blocks Count
Cnt
DSA0:                   Mounted              0  GTXSYS         1630344   572  
1
DSA1:                   Mounted              0  GTXAPL1        6979725     6  
1
DSA2:                   Mounted              0  GTXAPL2        8349669     1  
1
$1$DKA100:       (GT0)  ShadowSetMember      0  (member of DSA0:)
$1$DKA200:       (GT0)  ShadowSetMember      0  (member of DSA1:)
$1$DKA300:       (GT0)  ShadowSetMember      0  (member of DSA2:)
$1$DKA400:       (GT0)  Mounted              0  GTXQUORUM      8379639     1  
1
$1$DKA500:       (GT0)  ShadowSetMember      0  (member of DSA0:)
$10$DVA0:        (GT0)  Online               0
DAD0:            (GT0)  Online               0
$11$DKA400:      (GT0)  Online wrtlck        0

gt0>ty sys$system:sys$devices.dat
;$$ SYS$DEVICES.DAT created by SYSINIT on  7-May-1997 11:07

[Port GT0$PKA]
Allocation Class = 3

[Port GT0$PKB]
Allocation Class = 11

[Port GT0$PKC]
Allocation Class = 1


$ MC SYSGEN SHO/CLUSTER
  Parameters in use: Active
Parameter Name           Current    Default     Min.      Max.     Unit 
Dynamic
  --------------           -------    -------    -------   -------   ---- 
-----
--
VAXCLUSTER                      2          1         0          2 Coded-valu
EXPECTED_VOTES                  1          1         1        127 Votes
VOTES                           1          1         0        127 Votes
RECNXINTERVAL                  20         20         1      32767 Seconds    D
DISK_QUORUM     "$1$DKA400       "    "    "    "    "     "ZZZZ" Ascii
QDSKVOTES                       1          1         0        127 Votes
QDSKINTERVAL                   10         10         1      32767 Seconds
ALLOCLASS                      10          0         0        255 Pure-numbe
LOCKDIRWT                       0          0         0        255 Pure-numbe
CLUSTER_CREDITS                10         10        10        128 Credits
NISCS_CONV_BOOT                 0          0         0          1 Boolean
NISCS_LOAD_PEA0                 1          0         0          1 Boolean
NISCS_PORT_SERV                 0          0         0          3 Bitmask
MSCP_LOAD                       0          0         0      16384 Coded-valu
TMSCP_LOAD                      0          0         0          3 Coded-valu
MSCP_SERVE_ALL                  0          0         0          2 Coded-valu
TMSCP_SERVE_ALL                 0          0         0          3 Coded-valu
MSCP_BUFFER                   128        128        16         -1 Coded-valu
MSCP_CREDITS                    8          8         2        128 Coded-valu
MSCP_CMD_TMO                  600        600         0 2147483647 CNTLRTMOs  D
TAPE_ALLOCLASS                  0          0         0        255 Pure-numbe
NISCS_MAX_PKTSZ              1498       1498      1080       8192 Bytes
NISCS_LAN_OVRHD                18         18         0        256 Bytes
CWCREPRC_ENABLE                 1          1         0          1 Bitmask    D

gt0>mc sysgen sho device
Parameter Name           Current    Default     Min.      Max.     Unit 
Dynamic
--------------           -------    -------    -------   -------   ---- 
-------
DEVICE_NAMING                   1          0         0         -1 Bitmask


SDA> CLUE CONFIG
OpenVMS (TM) Alpha Operating System, Version V7.1     -- System Dump Analysis
14-MAY-1997 13:54:38.60                   Page     2
System Configuration:



System Information:
System Type    AlphaServer 1000A 5/400                Primary CPU ID 00
Cycle Time     2.5 nsec (400 MHz)                     Pagesize       8192 Byte

Memory Configuration:
Cluster    PFN Start    PFN Count         Range (MByte)        Usage
 #03             0          230         0.0 MB -     1.7 MB    Console
 #04           230        16145         1.7 MB -   127.9 MB    System
 #05         16375            9       127.9 MB -   128.0 MB    Console

Per-CPU Slot Processor Information:
CPU ID         00                        CPU State    rc,pa,pp,cv,pv,pmv,pl
CPU Type       EV56  Pass 2 (21164A)     Halt PC      00000000.20000000
PAL Code       1.19-4                    Halt PS      00000000.00001F00
CPU Revision   ....                      Halt Code    00000000.00000000
Serial Number  ..........                "Bootstrap or Powerfail"
Console Vers   V4.8-74


OpenVMS (TM) Alpha Operating System, Version V7.1     -- System Dump Analysis
14-MAY-1997 13:54:38.60                   Page     3
Adapter Configuration:


TR Adapter     ADP      Hose Bus   BusArrayEntry  Node Device Name / HW-Id
-- ----------- -------- ---- -------------------- ----
-------------------------
 1 KA1B05      80D4EC00    0 BUSLESS_SYSTEM
 2 PCI         80D4EE00    0 PCI
                                   80D4F1A0          7 MERCURY
                                   80D4F1D8          8 PBB
                                   80D4F280  EWA:   11 DC21140 - 100 mbit NI
(Tu
lip)
                                   80D4F2B8  EWB:   12 DC21140 - 100 mbit NI
(Tu
lip)
                                   80D4F2F0  PKA:   13 NCR 53C810 SCSI
 3 EISA        80D4F5C0    0 EISA
                                   80D4F798          0 System Board
 4 XBUS        80D4FB40    0 XBUS
                                   80D4FD18          0 EISA_SYSTEM_BOARD
                                   80D4FD50  DVA:    1 Floppy
                                   80D4FD88  LRA:    2 Line Printer (parallel
po
rt)
                                   80D4FDC0  TTA:    3 NS16450 Serial Port
 5 PCI         80D50000    0 PCI
                                   80D50218  PKB:    0 Qlogic ISP1020 SCSI-2
                                   80D50250  PKC:    1 FWD SCSI (KZPSA)





[Posted by WWW Notes gateway]
588.7UTRTSC::utojvdbu1.uto.dec.com::JurVanDerBurgChange mode to Panic!Thu May 15 1997 08:2215
Re .-1

That looks ok to me.

Re .2

>    The SCSI-naming code uses locks to ensure that a given port allocation
>    class is not assigned to more than 1 bus.

Are you sure about that? I could not find this anywhere in the code. Sure,
STACONFIG creates a lock before we're in a cluster, but that is a sublock
of the sysid lock, and it is allowed to create a sublock at that time.

Jur (waiting for hardware to reproduce the problem for further troubleshooting).

588.8EEMELI::MOSEROrienteers do it in the bush...Thu May 15 1997 08:5510
    your parameters look okay, so assuming there is indeed a valid quorum
    file on $1$DKA400, then the system should boot up with expected votes
    of 3.
    
    If it doesn't, then it most likely tries to look for the quorum file
    on your CDrom driver $11$DKA400. I don't think it'll help, but what
    happens if you change the SCSI id of your CDrom from 4 to something
    else, like 3 or 5?
    
    /cmos
588.9EEMELI::MOSEROrienteers do it in the bush...Thu May 15 1997 09:069
    re: .7
    
    yup that's correct, a so-called 'port class lock' is taken out from
    IOGEN to ensure that only controllers on the same bus have the same
    PAC. In fact there is a known problem with this, which accounts for
    the 72 second boot delay if you have PAC enabled and are booting from
    a device on a PAC'ed bus. A fix is in the works.
    
    /cmos
588.10no need to wait for hardware :-(EVMS::KUEHNELAndy K�hnelThu May 15 1997 10:204
    re .7
    
    Check out the use of $getlki in check_scsi_cpu() in 
    	[cluster.lis]sccpuver.lis