[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference spezko::cluster

Title:	+ OpenVMS Clusters - The best clusters in the world! +
Notice:	This conference is COMPANY CONFIDENTIAL. See #1.3
Moderator:	PROXY::MOORE

Created:	Fri Aug 26 1988
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	5320
Total number of notes:	23384

5221.0. "VAXCL=0, still access served disks (non HSx,RF)" by TIMABS::FREPPEL (Mosquito ergo summm...) Fri Jan 31 1997 09:18

    Hi,

    how can a node with VAXCLUSTER=0 "see" and mount disks that are served by a
    node of a cluster on the same NI?

    The situation: 

    ..Lobe1......::.......Lobe2................::.........Location3..
                 ::                            ::
    {A,B,C}	 ::	{D,E,F}  G----dssi---* ::		H
     | | |	 ::	 | | |	 |	       ::		|
    ++-+-+-----+ ::    +-+-+-+---++ 	       ::      +--------+---+
    |GIGAswitch| ::    |GIGAswitch|	       ::      |Concentrator|
    +---+------+ ::    +---+------+ 	       ::      +--------+---+
    	|	 ::	   |		       ::		|
        |        ::        |                   ::               |
    <===+==================+====================================+===> FDDI
                 ::                            ::

    {A,B,C} means: A,B,C are connected to the same CI 
    
    A: OpenVMS V6.2  (AlphaServer 8400 Model 5/300)	VOTES=1
    B: OpenVMS V6.2-1H2   (AlphaServer 2100 4/233)	VOTES=1
    C: OpenVMS V6.1  (VAX 10000-630)			VOTES=1
    D: OpenVMS V6.2  (AlphaServer 8400 Model 5/300)	VOTES=1
    E: OpenVMS V6.1  (VAX 6000-430)			VOTES=1
    F: OpenVMS V6.1  (VAX 10000-630)			VOTES=1
    G: OpenVMS V6.2  (VAX 4000-106A)			VOTES=0
    H: OpenVMS V6.1  (MicroVAX 3100)			VOTES=1


    The steps:
    - the entire cluster (A,B,C,D,E,F,G,H) is taken down
    - Nodes A,B,C,D,E,F,H are booted, the cluster is formed, CL_EXP=7
    - on G: conversational boot, set VAXCLUSTER=0 (the system should no longer
            be part of the cluster)

    The result:	
    - "sh cluster" on G reveals that G sees A,B,C,D,E,F,H and G but all status 
      fields are clear.
    - "sh dev d" on G lists all disks served by the nodes A,B,C,D,E,F (online)
    - G is able to mount a shadow set consisting of served member disks
    - A,B,C,D,E,F put the shadow set in MountVerify issuing the messages:

 	$2$DUA1271: (A, B) is an incorrect shadow set member volume
     	$2$DUA1271: (RBADA2, RBIZ07) has been removed from shadow set.
    	Mount verification has aborted for device DSA1271:
    	DSA1271: contains zero working members.
    	$2$DUA1271: (RBADA2, RBIZ07) is an incorrect shadow set member volume.

    - After dismounting the shadow set from G, this particular shadow set could
      no longer be accessed in the cluster. And since it was the disk holding
      SYSUAF and friends, no  more logins were possible, no more rightslist
      lookups and so an. Eventually we had to take down and reboot the entire
      cluster (sigh).

    Assumption/Guess:
    - G forms VCs on DSSI and biulds connections to the RF controller's
      MSCP$DISK. 
    - Somehow G must have become aware of the ports on FDDI.(???)
    - G forms VCs to A,B,C,D,E,F on FDDI 
    - G forms connection to the MSCP$DISK on the serving nodes
    - G does not have VMS$VAXcluster (VAXCLUSTER=0) therefore all access to
      served disks is not synchronized with the ABCDEF-cluster.

    Questions:
    - What did we do wrong?
    - How could G be aware of the systems on FDDI?


    Thank you for helping think about this.
    Raymond.

T.R	Title	User	Personal Name	Date	Lines
5221.1	Also Clear NISCS_LOAD_PEA0; Log QAR/IPMT	XDELTA::HOFFMAN	Steve, OpenVMS Engineering	`Fri Jan 31 1997 10:19`	23
	Log a medium-priority QAR (or IPMT) against OpenVMS on TRIFID::. G:: will use VMScluster protocols to access the DSSI disks even when it is not configured in a VMScluster, so this and other DSSI systems will tend to load various VMScluster support modules during the bootstrap. This looks like a SYSBOOT/SYSGEN bug -- setting VAXCLUSTER to zero should likely implicitly disable loading of the NISCS drivers -- it should force-set NISCS_LOAD_PEA0 to zero. (It's not clear how or if this can be done, and this whole sequence of events is strictly conjecture.) This looks like the same reason why we tell folks that nodes on a CI they need to have VAXCLUSTER set non-zero on all CI nodes... If an uncoordinated access is made -- such as an access to that shadow set when VAXCLUSTER was set to zero -- then corruptions can and will occur. In the interrum, "don't do that", or -- if you do, also disable the loading of the NISCS driver.
5221.2		UTRTSC::jvdbu.uto.dec.com::JurVanDerBurg	Change mode to Panic!	`Fri Jan 31 1997 10:49`	7
	Curious. I just experimented a little bit with this, and on both Vax and Alpha when setting vaxcluster=0 and niscs_load_pea0=1 then pedriver does not get loaded. Are you sure there's no other interconnect besides the network? (This was with OpenVMS VAX V6.2 and OpenVMS Alpha V7.1). Jur.
5221.3	only NI	TIMABS::FREPPEL	Mosquito ergo summm...	`Fri Jan 31 1997 14:09`	61
	Thanks Steve and Jur for your answers. re .1: >Log a medium-priority QAR (or IPMT) against OpenVMS on TRIFID::. Done. QAR# 1390 in the V6 database (I couldn't set a "component" or "abstract", can this be done after entering the QAR?) >G:: will use VMScluster protocols to access the DSSI disks even >when it is not configured in a VMScluster, so this and other DSSI >systems will tend to load various VMScluster support modules during >the bootstrap. Yes. And the VMScluster support modules should only form and use VCs with DSSI, right? (btw: there are no other DSSI systems in this particular cluster) >This looks like a SYSBOOT/SYSGEN bug -- setting VAXCLUSTER to zero >should likely implicitly disable loading of the NISCS drivers -- it >should force-set NISCS_LOAD_PEA0 to zero. Agree. I did the same (boot conv..set VAXC=0..cont) with system H, with absolutely no problems. There is no DSSI on that system though. >This looks like the same reason why we tell folks that nodes on a >CI they need to have VAXCLUSTER set non-zero on all CI nodes... Yes but in this case the interconnect is NI. Imagine what happens when there is not only one VMScluster on the NI, will we see all disks from all VMSclusters on the NI? >In the interrum, "don't do that", or -- if you do, also disable the >loading of the NISCS driver. Well, this lesson has been learned painfully ... re .2: >Curious. I just experimented a little bit with this, and on both Vax and Alpha >when setting vaxcluster=0 and niscs_load_pea0=1 then pedriver does not get >loaded. Did you use a system with a local DSSI. I managed to boot H with VAXCLUSTER=0, and it worked as expected. >Are you sure there's no other interconnect besides the network? Yes. All VCs to members are LAN based. We set VAXCLUSTER back to 1 and booted G, here's what can be seen: View of Cluster from system ID 1998 node: G +-------------------+---------+--------------------------+ \| SYSTEMS \| MEMBERS \| CIRCUITS \| +--------+----------+---------+-------+--------+---------+ \| NODE \| SOFTWARE \| STATUS \| RPORT \| RP_TYP \| CIR_STA \| +--------+----------+---------+-------+--------+---------+ \| G \| VMS V6.2 \| MEMBER \| \| LAN \| OPEN \| \| \| \| \| 5 \| SHAC \| OPEN \| \| \| \| \| 7 \| SHAC \| OPEN \| \| HSD10 \| HSD B259 \| \| 0 \| RF72 \| OPEN \| \| DISK1 \| RFX V256 \| \| 4 \| RF31 \| OPEN \| \| D \| VMS V6.2 \| MEMBER \| \| LAN \| OPEN \| \| H \| VMS V6.1 \| MEMBER \| \| LAN \| OPEN \| \| F \| VMS V6.1 \| MEMBER \| \| LAN \| OPEN \| \| E \| VMS V6.1 \| MEMBER \| \| LAN \| OPEN \| \| A \| VMS V6.2 \| MEMBER \| \| LAN \| OPEN \| \| B \| VMS V6.2 \| MEMBER \| \| LAN \| OPEN \| \| C \| VMS V6.1 \| MEMBER \| \| LAN \| OPEN \| +--------+----------+---------+-------+--------+---------+
5221.4	Fooling around with SYSGEN parameters can be dangerous	COVERT::COVERT	John R. Covert	`Fri Jan 31 1997 17:53`	15
	If you ask for PEDRIVER to be loaded by explicitly setting NISCS_LOAD_PEA0 to one, PEDRIVER will be loaded even if VAXCLUSTER is zero if there are other MSCP devices on the system. See the VMScluster Systems for OpenVMS manual, Appendix A, Table A-1, which contains the explicit warning: Caution: If the NISCS_LOAD_PEA0 parameter is set to 1, the VAXCLUSTER system parameter must be set to 2. This ensures coordinated access to shared resources in the VMScluster and prevents accidental data corruption. We warned you. /john
5221.5		EVMS::MORONEY	UHF Computers	`Fri Jan 31 1997 18:03`	5
	Doesn't sound like a good idea. What would be used for cluster ID? Is CLUSTER_AUTHORIZE.DAT still used? -Mike
5221.6		COVERT::COVERT	John R. Covert	`Sat Feb 01 1997 02:11`	4
	Yes. PEDRIVER reads CLUSTER_AUTHORIZE.DAT on initialization and will form circuits with other PEDRIVERs that match. /john
5221.7		TIMABS::FREPPEL	Mosquito ergo summm...	`Sat Feb 01 1997 05:39`	10
	re .4: Thanks John, this explains what we saw. So, I'm afraid, the answer to the question in the base note (What did I do wrong?) is: You didn't RTFM... And from your text I assume that Jur's systems (in .2) had no MSCP devices. Thanks to all, I appreciate it. Raymond.
5221.8		UTRTSC::thecow.uto.dec.com::JurVanDerBurg	Change mode to Panic!	`Mon Feb 03 1997 01:35`	10
	> And from your text I assume that Jur's systems (in .2) had no MSCP > devices. Right. Looking through the code i can see that on Alpha PEdriver's authorization code is not loaded if vaxcluster=0. The same thing is not done on vax, you can qar this. Jur.