[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference spezko::cluster

Title:+ OpenVMS Clusters - The best clusters in the world! +
Notice:This conference is COMPANY CONFIDENTIAL. See #1.3
Moderator:PROXY::MOORE
Created:Fri Aug 26 1988
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:5320
Total number of notes:23384

5221.0. "VAXCL=0, still access served disks (non HSx,RF)" by TIMABS::FREPPEL (Mosquito ergo summm...) Fri Jan 31 1997 09:18

    Hi,

    how can a node with VAXCLUSTER=0 "see" and mount disks that are served by a
    node of a cluster on the same NI?

    The situation: 

    ..Lobe1......::.......Lobe2................::.........Location3..
                 ::                            ::
    {A,B,C}	 ::	{D,E,F}  G----dssi---* ::		H
     | | |	 ::	 | | |	 |	       ::		|
    ++-+-+-----+ ::    +-+-+-+---++ 	       ::      +--------+---+
    |GIGAswitch| ::    |GIGAswitch|	       ::      |Concentrator|
    +---+------+ ::    +---+------+ 	       ::      +--------+---+
    	|	 ::	   |		       ::		|
        |        ::        |                   ::               |
    <===+==================+====================================+===> FDDI
                 ::                            ::

    {A,B,C} means: A,B,C are connected to the same CI 
    
    A: OpenVMS V6.2  (AlphaServer 8400 Model 5/300)	VOTES=1
    B: OpenVMS V6.2-1H2   (AlphaServer 2100 4/233)	VOTES=1
    C: OpenVMS V6.1  (VAX 10000-630)			VOTES=1
    D: OpenVMS V6.2  (AlphaServer 8400 Model 5/300)	VOTES=1
    E: OpenVMS V6.1  (VAX 6000-430)			VOTES=1
    F: OpenVMS V6.1  (VAX 10000-630)			VOTES=1
    G: OpenVMS V6.2  (VAX 4000-106A)			VOTES=0
    H: OpenVMS V6.1  (MicroVAX 3100)			VOTES=1


    The steps:
    - the entire cluster (A,B,C,D,E,F,G,H) is taken down
    - Nodes A,B,C,D,E,F,H are booted, the cluster is formed, CL_EXP=7
    - on G: conversational boot, set VAXCLUSTER=0 (the system should no longer
            be part of the cluster)

    The result:	
    - "sh cluster" on G reveals that G sees A,B,C,D,E,F,H and G but all status 
      fields are clear.
    - "sh dev d" on G lists all disks served by the nodes A,B,C,D,E,F (online)
    - G is able to mount a shadow set consisting of served member disks
    - A,B,C,D,E,F put the shadow set in MountVerify issuing the messages:

 	$2$DUA1271: (A, B) is an incorrect shadow set member volume
     	$2$DUA1271: (RBADA2, RBIZ07) has been removed from shadow set.
    	Mount verification has aborted for device DSA1271:
    	DSA1271: contains zero working members.
    	$2$DUA1271: (RBADA2, RBIZ07) is an incorrect shadow set member volume.

    - After dismounting the shadow set from G, this particular shadow set could
      no longer be accessed in the cluster. And since it was the disk holding
      SYSUAF and friends, no  more logins were possible, no more rightslist
      lookups and so an. Eventually we had to take down and reboot the entire
      cluster (sigh).

    Assumption/Guess:
    - G forms VCs on DSSI and biulds connections to the RF controller's
      MSCP$DISK. 
    - Somehow G must have become aware of the ports on FDDI.(???)
    - G forms VCs to A,B,C,D,E,F on FDDI 
    - G forms connection to the MSCP$DISK on the serving nodes
    - G does not have VMS$VAXcluster (VAXCLUSTER=0) therefore all access to
      served disks is not synchronized with the ABCDEF-cluster.

    Questions:
    - What did we do wrong?
    - How could G be aware of the systems on FDDI?


    Thank you for helping think about this.
    Raymond.
T.RTitleUserPersonal
Name
DateLines
5221.1Also Clear NISCS_LOAD_PEA0; Log QAR/IPMTXDELTA::HOFFMANSteve, OpenVMS EngineeringFri Jan 31 1997 10:1923
   Log a medium-priority QAR (or IPMT) against OpenVMS on TRIFID::.

   G:: will use VMScluster protocols to access the DSSI disks even
   when it is not configured in a VMScluster, so this and other DSSI
   systems will tend to load various VMScluster support modules during
   the bootstrap.

   This looks like a SYSBOOT/SYSGEN bug -- setting VAXCLUSTER to zero
   should likely implicitly disable loading of the NISCS drivers -- it
   should force-set NISCS_LOAD_PEA0 to zero.  (It's not clear how or if
   this can be done, and this whole sequence of events is strictly
   conjecture.)

   This looks like the same reason why we tell folks that nodes on a
   CI they need to have VAXCLUSTER set non-zero on all CI nodes...
   If an uncoordinated access is made -- such as an access to that
   shadow set when VAXCLUSTER was set to zero -- then corruptions
   can and will occur.

   In the interrum, "don't do that", or -- if you do, also disable the
   loading of the NISCS driver.

5221.2UTRTSC::jvdbu.uto.dec.com::JurVanDerBurgChange mode to Panic!Fri Jan 31 1997 10:497
Curious. I just experimented a little bit with this, and on both Vax and Alpha
when setting vaxcluster=0 and niscs_load_pea0=1 then pedriver does not get 
loaded. Are you sure there's no other interconnect besides the network?
(This was with OpenVMS VAX V6.2 and OpenVMS Alpha V7.1).

Jur.

5221.3only NITIMABS::FREPPELMosquito ergo summm...Fri Jan 31 1997 14:0961
    Thanks Steve and Jur for your answers.
    
    re .1:
>Log a medium-priority QAR (or IPMT) against OpenVMS on TRIFID::.
    Done. QAR# 1390 in the V6 database (I couldn't set a "component" or 
    "abstract", can this be done after entering the QAR?)

>G:: will use VMScluster protocols to access the DSSI disks even
>when it is not configured in a VMScluster, so this and other DSSI
>systems will tend to load various VMScluster support modules during
>the bootstrap.
    Yes. And the VMScluster support modules should only form and use VCs with
    DSSI, right?
    (btw: there are no other DSSI systems in this particular cluster)

>This looks like a SYSBOOT/SYSGEN bug -- setting VAXCLUSTER to zero
>should likely implicitly disable loading of the NISCS drivers -- it
>should force-set NISCS_LOAD_PEA0 to zero.  
    Agree. I did the same (boot conv..set VAXC=0..cont) with system H, with
    absolutely no problems. There is no DSSI on that system though.

>This looks like the same reason why we tell folks that nodes on a
>CI they need to have VAXCLUSTER set non-zero on all CI nodes...
    Yes but in this case the interconnect is NI. 
    Imagine what happens when there is not only one VMScluster on the NI, 
    will we see *all* disks from *all* VMSclusters on the NI?

>In the interrum, "don't do that", or -- if you do, also disable the
>loading of the NISCS driver.
    Well, this lesson has been learned painfully ...

re .2:
>Curious. I just experimented a little bit with this, and on both Vax and Alpha
>when setting vaxcluster=0 and niscs_load_pea0=1 then pedriver does not get 
>loaded. 
    Did you use a system with a local DSSI. I managed to boot H with 
    VAXCLUSTER=0, and it worked as expected.

>Are you sure there's no other interconnect besides the network?
    Yes. All VCs to members are LAN based.
    We set VAXCLUSTER back to 1 and booted G, here's what can be seen:

View of Cluster from system ID 1998  node: G
+-------------------+---------+--------------------------+
|      SYSTEMS      | MEMBERS |         CIRCUITS         |
+--------+----------+---------+-------+--------+---------+
|  NODE  | SOFTWARE |  STATUS | RPORT | RP_TYP | CIR_STA |
+--------+----------+---------+-------+--------+---------+
| G      | VMS V6.2 | MEMBER  |       | LAN    | OPEN    |
|        |          |         |     5 | SHAC   | OPEN    |
|        |          |         |     7 | SHAC   | OPEN    |
| HSD10  | HSD B259 |         |     0 | RF72   | OPEN    |
| DISK1  | RFX V256 |         |     4 | RF31   | OPEN    |
| D      | VMS V6.2 | MEMBER  |       | LAN    | OPEN    |
| H      | VMS V6.1 | MEMBER  |       | LAN    | OPEN    |
| F      | VMS V6.1 | MEMBER  |       | LAN    | OPEN    |
| E      | VMS V6.1 | MEMBER  |       | LAN    | OPEN    |
| A      | VMS V6.2 | MEMBER  |       | LAN    | OPEN    |
| B      | VMS V6.2 | MEMBER  |       | LAN    | OPEN    |
| C      | VMS V6.1 | MEMBER  |       | LAN    | OPEN    |
+--------+----------+---------+-------+--------+---------+
5221.4Fooling around with SYSGEN parameters can be dangerousCOVERT::COVERTJohn R. CovertFri Jan 31 1997 17:5315
If you ask for PEDRIVER to be loaded by explicitly setting NISCS_LOAD_PEA0
to one, PEDRIVER will be loaded even if VAXCLUSTER is zero if there are
other MSCP devices on the system.

See the VMScluster Systems for OpenVMS manual, Appendix A, Table A-1, which
contains the explicit warning:

	Caution: If the NISCS_LOAD_PEA0 parameter is set to 1, the
	VAXCLUSTER system parameter must be set to 2.  This ensures
	coordinated access to shared resources in the VMScluster and
	prevents accidental data corruption.

We warned you.

/john
5221.5EVMS::MORONEYUHF ComputersFri Jan 31 1997 18:035
Doesn't sound like a good idea.

What would be used for cluster ID? Is CLUSTER_AUTHORIZE.DAT still used?

-Mike
5221.6COVERT::COVERTJohn R. CovertSat Feb 01 1997 02:114
Yes.  PEDRIVER reads CLUSTER_AUTHORIZE.DAT on initialization and will form
circuits with other PEDRIVERs that match.

/john
5221.7TIMABS::FREPPELMosquito ergo summm...Sat Feb 01 1997 05:3910
    re .4:
    Thanks John, this explains what we saw. So, I'm afraid, the answer to
    the question in the base note (What did I do wrong?) is: You didn't RTFM...
    
    And from your text I assume that Jur's systems (in .2) had no MSCP
    devices. 
    
    Thanks to all,
    I appreciate it.
    Raymond.
5221.8UTRTSC::thecow.uto.dec.com::JurVanDerBurgChange mode to Panic!Mon Feb 03 1997 01:3510
>    And from your text I assume that Jur's systems (in .2) had no MSCP
>    devices. 

Right. Looking through the code i can see that on Alpha PEdriver's authorization
code is not loaded if vaxcluster=0. The same thing is not done on vax, you
can qar this.

Jur.