[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | + OpenVMS Clusters - The best clusters in the world! + |
Notice: | This conference is COMPANY CONFIDENTIAL. See #1.3 |
Moderator: | PROXY::MOORE |
|
Created: | Fri Aug 26 1988 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 5320 |
Total number of notes: | 23384 |
5221.0. "VAXCL=0, still access served disks (non HSx,RF)" by TIMABS::FREPPEL (Mosquito ergo summm...) Fri Jan 31 1997 09:18
Hi,
how can a node with VAXCLUSTER=0 "see" and mount disks that are served by a
node of a cluster on the same NI?
The situation:
..Lobe1......::.......Lobe2................::.........Location3..
:: ::
{A,B,C} :: {D,E,F} G----dssi---* :: H
| | | :: | | | | :: |
++-+-+-----+ :: +-+-+-+---++ :: +--------+---+
|GIGAswitch| :: |GIGAswitch| :: |Concentrator|
+---+------+ :: +---+------+ :: +--------+---+
| :: | :: |
| :: | :: |
<===+==================+====================================+===> FDDI
:: ::
{A,B,C} means: A,B,C are connected to the same CI
A: OpenVMS V6.2 (AlphaServer 8400 Model 5/300) VOTES=1
B: OpenVMS V6.2-1H2 (AlphaServer 2100 4/233) VOTES=1
C: OpenVMS V6.1 (VAX 10000-630) VOTES=1
D: OpenVMS V6.2 (AlphaServer 8400 Model 5/300) VOTES=1
E: OpenVMS V6.1 (VAX 6000-430) VOTES=1
F: OpenVMS V6.1 (VAX 10000-630) VOTES=1
G: OpenVMS V6.2 (VAX 4000-106A) VOTES=0
H: OpenVMS V6.1 (MicroVAX 3100) VOTES=1
The steps:
- the entire cluster (A,B,C,D,E,F,G,H) is taken down
- Nodes A,B,C,D,E,F,H are booted, the cluster is formed, CL_EXP=7
- on G: conversational boot, set VAXCLUSTER=0 (the system should no longer
be part of the cluster)
The result:
- "sh cluster" on G reveals that G sees A,B,C,D,E,F,H and G but all status
fields are clear.
- "sh dev d" on G lists all disks served by the nodes A,B,C,D,E,F (online)
- G is able to mount a shadow set consisting of served member disks
- A,B,C,D,E,F put the shadow set in MountVerify issuing the messages:
$2$DUA1271: (A, B) is an incorrect shadow set member volume
$2$DUA1271: (RBADA2, RBIZ07) has been removed from shadow set.
Mount verification has aborted for device DSA1271:
DSA1271: contains zero working members.
$2$DUA1271: (RBADA2, RBIZ07) is an incorrect shadow set member volume.
- After dismounting the shadow set from G, this particular shadow set could
no longer be accessed in the cluster. And since it was the disk holding
SYSUAF and friends, no more logins were possible, no more rightslist
lookups and so an. Eventually we had to take down and reboot the entire
cluster (sigh).
Assumption/Guess:
- G forms VCs on DSSI and biulds connections to the RF controller's
MSCP$DISK.
- Somehow G must have become aware of the ports on FDDI.(???)
- G forms VCs to A,B,C,D,E,F on FDDI
- G forms connection to the MSCP$DISK on the serving nodes
- G does not have VMS$VAXcluster (VAXCLUSTER=0) therefore all access to
served disks is not synchronized with the ABCDEF-cluster.
Questions:
- What did we do wrong?
- How could G be aware of the systems on FDDI?
Thank you for helping think about this.
Raymond.
T.R | Title | User | Personal Name | Date | Lines |
---|
5221.1 | Also Clear NISCS_LOAD_PEA0; Log QAR/IPMT | XDELTA::HOFFMAN | Steve, OpenVMS Engineering | Fri Jan 31 1997 10:19 | 23 |
|
Log a medium-priority QAR (or IPMT) against OpenVMS on TRIFID::.
G:: will use VMScluster protocols to access the DSSI disks even
when it is not configured in a VMScluster, so this and other DSSI
systems will tend to load various VMScluster support modules during
the bootstrap.
This looks like a SYSBOOT/SYSGEN bug -- setting VAXCLUSTER to zero
should likely implicitly disable loading of the NISCS drivers -- it
should force-set NISCS_LOAD_PEA0 to zero. (It's not clear how or if
this can be done, and this whole sequence of events is strictly
conjecture.)
This looks like the same reason why we tell folks that nodes on a
CI they need to have VAXCLUSTER set non-zero on all CI nodes...
If an uncoordinated access is made -- such as an access to that
shadow set when VAXCLUSTER was set to zero -- then corruptions
can and will occur.
In the interrum, "don't do that", or -- if you do, also disable the
loading of the NISCS driver.
|
5221.2 | | UTRTSC::jvdbu.uto.dec.com::JurVanDerBurg | Change mode to Panic! | Fri Jan 31 1997 10:49 | 7 |
| Curious. I just experimented a little bit with this, and on both Vax and Alpha
when setting vaxcluster=0 and niscs_load_pea0=1 then pedriver does not get
loaded. Are you sure there's no other interconnect besides the network?
(This was with OpenVMS VAX V6.2 and OpenVMS Alpha V7.1).
Jur.
|
5221.3 | only NI | TIMABS::FREPPEL | Mosquito ergo summm... | Fri Jan 31 1997 14:09 | 61 |
| Thanks Steve and Jur for your answers.
re .1:
>Log a medium-priority QAR (or IPMT) against OpenVMS on TRIFID::.
Done. QAR# 1390 in the V6 database (I couldn't set a "component" or
"abstract", can this be done after entering the QAR?)
>G:: will use VMScluster protocols to access the DSSI disks even
>when it is not configured in a VMScluster, so this and other DSSI
>systems will tend to load various VMScluster support modules during
>the bootstrap.
Yes. And the VMScluster support modules should only form and use VCs with
DSSI, right?
(btw: there are no other DSSI systems in this particular cluster)
>This looks like a SYSBOOT/SYSGEN bug -- setting VAXCLUSTER to zero
>should likely implicitly disable loading of the NISCS drivers -- it
>should force-set NISCS_LOAD_PEA0 to zero.
Agree. I did the same (boot conv..set VAXC=0..cont) with system H, with
absolutely no problems. There is no DSSI on that system though.
>This looks like the same reason why we tell folks that nodes on a
>CI they need to have VAXCLUSTER set non-zero on all CI nodes...
Yes but in this case the interconnect is NI.
Imagine what happens when there is not only one VMScluster on the NI,
will we see *all* disks from *all* VMSclusters on the NI?
>In the interrum, "don't do that", or -- if you do, also disable the
>loading of the NISCS driver.
Well, this lesson has been learned painfully ...
re .2:
>Curious. I just experimented a little bit with this, and on both Vax and Alpha
>when setting vaxcluster=0 and niscs_load_pea0=1 then pedriver does not get
>loaded.
Did you use a system with a local DSSI. I managed to boot H with
VAXCLUSTER=0, and it worked as expected.
>Are you sure there's no other interconnect besides the network?
Yes. All VCs to members are LAN based.
We set VAXCLUSTER back to 1 and booted G, here's what can be seen:
View of Cluster from system ID 1998 node: G
+-------------------+---------+--------------------------+
| SYSTEMS | MEMBERS | CIRCUITS |
+--------+----------+---------+-------+--------+---------+
| NODE | SOFTWARE | STATUS | RPORT | RP_TYP | CIR_STA |
+--------+----------+---------+-------+--------+---------+
| G | VMS V6.2 | MEMBER | | LAN | OPEN |
| | | | 5 | SHAC | OPEN |
| | | | 7 | SHAC | OPEN |
| HSD10 | HSD B259 | | 0 | RF72 | OPEN |
| DISK1 | RFX V256 | | 4 | RF31 | OPEN |
| D | VMS V6.2 | MEMBER | | LAN | OPEN |
| H | VMS V6.1 | MEMBER | | LAN | OPEN |
| F | VMS V6.1 | MEMBER | | LAN | OPEN |
| E | VMS V6.1 | MEMBER | | LAN | OPEN |
| A | VMS V6.2 | MEMBER | | LAN | OPEN |
| B | VMS V6.2 | MEMBER | | LAN | OPEN |
| C | VMS V6.1 | MEMBER | | LAN | OPEN |
+--------+----------+---------+-------+--------+---------+
|
5221.4 | Fooling around with SYSGEN parameters can be dangerous | COVERT::COVERT | John R. Covert | Fri Jan 31 1997 17:53 | 15 |
| If you ask for PEDRIVER to be loaded by explicitly setting NISCS_LOAD_PEA0
to one, PEDRIVER will be loaded even if VAXCLUSTER is zero if there are
other MSCP devices on the system.
See the VMScluster Systems for OpenVMS manual, Appendix A, Table A-1, which
contains the explicit warning:
Caution: If the NISCS_LOAD_PEA0 parameter is set to 1, the
VAXCLUSTER system parameter must be set to 2. This ensures
coordinated access to shared resources in the VMScluster and
prevents accidental data corruption.
We warned you.
/john
|
5221.5 | | EVMS::MORONEY | UHF Computers | Fri Jan 31 1997 18:03 | 5 |
| Doesn't sound like a good idea.
What would be used for cluster ID? Is CLUSTER_AUTHORIZE.DAT still used?
-Mike
|
5221.6 | | COVERT::COVERT | John R. Covert | Sat Feb 01 1997 02:11 | 4 |
| Yes. PEDRIVER reads CLUSTER_AUTHORIZE.DAT on initialization and will form
circuits with other PEDRIVERs that match.
/john
|
5221.7 | | TIMABS::FREPPEL | Mosquito ergo summm... | Sat Feb 01 1997 05:39 | 10 |
| re .4:
Thanks John, this explains what we saw. So, I'm afraid, the answer to
the question in the base note (What did I do wrong?) is: You didn't RTFM...
And from your text I assume that Jur's systems (in .2) had no MSCP
devices.
Thanks to all,
I appreciate it.
Raymond.
|
5221.8 | | UTRTSC::thecow.uto.dec.com::JurVanDerBurg | Change mode to Panic! | Mon Feb 03 1997 01:35 | 10 |
| > And from your text I assume that Jur's systems (in .2) had no MSCP
> devices.
Right. Looking through the code i can see that on Alpha PEdriver's authorization
code is not loaded if vaxcluster=0. The same thing is not done on vax, you
can qar this.
Jur.
|