[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference ssag::ask_ssag

Title:Ask the Storage Architecture Group
Notice:Check out our web page at http://www-starch.shr.dec.com
Moderator:SSAG::TERZAN
Created:Wed Oct 15 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6756
Total number of notes:25276

6652.0. "connected/unshared SCSI; non-cluster (VMS)" by SAYER::ELMORE (Steve [email protected] 4123645893) Mon May 05 1997 11:12

    I have a customer who does not need nor want to configure a VMS cluster
    between two AS1000s.  However they want to have a type of "cold"
    failover without moving cables.  They want to physically connect the 
    two systems using KZPAAs, SCSI, a BA356 (w two 4GB drives), and the
    "tri-link" "Y" connectors as shown below.  Logically, only one system 
    will ever see the disks in the BA356 (which will be host shadowed).
    
     VMS AS1000                                 VMS AS1000
    system "A"                                 system "B"
    +-----+---+                                +---+-----+
    |     | K |                                | K |     |
    |     | Z |                                | Z |     |
    |     | P |                                | P |     |
    |     | A |                                | A |     |
    |     | A |                                | A |     |
    +-----+-|-+        Tri-link                +-|-+-----+
            |==+==-------\ /----------------==+==|
               |          |                   |   
              term     +--|--+               term
                       |DWZZB|          
                       |RZxx |
                       |RZxx |
                       |     |       
                       +-----+             ==+== is a "Y" cable
                        BA356


    Each 1000 has its own system disk, not shared across the KZPAA-based
    SCSI.  No VMS cluster license.  System A is running host-based
    shadowing over the two BA356 disks.
    
    The customer wants to be able to power-up both systems and be running
    VMS.  Only system A will access the two disks in the BA356, unless
    system A goes down.  In that case, the customer wants to access the 
    those disks from system B.  Rebooting system B to accomplish this is OK.
    Directing system B to run a different system startup (systartup) 
    is OK.  If disks become corrupted and need to be rebuilt with some data
    loss is also OK.  Time to switch over is not critical (many 
    minutes to hours?).
    
    Can this work?  I understand it's probably not supported.  Our concerns
    are SCSI traffic (on powerup and upon startup) of system B and whether
    that would crash system A or corrupt the disks.  Any other hardware or
    software issues?  The customer "insists" that system B will never mount
    the disks when system A has control.  We're concerned that VMS or the
    controller in system B may do that without specific direction to do so.

    This is only to eliminate the need to physically unplug disks from 
    system A and plug them into system B in the event of a system A 
    failure (customer requirement in a spec. - no sense in arguing).
 

    Thanks,
    Steve
    
                                 
    

T.RTitleUserPersonal
Name
DateLines
6652.1commit only supportedFRAIS::KHANTue May 06 1997 06:5212
    I have had lots of discussions of this kind ... basically they need a
    few cluster functionalities and would donot want to buy the license.
    Yes, one can run such a configuration, and ignore this message, take
    care not to do this or that ... but it does not make it to asupported
    one !! In my option, the customer is trying to find someone who would
    commit himself and carry the responsibility.
    I donot argue with them. I just tell them ( like your note .0 ) all the
    techical issues why we donot recommend such a configuration. Also that
    with a cluster he can move into the 'supported' one.
    I have seen such configurations running, even with VAXs. The problem is
    that once the application is running 'hot' the customer reacts
    differently (" I asked DIGITAL and they said yes ... "). 
6652.2I think it's dangerous, but it's their choiceSUBSYS::BROWNSCSI and DSSI advice given cheerfullyTue May 06 1997 09:3431
    Although there is nothing to stop the customer from running an 
    unsupported configuration, and technically astute customers are
    welcome to take informed risks, I sense that this customer doesn't
    know what the risks are.
    
    I don't sense that the customer knows enough about the SCSI protocol 
    and about host-based volume shadowing to appreciate all the issues, 
    but it would be prudent to summarize some of the obvious problems.
    
    First, the customer shouldn't use DWZZBs with KZPAAs.  KZPAAs are
    already single-ended.  Perhaps the customer intends to use KZPSAs?
    
    Second, it would be prudent to use at least the VMS version that
    would have been required for a supported configuration.  Such a version
    might have driver fixes and volume-shadowing fixes that would help
    shared buses and failover work properly.
    
    Third, the customer should realize that host-based shadowing stores 
    some information about the shadow-set on the host, not on the disk.
    When you boot the second system, it won't know that the two disks
    in the BA356 are supposed to be a shadow-set.  If you mount them as
    a shadow-set, the new shadow-set may not be in the same state as
    the shadow set on the first system.  Writes may not have completed,
    master-slave relationships (rebuilds) may be lost, etc.
    
    In general, unsupported configurations work most of the time.  The
    two main reasons for not supporting some configurations are bugs
    (known or anticipated) and lack of testing time.  The problem with
    this configuration is potential data loss or data corruption, depending
    on the state of the shadow-set at the time of the failover.  If your
    customer is willing to take that risk, that's their call.
6652.3what to say?SAYER::ELMORESteve [email protected] 4123645893Tue May 06 1997 10:3317
    Based on .2, may I say that system "B" will NOT generate traffic on
    the SCSI that could cause the other system to crash, or its disks to
    become corrupted when system A has "control".  Also, may I say that
    system "A" can't crash system "B" ?
    
    I understand the nature of "unsupported" configurations.  I understand
    the long term implication of that too.  We can tell the customer that
    we will not service it, we won't support it, we won't configure it.  We
    can write that in a contract.  But, will it RUN under the circumstances
    we've outlined?  
    
    I suppose what we are really asking is what events are happening on the
    SCSI bus, if any, and how do the systems (hardware and OS) handle those
    events?  Will those events damage hardware or crash the OS?
    
    Thanks,
    Steve
6652.4specific problems to watch forSUBSYS::BROWNSCSI and DSSI advice given cheerfullyTue May 06 1997 13:2814
    In VMS V6.2, there were problems with one of the fast wide adapters 
    (KZPSA or QLogic, I forget which) being the target of an INQUIRY
    command from the other system.  With the KZPAA, that shouldn't be
    a problem.  The problem was fixed in 6.2-1H1 and 7.1.
    
    Also, booting the second system will cause SCSI bus resets.  We've seen
    crashes with the KZPAA with lots of resets in a short period, when it
    had a heavy I/O load, but the one or two resets generated during a boot
    are very unlikely to cause problems.  Still, since a reset can cause a
    command in progress to fail, you should probably check in VMSNOTES to see 
    if bus resets can cause problems with HBVS.
    
    Also, don't put tape drives on the shared bus.  We haven't proved it
    works, even in a cluster.
6652.5Partitioned VMScluster: Prelude to Data CorruptionXDELTA::HOFFMANSteve, OpenVMS EngineeringTue May 06 1997 16:5857
   This configuration is a partitioned VMScluster.

   Partitioned VMScluster configurations are bad configurations.

   If you want to tell the customer something, tell them that we do not
   recommend this configuration, and we have seen *massive* corruptions
   result, and that the supported configuration involves a VMScluster,
   or involves configuring disjoint (non-shared) SCSI busses.

:    Based on .2, may I say that system "B" will NOT generate traffic on
:    the SCSI that could cause the other system to crash, or its disks to
:    become corrupted when system A has "control".  Also, may I say that
:    system "A" can't crash system "B" ?

   I would not say anything of the kind.

   I would expect each host would detect the other's SCSI controller on
   the shared SCSI, and I would expect I might have to alter a few console
   variables to keep the systems from squawking about the SCSI controllers.
    
:    I understand the nature of "unsupported" configurations. I understand
:    the long term implication of that too.  We can tell the customer that
:    we will not service it, we won't support it, we won't configure it.  We
:    can write that in a contract.  But, will it RUN under the circumstances
:    we've outlined?  

   It might run.  It might randomly crash.  It might randomly corrupt
   the user and system data.  And the customer gets to find all this
   out -- whether this configuration works, and whether or not the
   customer can correctly (and safely) manage this configuration.

   I've already seen cases where the two nodes were incorrectly booted
   from the same system root -- which is *very* easy to do in this
   particular configuration -- and *massive* data corruptions resulted.

   I will admit to having configured and run partitioned VMScluster
   configurations -- which is what this is -- but I have also seen
   these lead to trashed disks and random system crashes.  Things
   get very interesting during upgrades, too.

   If a customer is asking the questions raised here, I'd recommend
   against this configuration.

:    I suppose what we are really asking is what events are happening on the
:    SCSI bus, if any, and how do the systems (hardware and OS) handle those
:    events?  Will those events damage hardware or crash the OS?

   You are headed off into dangerous territory, territory where the
   VMScluster connection manager and the distributed lock manager were
   explicitly designed to prevent just the sorts of problems your
   customer may/will see with this configuration.

   If the customer wants something like this, I'd look for a SCSI
   bus switch or similar widget -- hardware that can prevent two
   systems from being on the same bus at the same time...

6652.6ThanksSAYER::ELMORESteve [email protected] 4123645893Tue May 06 1997 21:106
    Thank you all for the information.  We'll tell the customer no.  
    
    I'd sure like to find a SCSI switch though.  Anyone ever heard of one?
    
    Thanks again,
    Steve
6652.7JACEK::waldek.rpw.dec.com::agatka::calkaWaldek CalkaWed May 07 1997 08:466
Have a look on ANCOT, my customer has 30 of them and they are 
working. Check ANNECY::WDX notes conference to learn more.

Regards/Waldemar


6652.8we do something like this every day for 10 years now.EPS::VANDENHEUVELHeinThu May 08 1997 10:3435
    
    IMHO the tone of the replies in general and Steve's in .5 in particular
    are overly pessimistic and just try to avoid core technical questions like
    - what is the expected results from bus resets after reboots
    - does mount reserve the target device on the scsi bus.
    - is it shadowing that makes it impossible because critical
      reconstruction data may be present on an other disk on the other node?
      (couldn't be in memory because it should be able to cope with crashes)
    
    The way I read .0 this customer perfectly understands that both systems
    should never ever try to mount the same disk from tow non-clustered 
    systems. And they show to understand the consequence of getting it wrong.
    It would seem to me that we can possibly simply tell them that 
    	- this is an unsupported configuration which we recommend against
    	- electrically it will work (no smoke) 
        - we would encourage them to buy a (minimal) cluster licence which 
    	  will solve their problem and add more value to boot but
    	- we do expect this configuration to work just fine. (don't we?!)
    	- it is ultimatly their choice. we can never condone the config
    
    fwiw, we have a 'sneaker' disk here in the lab for the past 10+ years
    which is dual ported and either mounted to one cluster or to an other
    cluster, never at the same time. The HSC's + VMS actually make sure 
    that you can not mount it twice. Never came close to any corruption.
    VMS Mounting on SCSI also 'reserves' the device does it not?
    If it does not, we may want to encourage the customer to come up
    with a 'token' that the mount procedure looks for to be present 
    before continuing. I think that token _could_ be information on the
    very disk. For example changing the volume label to reflect who's
    got it. It could perhaps be a network name, or a dongle behind
    a com port or whatever.
    
    2�,
    	Hein
                                              
6652.9VMS MOUNTing on SCSI ***DOES NOT*** RESERVE (i.e. SCSI RESERVE command) the DEVICE.STAR::WCLOGHERThu May 08 1997 11:060
6652.10LEFTY::CWILLIAMSCD or not CD, that's the questionThu May 08 1997 11:244
    It is so easy to screw this up that it is not worth the risk.
    Just the bus scanning and resets from a reboot could cause problems.
    
    Bad idea.
6652.11Removal of "Blade Guards"?XDELTA::HOFFMANSteve, OpenVMS EngineeringMon May 12 1997 11:0522
:    fwiw, we have a 'sneaker' disk here in the lab for the past 10+ years
:    which is dual ported and either mounted to one cluster or to an other
:    cluster, never at the same time. The HSC's + VMS actually make sure 
:    that you can not mount it twice. Never came close to any corruption.

   Which `sneaker' disk are you refering to?  RA-series disks -- you say
   HSC, so I'm assuming RAs --  are quite a bit different from SCSI here,
   as they can be mounted only from one controller or the other.

   I removed one of the major sets of RA-series `sneaker' disks used here
   in OpenVMS engineering last year...

   There is nothing to prevent this configuration from working as expected.
   With HSJs in the same allocation class, one can operate fairly well.
   But there is every reason to assume a small user screwup will massively
   corrupt the user and system disks, and probably at some critical time
   in the user's operations environment.  (And with the multi-host SCSI
   configuration, bootstrapping multiple systems off the same SYSn root
   is trivially easy, and extremely dangerous...)

   We _have_ a solution to this problem -- the VMScluster.