[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference cookie::raid_software_for_openvms

Title:RAID Software for OpenVMS
Notice:READ IMPORTANT NOTE IN 3.15, V2.4 SSB Kit in 3.176
Moderator:COOKIE::FROEHLIN
Created:Fri Dec 03 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:341
Total number of notes:1378

324.0. "RAID5 config with problems?" by TAGEIN::GRUENWALD () Fri Feb 28 1997 03:11

    Hi,
    
    I have a few questions about this configuration
    
     -------------                --------------               -----------
    |         VAX |   cluster-   |        ALPHA | local SCSI  |           |
    | vax 4500    |--------------| 8400         |-------------| RAID sets |
    | VMS v6.2    | interconnect | VMS v6.2-1H3 |  (KZMSA)    |           |
    | swRAID v2.3 | (DSSI, Eth)  | swRAID v2.3  |             |           |
     -------------                --------------               -----------
    
    . All firmwares were updated according to the latest (FW upd v3.8)
    . The raid sets were bind from ALPHA after booting both systems and
    . the sets were succesfully used from both systems.
    . While stopping and rebooting ALPHA, the other node (VAX) remained
    . blocked (this is normal - quorum lost) and resumed it's activity
    . afterwards
    
    In the configuration above if cluster member ALPHA fails (simulated with
    ctrl_P/ init) and comes back the first members of the raid sets are replaced
    by disks from the spareset. The reconstructing procedure goes down well.
    There were NO writes on the sets during this operation!
    
    Without the cluster (alpha - single VMS - with its SCSI storage) the same 
    action (ctrl_P / init / boot) doesn't cause such a failover in the raid set.
           
    The customer doesn't accept this behaviour. (He states that on his other
    site - with the same configuration - the data were corrupted while he
    removed and gave back one of the nodes according to this reconstructing.
    He had run autogen with the option REBOOT on the VAX when the data-
    corruption occured.)
    
    I asked for a copy of RAID$DIAGNOSTICS_*_NODE.LOG,RAID5_BIND.COM,
    RAID5_INIT.COM,ERRLOG.SYS and OPERATOR.LOG from both systems.
    
    Is the above configuration for raid software a supported one?
    
    What is the accepted behaviour for a raid set using from both nodes if one
    node crashes or just leaves the cluster? Does a special action have to be
    taken before coming back that failed node?
    
    regards and thanx in advance
    
    Michael
T.RTitleUserPersonal
Name
DateLines
324.1COOKIE::FROEHLINLet's RAID the Internet!Fri Feb 28 1997 09:249
    Michael,
    
    are you sure the VAX is running V2.3 of RAID? Any cluster configuration
    is a legal RAID Software configuration (s. SPD for details). In this
    scenario none of the members should have been removed using V2.3 RAID.
    
    I'll run a test here.
    
    Guenther
324.2RAID5 config with problems?TAGEIN::GRUENWALDMon Mar 03 1997 06:0112
    Guenther,
    
    > are you sure the VAX is running V2.3 of RAID?
    
    Yes, we are. That customer site is located in Hungary. I just spoke to
    the colleague there. He will get the data i asked for.
    
    There wheren't any ECOs installed!
    
    Regards
    
    Michael
324.3COOKIE::FROEHLINLet's RAID the Internet!Mon Mar 03 1997 16:546
    Michael,
    
    I could, to my surprise, reproduce this behavior here. Have to find out
    where RAID lost its mind and does this.
    
    Guenther
324.4COOKIE::FROEHLINLet's RAID the Internet!Tue Mar 04 1997 11:3722
    There's a quick solution to that. Disable the timeout action on all
    RAID 5 arrays with:
    
    	$ RAID MODIFY/NOTIMEOUT array-ID
    
    /TIMEOUT is the default and allows the driver to make a decision to
    remove a member device if an I/O to the disk driver did not return
    within timeout seconds (default is 30 sec.).
    
    The disadvantage of using /NOTIMEOUT is that a RAID 5 array may hang
    unnecessarily. Assume member devices are connected to different
    controllers and just one member device becomes unavailable. With
    /NOTIMEOUT RAID driver waits for the device to become accessible
    again which may take minutes or hours. Until then the whole array is
    inaccessible. With /TIMEOUT=n in place the array would be reduced after
    n seconds and the array is accessible again. This is a tradeoff the
    users have to make.
    
    Instead of choosing /NOTIMEOUT they can use /TIMEOUT=n with n large
    enough so that a serving node can reboot.
    
    Guenther
324.5SCSI patch solved the problemBPSTGA::TORONYG�bor TornyossyWed Mar 05 1997 07:1825
Guenther,

I assume that the SCSI patch (ALPSCSI02_070) solved the problem. 

Installing step by step the patches suggested by Michael when we arrived to that
patch this "member_0_reconstructing" behaviour has disappeared (there were
writes to that raidset, breaking the alpha, ...).

  Some additional things: at the office I set up a test environment where       
  (without any patches) I couldn't reproduce the problem. 

  My envir: dec3000/300 with turbochannel scsi adapter (PMAZB) and vax3100/78   
  with NI cluster (the software versions were the same as at the customer). 
  The differences between this and the site are the machines and the PCI/SCSI   
  stuff (KZPSA) and the DSSI as an additional cluster interconnect.

It gave a lesson to me: somehow beeing able to declare what patches to where
before a problem arises (=what are the MUP like patches). We know (or we believe
we know) the existence of patches, there are summaries concerning the different
op_sys versions. But anyhow they are to many to read through them all... 
How do you solve it? Can you help?

Thanks,
G�bor
(the involved engineer in Hungary)
324.6nabsco::FROEHLINLet's RAID the Internet!Wed Mar 05 1997 09:3334
    G�bor,
    
    the problem mentioned by Michael in .0 has nothing to do with hardware
    or patches. Let me explain:
    
    RAID$DPRIVER does RAID member management. If an I/O to the underlying
    driver (DS/DU/DKdriver) returns with an error and the RAID 5 array is
    in normal state, the member will be removed. When a disk server
    disappears with outstanding I/Os, the I/Os are not returned while
    the disk has entered mount verification. 
    
    But RAID$DPRIVER can timeout the I/Os if told so (RAID MODIFY/TIMEOUT=n).
    An error status returned for a member disk I/O starts a removal
    request for an array. The driver function goes thru all members of an
    array with such a request and checks members 0 to n. If all disks of
    this array have been served by a disappearing disk server and there
    were active I/Os it is likely that more than one member has a removal 
    request.
    
    The driver function starts with the first member and, if the array is
    in normal state, removes it. Then the driver checks the next member.
    But since we are reduced now, no further member can be removed and
    the DPA device enters mount verification.
    
    Starting with V2.3 the driver now waits for 2 seconds before working on
    a removal request. If more than 1 member needs to be removed the driver
    skips any removal request. But if after 2 seconds there's exactly one
    member with a removal request it will be removed. If there have been no
    I/Os at all during the time the disk server reboots, no member will be
    removed.
    
    Hope this helps!
    
    Guenther
324.7BPSTGA::TORONYG�bor TornyossyThu Mar 06 1997 10:2336
Guenther,


thank you for the info. It's like a "technical liberal education" in the topic
of sw raid. This is why one reads the notes. I'm serious.

But holds the problem (and the customer) in a state of uncertainty:
  As of .1 - should work without reconstructing,
 .2 - no (=feature),
 .4 - use /NOTIMEOUT. ---> Changing dinamically?  Too strange to leave it to the
      customer without giving any suggestions.
 .6 - it's obviuos that removes the first members... ---> How to use than in a  
      cluster environment then anyhow?

Still not clear: is it raining or the sun is shining? We smile, o.k. but in a
swimming dress or under an umbrella?

.What are the expectations in a cluster environment? Is it normal that when one
clustermember comes back (for example after an autogen/reboot) while the other
node remains working the raid will be  reconstructing? Yes or not how to handle?

.There were tests with and without that patch (or patches) causing changes in
the behaviour. Right - nothing to do with that, then why. And let's say to the
customer that although the problem seems to be disappeared nothing has been
solved - so don't use it? What's the answer to his next obvious question? 


The customer has decided to use it (after the successful test, as we said it's
okay). What will happen during the next reboot? 

The more so important since he is about deciding to leave or keep Digital
customer services and what more  - platform!


Regards,
G�bor
324.8COOKIE::FROEHLINLet's RAID the Internet!Thu Mar 06 1997 11:5853
    G�bor,
    
>Still not clear: is it raining or the sun is shining? We smile, o.k. but in a
>swimming dress or under an umbrella?
    
    If it's raining reconstructs use the /NOTIMEOUT umbrella to bring out
    your smile ;-).

>.What are the expectations in a cluster environment? Is it normal that when one
>clustermember comes back (for example after an autogen/reboot) while the other
>node remains working the raid will be  reconstructing? Yes or not how to handle?
    
    The point is not a general cluster member but a disk server. Some
    cluster nodes might serve their local disks to other nodes and are
    therefore disk servers as well. Rebooting such a "disk server" with
    active I/Os to the disks will cause the disks (I'm talking physical disks
    like the RAID set members) to enter mount verification. RAID software
    has a special feature built-in for RAID 5 arrays which can tolerate the
    loss of one member. This feature is a timeout of disk I/Os which is by
    default turned on. The idea behind this is to remove a hindering member 
    quickly and continue with the remaining members instead of stalling
    the whole RAID set. Most customers might not need/like this feature
    and therefore can turn it off dynamically at any time for specific arrays
    using the RAID MODIFY command.

>.There were tests with and without that patch (or patches) causing changes in
>the behaviour. Right - nothing to do with that, then why. And let's say to the
    
    Then why WHAT?
    
>customer that although the problem seems to be disappeared nothing has been
>solved - so don't use it? What's the answer to his next obvious question? 
    
    Patches are typically early point fixes to problems. If a system has 
    a severe impact caused by such a problem then the patch should be
    installed. Otherwise wait until the patch has been incorporated into the
    next release of the product (e.g. OpenVMS) and has hence passed a full 
    qualification test.

>The customer has decided to use it (after the successful test, as we said it's
>okay). What will happen during the next reboot? 
    
    What is expected?

>The more so important since he is about deciding to leave or keep Digital
>customer services and what more  - platform!
    
    Because of the RAID reconstruct issue?
    
    Or did they have so many little flames started in the past and now it's
    a wildfire the customer thinks we, Digital, cannot extinguish?
    
    Guenther
324.9BPSTGA::TORONYG�bor TornyossyFri Mar 07 1997 06:5734
Guenther,

I enjoy that you joined in. It's unfortunate that because of geographical
reasons it cannot be deepened beside a mug of beer.

The customer wants this software raid not to recognise server shutdown/reboot
(not to rebuild the set) moreover not to make any failure in the filesystem on
it. We can suggest or not suggest to use the sold software raid. To be more 
precise we have to give them a procedure how to use it safetely and meeting 
their expectations.

Therefore let me summarize if I understood well your letter:
. If you have a cluster (like in .0) and you have to shutdown/reboot one
  member (the server of the disks in question) while the other should run (the
  process making the IO to that disk will hang - no problem)
  - if you don't want to allow removing the first member (even if you have a 
    spare disk) then switch both drivers (on both involved nodes) not to 
    recognise events (/NOTIMEOUT)and switch them back again as the disks 
    become online again, or
  - you may live with the state change in your raidset (use spare, ...).
. If there is a crash like event - removing the first member will take
  place (actions are the same as the line above)

If this is the case the behaviour under or without the patch has only technical
interest: what changes in the SCSI related driver code make raid software
react differently.

Thanks a lot,
G�bor

(True, the last sentence - about the customer's decision of leaving the 
Digital presence - turned out too theatrical. The story is grotescue but 
Digital's role (local and general) and conscience quite clear... I just wanted
underline the importance of the problem.)
324.10COOKIE::FROEHLINLet's RAID the Internet!Fri Mar 07 1997 09:2534
    G�bor,
    
>reasons it cannot be deepened beside a mug of beer.
    
    Czech mana...ah!

>precise we have to give them a procedure how to use it safetely and meeting 
>their expectations.
    
    I assume the /NOTIMEOUT does it. Your summarization is correct. Use 
    /NOTIMOUT and no reconstructs happen when a disk server reboots either
    caused by a crash or a shutdown/reboot.

>(not to rebuild the set) moreover not to make any failure in the filesystem on
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^
    
    You didn't mention this before. Any more details?
    
    
>interest: what changes in the SCSI related driver code make raid software
>react differently.
    
    Nothing I could think of. RAID driver just fabricates on or more I/O
    request packets for the real disk driver (DU/DS/DKdriver) and queues it
    to its I/O queue. It's up to the real disk driver to perform the I/O.
    RAID driver just orchestrates that.

>Digital's role (local and general) and conscience quite clear... I just wanted
>underline the importance of the problem.)
    
    I want to get this customer back on the road with his computing
    equipment...Digital Equipment.
    
    Guenther