[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference cookie::raid_software_for_openvms

Title:RAID Software for OpenVMS
Notice:READ IMPORTANT NOTE IN 3.15, V2.4 SSB Kit in 3.176
Moderator:COOKIE::FROEHLIN
Created:Fri Dec 03 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:341
Total number of notes:1378

323.0. "Mountverify Problem V2.3" by RULLE::LINDSTROM_S () Tue Feb 25 1997 09:04

	This weekend Ericsson Radio in Sweden reconfigured one of their
	HSZ40 controllers. First they went from HSOF V2.7 to V3.0.
	Then they spread the disks over the channels and set up the
	disks from scratch. The HSZ40 don't do any shadowing or striping.
	It's done by the host. RAID software V2.3.
	The have some stripe sets containing 5 (2 member) shadowsets and
	one 6 (2 member) shadowsets. It's just the stripeset with 6
	members that doesn't work. It's called DPA2 and 3 times it went
	into 'Mount verification'. When this happend none of the copy
	operations where completed.
	There is no information in the errorlog except for the 'Mount 
	verification' message. The 5 member stripeset works without problem.
	The 3 failures with DPA2 was within 24 hours so what we did was to 
	reduce the DSA-devices to 1 member. Now the stripeset with 6 (1 member)
	shadowsets works fine but without any security.
	System is a AS2100 5/250 with VMS V6.2-1H3, ALPF11X03_0070,
	ALPSCSI02_070, ALPSHAD05_062.

	Why do we get this problems?
	Is there a limitation with 12 disk in a raid 0+1 set?
	Could we have a quota problem with the Raid-server process?
	 
T.RTitleUserPersonal
Name
DateLines
323.1COOKIE::FROEHLINLet's RAID the Internet!Tue Feb 25 1997 09:2535
    RAID Software uses a sys$mount call to mount the shadow sets. From that
    point on RAID software only knows about the DSA devices as members of a
    RAID set. All I/Os are sent to the shadowing driver by RAID$DPDRIVER.
    The RAID server is not involved in any shadowing member management. It
    is all done by shadowing.

>	one 6 (2 member) shadowsets. It's just the stripeset with 6
>	members that doesn't work. It's called DPA2 and 3 times it went
    
    Any differences to the other RAID set (e.g. disk device types)?
    
    What are the values for SHADOW_MAX_COPY and SHADOW_MBR_TMO on all
    nodes?
    
>	reduce the DSA-devices to 1 member. Now the stripeset with 6 (1 member)
>	shadowsets works fine but without any security.
    
    Did they try to add one member to the shadow sets at a time with
    RAID ADD/SHADOW? I mean adding a 2nd member to the first shadow set.
    Wait until the copy has completed and so on.

> 	Why do we get this problems?
    
    What's in the OPERATOR.LOG and ERRLOG.SYS related to either the
    controllers, the shadow set members or the shadow sets?
    
>	Is there a limitation with 12 disk in a raid 0+1 set?
    
    No!
    
>	Could we have a quota problem with the Raid-server process?
    
    No!
	 
    Guenther
323.2More infoRULLE::LINDSTROM_SWed Feb 26 1997 12:0627
	The parameters you asked about are as follows
	SHADOW_MBR_TMO  = 20
	SHADOW_MAX_COPY = 4

	The customer are trying tonight to add 1 disk at the time.
	There are nothing in the errorlog other than the entries telling
	us that DPA2 went offline into mount verification.
	According to operator.log all 6 DSA devices was added into DPA2.
	The first hour after the reboot went without errors and 2 of the
	shadow copy operations were completed. Then there was a couple of
	mount verification that went back online again. From now on DPA2
	is logging about 200 errors/5 min. Then it gets stuck in mount 
	verification until 'Mount verify timeout'.
	This raid set is made of RZ29B with 0014 and 0016 microcode only.
	We havent seen any indication of problem with the HSZ40,disks or
	the DSA devices. The only symptom is DPA2 is Offline and goes
	into mount verification.
	The 6 other raidsets with 5 dsa disk members are working without 
	any problem and this 6 dsa-disk raid set has worked all day
	with just 1 shadowmember.

	If there are any trouble with H/W or shadowing why doesn't it show?
	Have we missed to look at the right places?
	
	Regards	Sten Lindstrom
		CSC Sweden
323.3AMCFAC::RABAHYdtn 471-5160, outside 1-810-347-5160Wed Feb 26 1997 12:271
Are the HSZ40's in dual redundant pairs?  Is the PREFER command being used?
323.4Single controllerRULLE::LINDSTROM_SWed Feb 26 1997 14:137
	The HSZ40 is a single controller.

	By the way how do I get the new shadowing code (note 3.15) used in V7.1?
	Did I get by installing ALPSHAD05_062?

	Sten.
323.5COOKIE::FROEHLINLet's RAID the Internet!Wed Feb 26 1997 14:4418
    Sten,
    
    the TIMA patch kit for V7.1 has not been released yet. Is your customer
    running straight V6.2 or the V6.2 Compatibility Kit (which goes with
    V7.1)?
    
    DPA2 is going into mount verification and all DSA devices look ok from
    a SHOW DEVICE? Then I'm puzzled. If this happens (DPA device in mount
    verification) can they still do this:
    
    	$ DUMP DSAnnn:[000000]RAID$BC1.SYS/BLOCK=(COUNT:1)
    
    for each shadow set in this RAID set?
    
    What about a "RAID ANALYZE/ERROR/OUTPUT=..."? Any entries? Is this
    truely as standalone Alpha?
    
    Guenther
323.6COOKIE::FROEHLINLet's RAID the Internet!Wed Feb 26 1997 14:4913
    Still need to know the configuration. The RAID sets which work, are
    the disks connected to the same controller? Could it be that just one
    (maybe more) shadow set is having a problem? Keep in mind that if just
    one member in the RAID set has a problem the DPA device can enter mount
    verification whenever we hit the fould RAID set member.
    
    If they can still "play" with these disks devices they should try to
    DCL INITIALIZE the disks and mount them from DCL in exactly the same
    pairs as they use them in the RAID set and do some copies to/from these
    shadow sets. Maybe this helps to isolate the faulte shadow set
    (assuming this is the case).
    
    Guenther
323.7More info againRULLE::LINDSTROM_SThu Feb 27 1997 04:59207
	Last night the customer added 1 member at the time and now the
	raid set is complete. We will see what happens today but he feels
	a little bit more confident now. Here are the disk configuration.

	There are some errors logged since last boot. These errors came
	up when he did a HSZ>DELE UNIT and the controller restarted. But
	this is an other problem I think. After that HSZ restart we still had
	the same problem with the DPA2 device.

	When he first came up after the reconfiguration of the HSZ all the
	disks had to do a shadow full copy. Could the heavy load be part
	of the problem? 

	Sten


*******************************************************************************

$ mc sysman para sho vaxc
%SYSMAN-I-USEACTNOD, a USE ACTIVE has been defaulted on node GAER13
 Node GAER13:   Parameters in use: ACTIVE
Parameter Name            Current    Default    Minimum    Maximum Unit  Dynamic
--------------            -------    -------    -------    ------- ----  -------
VAXCLUSTER                      0          1          0          2 Coded-value  
 
$ sho dev dp
 
Device                  Device           Error    Volume         Free  Trans Mnt
 Name                   Status           Count     Label        Blocks Count Cnt
DPA0:         (GAER13)  Online               0
DPA2:         (GAER13)  Mounted              0  DISK2          4650639   105   1
DPA3:         (GAER13)  Mounted              0  DISK3          8717625   101   1
DPA4:         (GAER13)  Mounted              0  DISK4          7861586    46   1
DPA8:         (GAER13)  Offline              0

******************************************************************************* 

$ raid sho disk2
 
 StorageWorks(TM) RAID Software V2.3     Display Time: 27-FEB-1997 10:36:23.85
 Copyright Digital Equipment Corporation 1993-1996. All Rights Reserved.
 
 RAID Array Parameters:
 
         Current RAID Array ID:    DISK2
         Permanent RAID Array ID:  DISK2
         RAID Level:               0+1
         Current State:            NORMAL
 
 RAID Array Configuration:
 
         Member                               ShadowSet      ShadowSet
     Index    Name               State         Members         State
     -----   ------              -----        ---------      ---------
       0     _DSA21:             NORMAL           2         SteadyState
       1     _DSA22:             NORMAL           2         SteadyState
       2     _DSA23:             NORMAL           2         SteadyState
       3     _DSA24:             NORMAL           2         SteadyState
       4     _DSA25:             NORMAL           2         SteadyState
       5     _DSA26:             NORMAL           2         SteadyState
 
  Virtual
    Unit        Size    Status         Reads       Writes       Errors
  -------      ------  --------        -----       ------       ------
  DPA0002:   50188312  ACCESS        4187604       713687            0

******************************************************************************* 
 
$ raid sho disk3
 
 StorageWorks(TM) RAID Software V2.3     Display Time: 27-FEB-1997 10:36:30.44
 Copyright Digital Equipment Corporation 1993-1996. All Rights Reserved.
 
 RAID Array Parameters:
 
         Current RAID Array ID:    DISK3
         Permanent RAID Array ID:  DISK3
         RAID Level:               0+1
         Current State:            NORMAL
 
 RAID Array Configuration:
 
         Member                               ShadowSet      ShadowSet
     Index    Name               State         Members         State
     -----   ------              -----        ---------      ---------
       0     _DSA31:             NORMAL           2         SteadyState
       1     _DSA32:             NORMAL           2         SteadyState
       2     _DSA33:             NORMAL           2         SteadyState
       3     _DSA34:             NORMAL           2         SteadyState
       4     _DSA35:             NORMAL           2         SteadyState
 
  Virtual
    Unit        Size    Status         Reads       Writes       Errors
  -------      ------  --------        -----       ------       ------
  DPA0003:   41824792  ACCESS        2097262      1079406            0

******************************************************************************* 
 
 $ raid sho disk4
 
 StorageWorks(TM) RAID Software V2.3     Display Time: 27-FEB-1997 10:36:35.16
 Copyright Digital Equipment Corporation 1993-1996. All Rights Reserved.
 
 RAID Array Parameters:
 
         Current RAID Array ID:    DISK4
         Permanent RAID Array ID:  DISK4
         RAID Level:               0+1
         Current State:            NORMAL
 
 RAID Array Configuration:
 
         Member                               ShadowSet      ShadowSet
     Index    Name               State         Members         State
     -----   ------              -----        ---------      ---------
       0     _DSA41:             NORMAL           2         SteadyState
       1     _DSA42:             NORMAL           2         SteadyState
       2     _DSA43:             NORMAL           2         SteadyState
       3     _DSA44:             NORMAL           2         SteadyState
       4     _DSA45:             NORMAL           2         SteadyState
 
  Virtual
    Unit        Size    Status         Reads       Writes       Errors
  -------      ------  --------        -----       ------       ------
  DPA0004:   41824792  ACCESS        1722124       550852            0

******************************************************************************* 

$ sho dev d
 
Device                  Device           Error    Volume         Free  Trans Mnt
 Name                   Status           Count     Label        Blocks Count Cnt
DSA0:                   Mounted              0  SYSDISK        3927375   453   1
DSA1:                   Mounted              0  DISK1          6619482    89   1
DSA5:                   Mounted              0  DISK5           845068   101   1
DSA6:                   Mounted              0  DISK6          3543084    60   1
DSA7:                   Mounted              0  DISK7          7740180     3   1
DSA21:                  Mounted              0  DISK20000000        32     2   1
DSA22:                  Mounted              0  DISK20000001        32     2   1
DSA23:                  Mounted              0  DISK20000002        32     2   1
DSA24:                  Mounted              0  DISK20000003        32     2   1
DSA25:                  Mounted              0  DISK20000004        32     2   1
DSA26:                  Mounted              0  DISK20000005        32     2   1
DSA31:                  Mounted              0  DISK30000000        32     2   1
DSA32:                  Mounted              0  DISK30000001        32     2   1
DSA33:                  Mounted              0  DISK30000002        32     2   1
DSA34:                  Mounted              0  DISK30000003        32     2   1
DSA35:                  Mounted              0  DISK30000004        32     2   1
DSA41:                  Mounted              0  DISK40000000        32     2   1
DSA42:                  Mounted              0  DISK40000001        32     2   1
DSA43:                  Mounted              0  DISK40000002        32     2   1
DSA44:                  Mounted              0  DISK40000003        32     2   1
DSA45:                  Mounted              0  DISK40000004        32     2   1
DPA0:         (GAER13)  Online               0
DPA2:         (GAER13)  Mounted              0  DISK2          4650541   105   1
DPA3:         (GAER13)  Mounted              0  DISK3          8717830   103   1
DPA4:         (GAER13)  Mounted              0  DISK4          7861586    46   1
DPA8:         (GAER13)  Offline              0
$1$DKA0:      (GAER13)  Online               0
$1$DKA100:    (GAER13)  ShadowSetMember      0  (member of DSA1:)
$1$DKA200:    (GAER13)  ShadowSetMember      0  (member of DSA6:)
$1$DKA300:    (GAER13)  ShadowSetMember      0  (member of DSA7:)
$1$DKA400:    (GAER13)  ShadowSetMember      0  (member of DSA5:)
$1$DKA600:    (GAER13)  Online               0
$1$DKB0:      (GAER13)  ShadowSetMember      0  (member of DSA0:)
$1$DKB100:    (GAER13)  ShadowSetMember      0  (member of DSA1:)
$1$DKB200:    (GAER13)  ShadowSetMember      0  (member of DSA6:)
$1$DKB300:    (GAER13)  ShadowSetMember      0  (member of DSA7:)
$1$DKB400:    (GAER13)  ShadowSetMember      0  (member of DSA5:)
$1$DKC0:      (GAER13)  ShadowSetMember      1  (member of DSA21:)
$1$DKC1:      (GAER13)  ShadowSetMember      2  (member of DSA26:)
$1$DKC2:      (GAER13)  ShadowSetMember      1  (member of DSA31:)
$1$DKC3:      (GAER13)  ShadowSetMember      4  (member of DSA41:)
$1$DKC4:      (GAER13)  ShadowSetMember      0  (member of DSA45:)
$1$DKC5:      (GAER13)  ShadowSetMember      1  (member of DSA22:)
$1$DKC6:      (GAER13)  ShadowSetMember      2  (member of DSA21:)
$1$DKC7:      (GAER13)  ShadowSetMember      1  (member of DSA32:)
$1$DKC100:    (GAER13)  ShadowSetMember      1  (member of DSA31:)
$1$DKC101:    (GAER13)  ShadowSetMember      2  (member of DSA42:)
$1$DKC102:    (GAER13)  ShadowSetMember      1  (member of DSA23:)
$1$DKC103:    (GAER13)  ShadowSetMember      2  (member of DSA22:)
$1$DKC104:    (GAER13)  ShadowSetMember      6  (member of DSA33:)
$1$DKC105:    (GAER13)  ShadowSetMember      1  (member of DSA32:)
$1$DKC106:    (GAER13)  ShadowSetMember      1  (member of DSA43:)
$1$DKC107:    (GAER13)  ShadowSetMember      1  (member of DSA24:)
$1$DKC200:    (GAER13)  ShadowSetMember      2  (member of DSA23:)
$1$DKC201:    (GAER13)  ShadowSetMember      2  (member of DSA34:)
$1$DKC202:    (GAER13)  ShadowSetMember      4  (member of DSA33:)
$1$DKC203:    (GAER13)  ShadowSetMember      1  (member of DSA44:)
$1$DKC204:    (GAER13)  ShadowSetMember      1  (member of DSA25:)
$1$DKC205:    (GAER13)  ShadowSetMember      2  (member of DSA24:)
$1$DKC206:    (GAER13)  ShadowSetMember     10  (member of DSA35:)
$1$DKC207:    (GAER13)  ShadowSetMember      2  (member of DSA34:)
$1$DKC300:    (GAER13)  ShadowSetMember      2  (member of DSA42:)
$1$DKC301:    (GAER13)  ShadowSetMember      1  (member of DSA44:)
$1$DKC302:    (GAER13)  ShadowSetMember      1  (member of DSA26:)
$1$DKC303:    (GAER13)  ShadowSetMember      2  (member of DSA25:)
$1$DKC304:    (GAER13)  ShadowSetMember      9  (member of DSA35:)
$1$DKC305:    (GAER13)  ShadowSetMember      7  (member of DSA41:)
$1$DKC306:    (GAER13)  ShadowSetMember      1  (member of DSA45:)
$1$DKC307:    (GAER13)  ShadowSetMember      1  (member of DSA43:)
$1$DVA0:      (GAER13)  Online               0




323.8COOKIE::FROEHLINLet's RAID the Internet!Thu Feb 27 1997 09:596
    So all disks are connected to the very same controller. How about which
    drives are in which shelve (SCSI bus)?
    
    What about entries from "RAID ANALYZE/ERRORLOG SYS$ERRORLOG/OUTPUT=...?
    
    Guenther
323.9Is the CDDB reinit count going up?VMSSG::JENKINSKevin M Jenkins VMS Support EngineeringThu Feb 27 1997 11:388
    Check the CDDB reinit count to see if the controller is breakingf
    the connection. Use SDA.. SHOW DEV DUAxx the second screen should
    have the CDDB data, the reinit field is in the middle section.
    If so then you may be seeing some sort of controller related
    load situation.
    
    Kevin
    
323.10No more problemsRULLE::LINDSTROM_SThu Mar 06 1997 02:377
	Since we got the raid set established it hasn't been any more
	problems. There has been no indication of H/W errors anywere so
	I don't know what conclusions to make. We are closing this for now.
	But thanks for your support anyway.

	Sten.
323.11Blame DUDRIVERESSB::JNOLANJohn NolanMon Mar 17 1997 14:588
    
     Having experienced similar problems (though with HSJ's) I'd put the
    blame on DUDRIVER. VAX(ALP)DRIV01_070 contains a fix for Shadowset
    going into Mountverify and not coming out of it. Needless to say
    DRIV04 is an even better kit to have than any of the earlier ones
    (read the release notes for it and be thankful that you have HSZ
     and not HSJ's, as I've experienced both the shadowset mount verify
     and the other problem mentioned)! 
323.12DRIV04 good idea, but not likely a fixVMSSPT::JENKINSKevin M Jenkins VMS Support EngineeringTue Mar 18 1997 05:598
    
    The DRIV04 kit would be a good thing to have, however the MountVerify
    problem that was fixed in DRIV01 happened only on "idle" connections
    and would go away when any load was put on the devices. This problem
    seems to happen only under load. This makes me wonder about something
    in the controller or drives that doesn't like a heavy load.
    
    Kevin