T.R | Title | User | Personal Name | Date | Lines |
---|
590.1 | disk not found | DECWET::LEES | Will, NTSG DECwest, Seattle | Thu Jan 30 1997 16:29 | 30 |
| Did you remove the old disk objects from the group before doing the upgrade.
Please verify in the Cluster Administrator that the old disk objects are not
still referenced in the group.
Do you still have the problem after the reboot, or has it gone away?
Will
<<< Note 590.0 by IJSAPL::ONDERWATER "Cor Onderwater @UTO" >>>
-< Problems with RAIDArray 310 >-
Hi,
After upgrading an AlphaServer 4100 cluster from two simple shared
disks to a RaidArray 310 with two logical disks (a two-disk mirrorset,
a 4-disk raid 5/3 set and a spare disk) manual failover sometimes
fails. The error message we get is:
Unknown error code 3254846469 (sev=3, fac=0x201, id=0x405
CliFmMan TransferGroup=-1040120827
After this error the failovergroup (the raid 3/5 set) can not be brought
on-line again (same error message as above) from either system. After
rebooting the systems the cluster looks allright.
NT version: 3.51 with service pack 5
Digital Clusters for Windows NT 1.0 with service pack 1
Upgraded SCSI driver (from StorageWorks Nijmegen)
Is this a known problem? Is a solution available?
Cor
|
590.3 | interesting! | MSE1::PCOTE | Rebuilt NT: 163, Rebuilt VMS:1 | Fri Jan 31 1997 10:41 | 11 |
|
hmmm, I have the exact symptoms on a HSZ40 and the STEAM service
was running as well.
Question: Do you have HSZDISK 2.51 installed ? Also, Set
the FMlog verbosity to 6 to acquire more info in the fmlog
when the problem occurs. (see the admin guide for info on
how to do this)
|
590.4 | RAIDArray 310 still fails at manual failover | IJSAPL::ONDERWATER | Cor Onderwater @UTO | Fri Jan 31 1997 11:28 | 13 |
| After stopping the STEAM services the error mentioned in .0 occurred
again i.e. manual failover is correct, second manual failover
(failback) fails. When no client is accessing a shared disk the problem
did not occur (so far).
We reinstalled the cluster software, but the error is still there.
The RAIDArray disks show the following information at boot time:
"HSZ20 V30Z" (firmware revision?). Is this supported buy the cluster
software?
Cor
|
590.5 | Yes HSZDISK 2.51 installed | IJSAPL::ONDERWATER | Cor Onderwater @UTO | Fri Jan 31 1997 12:42 | 3 |
| Reply to .3
Yes, hszdisk v2.51 is installed. AlphaBios version is 5.21
|
590.6 | | MSE1::PCOTE | Rebuilt NT: 163, Rebuilt VMS:1 | Fri Jan 31 1997 12:44 | 5 |
|
Yeah, V3.0 is supported but you should upgrade to patch level 2.
|
590.7 | please tell me where I can get V3.0 patch level 2 | IJSAPL::ONDERWATER | Cor Onderwater @UTO | Fri Jan 31 1997 12:56 | 5 |
|
Can you please tell me where I can get V3.0 patch level 2?
Cor
|
590.8 | note 495.6 | MSE1::PCOTE | Rebuilt NT: 163, Rebuilt VMS:1 | Fri Jan 31 1997 13:14 | 0 |
590.9 | >>set this id=(0,..) | COPCLU::JTHOMSEN | | Sat Feb 01 1997 09:23 | 13 |
| Hi!
Had a similar problem with a HSZ40 controller where one could make a
failover but could not get it back with a failover but had to reboot.
In the HSZ40 there were no ID's set so with
HSZ40>set this id=(0,1,2,3) - the problem seems to be solved. Maybe you
have to do the same on your controller??
Regards
Jan Thomsen
MCS Denmark
|
590.10 | Where comes that 3rd disk from???? | IJSAPL::ONDERWATER | Cor Onderwater @UTO | Sun Feb 02 1997 16:34 | 38 |
| Hi,
See last part of cluster trace file:
-----Start
16:49:04.627 tid=170 Step II: Identifying shared devices by
probing Dos physical drives
16:49:04.665 tid=170 Defined Dos Device:
PhysicalDrive3 ==> \Device\Harddisk3\Partition0
16:49:04.710 tid=170 The Cluster Disk Driver is not
attached to device PhysicalDrive3.
This could be because the cluster disk driver, CluDisk,
is not installed,
or because some other driver has already attached to
this device.
Please verify that any disk filter drivers start after
the CluDisk driver.
File: E:\CLUBUILD.351\src\fm\fmdisk\device.c Line: 321
16:49:04.800 tid=170 The Failover Manager encountered an
error or exception
while invoking a method function.
The Online operation on object FMDisk\_disk_0eb1e35b failed.
File: E:\CLUBUILD.351\src\fm\fmcore\fmgroup.c Line: 1446
16:49:04.882 tid=170 No such disk.
File: E:\CLUBUILD.351\src\fm\fmcore\fmgroup.c Line: 1447
16:49:04.965 tid=170 Putting group "Groep2" Offline
16:49:06.765 tid=170 The cluster manager has put
group Groep2 OFF LINE
on this system. Reason: Administrator request.
File: E:\CLUBUILD.351\src\fm\fmcore\fmgroup.c Line: 1701
------ end trace
It looks as if a third disk is discovered. The RAIDArray 310 only
offers 2. This happens when a manually failed-over group is manually
failed back.
Another thing: The times on the two systems differ one hour. Can this
lead to problems?
Cor
|
590.11 | | MSE1::PCOTE | Rebuilt NT: 163, Rebuilt VMS:1 | Mon Feb 03 1997 14:23 | 13 |
|
<<< Note 590.10 by IJSAPL::ONDERWATER "Cor Onderwater @UTO" >>>
-< Where comes that 3rd disk from???? >-
read note 390.3 and the release notes for the cause of the
'phantom disk'.
Also, the problem you're seeing (and I'm seeing) with a manual
failover and open files seem to be the root of the error message
that you've noted on reply .0
More later,
|
590.12 | | MPOS01::naiad.mpo.dec.com::mpos01::cerling | I'[email protected] | Tue Feb 04 1997 10:31 | 17 |
|
I would guess our position to be that of waiting to see what
Microsoft does. Since our NT Clusters only supports StorageWorks,
and no other vendor, I cannot a monetary reason for Digital to
support EMC for Digital's clustering. Microsoft might feel
differently for follow-on Wolfpack. I doubt that it will be there
for V1.0 of Wolfpack, either.
Get your StorageWorks guy in there. Maybe if they really want
clusters, they will bite the bullet and give StorageWorks a toe
in the door in order for them to run clusters. Then they might
realize they are paying too much for EMC when they can get what
they want from StorageWorks. Pitch the benefits of clusters, but
allow the StorageWorks guy to counter any perceived shortcomings of
StorageWorks when compared to EMC.
tgc
|
590.13 | | MSE1::PCOTE | Rebuilt NT: 163, Rebuilt VMS:1 | Tue Feb 04 1997 10:42 | 14 |
|
I have logged the problem referenced by the base note. btw, this
has nothing to do with the SW310.
wrt to EMC storage, Microsoft will provide a hardware qualification
suite via the HCT. Hardware vendors, such as EMC will need to pass
this qualification suite to get the NT cluster (wolfpack) stamp of
approval.
I'm sure EMC will persue this. I'm sure Digital will not bother
to qual EMC storage for our (short lived) NT cluster product.
Paul
|