| Title: | + OpenVMS Clusters - The best clusters in the world! + |
| Notice: | This conference is COMPANY CONFIDENTIAL. See #1.3 |
| Moderator: | PROXY::MOORE |
| Created: | Fri Aug 26 1988 |
| Last Modified: | Fri Jun 06 1997 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 5320 |
| Total number of notes: | 23384 |
This notes has been cross posted in optical notes conference.
There is a 3 nodes CI/NI cluster: Alpha 4000, DEC7730 and DEC7750
with RW534 direct connection. After a node leave and re-joint the
cluster, many jukebox devices go into mount verification timeout
stage. Customer had to reboot the whole cluster to get access those
jukebox devices.
According to the information from customer, one of event happened
in the following sequence:
- Customer shutdown and reboot the Alpha 4000 system
- Many disks went into mount verification stage.
- Sometime later, find many mounted jukebox devices (13 out of 20
mounted devices) went into mount verification timeout in Alpha 4000.
- Same things happened on DEC7750 that has the RW534 direct connected.
- It is different on DEC7730 system, all mounted jukebox devices have
gone to mount verification timeout stage. This is the most busy
system in the cluster.
- Customer tried to dismount those MVtimout devices, process hung.
- Customer had to reboot the whole cluster.
VMS6.2, OSMS 3.3-1
Sysgen Parameter: MVTIMEOUT: 3600
MSCP_LOAD: 1
MSCP_SERVE_ALL:1
All the Jukeboxs devices are served clusterwise. There are 88 (176
logical units) cartridges in the optical disk library and only 20
mounted usually.
Questions:
1. Is it a normal behavior for cluster with jukebox devices?
2. How can we prevent this problem happens?
I plan to increase the MVTIMEOUT value but I am not sure it can
fix the problem and whether there is any side effect.
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 5223.1 | TAPE::SENEKER | Head banging causes brain mush | Tue Feb 04 1997 10:12 | 2 | |
From the OSMS standpoint I have replied in OPTICAL, note 770.
Rob
| |||||
| 5223.2 | mount/cluster without DNS/DFS? | HGOVC::CSCHAN | Wed Feb 05 1997 02:53 | 53 | |
I found an artical in TIMA that quote "[RW5XX] Document...Use of
Clusters with Optical Software". I think it is a good reference
but there are some queries:
5.1.D VMS MOUNT and DISMOUNT operations for each optical volume
will consume time equivalent to the traditional magnetic disk
volume VMS MOUNT and DISMOUNT, plus a per-volume swap time of
fifteen seconds plus the specified MINSWAP delay.
> Base on this information, should I set the sysgen parameter:
>
> MVTIMEOUT > (15 + MINSWAP) * number of mounted jukebox devices ?
>
5.2 Use of OSMS and OSDS without decDFS and decDNS
5.2.B. Platters must not be mounted /SERVED nor /CLUSTER nor
/SYSTEM (which implies /SERVED). Served platters trigger
OpenVMS MSCP services which currently are not compatible with
removable storage.
> The cluster does not use decDFS and decDNS. Could those Platters
> be mounted /cluster or /system?
5.2.C. MOUNT and DISMOUNT operations may consume several hours
on larger autochangers with as many as 288 volumes. This time
may be reduced by mounting only as many volumes as are
absolutely required, by mounting volumes as /NOWRITE (readonly)
volumes where possible, by using MCR JBUTIL SET PARAMETER
/MINSWAP=5 to set the platter hold time to a minimum, and by
keeping as few files open as is necessary.
5.2.D. Cluster transitions must be avoided. Processors joining
an active cluster do not respect the "special" nature of optical
disks. This limitation is particularly true when the node
rejoining the cluster contains the interface adapter to the
autochanger and drives. Any cluster transition will cause all
disks with outstanding I/O (optical as well as magnetic)
> clusterwide to begin a mount-verification, which may not
> complete before an operation timeout occurs which causes another
> mount verification to begin, ad infinitum.
> The customer has to shutdown/reboot one of the system often. In this
> case, how can we eliminate "operation timeout" situation happening?
| |||||