Title: | + OpenVMS Clusters - The best clusters in the world! + |
Notice: | This conference is COMPANY CONFIDENTIAL. See #1.3 |
Moderator: | PROXY::MOORE |
Created: | Fri Aug 26 1988 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 5320 |
Total number of notes: | 23384 |
This notes has been cross posted in optical notes conference. There is a 3 nodes CI/NI cluster: Alpha 4000, DEC7730 and DEC7750 with RW534 direct connection. After a node leave and re-joint the cluster, many jukebox devices go into mount verification timeout stage. Customer had to reboot the whole cluster to get access those jukebox devices. According to the information from customer, one of event happened in the following sequence: - Customer shutdown and reboot the Alpha 4000 system - Many disks went into mount verification stage. - Sometime later, find many mounted jukebox devices (13 out of 20 mounted devices) went into mount verification timeout in Alpha 4000. - Same things happened on DEC7750 that has the RW534 direct connected. - It is different on DEC7730 system, all mounted jukebox devices have gone to mount verification timeout stage. This is the most busy system in the cluster. - Customer tried to dismount those MVtimout devices, process hung. - Customer had to reboot the whole cluster. VMS6.2, OSMS 3.3-1 Sysgen Parameter: MVTIMEOUT: 3600 MSCP_LOAD: 1 MSCP_SERVE_ALL:1 All the Jukeboxs devices are served clusterwise. There are 88 (176 logical units) cartridges in the optical disk library and only 20 mounted usually. Questions: 1. Is it a normal behavior for cluster with jukebox devices? 2. How can we prevent this problem happens? I plan to increase the MVTIMEOUT value but I am not sure it can fix the problem and whether there is any side effect.
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
5223.1 | TAPE::SENEKER | Head banging causes brain mush | Tue Feb 04 1997 10:12 | 2 | |
From the OSMS standpoint I have replied in OPTICAL, note 770. Rob | |||||
5223.2 | mount/cluster without DNS/DFS? | HGOVC::CSCHAN | Wed Feb 05 1997 02:53 | 53 | |
I found an artical in TIMA that quote "[RW5XX] Document...Use of Clusters with Optical Software". I think it is a good reference but there are some queries: 5.1.D VMS MOUNT and DISMOUNT operations for each optical volume will consume time equivalent to the traditional magnetic disk volume VMS MOUNT and DISMOUNT, plus a per-volume swap time of fifteen seconds plus the specified MINSWAP delay. > Base on this information, should I set the sysgen parameter: > > MVTIMEOUT > (15 + MINSWAP) * number of mounted jukebox devices ? > 5.2 Use of OSMS and OSDS without decDFS and decDNS 5.2.B. Platters must not be mounted /SERVED nor /CLUSTER nor /SYSTEM (which implies /SERVED). Served platters trigger OpenVMS MSCP services which currently are not compatible with removable storage. > The cluster does not use decDFS and decDNS. Could those Platters > be mounted /cluster or /system? 5.2.C. MOUNT and DISMOUNT operations may consume several hours on larger autochangers with as many as 288 volumes. This time may be reduced by mounting only as many volumes as are absolutely required, by mounting volumes as /NOWRITE (readonly) volumes where possible, by using MCR JBUTIL SET PARAMETER /MINSWAP=5 to set the platter hold time to a minimum, and by keeping as few files open as is necessary. 5.2.D. Cluster transitions must be avoided. Processors joining an active cluster do not respect the "special" nature of optical disks. This limitation is particularly true when the node rejoining the cluster contains the interface adapter to the autochanger and drives. Any cluster transition will cause all disks with outstanding I/O (optical as well as magnetic) > clusterwide to begin a mount-verification, which may not > complete before an operation timeout occurs which causes another > mount verification to begin, ad infinitum. > The customer has to shutdown/reboot one of the system often. In this > case, how can we eliminate "operation timeout" situation happening? |