[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference spezko::cluster

Title:	+ OpenVMS Clusters - The best clusters in the world! +
Notice:	This conference is COMPANY CONFIDENTIAL. See #1.3
Moderator:	PROXY::MOORE

Created:	Fri Aug 26 1988
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	5320
Total number of notes:	23384

5244.0. "Dual scsi tape naming on CI cluster ?" by PRSPSU::BOUGEANT (MCS Antony ...) Thu Mar 06 1997 10:32

	Hi,


	I've a strange problem on site. We've a cluster of four AS8400 with
	one TZ877 on each connected on KZMSA. Each node have tape SYSGEN 
	parameters set like below :

			node A	   node B    node C    node D


	TAPE_ALLOCLASS    12         13          14          15
	TMSCP_LOAD	   1	      1           1           1
	TMSCP_SERVE_ALL	   1	      1		  1	      1

	magtape		 MKA0	     MKA100	 MKA200	     MKA300

	Sometimes, magtape connected on node A (or node B ...) is seen as 
	$12$MKA0: and NODEA$MKA0: . SLS is not able to run in this case with
	two different names. We need to reboot one member or whole cluster.
	A new UCB, is also created (MKA1:) with an unknown device .

	Has anybody an idea ?
	Is it because we've the same controler letter (A) ?

	OVMS V6.2-1H3

	Regards.
	Luc Bougeant.

T.R	Title	User	Personal Name	Date	Lines
5244.1	Any Shared Interconnects?	XDELTA::HOFFMAN	Steve, OpenVMS Engineering	`Thu Mar 06 1997 10:56`	17
	Your TAPE_ALLOCLASS values imply no shared interconnects (DSSI, CI, SCSI) are in use for accessing tape devices. Check for any unit number collisions among any tape devices present, entirely ignoring any differences among the device controller letters in use. If there are no shared tape interconnects and no tape unit collisions, consider using allocation class zero. (Which will serve the tape device to the other VMScluster members using the host node's SCSNODE name...) The creation of an MKA1: unit sounds like there might be a problem with the TZ877 drive, with the SCSI controller, with a SCSI driver, or with something on the SCSI bus. Check the controller firmware, the drive firmware, check for any SCSI driver patches, and -- if you have current revisions -- get a system crashdump and elevate this through channels...
5244.2	Rectifications and more informations	PRSPSU::BOUGEANT	MCS Antony ...	`Fri Mar 07 1997 11:58`	117
	Hi Steve, Sorry, I wrote a lot of mistakes . 1) Each TZ877 is connected on each KFTIA and noshared. 2) I had original configuration like below node A node B node C node D TAPE_ALLOCLASS 0 0 0 0 TMSCP_LOAD 1 1 1 1 TMSCP_SERVE_ALL 1 1 1 1 magtape MKA0 MKA0 MKA0 MKA0 Sometimes, magtape connected on node A (or node B ...) was seen as $12$MKA0: (with ALLOCLASS) and NODEA$MKA0: . SLS was not able to run in this case with two different names. Is this configuration supported without TAPE_ALLOCLASS ? 3) I try configuration like below node A node B node C node D TAPE_ALLOCLASS 0 0 0 0 TMSCP_LOAD 1 1 1 1 TMSCP_SERVE_ALL 1 1 1 1 magtape MKA0 MKA100 MKA200 MKA300 I reboot whole cluster, and I had same problem ( ie: $12$MKA0: (with ALLOCLASS) and NODEA$MKA0: ) 4) I have today configuration like below node A node B node C node D TAPE_ALLOCLASS 12 13 14 15 TMSCP_LOAD 1 1 1 1 TMSCP_SERVE_ALL 1 1 1 1 magtape MKA0 MKA100 MKA200 MKA300 I reboot whole cluster (with power off/on), and I have a right configuration : Device Device Error Volume Free Trans Mnt Name Status Count Label Blocks Count Cnt $12$MKA0: (HEINE) Online 0 $13$MKA100: (WILDE) Online 0 $14$MKA200: (FEVAL) Online 0 $15$MKA300: (ZEVACO) Online 0 Device Device Error Name Status Count MKA1: Online 0 But a new UCB, is also created (MKA1:) with an unknown device (see sda report below) . Is TAPE_ALLOCLASS obvious in this case ? SCSI Summary Configuration: --------------------------- SPDT Port STDT SCSI-Id SCDT SCSI-Lun Device UCB Type Rev -------------- -------------- -------------- -------- -------- ------ ---- 8362F280 0 DKE400 8362EC80 RZ29B 0016 83617E80 PKD0 83573F00 0 83598B40 0 DKD0 836238C0 GENERI 0436 83610300 PKC0 83605880 PKB0 835F4200 PKA0 836E7E00 0 8368FDC0 0 MKA0 836F64C0 TZ87 9B3C 849B2100 1 MKA1 84A03C00 TU78 .... SCSI Port Descriptor (SPDT): ---------------------------- PKA0: Driver SYS$PKQDRIVER SPDT Address 835F4200 Port Type SCSI QLogic ISP1020 ADP Address 83543940 Adapter PCI UCB Address 835E6900 Device QLOGIC Busarray Address 83543B00 Port Host SCSI Id 7 Port Flags synch,asynch,mapping_reg,dir_dma,luns,cmdq,port_aut Port Device Status online Port Dev Status at DIPL - Target inited Bus Resets 0 Number of Events 0 Retry Attempts 0 Curr I/Os on all Ports 0 Stray Interrupts 0 Curr I/Os on all Devices 0 Unexpected Interrupts 0 Total Outstanding I/Os 0 Reselections 0 CRAB Address 835441C0 Port Wait Queue empty Port CRAM Address 00000000 Nonpg Pool FKB Que empty Port IDB Address 835F3100 Bus Reset Waiters empty Queue Manager's KPB 83588000 Queue Manager's SCDRP 00000000 Regards. Luc Bougeant.
5244.3	Two Bugs Here...	XDELTA::HOFFMAN	Steve, OpenVMS Engineering	`Fri Mar 07 1997 12:36`	28
	Set TAPE_ALLOCLASS back to 0 on all nodes and turn off tape serving, reboot the whole VMScluster, and try to figure out where this $12$ stuff was coming from. That is one bug. And as I previously indicated and will restate here, check the firmware revisions and controller revisions, check the SCSI configuration and cabling, and check for any patches for whatever version of OpenVMS is in use here. (If you do not know how to do these steps, please call in field service for a look at the hardware, and to check the revisions and the SCSI.) If you cannot resolve this, then log an IPMT. As for the MKA1 bug, that sounds like one of the SCSI tape drives is not responding appropriately to the host -- if you need/want to serve tapes, your different-allocation-classes configuration is correct, but you do not need to alter the unit numbers. (The tape unit numbers must be unique only within a tape allocation class.) Set the unit numbers to zero, and reboot the whole VMScluster. See if MKA1: reappears. (This MKA1: bug looks like -- with no more real evidence -- a SCSI problem with the TZ877 or with the SCSI device driver on the local node(s).) If you cannot resolve this, then log an IPMT.