[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference spezko::cluster

Title:	+ OpenVMS Clusters - The best clusters in the world! +
Notice:	This conference is COMPANY CONFIDENTIAL. See #1.3
Moderator:	PROXY::MOORE

Created:	Fri Aug 26 1988
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	5320
Total number of notes:	23384

5202.0. "cluster transition .vs. optical jukebox" by TAPE::SENEKER (Head banging causes brain mush) Thu Jan 09 1997 10:15

T.R	Title	User	Personal Name	Date	Lines
5202.1	MVerify strikes again	helena.zko.dec.com::EVERHART		`Thu Jan 09 1997 13:56`	8
5202.2	can you have your cake and it it too...	TAPE::SENEKER	Head banging causes brain mush	`Thu Jan 09 1997 17:49`	12
5202.3		UTRTSC::jvdbu.uto.dec.com::JurVanDerBurg	Change mode to Panic!	`Fri Jan 10 1997 01:36`	14
5202.4	Here's another hint....	STAR::CROLL		`Fri Jan 10 1997 10:27`	14
5202.5	looking for "less obvious" stuff	TAPE::SENEKER	Head banging causes brain mush	`Fri Jan 10 1997 11:45`	37
5202.6		UTRTSC::thecat.uto.dec.com::JurVanDerBurg	Change mode to Panic!	`Mon Jan 13 1997 02:17`	33
5202.7		STAR::CROLL		`Mon Jan 13 1997 11:08`	17
5202.8	possible code changes	TAPE::SENEKER	Head banging causes brain mush	`Tue Feb 04 1997 10:54`	45
	tstl g^scs$gl_mscp ; MSCP server running? MOVL G^SCS$GL_MSCP_NEWDEV,R2 ; Get address of MSCP routine Will both of the above SCS addresses always be valid or will both be zero? Will there be a case were one has a value and the other doesn't? ------------------------------------------------------------------------------- My pseudo-driver sets the _CLU bit via a DPT_STORE and two code segments. Based on the previous replies I think the following code segment should modified. Please check and comment. ------------------------------------------------------------------------------- MOVB #DT$_GENERIC_DK, - ; Set device type to GENERIC_DK UCB$B_DEVTYPE(R5) ; to allow serving BISL2 #DEV$M_CLU,- ; Set device shareable UCB$L_DEVCHAR2(R5) ; cluster-wide MOVL G^SCS$GL_MSCP_NEWDEV,R2 ; Get address of MSCP routine BGEQ 60$ ; Branch if not available JSB (R2) ; Make this unit available to ; be served 60$: movl ucb$l_vcb(r5),r0 ; Get VCB address ------------------------------------------------------------------------------- to: ------------------------------------------------------------------------------- MOVL G^SCS$GL_MSCP_NEWDEV,R2 ; Get address of MSCP routine BGEQ 60$ ; Branch if not available MOVB #DT$_GENERIC_DK, - ; Set device type to GENERIC_DK UCB$B_DEVTYPE(R5) ; to allow serving BISL2 #DEV$M_CLU,- ; Set device shareable UCB$L_DEVCHAR2(R5) ; cluster-wide JSB (R2) ; Make this unit available to ; be served 60$: movl ucb$l_vcb(r5),r0 ------------------------------------------------------------------------------- Also, the setting of the _CLU bit in the DPT_STORE should be removed. Do you agree? ------------------------------------------------------------------------------- Thanks for taking the time to investigate information related to this problem. Rob
5202.9		UTRTSC::jvdbu.uto.dec.com::JurVanDerBurg	Change mode to Panic!	`Wed Feb 05 1997 02:54`	40
	> tstl g^scs$gl_mscp ; MSCP server running? > MOVL G^SCS$GL_MSCP_NEWDEV,R2 ; Get address of MSCP routine > > Will both of the above SCS addresses always be valid or will > both be zero? Will there be a case were one has a value and > the other doesn't? In the normal case they are both zero or non-zero. > MOVL G^SCS$GL_MSCP_NEWDEV,R2 ; Get address of MSCP routine > BGEQ 60$ ; Branch if not available > MOVB #DT$_GENERIC_DK, - ; Set device type to GENERIC_DK > UCB$B_DEVTYPE(R5) ; to allow serving > BISL2 #DEV$M_CLU,- ; Set device shareable > UCB$L_DEVCHAR2(R5) ; cluster-wide > JSB (R2) ; Make this unit available to > ; be served >60$: movl ucb$l_vcb(r5),r0 If the only object is to enable mscp-serving your devices then this should do it: TSTL G^SCS$GL_MSCP ; MSCP server loaded? BGEQ 60$ ; Branch if not available BISL2 #DEV$M_CLU,- ; Set device shareable UCB$L_DEVCHAR2(R5) ; cluster-wide 60$: That's all. The routine pointed at by SCS$GL_MSCP_NEWDEV only checks if a device is mscp-served. It only returns a status, for the rest it does nothing. The real enabling of mscp serving is done via the scs process poller which notifies the CONFIGURE process, which then enables serving of your device. > Also, the setting of the _CLU bit in the DPT_STORE should be removed. > Do you agree? Yes, get rid of it. Jur.
5202.10	let configure do it....	STAR::CROLL		`Wed Feb 05 1997 09:53`	41
	SCS$GL_MSCP is the global cell used by everybody to determine if the MSCP server is running. It's either zero, or the base address of the base server data structure. It's explicitly initialized to zero by system initialization, and set to non-zero by successful completion of the MSCP server initialization. There are a few cases where SCS$GL_MSCP_NEWDEV is used to determine if the server is running (PEDRIVER uses it) -- this cell is also explicitly initialized to zero when the system is initialized and set to the address of the "is this device served" routine during MSCP server initialization. This routine returns a "yes, I'm serving it" or a "no, never heard of it" indication. It's used by PEDRIVER, to determine if it should respond to a "Solicit Service" message from a satellite system attempting to do MOP booting. I would do what Jur recommended in .9, except I'd make the branch a BEQL (and the BGEQ should be BGEQU, anyway...) The CONFIGURE process periodically scans the I/O database looking for new gizmos to serve, and when one is found, it notifies the MSCP server and sets everything up properly. I recommend this 'cause it'll be a bit less work for you when QIOserver comes along. The other alternative you have is to use the SCS$DISK_MSCP_NEWDEV routine. This is a routine in the MSCP server that's called by DUDRIVER (see [DRIVER]DUTUSUBS\DUTU$LINK_NEW_UCB) when a new device is found and it should be served. (You'd think that SCS$GL_MSCP_NEWDEV and SCS$DISK_MSCP_NEWDEV did pretty much the same thing, wouldn't you? It's clear they were added at different times by different people -- they do two totally different things, and, incidently, use two completely different mechanisms for how they're initialized and set up. Standards are only any good when people know they exist....) Note that DKDRIVER doesn't call SCS$DISK_MSCP_NEWDEV. And it's not documented, and therefore isn't part of the standard driver interface we're all sworn to uphold. The SCS$DISK_MSCP_NEWDEV routine does essentially what CONFIGURE does, except there's no wait for CONFIGURE's polling loop. However, I'm going to take this call out of DUDRIVER when QIOserver comes along, forcing all served device discovery to take place in CONFIGURE, so I can apply cluster-wide configuration rules to served devices. John
5202.11	Where do IO$_NOP's get issued?	TAPE::SENEKER	Head banging causes brain mush	`Mon Feb 24 1997 19:01`	31
	Based on this note stream's input and since most drivers didn't do any of this, I just took all the code segments including the _CLU setting and JSB SCS$GL_MSCP_NEWDEV out all together? Testing shows that the customers problem is corrected but.... (What is the Alpha DRDRIVER doing, it has some _CLU and MSCP_NEWDEV stuff)? and It appears that there is a "undesirable feature" in VMS when MSCP_LOAD is set to 1. It appears that "available to cluster via MSCP Server" disk devices (on Alpha's) are sent IO$_NOP requests when a node joins the cluster. While this extra I/O operations has very little impact to overall performance, for a magnetic disk, it has a large impact to the performance of a OSMS JB type device. That is, it forces every mounted volume to be placed in a physical drive just to service the IO$_NOP request. This means no useful work gets done but some large jukeboxes swap platters for hours. I can code around this but, is there a way to prevent these I/O requests from be generated? Does a IO$_NOP ring a bell with anybody in this type of situation? Also remember, the customer said the jukebox did not have this problem when it was connected to the VAX. Could some piece of software be ohhhh soooooo different on Alpha and be queuing these IO$_NOP's? Rob
5202.12	DRDRIVER sets _CLU and calls SCS$DISK_MSCP_NEWDEV	STAR::CROLL		`Tue Feb 25 1997 10:24`	10
	I just checked the V7.1 listings for DRDRIVER, and it sets the _CLU bit and calls SCS$DISK_MSCP_NEWDEV in two different places, both in situations where a new unit is being set up. I haven't any idea about IO$_NOP, but I'll poke around some. Are you sure it's IO$_NOP and not IO$_PACKACK? I'm pretty sure PACKACKs are sent by DUDRIVER when it discovers a new unit, regardless of the state of that unit -- it's part getting the unit set up. I don't know about IO$_NOP. But, I'll poke around... John
5202.13		STAR::CROLL		`Tue Feb 25 1997 10:49`	23
	Here's a bit more info about IO$_NOP. DUDRIVER turns IO$_NOP requests into MSCP Set Unit Characteristics commands. This probably isn't relevant to your situation. This part probably is. If the MSCP server has a unit ONLINE to a host, and some other host tries to put the unit online (as DUDRIVER in a new node booting into the cluster and discovering units will attempt to do), the MSCP server issues an IO$_NOP to the unit. (See the code starting at label ONLINE: in MSCP.LIS). The comments say that the IO$_NOP is issued to set device characteristics in the local UCB -- DUDRIVER turns it into a Set Unit Characteristics, which is what accomplishes this. This is skipped for non-MSCP devices. So, what's going on appears to be this. The unit is ONLINE to some client. Another client boots, discovers the unit, issues an ONLINE. The MSCP server sees that the unit is ONLINE already, and issues an IO$_NOP to the unit to ensure the unit characteristics are up to date in the server's UCB. Then the MSCP server then copies the characteristics into the MSCP end message and returns this to the new client. This apparently saves a PACKACK and some related processing. In your case, however, it appears to force real I/O to the device, thrashing the jukebox as a result. John
5202.14	more thanks but why is vax different?	TAPE::SENEKER	Head banging causes brain mush	`Tue Feb 25 1997 12:01`	27
	Thanks for your investigation John. I'll look at MSCP.LIS. Since my last reply, I have added code to stop the jukebox thrashing and to return success for the IO$_NOP. Initially, all appears to be well, but I have only tested with two volumes so far. I have a command file mounting 200 volumes as I enter this so I can test with a large number of volumes. Put a reply in later with the result. If I understand this, the IO$_NOP will has value because the MSCP server is able to validate the unit characteristics based on the fact that the IO$_NOP completed successfully. FYI: The reason that this I/O was forcing real I/O (work) to the device is that the JKDRIVER is designed to insure the proper volume is in a drive before the I/O is processed. In this case, no real I/O exists, so the jukebox thrashes. I have added some more code to deal with this special case along with the MVIRP IO$_PACKACK case. Any clues on why the VAX does not do the same thing? Is the MSCP code the same for VAX in the V6.2 timeframe? The reason, I ask is I will have to generate a patch for this fix, and I need to gather as many facts as possible to determine if it will be a only for the Alpha code stream. Rob
5202.15		EVMS::MORONEY	UHF Computers	`Tue Feb 25 1997 13:38`	4
	The VAX and Alpha MSCP server code are similar but not the same. The handling of IO$_NOP doesn't seem to be one of the differences, however. -Mike
5202.16	corruption avoidance	STAR::EVERHART		`Tue Feb 25 1997 17:18`	19
	IO$_NOP gets sent to devices generally to force through any operations sequentially. It provides a guarantee that all other I/O has already taken place, so that nothing is left in the device itself. This is by way of ensuring that when other nodes start accessing the device, no old writes are "left lying around" to clobber later writes that may take place. If you have a way to be sure this is not going to happen you can of course just ignore them. I am referring to writes that may be in a device queue here, or somewhere along the way, not writes that are "at" a source node. The idea is that a new node sends this out to ensure that anything else from a failed node has already happened before any new activity starts. This is necessary to prevent data corruption in the odd rare case where a write was in progress just as a node died and the new node tries to write the same block, and has the block clobbered by the old write because something (e.g. the drive) reordered them. If you send a packack down the wire when you stick something new in the drive, this might not be an issue. (MY code did!)
5202.17	current solution status	TAPE::SENEKER	Head banging causes brain mush	`Wed Feb 26 1997 16:01`	20
	The new code has handled 218 volumes correctly when a node was added to the cluster but the jukebox was idle. No thrashing and the new node entered the cluster normally. Glenn, thanks for the info in .16. I don't think it will matter for the JKDRIVER because it is a low performance simple FIFO queue. But to be sure, I moved the IO$_NOP check to a new place to insure that all the pending I/O has completed. Short summary: The removal of incorrect usage of DEV$V_CLU and the addition of special processing IO$_NOP requests in the JKDRIVER prevents two undesirable situations. In both situations each jukebox pseudo volume was being placed in a jukebox drive to service I/O requests causing serious performance degradation for jukeboxes with as little as four mounted volumes. I am working at creating an ECO, my first one. After this is created, I plan to add another reply with a pointer for the ECO. Rob