[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::hackers_v1

Title:	-={ H A C K E R S }=-
Notice:	Write locked - see NOTED::HACKERS
Moderator:	DIEHRD::MORRIS

Created:	Thu Feb 20 1986
Last Modified:	Mon Aug 03 1992
Last Successful Update:	Fri Jun 06 1997
Number of topics:	680
Total number of notes:	5456

331.0. "Could I use lock manager for this??" by RUMOR::FALEK (The TU58 King) Wed Oct 08 1986 13:28

I have a scheduler program that was originally designed to work on a
single CPU but now I want to make it useable on a cluster. The scheduler
runs as a detached process. External programs, such as the human interface,
communicate with the scheduler by directly reading or writing the
scheduler's database, and then kicking the scheduler by sending a mailbox
message which basically says "Hey, wake up and look at record number n".

Since the database file is already accessible to all nodes on the cluster,
that part does not present a problem. But there are no cluster-wide mailboxes
so I need to replace that part with another mechanism. I want something
simple and elegant. I'm wondering, is there any way I could use the distributed
lock manager somehow to do what I want?

Restrictions:

I need to pass up to 16 bytes of data to the scheduler and cause one of its
local event flags to get set (as of now, this is done through mailbox i/o
completion).

I don't want to have special detached processes sitting around on each node
of the cluster.

I don't want the overhead of setting up DECnet links to the node the
scheduler is running on.

     Anyone have any ideas?

T.R	Title	User	Personal Name	Date	Lines
331.1	SCS services, maybe?	FROST::HARRIMAN	DEC 41-BLANK-03, Harriman,Paul J., qty 1	`Thu Oct 09 1986 08:53`	1

331.2	Read Chapter 12 of System Services Ref. Man.	QUILL::NELSON	JENelson	`Thu Oct 09 1986 12:51`	28
	Yes, the lock manager will do exactly what you want. Here's what you do: Your scheduler process gets started (on just one node of your cluster!) and $ENQs a lock for EXclusive access, then converts the lock to PW specifying a value block (value blocks discussed later), and a blocking AST routine called DOORBELL. The scheduler $HIBERnates. When the DOORBELL routine runs, it converts the lock to NL mode, then $ENQWs a request to get the lock back in EX mode. Once the request is satisfied, you copy the value block to local storage, re-initialize the value block, and convert the lock back to PW, specifying DOORBELL as your blocking AST routine. The scheduler can then act on the contents of the value block. Value blocks are used to pass information (up to 16 bytes) between processes. The program that wants to communicate with the scheduler $ENQWs a request for the lock in EX mode. Once granted, it fills in the value block, and converts the lock to NL mode. Hope this has been clear. JENelson
331.3	will try it, thanks -	RUMOR::FALEK	The TU58 King	`Thu Oct 09 1986 13:27`	4
	Thanks, I'll write a little test program to try this. (I also entered this as note 1731 in VMSnotes and got some response there, it says essentially the same thing as .2)
331.4		CLT::GILBERT	eager like a child	`Thu Oct 09 1986 18:37`	31
	Yes, but what if... What if two processes simultaneously want to communicate with the scheduler? Processes A and B $ENQW EX mode lock requests. The scheduler's DOORBELL routine runs, converts the lock to NL mode, and $ENQs a request to get the lock back in EX mode (your note had this last one as '$ENQW', which is incorrect). Now process A is granted the EX lock, fills the value block, and converts to NL mode. Process B gets the lock, fills the value block (thereby trashing what process A wrote there), and converts to NL mode. Then the scheduler is granted its EX lock. Note that the scheduler receives only one of the two messages. Instead, to communicate with the scheduler should $ENQ (it's okay to wait here, if you like) an EX mode request. When it's granted, check whether the value block already contains a message. While there's a message in the value block, convert to NL mode, and $ENQ another EX mode request for the message. When there isn't a message in the value block, fill the value block, and convert the lock to NL mode. And before the scheduler's DOORBELL routine converts the lock to NL mode, it should clear the value block to indicate that it contains no message. Instead of clearing the value block, you could use the flags (to the $ENQ service) and a lock-status-block status of SS$_VALNOTVALID to indicate whether the value block contains a message.
331.5	DECnet is not so bad	TAV02::NITSAN	Nitsan Duvdevani, Digital Israel	`Thu Oct 16 1986 04:29`	8
	re .0 > I don't want the overhead of setting up DECnet links to the node the > scheduler is running on. In a small "benchmark" we made about a year ago (on a small cluster), DECnet communication (using the CI) was more efficient than the distributed lock manager.
331.6	DECnet links don't HAVE to be slow	CRATE::COBB	Danny Cobb, DSS Eng, LKG	`Mon Oct 20 1986 12:18`	6
	Lou, instead of creating/deleting processes for logical links, write your own program that declares itself a decnet object and handles the incoming links. Its fast, and handling the multithreading isn't too tough, and your connects are practically instantaneous. Danny
331.7	Lock manager mechanism works great!	RUMOR::FALEK	ex-TU58 King	`Wed Oct 22 1986 13:51`	14
	Re: .6 (Having the server declare itself as a network object and handle incoming DEcnet connects) - would certainly work, and has the advantage (not relevant for this particular application) of also working on a wide-area network. But... I've coded the mechanism described in .3,.4 using $ENQ and it works great!! Connections appear to the user as being nearly instantaneous. I don't think DECnet connects to a server can work as quickly. The distributed lock manager is neat stuff! The first time I read the documentation it seemed confusing, but it is simple to use once you get the concepts down. Our group is putting all our workstations on a LAVC, so maybe I can use some of my new-found knowlege to make some useful cluster utilities, like f'rinstance something to cause execution of a VMS command on all nodes of a cluster at once.
331.8	How can I pass a message to all nodes of cluster?	FALEK::FALEK	ex-TU58 King	`Sat Nov 15 1986 23:57`	72
	I now have my job-creating scheduler program's user-interface working cluster-wide, using mechanisms discussed earlier in this note. Users can talk to the scheduler from any node, but jobs get created only on the node the scheduler is running on. There can be only one copy of the scheduler running per cluster. It would be useful to generalize things furthur. I'd like to add a "NODE" field to the (common diskfile) database and run a scheduler on EACH node of the cluster. If the node field in the database is blank, the job may be run on any CPU in the cluster, otherwise just on the specified CPU. I came up with a lock-based communications mechanism to help implement this, but when I tried it out, I found that my design is wrong. I'm going to describe it here (any why it doesn't work) in hopes that readers of this note may be able to offer hints that may help me to get around the problem. My design has one of the schedulers be "master", and all others slaves. The master is the one that holds the "KO" lock in exclusive mode. Since a scheduler never gives this up once it gets it, the master is the first scheduler started in the cluster. If the node crashes or someone kills the master process, one of the slaves will get the KO_LOCK and the KO_AST that goes along with the lock will make it the new master. The KO_AST sets a bit in the scheduler that tells it that it is the master, and also sets up the user-interface lock mechanism described in earlier responses to this note. The user-interface always talks to the master scheduler. (This stuff, the passing of master-ship and the user-interface always talking to the master from any node, works correctly) Now let's say that the master wants to send a message to one of the slaves. Messages consist of a flag-byte, a destination node, and a message. My original (doesn't work) design had each scheduler, at startup time, enqueue a request for the "Round Robbin" (RR) lock in exclusive mode, specifying the "RR_AST" to be delivered when the EX-mode request is granted. When the scheduler acting as master gets the EX-mode RR-lock, it keeps it until it has a message to circulate to all the slaves. To circulate a message, it puts the message in the value block, downgrades the lock to NL-mode, and $ENQS another request to get the lock back in Ex-mode. When a scheduler acting as slave gets the lock, it reads the message in the value block, sets the flag-byte to "H" if the message was for it, and then downgrades the lock to NL, thereby letting the next guy get it. It then $ENQs a request to get the lock again in EX-mode. Eventually the master should get the EX-mode lock again, look at the flag byte to see if the message was accepted by anybody, and then keep the lock in EX-mode until it has another message to circulate. I reasoned that when the master downgrades the lock and then requests the upgrade, the conversion request goes at the end of the queue and so all slaves should get a crack at it before the master gets it again. Unfortunately, my mechanism only works for up to 1 slave. The problem is that new slaves can never get the lock for the first time (even if they first request it in NL: mode and try to upgrade) because apparently the lock CONVERSION queue must be empty before any NEW locks can ever be granted. Since in my scheme either the first slave or the master will always have an outstanding request for a conversion of the RR lock to EX-mode, no new slaves can ever acquire the lock for the first time. Is this analysis correct? I've gotta believe VMS must do this sort of thing all the time, so there must be some way to do it! Can anybody think of a scheme whereby the master can pass a message to the servers on ALL nodes of the cluster? lou falek
331.9	see also CSSE32::CLUSTER	FALEK::FALEK	ex-TU58 King	`Thu Nov 20 1986 23:09`	6
	I posted my question about using the lock manager to send a message to all nodes of a cluster to CSSE32::CLUSTER (note 302) and got many useful suggestions. There are pitfalls that one can fall into because rules about which conversions can or can't be blocked by what are complex and the documentation doesn't make them all that clear. But what I want can be made to work.