[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::hackers_v1

Title:	-={ H A C K E R S }=-
Notice:	Write locked - see NOTED::HACKERS
Moderator:	DIEHRD::MORRIS

Created:	Thu Feb 20 1986
Last Modified:	Mon Aug 03 1992
Last Successful Update:	Fri Jun 06 1997
Number of topics:	680
Total number of notes:	5456

635.0. "Cluster monitor/watchdog" by FROST::HARRIMAN (How do I work this?) Fri Dec 18 1987 13:31

    
    Does anybody here have any good ideas on how to tell if another
    cluster member has died? I have a major application which does not
    do journaling, and I cannot touch the file-io-system. However I
    need to shut it down if a member of the cluster (homogenous) goes
    away. 
    
    Are there any services available? Could the lock manager tell me?
    
    /pjh

T.R	Title	User	Personal Name	Date	Lines
635.1		HIBOB::KRANTZ	Next window please.	`Fri Dec 18 1987 13:54`	6
	You can tell from DCL, so from a program it should be easy! f$getsyi("cluster_member","nodename") yeilds TRUE/FALSE... presumably there is a matching library routine... Joe
635.2		STAR::DICKINSON	Peter	`Fri Dec 18 1987 15:35`	13
	Is it necessary for you to know if a specific node has gone down, or just that _a_ node in the cluster has gone down ? How much 'time' is needed from that event until you must take action ? It seems that $getsyi (or Lib version) will do, question is; how are you going to detect the event- by polling every delta time , or having the event be asynchronous and notifying you ? The interesting, and probably most desirable case, is the asynchronous method.
635.3	Asynchronously would be much better, thank you	FROST::HARRIMAN	How do I work this?	`Fri Dec 18 1987 16:50`	26
	It would be nice to know what node, although that would be pretty moot (I'm sure Operations will figure it out pretty quickly) As for how quickly to respond, I'd like to detect it as quickly as possible to avoid further database corruption (I can guarantee that some corruption _will_ be evident; there would be 80+ people guaranteed to be in the database at any one time, someone is _Always_ trying to do some kind of update, and it's all RMS files) I was thinking about $getsyi.. it's easy but I don't want to have my monitors doing polling - they're busy enough recording audit information and updating their "snapshot database" (who's on at what time in what process mode)... but $getsyi implies that I'd have to have each monitor (or worse yet, one of them) poll the others ... this wouldn't be what I want - besides, what if the "master" was on the machine that crashed? I'd much rather get notification from some system entity telling me that the cluster just changed state or that someone went away - that way I could tell HSC crashes too (which are just as bad sometimes). Each surviving monitor would have to behave the same, on each VMS node, so asynchronous notification would be much more desirable to me. /pjh
635.4	How about this	MDVAX3::COAR	My hero? Vax Headroom, of course!	`Fri Dec 18 1987 17:45`	24
	Use $GETSYI to find out all the members of a VAXcluster, or use some sort of stored list of names. Build a list of resource names from this, and write a program which runs on each node. The program will lock the resource specifying its own node with mode=EX, and queue (NOT with wait!) a PR-mode lock to all the other resources, specifying an AST routine. The AST will get called when the lock gets granted; if all nodes were up and properly synchronised, this means that the node corresponding to the resource you just locked has gone down (or the program has stopped running). Once you get it, flag the value block to indicate that the node drop has been observed and acknowledged, so that any other nodes can know and treat it as a no-op when they get it. Then release the lock. (Any other nodes in the VAXcluster will now get it in turn, see that it has already been handled, and release it without doing anything.) When the victim node comes back up, he'll get his lock in EX mode again. I'll leave it as an exercise to the reader as to how he can tell the other nodes to re-queue their lock on his resource. Good enough? #ken :-)}
635.5	Oops! I forgot this..	MDVAX3::COAR	My hero? Vax Headroom, of course!	`Fri Dec 18 1987 17:49`	11
	Oops! I forgot to mention that your AST should check with $GETSYI to see if the specified node is actually available, and pause for some interval if so. This means that, while one node is struggling to come up for the first time (i.e., the other programs didn't note his going down - possible if the VAXcluster is just booting), the others will go into an almost-deadly embrace, passing the lock around until the real owner gets to the point of running the program. The pause is to prevent them from sucking up lots of cycles while passing the lock around. #ken :-)}
635.6	Dennis' Command Files	BLITZN::ROBERTS	Peace .XOR. Freedom ?	`Fri Dec 18 1987 18:46`	164
	The following two command files are used at CXO. One monitors via the network (and thus is impervious to the monitored cluster's demise) and the other runs on the cluster. The command files are separated with form feeds. For questions, comments, etc about these two command files, please address their author via mail to KNEE::ATKINSON. /Dwayne Roberts $! $! Title : CAPS_MONITOR_NODES.COM $! Version : V1.0 $! Date : 14-Apr-1987 $! Programmer : Dennis Atkinson $! $! Description: $! This command file is used to monitor the nodes in the CAPS cluster. $! It will inform the OPERATOR and SYSTEM accounts if a node drops from $! the cluster. The program is continuously running on all nodes in $! the CAPS Production cluster. $! $! Maintenance History: $! Version Date Initials Description $! V1.0 14 Apr 87 DMA Created $! V1.1 30 Apr 87 DMA Added notify by remote submit to dasher $! V1.1a 12 Jun 87 DMA Added notify of LTA1000 terminal on CAPS $! $! $Start: $ vers = f$extract(18,2,f$time()) $ on error then goto finish_up $ on control_y then goto finish_up $ this_node := "" $ node := "" $ cnt = 0 $ a_member := "" $ all_here := "" $ repeat = 0 $ log_count = 0 $ define /nolog system1 KNEE $ users := (ATKINSON,SYSTEM,OPERATOR) $! $Get_nodes: $ show cluster/out=sys$manager:cluster.tmp $ open/err=finish_up cluster_list sys$manager:cluster.tmp $read_header: $ read cluster_list node $ read cluster_list node $ read cluster_list node $ read cluster_list node $ read cluster_list node $ read cluster_list node $ cnt = 0 $show_nodes: $ read/end=close_this_one/err=finish_up cluster_list node $ node = f$edit(f$extract(2,6,node),"trim,compress,upcase") $ if node .eqs. "" then goto close_this_one $ a_member = f$getsyi("cluster_member","''node'") $ if .not. a_member then goto show_nodes $ all_here := "''all_here'" "''node'" $ ! $ goto show_nodes $close_this_one: $ close cluster_list /nolog $ delete /nolog sys$manager:cluster.tmp;* $Check_members: $ cnt=cnt+1 $ this_node = f$logical("system''cnt'") $ if this_node .eqs. "" then goto do_it_again $ on error then continue $ if f$locate("''this_node'","''all_here'") .eq. f$length("''all_here'") - then reply/bell/urgent/term=LTA1000: - "''this_node' IS NOT A MEMBER OF THE CAPS DEVELOPMENT CLUSTER - PLEASE CHECK ''this_node'" $ on error then goto finish_up $ goto check_members $do_it_again: $ wait 00:00:30.00 $Finish_up: $ gosub clear_logicals $! $! This will restart the process NODE_CHK_x on CAPS $! $ run /detached /uic=[1,4] - /input=sys$manager:caps_monitor_nodes.com - /output=nl: - /error=sys$manager:caps_monitor_nodes.err - /prio=3 - /process="MON_NODES_''vers'" - sys$system:loginout.exe $! $ exit $ ! S U B R O U T I N E S $! $Clear_logicals: $ log_count=log_count+1 $ if f$logical("system''log_count'") .eqs. "" then return $ deassign system'log_count' $ goto clear_logicals $! $! Title : CAPS_MONITOR_NET_NODES.COM $! Version : V1.0 $! Date : 23-Oct-1987 $! Programmer : Dennis Atkinson $! $! Description: $! This command file is used to monitor the nodes in the CAPS $! production and test clusters. It will inform LTA1000, a remote $! console, if a node drops from the network. The program is $! continuously running on all nodes in the CAPS Production cluster. $! $! Maintenance History: $! Version Date Initials Description $! V1.0 23 Oct 87 DMA Created $! $Start: $ set noverify $ log_count=0 $ cnt=0 $ vers = f$extract(18,2,f$time()) $ on control_y then exit $! $ define /nolog system1 TOPCAP $ define /nolog system2 BOTTLE $ define /nolog system3 NIGHT $! $Check: $ cnt=cnt+1 $ this_node = f$logical("system''cnt'") $ if this_node .eqs. "" then gosub finish_up $ on error then goto no_network $ open/write network_up 'this_node'::nl: $ close network_up $ goto check $! $No_network: $! $ reply/URGENT/bell/term=lta1000 - "''this_node' is not available from the CAPS cluster - please investigate." $ return $Finish_up: $ gosub clear_logicals $! $! This will restart the process mon_net_nodes_xxx on CAPS $! $ on error then exit $! $ wait 00:00:30.00 $ run /detached /uic=[1,4] - /input=sys$manager:caps_monitor_net_nodes.com - /output=nl: - /error=sys$manager:caps_monitor_net_nodes.err - /prio=3 - /process="MON_NET_''vers'" - sys$system:loginout.exe $! $ exit $Clear_logicals: $ log_count=log_count+1 $ if f$logical("system''log_count'") .eqs. "" then return $ deassign system'log_count' $ goto clear_logicals
635.7	Locks are re-granted later on...	CHOVAX::YOUNG	Back from the Shadows Again,	`Sat Dec 19 1987 23:05`	7
	Re .4, .5: I think that this method will only inform you AFTER the lock database has been rebuilt. I believe that $GETSYI will reveal the transition as soon as the connection manager figures it out. -- Barry
635.8	Thanks, time to start investigating...	FROST::HARRIMAN	How do I work this?	`Mon Dec 21 1987 09:14`	19
	re: .6 Interesting. I need other things to happen with the monitors though. re: .4, .5 Fascinating. I'll go hack on it. I had a feeling that the Distributed Lock Manager was something I could use. I looks like $GETSYI and some lock detection is needed. I wish I could get some kind of notification from the system detecting when the cluster was transitioning. How does VMS do it? OPCOM surely hears about it... Is this from other OPCOMs or is it something else altogether? Now we're just getting academic.... /pjh
635.9		VINO::RASPUZZI	Michael Raspuzzi	`Mon Dec 21 1987 09:57`	18
	I am not a VMS person but have a question that pertains to this and maybe someone in VMS could answer it. Is there another system service provided by VMS that uses the SCA facility? SCA is Digital's way of connecting homogeneous (and heterogeneous!) machines in a cluster. If there is a way for you to interface into SCA through a system service, then you could mimic a SYSAP (system application). When a node goes down (HSC or not) SCA will give you a "port broke connection" callback and you can then react to it as you will. I ask this because in TOPS-20, we have the SCS% JSYS. In fact, I have a fork that always runs and it uses SCS% to notify me if a node goes down. No polling. SCA simply gives my process a software interrupt (PSI - like an AST) and away it goes. One word of caution though. SCS% on TOPS-20 needs privs. If there is an equivalent on VMS, it may also need privs. Mike
635.10	Well..	MDVAX3::COAR	My hero? Vax Headroom, of course!	`Mon Dec 21 1987 12:13`	13
	Re .7: The entire VAXcluster is blocked until the state transition completes, and the lock DB rebuild is part of the transition. You can avoid this (that is, continue running through the transition) by running at a high IPL, but that may interfere with the connexion manager, and need a lot of privileges and development time. Re .9: SCA? Are you sure you don't mean SCS (System Communication Services)? #ken :-)}
635.11		VINO::RASPUZZI	Michael Raspuzzi	`Mon Dec 21 1987 13:34`	8
	SCA/SCS are really the same thing (SCA stands for System Communication Architecture and SCS stands for System Communication Service). Our module is called System Communication Architecture for Multiprocessor Interconnect or SCAMPI for short (not to be confused with an Italian shrimp dish). Mike
635.12	SCS is a possibility..	FROST::HARRIMAN	How do I work this?	`Mon Dec 21 1987 15:51`	23
	Re: .-2 Privs are not a problem. Processes all run detached, communicate via mailboxes. Elevated IPLs may or may not be a problem. I'm not afraid of writing in kernel mode code, but only for a good reason. SCS sounds interesting, I was thinking about it since CUDRIVER (which is a decent example) uses it and even has a device hanging around (CUA0:). I'm checking Digital Technical Journal #5, sept. '87 to see what else I can find - how does the connection manager signal to people like OPCOM???? Can I get it to tell me a connection went away? It would help; I have found that occasionally an HSC will drop, and failover won't work right, and I then need to recover (after 20-30 minutes of hung processes, myriad telephone calls to the computer center, aggravation, etc). I can build smarts into the monitor if I can find out the information... Thanks /pjh
635.13	Is all this really necessary?	SQM::HALLYB	Khan or bust!	`Tue Dec 22 1987 16:49`	12
	.3> As for how quickly to respond, I'd like to detect it as quickly .3> as possible to avoid further database corruption (I can guarantee .3> that some corruption _will_ be evident; there would be 80+ people .3> guaranteed to be in the database at any one time, someone is _Always_ .3> trying to do some kind of update, and it's all RMS files) ^^^^^^^ If you can "guarantee" corruption I'm sure the RMS developers would be most interested. RMS corruption should be independent of cluster membership. Perhaps you could sketch out a scenario or two where you figure "corruption" will occur. Perhaps the problem is not as bad as you contemplate.
635.14	Observation from the peanut gallery	DPDMAI::BEATTIE	But, Is BLISS ignorance?	`Tue Dec 22 1987 19:14`	12
	re: .13 All you to have to do is need to update more than one file in a transaction, and have the node fail in the middle. Thank goodness for START TRANSACTION and COMMIT in Rdb\VMS. A FORTRAN application I helped to convert uses 50 interrelated RMS files, and the cleanup effort after a node failure was torture (but it has nothing whatever to do with RMS, just the [ahem] screwballs that thunk up the application in the first place). -- Brian
635.15	No peanut butter, thank you	SQM::HALLYB	Khan or bust!	`Tue Dec 22 1987 21:48`	7
	> All you to have to do is need to update more than one file in a > transaction, and have the node fail in the middle. Thank goodness > for START TRANSACTION and COMMIT in Rdb\VMS. In RMS, this is known as $START_RU and $END_RU I was hoping for some examples with meat in them.
635.16	more detail, anyway.	FROST::HARRIMAN	Here we come a'waffling	`Wed Dec 23 1987 08:55`	45
	re: .13, .14, .15 This particular application uses approximately 179 RMS files, mostly indexed, and much of the time does not do it's transaction updates in a timely fashion. It happens to do all of the workorder, inventory control, planning, and financial accounting for this site. $START_RU and $END_RU are V5isms. As I said earlier, I have no access to the routines that provide me with this. As the image (yes, they made it a single image) is approx. 19000 blocks, and contains about 700 subroutines, and people can jump between them without leaving the image, I can't really try BIJ/AIJ. Run-unit journaling in RMS? Is it there? You have documentation?? Back to the subject: Corruption is in the eyes of the beholder. Very rarely we get corrupted files during system crashes (some files are over 200K blocks, notably stock status, manuf. cost detail transactions). Mostly the corruption I am talking about is transaction corruption. Long ago I installed hooks into the application code (I do have the sources for those) to allow the monitors to record what was being entered and by whom (the vanilla auditing package is lousy too). This was for the same reason - if a cluster member goes away I need to know what people were doing so we can (manually) recover. Now we are upgrading our cluster (8*8800s) and the application will run on all members (along with office automation, an RDB database, a DBMS-32 database, and yet another file system by a third party). If a node crashes, the RDB/DBMS monitors are intelligent enough to recover the databases. This application is not. Therefore, the monitors need to: 1) record what people are doing, and when, and how (they do this now) 2) Detect a system failure and shut down access to the application (half way there, I can shut down the application) 3) provide a snapshot of the username/function combinations to us so that we can (manually) track down transactions. This happens now, also. If there was a better way without hooking into the I/O system I'd be glad to listen. Anybody guess the application yet? /pjh
635.17	MAXCIM?	CIMNET::NIKOPOULOS	Steve Nikopoulos	`Wed Dec 23 1987 12:59`	10
	Sounds like MAXCIM. If it is MAXCIM, NCA should be convinced that it's important enough for them to develop and maintain better audit trail/recovery mechanisms. In doing so, all of NCA's customers benefit including the X number of Digital sites running the application. (Last count it was about 30 sites). In any case, your efforts are admirable. Steve
635.18	Now THAT'S meaty!	SQM::HALLYB	Khan or bust!	`Wed Dec 23 1987 14:21`	6
	> $START_RU and $END_RU are V5isms. As I said earlier, I have no > access to the routines that provide me with this. What do you mean "V5isms"? Digital has it now! Gander ye BULOVA::VAX-RMS (KP7) for details.
635.19	Did I mention the prize?	FROST::HARRIMAN	Here we come a'waffling	`Wed Dec 23 1987 16:46`	15
	re: .18 I'll do just that - once I return from vacation. re: .17 You are correct - however NCA no longer exists, having been bought by their major competitor, ASK. MAXCIM is no longer getting development time outside of DEC, we are all stuck with it, and I still have the same problem, except we'll be rewriting it into the VAX architecture eventually. Still the same old garbage code. /Paul_who_is_sick_of_pdp11_application_code_on_vaxen