[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::hackers_v1

Title:-={ H A C K E R S }=-
Notice:Write locked - see NOTED::HACKERS
Moderator:DIEHRD::MORRIS
Created:Thu Feb 20 1986
Last Modified:Mon Aug 03 1992
Last Successful Update:Fri Jun 06 1997
Number of topics:680
Total number of notes:5456

635.0. "Cluster monitor/watchdog" by FROST::HARRIMAN (How do I work this?) Fri Dec 18 1987 13:31

    
    Does anybody here have any good ideas on how to tell if another
    cluster member has died? I have a major application which does not
    do journaling, and I cannot touch the file-io-system. However I
    need to shut it down if a member of the cluster (homogenous) goes
    away. 
    
    Are there any services available? Could the lock manager tell me?
    
    /pjh
    
T.RTitleUserPersonal
Name
DateLines
635.1HIBOB::KRANTZNext window please.Fri Dec 18 1987 13:546
    You can tell from DCL, so from a program it should be easy!
    
    f$getsyi("cluster_member","nodename") yeilds TRUE/FALSE...
    presumably there is a matching library routine...
    
    		Joe
635.2STAR::DICKINSONPeterFri Dec 18 1987 15:3513


Is it necessary for you to know if a specific node has gone down, or just
that _a_ node in the cluster has gone down ?
How much 'time' is needed from that event until you must take action ?
It seems that $getsyi (or Lib version) will do, question is; how are you
going to detect the event- by polling every delta time , or having the event
be asynchronous and notifying you ?

The interesting, and probably most desirable case, is the asynchronous method.


635.3Asynchronously would be much better, thank youFROST::HARRIMANHow do I work this?Fri Dec 18 1987 16:5026
    
    It would be nice to know what node, although that would be pretty
    moot (I'm sure Operations will figure it out pretty quickly)
    
    As for how quickly to respond, I'd like to detect it as quickly
    as possible to avoid further database corruption (I can guarantee
    that some corruption _will_ be evident; there would be 80+ people
    guaranteed to be in the database at any one time, someone is _Always_
    trying to do some kind of update, and it's all RMS files)
    
    I was thinking about $getsyi.. it's easy but I don't want to have
    my monitors doing polling - they're busy enough recording audit
    information and updating their "snapshot database" (who's on at
    what time in what process mode)... but $getsyi implies that I'd
    have to have each monitor (or worse yet, one of them) poll the others
    ... this wouldn't be what I want - besides, what if the "master"
    was on the machine that crashed?
    
    I'd much rather get notification from some system entity telling me
    that the cluster just changed state or that someone went away - that
    way I could tell HSC crashes too (which are just as bad sometimes).
    Each surviving monitor would have to behave the same, on each VMS node,
    so asynchronous notification would be much more desirable to me. 
    
    /pjh
    
635.4How about thisMDVAX3::COARMy hero? Vax Headroom, of course!Fri Dec 18 1987 17:4524
    Use $GETSYI to find out all the members of a VAXcluster, or use
    some sort of stored list of names.  Build a list of resource names
    from this, and write a program which runs on each node.  The program
    will lock the resource specifying its own node with mode=EX, and
    queue (NOT with wait!) a PR-mode lock to all the other resources,
    specifying an AST routine.
    
    The AST will get called when the lock gets granted; if all nodes
    were up and properly synchronised, this means that the node
    corresponding to the resource you just locked has gone down (or
    the program has stopped running).  Once you get it, flag the value
    block to indicate that the node drop has been observed and
    acknowledged, so that any other nodes can know and treat it as a
    no-op when they get it.  Then release the lock.  (Any other nodes
    in the VAXcluster will now get it in turn, see that it has already
    been handled, and release it without doing anything.)
    
    When the victim node comes back up, he'll get his lock in EX mode
    again.  I'll leave it as an exercise to the reader as to how he
    can tell the other nodes to re-queue their lock on his resource.
    
    Good enough?
    
    #ken	:-)}
635.5Oops! I forgot this..MDVAX3::COARMy hero? Vax Headroom, of course!Fri Dec 18 1987 17:4911
    Oops!  I forgot to mention that your AST should check with $GETSYI
    to see if the specified node is actually available, and pause for
    some interval if so.  This means that, while one node is struggling
    to come up for the first time (i.e., the other programs didn't note
    his going down - possible if the VAXcluster is just booting), the
    others will go into an almost-deadly embrace, passing the lock around
    until the real owner gets to the point of running the program. 
    The pause is to prevent them from sucking up lots of cycles while
    passing the lock around.
    
    #ken	:-)}
635.6Dennis' Command FilesBLITZN::ROBERTSPeace .XOR. Freedom ?Fri Dec 18 1987 18:46164
    The following two command files are used at CXO.  One monitors via
    the network (and thus is impervious to the monitored cluster's demise)
    and the other runs on the cluster.  The command files are separated
    with form feeds.  For questions, comments, etc about these two command
    files, please address their author via mail to KNEE::ATKINSON.
    
    						/Dwayne Roberts
    

$!
$!	Title      : CAPS_MONITOR_NODES.COM
$!	Version    : V1.0
$!	Date       : 14-Apr-1987
$!	Programmer : Dennis Atkinson
$!
$!	Description:
$!        This command file is used to monitor the nodes in the CAPS cluster.
$!        It will inform the OPERATOR and SYSTEM accounts if a node drops from
$!        the cluster.  The program is continuously running on all nodes in 
$!        the CAPS Production cluster.
$!
$!      Maintenance History:
$!      Version   Date        Initials  Description             
$!      V1.0    14 Apr 87       DMA     Created
$!      V1.1    30 Apr 87       DMA     Added notify by remote submit to dasher
$!      V1.1a   12 Jun 87       DMA     Added notify of LTA1000 terminal on CAPS
$!
$!
$Start: 
$       vers = f$extract(18,2,f$time())
$       on error then goto finish_up
$       on control_y then goto finish_up
$       this_node := ""
$       node := ""
$       cnt = 0
$       a_member := ""
$       all_here := ""
$       repeat = 0
$       log_count = 0
$       define /nolog system1 KNEE
$       users := (ATKINSON,SYSTEM,OPERATOR)
$!
$Get_nodes:
$       show cluster/out=sys$manager:cluster.tmp
$       open/err=finish_up cluster_list sys$manager:cluster.tmp
$read_header:
$       read cluster_list node
$       read cluster_list node
$       read cluster_list node
$       read cluster_list node
$       read cluster_list node
$       read cluster_list node
$       cnt = 0
$show_nodes:
$       read/end=close_this_one/err=finish_up cluster_list node
$       node = f$edit(f$extract(2,6,node),"trim,compress,upcase")
$       if node .eqs. "" then goto close_this_one
$       a_member = f$getsyi("cluster_member","''node'")
$       if .not. a_member then goto show_nodes
$       all_here := "''all_here'" "''node'"
$       !
$       goto show_nodes
$close_this_one:
$       close cluster_list /nolog
$       delete /nolog sys$manager:cluster.tmp;*
$Check_members:
$       cnt=cnt+1
$       this_node = f$logical("system''cnt'")
$       if this_node .eqs. "" then goto do_it_again
$       on error then continue
$       if f$locate("''this_node'","''all_here'") .eq. f$length("''all_here'") -
        then  reply/bell/urgent/term=LTA1000: -
        "''this_node' IS NOT A MEMBER OF THE CAPS DEVELOPMENT CLUSTER - PLEASE CHECK ''this_node'"
$       on error then goto finish_up
$       goto check_members
$do_it_again:
$       wait 00:00:30.00
$Finish_up:
$       gosub clear_logicals
$!
$!      This will restart the process NODE_CHK_x on CAPS
$!
$       run /detached /uic=[1,4] -
          /input=sys$manager:caps_monitor_nodes.com  -
          /output=nl: -
          /error=sys$manager:caps_monitor_nodes.err -
          /prio=3 -
          /process="MON_NODES_''vers'" -
          sys$system:loginout.exe
$!
$       exit
$       !       S U B R O U T I N E S 
$!
$Clear_logicals:
$       log_count=log_count+1
$       if f$logical("system''log_count'") .eqs. "" then return
$       deassign system'log_count'
$       goto clear_logicals


$!
$!	Title      : CAPS_MONITOR_NET_NODES.COM
$!	Version    : V1.0
$!	Date       : 23-Oct-1987
$!	Programmer : Dennis Atkinson
$!
$!	Description:
$!        This command file is used to monitor the nodes in the CAPS
$!        production and test clusters.  It will inform LTA1000, a remote
$!        console, if a node drops from the network.  The program is 
$!        continuously running on all nodes in the CAPS Production cluster.
$!
$!      Maintenance History:
$!      Version   Date        Initials  Description             
$!      V1.0    23 Oct 87       DMA     Created
$!
$Start: 
$       set noverify
$       log_count=0
$       cnt=0
$       vers = f$extract(18,2,f$time())
$       on control_y then exit
$!
$       define /nolog system1 TOPCAP
$       define /nolog system2 BOTTLE
$       define /nolog system3 NIGHT
$!
$Check:
$       cnt=cnt+1
$       this_node = f$logical("system''cnt'")
$       if this_node .eqs. "" then gosub finish_up
$       on error then goto no_network
$       open/write network_up  'this_node'::nl:
$       close network_up
$       goto check
$!
$No_network:
$!
$       reply/URGENT/bell/term=lta1000 -
"''this_node' is not available from the CAPS cluster - please investigate."
$       return
$Finish_up:
$       gosub clear_logicals
$!
$!      This will restart the process mon_net_nodes_xxx on CAPS
$!
$       on error then exit
$!
$       wait 00:00:30.00
$       run /detached /uic=[1,4] -
          /input=sys$manager:caps_monitor_net_nodes.com  -
          /output=nl: -
          /error=sys$manager:caps_monitor_net_nodes.err  -
          /prio=3 -
          /process="MON_NET_''vers'" -
          sys$system:loginout.exe
$!
$       exit
$Clear_logicals:
$       log_count=log_count+1
$       if f$logical("system''log_count'") .eqs. "" then return
$       deassign system'log_count'
$       goto clear_logicals

635.7Locks are re-granted later on...CHOVAX::YOUNGBack from the Shadows Again,Sat Dec 19 1987 23:057
    Re .4, .5:
    
    I think that this method will only inform you AFTER the lock database
    has been rebuilt.  I believe that $GETSYI will reveal the transition
    as soon as the connection manager figures it out.
    
    --  Barry
635.8Thanks, time to start investigating...FROST::HARRIMANHow do I work this?Mon Dec 21 1987 09:1419
    
    re: .6
    
      Interesting. I need other things to happen with the monitors though.
    
    re: .4, .5
    
      Fascinating. I'll go hack on it. I had a feeling that the Distributed
    Lock Manager was something I could use. 
    
    I looks like $GETSYI and some lock detection is needed. I wish I
    could get some kind of notification from the system detecting when
    the cluster was transitioning. How does VMS do it? OPCOM surely
    hears about it... Is this from other OPCOMs or is it something else
    altogether? 
    
    Now we're just getting academic....
    
    /pjh
635.9VINO::RASPUZZIMichael RaspuzziMon Dec 21 1987 09:5718
    I am not a VMS person but have a question that pertains to this
    and maybe someone in VMS could answer it. Is there another system
    service provided by VMS that uses the SCA facility? SCA is Digital's
    way of connecting homogeneous (and heterogeneous!) machines in a
    cluster. If there is a way for you to interface into SCA through
    a system service, then you could mimic a SYSAP (system application).
    When a node goes down (HSC or not) SCA will give you a "port broke
    connection" callback and you can then react to it as you will.
    
    I ask this because in TOPS-20, we have the SCS% JSYS. In fact, I
    have a fork that always runs and it uses SCS% to notify me if a
    node goes down. No polling. SCA simply gives my process a software
    interrupt (PSI - like an AST) and away it goes.
    
    One word of caution though. SCS% on TOPS-20 needs privs. If there
    is an equivalent on VMS, it may also need privs.
    
    Mike
635.10Well..MDVAX3::COARMy hero? Vax Headroom, of course!Mon Dec 21 1987 12:1313
    Re .7:
    
    The entire VAXcluster is blocked until the state transition completes,
    and the lock DB rebuild is part of the transition.  You can avoid
    this (that is, continue running through the transition) by running at a
    high IPL, but that may interfere with the connexion manager, and need
    a lot of privileges and development time.
    
    Re .9:
    
    SCA?  Are you sure you don't mean SCS (System Communication Services)?
    
    #ken	:-)}
635.11VINO::RASPUZZIMichael RaspuzziMon Dec 21 1987 13:348
    SCA/SCS are really the same thing (SCA stands for System Communication
    Architecture and SCS stands for System Communication Service).
    
    Our module is called System Communication Architecture for
    Multiprocessor Interconnect or SCAMPI for short (not to be confused
    with an Italian shrimp dish).
    
    Mike
635.12SCS is a possibility..FROST::HARRIMANHow do I work this?Mon Dec 21 1987 15:5123
    
    Re: .-2
    
    Privs are not a problem. Processes all run detached, communicate
    via mailboxes. Elevated IPLs may or may not be a problem. I'm not
    afraid of writing in kernel mode code, but only for a good reason.
    
    SCS sounds interesting, I was thinking about it since CUDRIVER (which
    is a decent example) uses it and even has a device hanging around
    (CUA0:). 
    
    I'm checking Digital Technical Journal #5, sept. '87 to see what
    else I can find - how does the connection manager signal to people
    like OPCOM???? Can I get it to tell me a connection went away? It
    would help; I have found that occasionally an HSC will drop, and
    failover won't work right, and I then need to recover (after 20-30
    minutes of hung processes, myriad telephone calls to the computer
    center, aggravation, etc). I can build smarts into the monitor *if*
    I can find out the information...
    
    Thanks
    
    /pjh
635.13Is all this really necessary?SQM::HALLYBKhan or bust!Tue Dec 22 1987 16:4912
.3>    As for how quickly to respond, I'd like to detect it as quickly
.3>    as possible to avoid further database corruption (I can guarantee
.3>    that some corruption _will_ be evident; there would be 80+ people
.3>    guaranteed to be in the database at any one time, someone is _Always_
.3>    trying to do some kind of update, and it's all RMS files)
						  ^^^^^^^

  If you can "guarantee" corruption I'm sure the RMS developers would be most 
  interested.  RMS corruption should be independent of cluster membership.
    
  Perhaps you could sketch out a scenario or two where you figure "corruption" 
  will occur.  Perhaps the problem is not as bad as you contemplate. 
635.14Observation from the peanut galleryDPDMAI::BEATTIEBut, Is BLISS ignorance?Tue Dec 22 1987 19:1412
    re: .13
    
    	All you to have to do is need to update more than one file in a
    transaction, and have the node fail in the middle.  Thank goodness
    for START TRANSACTION and COMMIT in Rdb\VMS.
    
    	A FORTRAN application I helped to convert uses 50 interrelated
    RMS files, and the cleanup effort after a node failure was torture
    (but it has nothing whatever to do with RMS, just the [*ahem*] screwballs
    that thunk up the application in the first place).
    
    				-- Brian
635.15No peanut butter, thank youSQM::HALLYBKhan or bust!Tue Dec 22 1987 21:487
>    All you to have to do is need to update more than one file in a
>    transaction, and have the node fail in the middle.  Thank goodness
>    for START TRANSACTION and COMMIT in Rdb\VMS.
    
    In RMS, this is known as $START_RU and $END_RU
    
    I was hoping for some examples with meat in them.
635.16more detail, anyway.FROST::HARRIMANHere we come a'wafflingWed Dec 23 1987 08:5545
    
    re: .13, .14, .15
    
      This particular application uses approximately 179 RMS files,
    mostly indexed, and much of the time does not do it's transaction
    updates in a timely fashion. It happens to do all of the workorder,
    inventory control, planning, and financial accounting for this site.
    
      $START_RU and $END_RU are V5isms. As I said earlier, I have no
    access to the routines that provide me with this. As the image (yes,
    they made it a single image) is approx. 19000 blocks, and contains
    about 700 subroutines, and people can jump between them without
    leaving the image, I can't really try BIJ/AIJ. Run-unit journaling
    in RMS? Is it there? You have documentation?? 
    
      Back to the subject: Corruption is in the eyes of the beholder.
    Very rarely we get corrupted files during system crashes (some files
    are over 200K blocks, notably stock status, manuf. cost detail
    transactions). Mostly the corruption I am talking about is transaction
    corruption. Long ago I installed hooks into the application code
    (I do have the sources for those) to allow the monitors to record
    what was being entered and by whom (the vanilla auditing package
    is lousy too). This was for the same reason - if a cluster member
    goes away I need to know what people were doing so we can (manually)
    recover. Now we are upgrading our cluster (8*8800s) and the application
    will run on all members (along with office automation, an RDB database,
    a DBMS-32 database, and yet another file system by a third party).
    If a node crashes, the RDB/DBMS monitors are intelligent enough
    to recover the databases. This application is not. Therefore, the
    monitors need to:
    
    1) record what people are doing, and when, and how (they do this
       now)
    
    2) Detect a system failure and shut down access to the application
       (half way there, I can shut down the application)
    
    3) provide a snapshot of the username/function combinations to us
       so that we can (manually) track down transactions. This happens
       now, also. If there was a better way without hooking into the
       I/O system I'd be glad to listen.
    
    Anybody guess the application yet?
    
    /pjh
635.17MAXCIM?CIMNET::NIKOPOULOSSteve NikopoulosWed Dec 23 1987 12:5910
Sounds like MAXCIM. If it is MAXCIM, NCA should be convinced that it's
important enough for them to develop and maintain better audit
trail/recovery mechanisms. In doing so, all of NCA's customers benefit
including the X number of Digital sites running the application. (Last
count it was about 30 sites). 

In any case, your efforts are admirable.

Steve
635.18Now THAT'S meaty!SQM::HALLYBKhan or bust!Wed Dec 23 1987 14:216
>    $START_RU and $END_RU are V5isms. As I said earlier, I have no
>    access to the routines that provide me with this. 
    
    What do you mean "V5isms"?  Digital has it now!
    
    Gander ye BULOVA::VAX-RMS (KP7) for details.
635.19Did I mention the prize?FROST::HARRIMANHere we come a'wafflingWed Dec 23 1987 16:4615
    
    re: .18
    
       I'll do just that - once I return from vacation.
    
    re: .17
    
       You are correct - however NCA no longer exists, having been bought
    by their major competitor, ASK. MAXCIM is no longer getting development
    time outside of DEC, we are all stuck with it, and I still have
    the same problem, except we'll be rewriting it into the VAX
    architecture eventually. Still the same old garbage code.
    
    
    /Paul_who_is_sick_of_pdp11_application_code_on_vaxen