T.R | Title | User | Personal Name | Date | Lines |
---|
635.1 | | HIBOB::KRANTZ | Next window please. | Fri Dec 18 1987 13:54 | 6 |
| You can tell from DCL, so from a program it should be easy!
f$getsyi("cluster_member","nodename") yeilds TRUE/FALSE...
presumably there is a matching library routine...
Joe
|
635.2 | | STAR::DICKINSON | Peter | Fri Dec 18 1987 15:35 | 13 |
|
Is it necessary for you to know if a specific node has gone down, or just
that _a_ node in the cluster has gone down ?
How much 'time' is needed from that event until you must take action ?
It seems that $getsyi (or Lib version) will do, question is; how are you
going to detect the event- by polling every delta time , or having the event
be asynchronous and notifying you ?
The interesting, and probably most desirable case, is the asynchronous method.
|
635.3 | Asynchronously would be much better, thank you | FROST::HARRIMAN | How do I work this? | Fri Dec 18 1987 16:50 | 26 |
|
It would be nice to know what node, although that would be pretty
moot (I'm sure Operations will figure it out pretty quickly)
As for how quickly to respond, I'd like to detect it as quickly
as possible to avoid further database corruption (I can guarantee
that some corruption _will_ be evident; there would be 80+ people
guaranteed to be in the database at any one time, someone is _Always_
trying to do some kind of update, and it's all RMS files)
I was thinking about $getsyi.. it's easy but I don't want to have
my monitors doing polling - they're busy enough recording audit
information and updating their "snapshot database" (who's on at
what time in what process mode)... but $getsyi implies that I'd
have to have each monitor (or worse yet, one of them) poll the others
... this wouldn't be what I want - besides, what if the "master"
was on the machine that crashed?
I'd much rather get notification from some system entity telling me
that the cluster just changed state or that someone went away - that
way I could tell HSC crashes too (which are just as bad sometimes).
Each surviving monitor would have to behave the same, on each VMS node,
so asynchronous notification would be much more desirable to me.
/pjh
|
635.4 | How about this | MDVAX3::COAR | My hero? Vax Headroom, of course! | Fri Dec 18 1987 17:45 | 24 |
| Use $GETSYI to find out all the members of a VAXcluster, or use
some sort of stored list of names. Build a list of resource names
from this, and write a program which runs on each node. The program
will lock the resource specifying its own node with mode=EX, and
queue (NOT with wait!) a PR-mode lock to all the other resources,
specifying an AST routine.
The AST will get called when the lock gets granted; if all nodes
were up and properly synchronised, this means that the node
corresponding to the resource you just locked has gone down (or
the program has stopped running). Once you get it, flag the value
block to indicate that the node drop has been observed and
acknowledged, so that any other nodes can know and treat it as a
no-op when they get it. Then release the lock. (Any other nodes
in the VAXcluster will now get it in turn, see that it has already
been handled, and release it without doing anything.)
When the victim node comes back up, he'll get his lock in EX mode
again. I'll leave it as an exercise to the reader as to how he
can tell the other nodes to re-queue their lock on his resource.
Good enough?
#ken :-)}
|
635.5 | Oops! I forgot this.. | MDVAX3::COAR | My hero? Vax Headroom, of course! | Fri Dec 18 1987 17:49 | 11 |
| Oops! I forgot to mention that your AST should check with $GETSYI
to see if the specified node is actually available, and pause for
some interval if so. This means that, while one node is struggling
to come up for the first time (i.e., the other programs didn't note
his going down - possible if the VAXcluster is just booting), the
others will go into an almost-deadly embrace, passing the lock around
until the real owner gets to the point of running the program.
The pause is to prevent them from sucking up lots of cycles while
passing the lock around.
#ken :-)}
|
635.6 | Dennis' Command Files | BLITZN::ROBERTS | Peace .XOR. Freedom ? | Fri Dec 18 1987 18:46 | 164 |
| The following two command files are used at CXO. One monitors via
the network (and thus is impervious to the monitored cluster's demise)
and the other runs on the cluster. The command files are separated
with form feeds. For questions, comments, etc about these two command
files, please address their author via mail to KNEE::ATKINSON.
/Dwayne Roberts
$!
$! Title : CAPS_MONITOR_NODES.COM
$! Version : V1.0
$! Date : 14-Apr-1987
$! Programmer : Dennis Atkinson
$!
$! Description:
$! This command file is used to monitor the nodes in the CAPS cluster.
$! It will inform the OPERATOR and SYSTEM accounts if a node drops from
$! the cluster. The program is continuously running on all nodes in
$! the CAPS Production cluster.
$!
$! Maintenance History:
$! Version Date Initials Description
$! V1.0 14 Apr 87 DMA Created
$! V1.1 30 Apr 87 DMA Added notify by remote submit to dasher
$! V1.1a 12 Jun 87 DMA Added notify of LTA1000 terminal on CAPS
$!
$!
$Start:
$ vers = f$extract(18,2,f$time())
$ on error then goto finish_up
$ on control_y then goto finish_up
$ this_node := ""
$ node := ""
$ cnt = 0
$ a_member := ""
$ all_here := ""
$ repeat = 0
$ log_count = 0
$ define /nolog system1 KNEE
$ users := (ATKINSON,SYSTEM,OPERATOR)
$!
$Get_nodes:
$ show cluster/out=sys$manager:cluster.tmp
$ open/err=finish_up cluster_list sys$manager:cluster.tmp
$read_header:
$ read cluster_list node
$ read cluster_list node
$ read cluster_list node
$ read cluster_list node
$ read cluster_list node
$ read cluster_list node
$ cnt = 0
$show_nodes:
$ read/end=close_this_one/err=finish_up cluster_list node
$ node = f$edit(f$extract(2,6,node),"trim,compress,upcase")
$ if node .eqs. "" then goto close_this_one
$ a_member = f$getsyi("cluster_member","''node'")
$ if .not. a_member then goto show_nodes
$ all_here := "''all_here'" "''node'"
$ !
$ goto show_nodes
$close_this_one:
$ close cluster_list /nolog
$ delete /nolog sys$manager:cluster.tmp;*
$Check_members:
$ cnt=cnt+1
$ this_node = f$logical("system''cnt'")
$ if this_node .eqs. "" then goto do_it_again
$ on error then continue
$ if f$locate("''this_node'","''all_here'") .eq. f$length("''all_here'") -
then reply/bell/urgent/term=LTA1000: -
"''this_node' IS NOT A MEMBER OF THE CAPS DEVELOPMENT CLUSTER - PLEASE CHECK ''this_node'"
$ on error then goto finish_up
$ goto check_members
$do_it_again:
$ wait 00:00:30.00
$Finish_up:
$ gosub clear_logicals
$!
$! This will restart the process NODE_CHK_x on CAPS
$!
$ run /detached /uic=[1,4] -
/input=sys$manager:caps_monitor_nodes.com -
/output=nl: -
/error=sys$manager:caps_monitor_nodes.err -
/prio=3 -
/process="MON_NODES_''vers'" -
sys$system:loginout.exe
$!
$ exit
$ ! S U B R O U T I N E S
$!
$Clear_logicals:
$ log_count=log_count+1
$ if f$logical("system''log_count'") .eqs. "" then return
$ deassign system'log_count'
$ goto clear_logicals
$!
$! Title : CAPS_MONITOR_NET_NODES.COM
$! Version : V1.0
$! Date : 23-Oct-1987
$! Programmer : Dennis Atkinson
$!
$! Description:
$! This command file is used to monitor the nodes in the CAPS
$! production and test clusters. It will inform LTA1000, a remote
$! console, if a node drops from the network. The program is
$! continuously running on all nodes in the CAPS Production cluster.
$!
$! Maintenance History:
$! Version Date Initials Description
$! V1.0 23 Oct 87 DMA Created
$!
$Start:
$ set noverify
$ log_count=0
$ cnt=0
$ vers = f$extract(18,2,f$time())
$ on control_y then exit
$!
$ define /nolog system1 TOPCAP
$ define /nolog system2 BOTTLE
$ define /nolog system3 NIGHT
$!
$Check:
$ cnt=cnt+1
$ this_node = f$logical("system''cnt'")
$ if this_node .eqs. "" then gosub finish_up
$ on error then goto no_network
$ open/write network_up 'this_node'::nl:
$ close network_up
$ goto check
$!
$No_network:
$!
$ reply/URGENT/bell/term=lta1000 -
"''this_node' is not available from the CAPS cluster - please investigate."
$ return
$Finish_up:
$ gosub clear_logicals
$!
$! This will restart the process mon_net_nodes_xxx on CAPS
$!
$ on error then exit
$!
$ wait 00:00:30.00
$ run /detached /uic=[1,4] -
/input=sys$manager:caps_monitor_net_nodes.com -
/output=nl: -
/error=sys$manager:caps_monitor_net_nodes.err -
/prio=3 -
/process="MON_NET_''vers'" -
sys$system:loginout.exe
$!
$ exit
$Clear_logicals:
$ log_count=log_count+1
$ if f$logical("system''log_count'") .eqs. "" then return
$ deassign system'log_count'
$ goto clear_logicals
|
635.7 | Locks are re-granted later on... | CHOVAX::YOUNG | Back from the Shadows Again, | Sat Dec 19 1987 23:05 | 7 |
| Re .4, .5:
I think that this method will only inform you AFTER the lock database
has been rebuilt. I believe that $GETSYI will reveal the transition
as soon as the connection manager figures it out.
-- Barry
|
635.8 | Thanks, time to start investigating... | FROST::HARRIMAN | How do I work this? | Mon Dec 21 1987 09:14 | 19 |
|
re: .6
Interesting. I need other things to happen with the monitors though.
re: .4, .5
Fascinating. I'll go hack on it. I had a feeling that the Distributed
Lock Manager was something I could use.
I looks like $GETSYI and some lock detection is needed. I wish I
could get some kind of notification from the system detecting when
the cluster was transitioning. How does VMS do it? OPCOM surely
hears about it... Is this from other OPCOMs or is it something else
altogether?
Now we're just getting academic....
/pjh
|
635.9 | | VINO::RASPUZZI | Michael Raspuzzi | Mon Dec 21 1987 09:57 | 18 |
| I am not a VMS person but have a question that pertains to this
and maybe someone in VMS could answer it. Is there another system
service provided by VMS that uses the SCA facility? SCA is Digital's
way of connecting homogeneous (and heterogeneous!) machines in a
cluster. If there is a way for you to interface into SCA through
a system service, then you could mimic a SYSAP (system application).
When a node goes down (HSC or not) SCA will give you a "port broke
connection" callback and you can then react to it as you will.
I ask this because in TOPS-20, we have the SCS% JSYS. In fact, I
have a fork that always runs and it uses SCS% to notify me if a
node goes down. No polling. SCA simply gives my process a software
interrupt (PSI - like an AST) and away it goes.
One word of caution though. SCS% on TOPS-20 needs privs. If there
is an equivalent on VMS, it may also need privs.
Mike
|
635.10 | Well.. | MDVAX3::COAR | My hero? Vax Headroom, of course! | Mon Dec 21 1987 12:13 | 13 |
| Re .7:
The entire VAXcluster is blocked until the state transition completes,
and the lock DB rebuild is part of the transition. You can avoid
this (that is, continue running through the transition) by running at a
high IPL, but that may interfere with the connexion manager, and need
a lot of privileges and development time.
Re .9:
SCA? Are you sure you don't mean SCS (System Communication Services)?
#ken :-)}
|
635.11 | | VINO::RASPUZZI | Michael Raspuzzi | Mon Dec 21 1987 13:34 | 8 |
| SCA/SCS are really the same thing (SCA stands for System Communication
Architecture and SCS stands for System Communication Service).
Our module is called System Communication Architecture for
Multiprocessor Interconnect or SCAMPI for short (not to be confused
with an Italian shrimp dish).
Mike
|
635.12 | SCS is a possibility.. | FROST::HARRIMAN | How do I work this? | Mon Dec 21 1987 15:51 | 23 |
|
Re: .-2
Privs are not a problem. Processes all run detached, communicate
via mailboxes. Elevated IPLs may or may not be a problem. I'm not
afraid of writing in kernel mode code, but only for a good reason.
SCS sounds interesting, I was thinking about it since CUDRIVER (which
is a decent example) uses it and even has a device hanging around
(CUA0:).
I'm checking Digital Technical Journal #5, sept. '87 to see what
else I can find - how does the connection manager signal to people
like OPCOM???? Can I get it to tell me a connection went away? It
would help; I have found that occasionally an HSC will drop, and
failover won't work right, and I then need to recover (after 20-30
minutes of hung processes, myriad telephone calls to the computer
center, aggravation, etc). I can build smarts into the monitor *if*
I can find out the information...
Thanks
/pjh
|
635.13 | Is all this really necessary? | SQM::HALLYB | Khan or bust! | Tue Dec 22 1987 16:49 | 12 |
| .3> As for how quickly to respond, I'd like to detect it as quickly
.3> as possible to avoid further database corruption (I can guarantee
.3> that some corruption _will_ be evident; there would be 80+ people
.3> guaranteed to be in the database at any one time, someone is _Always_
.3> trying to do some kind of update, and it's all RMS files)
^^^^^^^
If you can "guarantee" corruption I'm sure the RMS developers would be most
interested. RMS corruption should be independent of cluster membership.
Perhaps you could sketch out a scenario or two where you figure "corruption"
will occur. Perhaps the problem is not as bad as you contemplate.
|
635.14 | Observation from the peanut gallery | DPDMAI::BEATTIE | But, Is BLISS ignorance? | Tue Dec 22 1987 19:14 | 12 |
| re: .13
All you to have to do is need to update more than one file in a
transaction, and have the node fail in the middle. Thank goodness
for START TRANSACTION and COMMIT in Rdb\VMS.
A FORTRAN application I helped to convert uses 50 interrelated
RMS files, and the cleanup effort after a node failure was torture
(but it has nothing whatever to do with RMS, just the [*ahem*] screwballs
that thunk up the application in the first place).
-- Brian
|
635.15 | No peanut butter, thank you | SQM::HALLYB | Khan or bust! | Tue Dec 22 1987 21:48 | 7 |
| > All you to have to do is need to update more than one file in a
> transaction, and have the node fail in the middle. Thank goodness
> for START TRANSACTION and COMMIT in Rdb\VMS.
In RMS, this is known as $START_RU and $END_RU
I was hoping for some examples with meat in them.
|
635.16 | more detail, anyway. | FROST::HARRIMAN | Here we come a'waffling | Wed Dec 23 1987 08:55 | 45 |
|
re: .13, .14, .15
This particular application uses approximately 179 RMS files,
mostly indexed, and much of the time does not do it's transaction
updates in a timely fashion. It happens to do all of the workorder,
inventory control, planning, and financial accounting for this site.
$START_RU and $END_RU are V5isms. As I said earlier, I have no
access to the routines that provide me with this. As the image (yes,
they made it a single image) is approx. 19000 blocks, and contains
about 700 subroutines, and people can jump between them without
leaving the image, I can't really try BIJ/AIJ. Run-unit journaling
in RMS? Is it there? You have documentation??
Back to the subject: Corruption is in the eyes of the beholder.
Very rarely we get corrupted files during system crashes (some files
are over 200K blocks, notably stock status, manuf. cost detail
transactions). Mostly the corruption I am talking about is transaction
corruption. Long ago I installed hooks into the application code
(I do have the sources for those) to allow the monitors to record
what was being entered and by whom (the vanilla auditing package
is lousy too). This was for the same reason - if a cluster member
goes away I need to know what people were doing so we can (manually)
recover. Now we are upgrading our cluster (8*8800s) and the application
will run on all members (along with office automation, an RDB database,
a DBMS-32 database, and yet another file system by a third party).
If a node crashes, the RDB/DBMS monitors are intelligent enough
to recover the databases. This application is not. Therefore, the
monitors need to:
1) record what people are doing, and when, and how (they do this
now)
2) Detect a system failure and shut down access to the application
(half way there, I can shut down the application)
3) provide a snapshot of the username/function combinations to us
so that we can (manually) track down transactions. This happens
now, also. If there was a better way without hooking into the
I/O system I'd be glad to listen.
Anybody guess the application yet?
/pjh
|
635.17 | MAXCIM? | CIMNET::NIKOPOULOS | Steve Nikopoulos | Wed Dec 23 1987 12:59 | 10 |
|
Sounds like MAXCIM. If it is MAXCIM, NCA should be convinced that it's
important enough for them to develop and maintain better audit
trail/recovery mechanisms. In doing so, all of NCA's customers benefit
including the X number of Digital sites running the application. (Last
count it was about 30 sites).
In any case, your efforts are admirable.
Steve
|
635.18 | Now THAT'S meaty! | SQM::HALLYB | Khan or bust! | Wed Dec 23 1987 14:21 | 6 |
| > $START_RU and $END_RU are V5isms. As I said earlier, I have no
> access to the routines that provide me with this.
What do you mean "V5isms"? Digital has it now!
Gander ye BULOVA::VAX-RMS (KP7) for details.
|
635.19 | Did I mention the prize? | FROST::HARRIMAN | Here we come a'waffling | Wed Dec 23 1987 16:46 | 15 |
|
re: .18
I'll do just that - once I return from vacation.
re: .17
You are correct - however NCA no longer exists, having been bought
by their major competitor, ASK. MAXCIM is no longer getting development
time outside of DEC, we are all stuck with it, and I still have
the same problem, except we'll be rewriting it into the VAX
architecture eventually. Still the same old garbage code.
/Paul_who_is_sick_of_pdp11_application_code_on_vaxen
|