| Title: | POLYCENTER System Watchdog for VMS OSF/1 ULTRIX HP-UX AIX SunOS |
| Notice: | Wishes:406,FAQ:845,Kits-VMS:1000,UNIX:694 VMS ECO01 FT kit: 521 |
| Moderator: | AZUR::HUREZ Z |
| Created: | Fri May 15 1992 |
| Last Modified: | Fri Jun 06 1997 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 1033 |
| Total number of notes: | 4584 |
Hello,
just a short question.
Is it possible to do a clusterwide watch for a process?
Eg. the QUEUE_MANAGER runs on only onde node.
If I add a check for this process on that node I will
get notified (correctly) that the process is gone when
it fails over to another node in the cluster.
I tried to enter a 'watch' via the alias cluster node name.
But this doens't work.
It seems that the consolidator 'translates' the cluster
alias to one of the cluster nodes and polls that specific
node for the process. Which may not be runnig there.
However, looking at the output from SENS WATCH SHOW EVENTS
I see several events (such as DISK free space below xxx)
reported with the cluster alias node.
Why can't I do this with a process?
Thatnks in advance,
Ton Dorland
(tested with SNS 2.2 ECO3)
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 1003.1 | Not yet implemented | AZUR::HUREZ | Connectivity & Computing Services @VBE. DTN 828-5159 | Wed Feb 19 1997 10:36 | 41 |
The feature you're describing is an interesting one, but it is not yet
implemented into System Watchdog.
If you enter the cluster alias in your profile, then the cluster load
balancing algorithm will decide which is the cluster member actually
connected to... So you may get process missing events somehow randomly
depending upon the presence of the process on the target node selected
independantly of the Consolidator. This obviously doesn't work as
expected.
Besides, the Consolidator has currently no means to know - a priori -
the cluster members list, from a cluster alias, or even what is a
cluster alias, a cluster member name or a standalone node name.
Consolidation of cluster-wide events is done a posteriori, once events
are reported to the Consolidator, as each event packet has a cluster field
into it. It consists in merging, for an event code sublist, identical
event messages coming from distinct cluster members with the same cluster
alias into a single event. PROcess missing is not considered as a
cluster-wide event...
I think the most straightforward way to implement the wished feature
would be merely to add a parameter into the PROcess missing data
specification, say using a /CLUSTER_WIDE qualifier, e.g.
SNS$EDIT> ADD NODE trusted_cluster_member PROCESS proc_name proc_uic -
/CLUSTER_WIDE /INTERVAL=...
so that, for processes marked as cluster-wide, the Agent node trusted to
detect the process presence would scan the cluster process table
instead of only the local node process table.
Of course, this implies a profile structure change, conversion utilities,
etc, which cannot be included in an ECO kit, but rather into a point
release.
What do you think?
Regards,
-- Olivier.
| |||||
| 1003.2 | Sounds good. | UTRTSC::DORLAND | The Wizard of Odz2 | Thu Feb 20 1997 02:53 | 7 |
Sounds good, despite the fact that a CONVERT of the database
is necessary. Also I think it can be implemented faily easy
because since the last few VMS versions (V6.2 if I remember
correctly) it is much easier to do clusterwide checks via
GETJPI.
Thanks, Ton
| |||||