[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DECmcc user notes file. Does not replace IPMT. |
Notice: | Use IPMT for problems. Newsletter location in note 6187 |
Moderator: | TAEC::BEROUD |
|
Created: | Mon Aug 21 1989 |
Last Modified: | Wed Jun 04 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 6497 |
Total number of notes: | 27359 |
3894.0. "Reachability: MCC_REACH" by CTHQ::WOODCOCK () Mon Oct 12 1992 16:50
Greetings,
If you are interested in using MCC V1.2 as a monitor of devices for
REACHABILITY status, the following may be of interest to you (especially
for VMS). Thanks go to Peter Flack for his help on a customer site with
procedures and proof of concept help.
best regards,
brad...
............................................................................
Note 3385 began a discussion with regards to MCC V1.2s use as a reachability
monitor and identified some lack of functionality in V1.2 for this use. As
always there are methods to improve the monitoring environment and MCC_REACH
was written for this purpose.
MCC_REACH is simply a name given to a change in monitoring METHODOLOGY and a
few DCL procedures to accomplish the task. Before describing what this method
does lets go over some V1.2 characteristics and what lack of functions exist:
- MCC alarms consume lots of memory. Ever run out of memory by the time you're
done implementing alarms leaving you short for other tasks?
- Implementing globally wildcarded alarms is great but this forces you to
assemble your maps/domains in specific ways. If not, you double poll some
entities and potentially poll others which you do not require (end systems).
It has been mentioned that the ability to 'mark' devices to not poll would
be helpful enabling management of the device but not monitoring.
- If the maps/domains are not changed then global wildcards may not be feasible.
This results in implementing individual alarms, which results in extra
memory consumption, which results in scaling and performance issues.
The effort to maintain non-wildcarded alarms is also undesirable
(adds/deletes).
- The map colors do not correlate properly to show an actual status of the
devices because MCC uses non-intelligent EXCEPTION HANDLING and flags the
device as INDETERMINATE. This is true for all devices except SNMP which
works properly. Therefore, unless the map is tended continually the status
of all devices is unknown. Manual clearing of alarms on devices is required.
- In order to achieve proper map colors users have implemented alarms which
fire EVERY poll. This forces the devices on the map to a CLEAR severity but
uses unreasonable queue resources. This can also clutter up the NOTIFICATION
window with CLEAR messages rather than actual outages if not filtered
correctly.
- There is no control function for contiguous outages. That is, if a device
goes down a user could potentially receive mail each polling period until
the device returns to service. This is an undesirable effect.
..............................................................................
How does MCC_REACH work and what can it do for you? MCC_REACH is a SEPERATE
polling environment which then uses data collector functions to update your
current viewed map.
- A seperate domain is created and all devices TO BE POLLED are placed within
it. This resolves the issue where you would like to have devices on the
map but not poll them or marked for non-polling.
- A single globally wildcarded alarm rule is created for each class of entities
for the devices within the new domain. This immediately reduces the number
of alarms by a minimum factor of the amount of domains you currently use
today. This can be several fold and should result in much less memory
consumed for alarming needs.
- The only maintenance is to either add or delete entities from the new
domain as polling requirements change.
- Two other alarm rules are created which monitor the internal events of all
the rules used for polling. These two alarms then fire a procedure which
sends an event into the viewed map with the proper severity, updates a log
file, sends mail (but ONLY when the device first goes down, and then comes
back up). This keeps the map and log files updated continually but does not
burden the user with multiple mail messages for each outage. The map now
indicates the actual status of all devices with no manual intervention.
- There are options to allow devices to be present in multiple viewed domains
but only be updated in those which the user requires.
- Setup of this environment is estimated to take somewhere from 20 minutes to
an hour or two depending on the implementation size.
ANCHOR::NET$LIBRARY:MCC_REACH.BCK contains the working files and also
procedures to take the user from their current environment to the new. Simply
follow the set of instructions below.
RISKS/LIMITATIONS:
There is no official support for these procedures, use at your own risk.
Testing for these procedures has been done only on a limited basis with limited
DNS directories. Devices tested to date successfully are NODE4, NODE, SNMP,
STATION, BRIDGE, CONCENTRATOR, and TERMINAL SERVERS. The alarm fired procedures
were built to be as generic as possible.
Although MCC_REACH reduces the number of alarms required there now becomes some
practical limitations to MCC_REACH. For small to medium implementations all
should work well. Because a single alarm (and thread) are used to poll each
class device the following applies: MCC must be able to poll all devices
within a given class within the poll period picked, if not it skips a poll
period. So if you have several hundred terminal servers you may not
successfully poll them every 2 minutes using a 3100 with a single thread!
After MCC_REACH is built you can do a simple test to get a relative feel for
polling capabilities. Issue a similar type command from FCL for each class and
determine how long it takes to fulfill:
mcc> show node4 * state, in domain mcc_reach
If you find a particular class is too large simply delete the MCC_REACH alarm
used to poll that class if the practical poll period does not meet your
business needs. You will have to use standard techniques and alarms for that
class.
==============================================================================
MCC_REACH setup:
- Create two directories for use with MCC_REACH procedures, related files,
and log files if needed.
$ CREATE/DIR SYS$SYSDEVICE:[DECMCC.MCC_REACH]
$ CREATE/DIR SYS$SYSDEVICE:[DECMCC.MCC_REACH.LOG]
- Edit SYS$STARTUP:MCC_LOGICAL_BMS.COM; and add lines to define the directories
for MCC_REACH procedures (from above). Examples below:
$ DEFINE/SYSTEM/EXEC MCC_REACH_DIR "SYS$SYSDEVICE:[DECMCC.MCC_REACH]"
$ DEFINE/SYSTEM/EXEC MCC_REACH_LOG "SYS$SYSDEVICE:[DECMCC.MCC_REACH.LOG]"
Execute @SYS$STARTUP:MCC_LOGICAL_BMS.COM; to set these logicals now.
- Copy MCC_REACH.BCK into MCC_REACH_DIR and seperate using BACKUP.
$ set def mcc_reach_dir
$ backup mcc_reach.bck/save/log *.*
- Create two domains; MCC_REACH and OSI_RULES. Examples below:
$ manage/enter
MCC> create domain .MCC_REACH
MCC> create domain .OSI_RULES
MCC> exit
- Add a COLLECTOR entity to ALL domains within the top_domain hierarchy using
the iconic map toolbox. Collector names MUST BE EXACTLY named to
<domain_name>_COLLECTOR
Pull-down EDIT -> TOOLBOX
Select COLLECTOR entity
Enter collector name -> <domain_name>_COLLECTOR
Apply
Enter reference information if desired
OK
Place on map
Repeat for all domains
- Customize the map and make the ICON color the same as the CLEAR severity
color.
- Start up the event sink for Collector events. Edit MCC_STARTUP_BMS.COM in
SYS$STARTUP and uncomment the following line:
$! $ @SYS$STARTUP:MCC_STARTUP_EVC_SINK.COM
Execute @SYS$STARTUP:MCC_STARTUP_EVC_SINK.COM to start sink now.
- Set up NOTIFY request to receive data collector events:
Pull-down APPLICATIONS -> DECmcc NOTIFICATION SERVICES
Select NOTIFY REQUESTS from NOTIFICATION window
Select CREATE
Enter DOMAIN -> <top_domain>
Enter ENTITY -> COLLECTOR *
Enter EVENTS -> ANY EVENT
Select OK
- Save the NOTIFY request to enable with each map startup:
Select Collector Notify line from NOTIFY REQUESTs window
Pull-down FILE -> SAVE AS from NOTIFY REQUESTs window
Enter filename for new startup file or append
OK
Select CLOSE for NOTIFY REQUESTs window
If new startup file pull-down CUSTOMISE -> GENERAL
from DECmcc NOTIFIATIONs window
Enter same filename for Notify Requests Startup File
OK
From DECmcc NOTIFIATIONs window CUSTOMISE -> SAVE CURRENT SETTINGS
- Edit MCC_REACH_SYMBOLS.COM to reflect your system. Description of symbols
reside within the file. This file is executed by other procedures.
- Execute @CREATE_COPY.COM;. This procedure produces a file called
MCC_REACH:COPY_DOMAINS.COM;. This file is used to copy all entities in any
domain within the top_domain hierarchy into the MCC_REACH domain. If there
are any WHOLE domains which require NO polling for ANY ENTITY edit
MCC_REACH:COPY_DOMAINS.COM; and remove the copy command for those domains.
Any domain with no entities residing within it is not of concern and can be
left. Once done, execute @MCC_REACH:COPY_DOMAINS.COM;.
- View the MCC_REACH domain by MCC> show domain mcc_reach member *
If there are still any members within the MCC_REACH domain which do not
require polling then remove them at this point:
MCC> delete domain mcc_reach member <member_name>
- Execute CREATE_MCC_REACH_RULES.COM;. Two procedures will then be created;
CREATE_RULES.COM; and START_RULES.COM;. CREATE_RULES.COM; contains commands
to build all rules needed for the monitoring environment. The only edits
which may be needed is if different entity classes require different
polling periods. Execute @CREATE_RULES.COM; when all polling periods meet
your requirements. START_RULES.COM; is a daily resubmitting procedure which
enables all the MCC_REACH rules. Execute @SUBMIT_START_RULES.COM; to start
the process the first time, or any time manual startup is required.
- You are now monitoring all entities within the MCC_REACH domain. Mail will
be sent each time an entity either becomes unreachable or reachable similar
to a change_of function. If the entity is unreachable a log file named
MCC_REACH_LOG:MCC_REACH_LOG.<date> will be updated for each poll during the
unreachable timeframe. An event will also be generated and sent to the
viewed map for updating the color to critical/clear. If the entity resides in
multiple domains then multiple events will be sent to update the entity in
all the domains it may reside unless a NONOTIFY is defined on a domain or in
the entity.
- It is generally recommended to have an entity located within a single domain
within the viewed domain hierarchy. If your requirements need to place
entities in multiple domains then an unreachable entity will be color updated
in all domains it resides within the hierarchy unless one of the following
two methods are deployed. If there is a specific domain(s) within the viewed
structure by which you wish NO updates to ANY entity of that domain(s) this
can be accomplished by defining the DOMAIN_NONOTIFY symbol within the
MCC_REACH_SYMBOLS.COM file;. If there are individual entities which reside in
multiple domains and you require a color change in only one then the REMARKS
attribute of the global entity can be changed to reflect those domains not
requiring color updates. Simply set the REMARK attribute to be
NONOTIFY:<domain_name>. This can also be a comma seperated list to define
multiple nonotify domains for that entity (example below).
mcc> set node4 foo remarks NONOTIFY:DOMAIN_Q
No collector events from MCC_REACH will be sent to node4 foo in domain
DOMAIN_Q.
T.R | Title | User | Personal Name | Date | Lines |
---|
3894.1 | Good tool | RACER::dave | Ahh, but fortunately, I have the key to escape reality. | Mon Oct 12 1992 17:52 | 6 |
| Brad,
Could you repost this in the MCC-TOOLS conference,
where it really belongs?
Thanks
|