[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]
Conference azur::mcc

Title:	DECmcc user notes file. Does not replace IPMT.
Notice:	Use IPMT for problems. Newsletter location in note 6187
Moderator:	TAEC::BEROUD

Created:	Mon Aug 21 1989
Last Modified:	Wed Jun 04 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	6497
Total number of notes:	27359
3894.0. "Reachability: MCC_REACH" by CTHQ::WOODCOCK () Mon Oct 12 1992 16:50

Greetings,

If you are interested in using MCC V1.2 as a monitor of devices for 
REACHABILITY status, the following may be of interest to you (especially
for VMS). Thanks go to Peter Flack for his help on a customer site with
procedures and proof of concept help.

best regards,
brad...

............................................................................

Note 3385 began a discussion with regards to MCC V1.2s use as a reachability
monitor and identified some lack of functionality in V1.2 for this use. As
always there are methods to improve the monitoring environment and MCC_REACH
was written for this purpose.

MCC_REACH is simply a name given to a change in monitoring METHODOLOGY and a
few DCL procedures to accomplish the task. Before describing what this method
does lets go over some V1.2 characteristics and what lack of functions exist:

- MCC alarms consume lots of memory. Ever run out of memory by the time you're
  done implementing alarms leaving you short for other tasks?

- Implementing globally wildcarded alarms is great but this forces you to
  assemble your maps/domains in specific ways. If not, you double poll some
  entities and potentially poll others which you do not require (end systems).
  It has been mentioned that the ability to 'mark' devices to not poll would
  be helpful enabling management of the device but not monitoring.

- If the maps/domains are not changed then global wildcards may not be feasible.
  This results in implementing individual alarms, which results in extra
  memory consumption, which results in scaling and performance issues.
  The effort to maintain non-wildcarded alarms is also undesirable 
  (adds/deletes).

- The map colors do not correlate properly to show an actual status of the
  devices because MCC uses non-intelligent EXCEPTION HANDLING and flags the
  device as INDETERMINATE. This is true for all devices except SNMP which 
  works properly. Therefore, unless the map is tended continually the status
  of all devices is unknown. Manual clearing of alarms on devices is required.

- In order to achieve proper map colors users have implemented alarms which
  fire EVERY poll. This forces the devices on the map to a CLEAR severity but
  uses unreasonable queue resources. This can also clutter up the NOTIFICATION
  window with CLEAR messages rather than actual outages if not filtered
  correctly.

- There is no control function for contiguous outages. That is, if a device
  goes down a user could potentially receive mail each polling period until
  the device returns to service. This is an undesirable effect.

..............................................................................

How does MCC_REACH work and what can it do for you? MCC_REACH is a SEPERATE
polling environment which then uses data collector functions to update your
current viewed map. 

- A seperate domain is created and all devices TO BE POLLED are placed within 
  it. This resolves the issue where you would like to have devices on the
  map but not poll them or marked for non-polling.

- A single globally wildcarded alarm rule is created for each class of entities
  for the devices within the new domain. This immediately reduces the number
  of alarms by a minimum factor of the amount of domains you currently use
  today. This can be several fold and should result in much less memory
  consumed for alarming needs.

- The only maintenance is to either add or delete entities from the new
  domain as polling requirements change. 

- Two other alarm rules are created which monitor the internal events of all
  the rules used for polling. These two alarms then fire a procedure which
  sends an event into the viewed map with the proper severity, updates a log
  file, sends mail (but ONLY when the device first goes down, and then comes 
  back up). This keeps the map and log files updated continually but does not
  burden the user with multiple mail messages for each outage. The map now 
  indicates the actual status of all devices with no manual intervention.

- There are options to allow devices to be present in multiple viewed domains
  but only be updated in those which the user requires.

- Setup of this environment is estimated to take somewhere from 20 minutes to
  an hour or two depending on the implementation size. 
  ANCHOR::NET$LIBRARY:MCC_REACH.BCK contains the working files and also
  procedures to take the user from their current environment to the new. Simply
  follow the set of instructions below.

RISKS/LIMITATIONS:

There is no official support for these procedures, use at your own risk.
Testing for these procedures has been done only on a limited basis with limited
DNS directories. Devices tested to date successfully are NODE4, NODE, SNMP,
STATION, BRIDGE, CONCENTRATOR, and TERMINAL SERVERS. The alarm fired procedures
were built to be as generic as possible.

Although MCC_REACH reduces the number of alarms required there now becomes some
practical limitations to MCC_REACH. For small to medium implementations all
should work well. Because a single alarm (and thread) are used to poll each 
class device the following applies: MCC must be able to poll all devices 
within a given class within the poll period picked, if not it skips a poll 
period. So if you have several hundred terminal servers you may not 
successfully poll them every 2 minutes using a 3100 with a single thread!

After MCC_REACH is built you can do a simple test to get a relative feel for
polling capabilities. Issue a similar type command from FCL for each class and
determine how long it takes to fulfill:

	mcc> show node4 * state, in domain mcc_reach

If you find a particular class is too large simply delete the MCC_REACH alarm
used to poll that class if the practical poll period does not meet your 
business needs. You will have to use standard techniques and alarms for that 
class.

==============================================================================

MCC_REACH setup:

- Create two directories for use with MCC_REACH procedures, related files, 
  and log files if needed.

  $ CREATE/DIR SYS$SYSDEVICE:[DECMCC.MCC_REACH]
  $ CREATE/DIR SYS$SYSDEVICE:[DECMCC.MCC_REACH.LOG]

- Edit SYS$STARTUP:MCC_LOGICAL_BMS.COM; and add lines to define the directories 
  for MCC_REACH procedures (from above). Examples below:

  $ DEFINE/SYSTEM/EXEC  MCC_REACH_DIR "SYS$SYSDEVICE:[DECMCC.MCC_REACH]"
  $ DEFINE/SYSTEM/EXEC  MCC_REACH_LOG "SYS$SYSDEVICE:[DECMCC.MCC_REACH.LOG]"

  Execute @SYS$STARTUP:MCC_LOGICAL_BMS.COM; to set these logicals now.

- Copy MCC_REACH.BCK into MCC_REACH_DIR and seperate using BACKUP.

  $ set def mcc_reach_dir
  $ backup mcc_reach.bck/save/log *.*

- Create two domains; MCC_REACH and OSI_RULES. Examples below:

  $ manage/enter
  MCC> create domain .MCC_REACH
  MCC> create domain .OSI_RULES
  MCC> exit

- Add a COLLECTOR entity to ALL domains within the top_domain hierarchy using
  the iconic map toolbox. Collector names MUST BE EXACTLY named to
  <domain_name>_COLLECTOR   

  Pull-down EDIT -> TOOLBOX
            Select COLLECTOR entity
	    Enter collector name ->  <domain_name>_COLLECTOR      
            Apply
	    Enter reference information if desired
	    OK
	    Place on map
	    Repeat for all domains

- Customize the map and make the ICON color the same as the CLEAR severity
  color.

- Start up the event sink for Collector events. Edit MCC_STARTUP_BMS.COM in
  SYS$STARTUP and uncomment the following line:

  $!      $ @SYS$STARTUP:MCC_STARTUP_EVC_SINK.COM

  Execute @SYS$STARTUP:MCC_STARTUP_EVC_SINK.COM to start sink now.

- Set up NOTIFY request to receive data collector events:

  Pull-down APPLICATIONS -> DECmcc NOTIFICATION SERVICES
	    Select NOTIFY REQUESTS from NOTIFICATION window
	    Select CREATE
	    Enter DOMAIN -> <top_domain>
	    Enter ENTITY -> COLLECTOR *
	    Enter EVENTS -> ANY EVENT
	    Select OK

- Save the NOTIFY request to enable with each map startup:

	    Select Collector Notify line from NOTIFY REQUESTs window
  Pull-down FILE -> SAVE AS from NOTIFY REQUESTs window
	    Enter filename for new startup file or append
	    OK
	    Select CLOSE for NOTIFY REQUESTs window
	    If new startup file pull-down CUSTOMISE -> GENERAL
		from DECmcc NOTIFIATIONs window
	    Enter same filename for Notify Requests Startup File
	    OK
	    From DECmcc NOTIFIATIONs window CUSTOMISE -> SAVE CURRENT SETTINGS

- Edit MCC_REACH_SYMBOLS.COM to reflect your system. Description of symbols
  reside within the file. This file is executed by other procedures.

- Execute @CREATE_COPY.COM;. This procedure produces a file called
  MCC_REACH:COPY_DOMAINS.COM;. This file is used to copy all entities in any
  domain within the top_domain hierarchy into the MCC_REACH domain. If there 
  are any WHOLE domains which require NO polling for ANY ENTITY edit 
  MCC_REACH:COPY_DOMAINS.COM; and remove the copy command for those domains. 
  Any domain with no entities residing within it is not of concern and can be 
  left. Once done, execute @MCC_REACH:COPY_DOMAINS.COM;. 

- View the MCC_REACH domain by    MCC> show domain mcc_reach member *
  If there are still any members within the MCC_REACH domain which do not
  require polling then remove them at this point:

  MCC> delete domain mcc_reach member <member_name>

- Execute CREATE_MCC_REACH_RULES.COM;. Two procedures will then be created;
  CREATE_RULES.COM; and START_RULES.COM;. CREATE_RULES.COM; contains commands
  to build all rules needed for the monitoring environment. The only edits
  which may be needed is if different entity classes require different
  polling periods. Execute @CREATE_RULES.COM; when all polling periods meet
  your requirements. START_RULES.COM; is a daily resubmitting procedure which
  enables all the MCC_REACH rules. Execute @SUBMIT_START_RULES.COM; to start
  the process the first time, or any time manual startup is required.

- You are now monitoring all entities within the MCC_REACH domain. Mail will
  be sent each time an entity either becomes unreachable or reachable similar
  to a change_of function. If the entity is unreachable a log file named
  MCC_REACH_LOG:MCC_REACH_LOG.<date> will be updated for each poll during the
  unreachable timeframe. An event will also be generated and sent to the
  viewed map for updating the color to critical/clear. If the entity resides in
  multiple domains then multiple events will be sent to update the entity in
  all the domains it may reside unless a NONOTIFY is defined on a domain or in
  the entity.

- It is generally recommended to have an entity located within a single domain
  within the viewed domain hierarchy. If your requirements need to place 
  entities in multiple domains then an unreachable entity will be color updated
  in all domains it resides within the hierarchy unless one of the following
  two methods are deployed. If there is a specific domain(s) within the viewed
  structure by which you wish NO updates to ANY entity of that domain(s) this
  can be accomplished by defining the DOMAIN_NONOTIFY symbol within the
  MCC_REACH_SYMBOLS.COM file;. If there are individual entities which reside in
  multiple domains and you require a color change in only one then the REMARKS
  attribute of the global entity can be changed to reflect those domains not
  requiring color updates. Simply set the REMARK attribute to be
  NONOTIFY:<domain_name>. This can also be a comma seperated list to define
  multiple nonotify domains for that entity (example below).

  mcc> set node4 foo remarks NONOTIFY:DOMAIN_Q

  No collector events from MCC_REACH will be sent to node4 foo in domain 
  DOMAIN_Q.
T.R	Title	User	Personal Name	Date	Lines
3894.1	Good tool	RACER::dave	Ahh, but fortunately, I have the key to escape reality.	`Mon Oct 12 1992 17:52`	6
	Brad, Could you repost this in the MCC-TOOLS conference, where it really belongs? Thanks