[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

4662.0. "alarms/historia/exporter interactions" by ANOVAX::COMFORT (Cold iron shackle, ball & chain) Wed Mar 10 1993 09:25

    
    Hi,
    
    I wish to pose the following scenario and see 
    
    	1) if it is possible
    	2) if anyone has done similar things
    
    
    I wish to reduce the amount of polling that my particular type of alarm
    rules require.  The base idea is to setup recording of partition
    counters and partition status via the historian, lets just say every
    00:01:00, of a circuit on a router.  Then instead of polling the router
    to check for circuit status changes, I would like to poll the historian
    for the alarm, SLIGHTLY after the fact, ie.
    
    	expression=(change_of(node4 foo circ phred substate,none,*), -
    	for start -00:01:00 , at every 00:01:00)
    
    Given that the syntax is probably not correct and I have not tried this
    yet, I'm looking for feedback on whether this is a viable idea.
    
    Additionally, I would be looking to export the counter information that
    would be recorded.  Is this possible?  When I return to a facility
    where I have the aility to try this, I will.  In the meantime, I'm
    looking for other thoughts on the matter.
    
    Thanks and regards,
    
    Dave Comfort
    
T.RTitleUserPersonal
Name
DateLines
4662.1it is possibleTOOK::SHMUYLOVICHWed Mar 10 1993 16:44157

    Dave,

    This is possible. We are writing a paper on how to use historical data.
    The following are some examples from this paper (all of them are tested).

  1.1 How to set up Historian

    The following are the steps to create a configuration of recording requests:

       a. You should decide what MCC components will be using historical data 
          (what entities and attribute partitions need to be recorded). 
	  Please, remember that Historian works on per entity per partition 
	  basis.

       b. You should decide how often you want to record entity's data. 
          For example, if you want to export entity's data every hour, set up 
	  alarms rule on one of the counters every 15 minutes, and calculate 
	  statistics for a half an hour interval you should record a counter 
	  partition every 15 minutes.

       c. You should choose a domain to store historical data. Please, keep in 
          mind that domain is used by the Historian FM ONLY as a place to store
	  recorded data. This means that using one Historian Background process
	  you can record data from members of different domains.

       d. You should start the Historian Background process and set up all
          necessary recording requests. Recording requests can be 
	  created/enabled manually or using a command procedure. In both cases 
	  a wildcarding can be used. The Historian FM supports wildcard in the 
	  "partition" attribute (means all attribute partitions for a given 
	  entity) as well as wildcard in any level of the entity specification.
	  The wildcard in the global level of entity is expanded by the FCL 
	  and means "all entities of a specified class in the specified domain".

          An example of a Record command:

          Record node4 rudone partition  counters, -
          polling period  00:15:00, - 
          in domain history

          In this record command "begin time" and "end time" attributes are not 
          specified which means that recording operation starts immediately and 
          continues forever. 

     1.2 Calculating statistics on historical data

         Calculating statistics on historical data is well working function and
         I assume that MCC users use it. The only problem is you have to 
	 record not only counters but all attribute partitions needed for this
         calculation. Note that only counters must be recorded with interval 
	 not bigger that a duration for statistics. Others partitions (such as 
         line characteristics which are used in the case of node4) could be 
	 recorded less often.

         The following is an example of a request for statistics for a 
	 specified duration ending slightly before "now":

         Show node4 rudone all statistics, -
	 for start (-00:15:10) duration 00:15:00,-
         in domain history
         
      
         
     1.3 Alarming on historical data

	To force Alarms rules to work on historical data you should use a scope 
        interest time in the alarms rule expression, for example:

        Expression = (node4 rudone User Bytes Receive > 1000, 
                      for start  (-00:00:10), 
                      at every 00:15:00, 
                      in domain history)

        In this example alarms rule will get the latest recorded data from a
        historical repository stored in domain "history".

	Also you can use any wildcard combinations in the way you use them in 
        the regular alarms rules.

        One may be concerned with the fact that due to asynchronuos behavior 
	between Historian and Alarms FMs the alarms rule created in the above 
	example could fire with a delay up to the polling interval. 
        If this is a real problem (meaning that you want to see notification 
	of an alarms condition immediately) we can recommend the following 
	scenario:
               - in the Record command you should specify a "begin time"
                 argument;
               - in the alarms rule expression instead of specifying
                 "at every..." you should specify "at start ... every..."
                 where start time is slightly behind the "begin time"
		 argument in the Record command.

        An example of synchronizing a record operation and an alarms rule:

        Record node4 rudone partition = counters, -
        polling period = 00:15:00, -
	begin time = 17:00:00, - 
	in domain history

        Expression = (node4 rudone User Bytes Receive > 1000, 
                      for start  (-00:00:10), 
                      at start 17:00:30 every 00:15:00, 
                      in domain history)

        An alarms rule on historical statistics requires slightly different 
	time semantics,
        for example:

        Expression = (node4 rudone Packet Rate > 100, 
	              for start (-00:15:10) duration 00:15:00,-
                      at every 00:15:00, 
                      in domain history)
        
	The same alarms rule could be synchronized with recording process, for 
	example:
        
        Expression = (node4 rudone Packet Rate > 100, 
	              for start (-00:15:10) duration 00:15:00,-
                      at start 17:00:30 every 00:15:00, 
                      in domain history)


     1.4 Graph on historical data

         To force Graph utility to work on historical data you have to specify 
	 scope of interest time similar to the following:

                      for start = (-0:00:10)

         To do this (in addition to the schedule part of time specification 
	 which is obvious for Graph utility) you should choose "relative scope 
	 of interest time" from "operation time" menu and set up the above 
	 value.

         When Graph utility is working on statistical attributes the scope of 
	 interest time should be as follows:

	              for start (-00:15:10) duration 00:15:00

         To set up these values the same "relative scope of interest time" 
	 menu is used.

     1.5 Exporting historical data

	 Exporting historical data is well documented in the "Historical Data 
	 Services Use" manual. 
	 If in the Export command you specify the "begin time" and "end time" 
	 arguments in the past the historical data will be exported from the
	 repository stored in the specified domain.
	 Please, remember that Exporter works on per entity basis. It means 
	 that ALL partitions (including statistics) will be exported for a 
	 target entity. So in order to use Past Time Exporter you should
	 to record all partitions you are interested in for a target entity
	 plus all partitions for statistics calculation.
    
4662.2sync with batch??CTHQ::WOODCOCKWed Mar 10 1993 18:2345
Sam,

Good write up for integrating the two FMs but I've a couple of questions/
comments.

I am skeptical of the practicality of SYNCRONIZING historian and alarms. Most
operations will be 7x24. Therefore both historical and alarm polling will be
placed in the background in batch. Your scenerio would only seem to work for
the FIRST running of the jobs in batch. Whenever the queues are stopped or the
system restarts you will lose sync because you have no control of this. 
Historian also 'buffers' the startup to my knowledge as to not overload the
system also causing the loss of sync.

Typically the user only needs *real time* alarms for STATUS attributes. These
are actually very few and describe "IS IT UP or DOWN". The rest of the alarms
typically can be off a few minutes because they are only describing 
non-business critical conditions such as ERRORs or UTILIZATION, etc...
So there can be a couple of views of how to implement. One technique would be
to use DIRECT ALARMs for UP/DOWN attributes then the rest would be from
historical data. The polling is doubled but if you monitor a WHOLE BUNCH of
attributes from HISTORIAN it may be worth it.

My contention would then be that if the attribute evaluation isn't required
right now then the interval probably doesn't need to be the same either. One
problem the user faces is that UP/DOWN attributes need short polling while
everything else needs longer polling. This is because the other attributes
are often used for long term trend analysis (maybe hourly). So the polling
doesn't really have to be double at all and the current tool works pretty
good. Even most error counters probably don't need to be evaluated like
STATUS and the longer poll period will do. Try to explain that to the user 
base :).

But (of course there's a but) if you really want MCC to be BEST IN CLASS then
we need a change in tactic. Today you have a module polling for information
which can then be polled by OTHER modules for evaluation. BEST IN CLASS would
have a generic module which polls for information which **DELIVERS** to other 
modules (read: multiple) making all aspects closer to real time. This method
would also be MUCH EASIER to set up by a user because the system has the
knowledge and polling control instead of the user having to figure it out
on an attribute or partition basis.

Can V1.3 alarm on PAST statistical attributes with wildcards??

best regards,
brad...
4662.3short historian pollsCTHQ::WOODCOCKWed Mar 10 1993 18:4522
Dave,

I think (not sure) you might actually be able to accomplish determining the
status of the circuit quicker than originally thought. Using Sam's outline of
how-to then polling the historian could mean figuring out the circuit
status for a different interval than the polling interval.

Example: Polling hourly into historian

        Expression = (node4 rudone circuit syn-0 substate <>none,
                      for start  (-00:00:10),
                      at every 00:05:00,
                      in domain history)
 
The above expression simply looks at the last poll whenever it was. So if you
poll the historian every 5 minutes regardless how often the historian polls
you should know the substate within 5 minutes of the historian poll. This
reduces polling but I'm not sure what it will do to your system performance
if you wildcard or do it multiple times.

good luck,
brad...
4662.4re -.*ANOVAX::COMFORTCold iron shackle, ball &amp; chainThu Mar 11 1993 10:3010
    
    Brad, Sam -
    
    Thanks for the input.  This info should clinch MCC as the platform of
    choice for MCI.  Similar functionality for the exporter would be a
    *BIG* plus for MCC.
    
    Regards,
    
    Dave
4662.5TOOK::SHMUYLOVICHThu Mar 11 1993 11:1538
	RE: .2



1. I agree that synchronization will be lost after the queues or system restart.
   It's possible to synchronize historian and alarms again by using Suspend 
   Recording and Resume Recording (with "begin time" argument) commands and 
   deleting/alarms rules with a new value of a "start time" in the alarms rule
   expression. 
   But after reading .2 it looks like this synchronization has only theoretical
   value.

2. The idea of a generic "poller" has been on our list for some time but 
   recourses for implementation....

   I disagree that "today you have a module polling for information can then be 
   polled by OTHER modules for evaluation". The Historian ONLY gets information 
   and stores it in the some place known to the system (historical repository).
   When the system needs this information it gets it without any help from the
   Historian. The only( but important) missing part is an ability of the system 
   to set up automatically recordings when several periodical requests for the 
   same data exist.

3. About wildcard in the alarms rule expression for historical statistics.
   I think it will work only if a global entity is wildcarded. Also note
   that global entity wildcard in conjunction with "in domain" qualifier is
   expanded as "all entities of a given class which are members of a specified 
   domain".


	Re: .3

   This will work but an alarms rule in described situation will fire up to
   12 times for the same data.


	Sam
4662.6more detailCTHQ::WOODCOCKThu Mar 11 1993 12:2365
A good subject to continue a discussion:

>2. The idea of a generic "poller" has been on our list for some time but 
>   recourses for implementation....

>   I disagree that "today you have a module polling for information can then be 
>   polled by OTHER modules for evaluation". The Historian ONLY gets information 
>   and stores it in the some place known to the system (historical repository).
>   When the system needs this information it gets it without any help from the
>   Historian. The only( but important) missing part is an ability of the system 
>   to set up automatically recordings when several periodical requests for the 
>   same data exist.

Correct. Let me restate. One module collects and stores the data while other
modules poll for the stored data. This causes the delay factor discussed but
also has other subtle (sometimes not so subtle) effects. Let's change the word
'poller' to 'info_collecter_deliverer' and put it to a scenerio. Using the
'info_collecter_deliverer', poll for info, once gathered it then checks
a table for service delivery, then delivers. The service delivery table would
include all modules requiring use of the data. This would include a module
which might store the data, another which might alarm the data, etc... This
means immediate results for all modules looking for common data without those
modules required to poll for the info either from the entity or somewhere
stored.

Now lets expand on what a 'info_collecter_deliverer' would be. It would be both
a poller and any 'event sinks'. I could be wrong about this but here is my
current observation with sinks today. The event is sent to the sink, the sink
then places the event 'somewhere', the modules poll 'somewhere' to retrieve
the event. The modules looking for the data today are primarily ALARMs and
NOTIFICATION. It's the poll cycle which kills the system and is not scalable.
If the event were delivered rather than polled for then life would appear 
to be better.

I sight the MCC_DNA5_EVL sink as an example. MCC can never be used as an OPCOM
type device today as I have seen several recommend: "just turn on all events
then pick the ones you think are important". Here's what happened when I tried
this approach. Start sink, set up remote entity to send events, then NOTIFY
DOMAIN X ENTITY = NODE y ANY EVENT. The getevent from the NOTIFY request begins
polling 'somewhere' for each event. The problem is that there are hundreds of
events in NODE so the NOTIFY process continually polls 'somewhere' to the point
where it looks like a CPU loop. We're talking 40-60% of a 8810 to process a
single request (probably more if I had it)!! The CPU spins to 100% and I'm out
of business. But if the sink were a 'info_collecter_deliverer' nothing would 
happen until an event actually arrives, service table lookup, then deliveries.

I'm hopin' somebody tells me I'm wrong about the above or something has changed
for v1.3 or it was just a bug.

>3. About wildcard in the alarms rule expression for historical statistics.
>   I think it will work only if a global entity is wildcarded. Also note
>   that global entity wildcard in conjunction with "in domain" qualifier is
>   expanded as "all entities of a given class which are members of a specified 
>   domain".

This is a shame. A lot of the need is with children stats.


>   This will work but an alarms rule in described situation will fire up to
>   12 times for the same data.

True, a buffering scheme at minimum is required for mail/page and maybe even
the need to use collector events for notification instead of normal alarm
notify if it's bothersome enough.