[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

123.0. "alarms -circuit adjacent node :datatype unsupported" by GDJUNK::HOULE (Steve, NM is the future!) Mon May 07 1990 13:18

Hi,

I'm trying out a different strategy to detect "bad" circuits (see related note
105) by examining a circuit's adjacent node  on a point-to-point line.
However, I can't create the alarm rule I need for this; I get an error.
(I think I know why but want confirmation)

Here's one variation of the rule (they all get the same error):

 create mcc alarms rule  GSFDR1_SYN3A  -
 expression = (change_of ( node4 GSFDR1 circuit syn-3 adja  -
    node 45.986 name, GSFDR1,*  ) , -
 at every=00:05:00), -
 procedure = MCC$COMMON:ACT_MAIL_ALARM.COM, -
 parameter = "@mcc$common:act_mail.dis", -
 category  = "ROUTER CIRCUIT", -
 description = "MCC GSFDR1 circuit SYN-3 ", -
 QUEUE = "mcc$batch"

The error I get is:
 Datatype of the attribute in the alarm expression is currently unsupported.

I've also tried these expression variations:
  (change_of ( node4 GSFDR1 circuit syn-3 adja node 45.986 address, *,*  ) ,
  (change_of ( node4 GSFDR1 circuit syn-3 adja node 45.986 name, *,*  ) 

Please confirm that what's unsupported are attributes like these that don't have 
a fixed set of  string values.
When WILL this be supported?


On another point:
Note that with the P4 entity model and the current alarm AM functionality
I had to KNOW the adjacent node to write my alarm rule even though  
point-to-point circuits can ONLY have ONE adjacent node (verses an 
ethernet circuit which has N adjacencies).  --Ugly!!
  
===Steve
T.RTitleUserPersonal
Name
DateLines
123.1Note a reply, another similiar problemCLAUDI::PETERSWed May 09 1990 14:56123
OK -- I give up, I simply do not understand why, what I will describe here
does not work.  Please extract the command file below and try it yourself.
I created the following command file to play with setting up some alarms
on node counters.   I set up 6 alarms rules which are basically identical
expect for the counter which it monitors (why? because that's the way I did it).
See attached command file for CREATE commands.

Behavior:

Alarms rules 1 and 2 kick in after the appointed 1 min wait, do their thing 
and remain enabled.  

Rules 3-6 also kick in after the 1 min wait and immediately become disabled.

My MCC$Alarms_9-may-1990_errors.log indicates that alarms 3-6 experienced
an '%MCC-S-COMMON_EXPECTIO, Success with common exception reply' and 
'Software logic error detected
                      MCC Routine Error = %MCC-E-ILVSYNTAXERROR,  the ASN.1 valu
                                          e's data type conflicts with supplied
                                          type'

(ugly spacing BTW but that's not the issue) 

Question:

Why do the first 2 alarms work and the next 4 fail?  Initially, I had them
running all through the same nodify command file -- then I broke them up, which
didn't matter.  I tried enabling them in different orders once I
created the rules, that didn't matter.   My conclusion is that there
is something wrong with rules 3-6, but I do not know what.  What doesn't 
MCC like about rules 3-6 (I thought these were pretty straight forward,
but sometimes you get too close and need other eyes to see)???  

Is this a dictionary problem?   

/Claudia

!==================================================
! Alarms testing command file
!
! Delete the old rules and add the new rules.
!
DELETE  MCC ALARMS RULE NODE_STATUS_ALARM1
DELETE  MCC ALARMS RULE NODE_STATUS_ALARM2
DELETE  MCC ALARMS RULE NODE_STATUS_ALARM3
DELETE  MCC ALARMS RULE NODE_STATUS_ALARM4
DELETE  MCC ALARMS RULE NODE_STATUS_ALARM5
DELETE  MCC ALARMS RULE NODE_STATUS_ALARM6
!
!
CREATE  MCC ALARMS RULE NODE_STATUS_ALARM1 -
EXPRESSION  = (NODE4 CLAUDI RECEIVED CONNECT RESOURCE ERROR > 0, AT -
START =(+00:01:00) EVERY=01:00:00 ), -
PROCEDURE   = MCC$COMMON:MCC$ALARMS_NOTIFY1.COM, -
PARAMETER   = "ResourceError", -
CATEGORY    = "COUNTERS CHECK", -
DESCRIPTION = "Test node counters", -
QUEUE       = "SYS$BATCH"
ENABLE MCC ALARMS RULE NODE_STATUS_ALARM1
SHOW MCC ALARMS RULE NODE_STATUS_ALARM1 ALL STATUS
!
!
CREATE  MCC ALARMS RULE NODE_STATUS_ALARM2 -
EXPRESSION  = (NODE4 CLAUDI RESPONSE TIMEOUTS > 0, AT -
START =(+00:01:00) EVERY 01:00:00 ), -
PROCEDURE   = MCC$COMMON:MCC$ALARMS_NOTIFY2.COM, -
PARAMETER   = "ResponseTimeouts", -
CATEGORY    = "COUNTERS CHECK", -
DESCRIPTION = "Test node counters", -
QUEUE       = "SYS$BATCH"
ENABLE MCC ALARMS RULE NODE_STATUS_ALARM2
SHOW MCC ALARMS RULE NODE_STATUS_ALARM2 ALL STATUS
!
!
CREATE  MCC ALARMS RULE NODE_STATUS_ALARM3 -
EXPRESSION  = (NODE4 CLAUDI AGED PACKET LOSS > 0, AT -
START =(+00:01:00) EVERY=01:00:00 ), -
PROCEDURE   = MCC$COMMON:MCC$ALARMS_NOTIFY3.COM, -
PARAMETER   = "AgedPacketLoss", -
CATEGORY    = "COUNTERS CHECK", -
DESCRIPTION = "Test node counters", -
QUEUE       = "SYS$BATCH"
ENABLE MCC ALARMS RULE NODE_STATUS_ALARM3
SHOW MCC ALARMS RULE NODE_STATUS_ALARM3 ALL STATUS
!
!
CREATE  MCC ALARMS RULE NODE_STATUS_ALARM4 -
EXPRESSION  = (NODE4 CLAUDI OVERSIZED PACKET LOSS > 0, AT -
START =(+00:01:00) EVERY=01:00:00 ), -
PROCEDURE   = MCC$COMMON:MCC$ALARMS_NOTIFY4.COM, -
PARAMETER   = "OversizedPacketLoss", -
CATEGORY    = "COUNTERS CHECK", -
DESCRIPTION = "Test node counters", -
QUEUE       = "SYS$BATCH"
ENABLE MCC ALARMS RULE NODE_STATUS_ALARM4
SHOW MCC ALARMS RULE NODE_STATUS_ALARM4 ALL STATUS
!
!
CREATE  MCC ALARMS RULE NODE_STATUS_ALARM5 -
EXPRESSION  = (NODE4 CLAUDI PACKET FORMAT ERROR > 0, AT -
START =(+00:01:00) EVERY=01:00:00 ), -
PROCEDURE   = MCC$COMMON:MCC$ALARMS_NOTIFY5.COM, -
PARAMETER   = "PacketFormatError", -
CATEGORY    = "COUNTERS CHECK", -
DESCRIPTION = "Test node counters", -
QUEUE       = "SYS$BATCH"
ENABLE MCC ALARMS RULE NODE_STATUS_ALARM5
SHOW MCC ALARMS RULE NODE_STATUS_ALARM5 ALL STATUS
!
!
CREATE  MCC ALARMS RULE NODE_STATUS_ALARM6 -
EXPRESSION  = (NODE4 CLAUDI VERIFICATION REJECT > 0, AT -
START =(+00:01:00) EVERY=01:00:00 ), -
PROCEDURE   = MCC$COMMON:MCC$ALARMS_NOTIFY6.COM, -
PARAMETER   = "VerficationReject", -
CATEGORY    = "COUNTERS CHECK", -
DESCRIPTION = "Test node counters", -
QUEUE       = "SYS$BATCH"
ENABLE MCC ALARMS RULE NODE_STATUS_ALARM6
SHOW MCC ALARMS RULE NODE_STATUS_ALARM6 ALL STATUS
SHOW MCC ALARMS RULE * ALL STATUS
!
!
123.2Sorry, not supported...TOOK::PLOUFFEJerryWed May 09 1990 18:0627
RE: .0 


  The error message "Datatype of the attribute in the alarm expression is 
  currently unsupported." tells the story.  The attributes NAME and ADDRESS
  that you are using in the alarm expression are of datatype: PhaseIVname and
  PhaseIVaddress.  The alarm module doesn't support these datatypes.  The list
  of supported datatypes can be found in the Alarms HELP text (Topic: ENTITY
  MCC ALARMS RULE CREATE Expression Restrictions).

> When WILL this be supported?

  Again, I will have to refer you to product management (contact Daniel 
  Holland) for timeframes for future functionality.  But, I will bump up the 
  priority on support for these datatypes.


> On another point:
> Note that with the P4 entity model and the current alarm AM functionality
> I had to KNOW the adjacent node to write my alarm rule even though  
> point-to-point circuits can ONLY have ONE adjacent node (verses an 
> ethernet circuit which has N adjacencies).  --Ugly!!
  
  This sounds like a question concerning the Phase IV entity model.  Jim
  Carey would be better able to field this one.  Alarms just uses whatever
  entity model is defined...

123.3Why .1 failsCLAUDI::PETERSThu May 10 1990 11:2627
From:	TOOK::SCHLENER     "STOP the ASPHALT plant in Templeton, MA!" 10-MAY-1990 10:02:47.93
To:	CLAUDI::PETERS
CC:	
Subj:	reason for your alarm rule problems

Hello. Jill Callander informed me about the problems that you're having with some
of your alarm rules. I'm Cindy Schlener and am a member of the DECnet Phase IV
team. 

The problem that you are encountering is due to a discrepancy with the MSL 
compiler. All the counters that you were dealing with were of the counter8 datatype.

Before the msl data can be placed into the data dictionary, they must first be
compiled using the msl compiler. Unfortunately, even though the msl compiler 
allows users to use the counter8 datatype, MCC doesn't have such a data type.
The msl compiler translated the counter8 datatype to be a counter16 datatype 
hence changing the expected size of the counter.

The DECnet Phase IV AM encoded these counter8 attributes as a 1 byte counter
when Alarms expected them to be a 2 byte counter.

We discovered this discrepancy a couple of months ago and the dictionary that
will be going out with the EFT update has the necessary changes to the counter
datatypes. I'm sorry for the inconvenience that this problem may have caused
you.
				Sincerely,
				Cindy Schlener
123.4adj node alarms & topologyJETSAM::WOODCOCKThu May 10 1990 15:0661
Hi there, 

This is a continuation of this discussion and it also references note
105 regarding circuit status.

Originally there were two options discussed thru the QAR system and
offline regarding alarms detecting whether a circuit was up or down. 
The first was using the circuit substate and the second was using
the adjacent node of a circuit. The problem with circuit substate
was that the substate isn't returned if the circuit is running
properly causing an error in the alarm. The problem with the adj 
node option was that alarms do not fire on error (which is the case 
when the circuit is down and the adj node attribute is not there). 
Also datatypes for 'name' and 'adj node' are not yet supported as 
brought up in this note.

The solution for circuit status alarms has been worked out by MCC
supplying a substate if none is there (Note 105). Thanks for all
your efforts, this will certainly help MCC fit into our environment
at NETops and other support centers.

But I would like to point out an added feature the adj node option
would have given MCC net mngrs. It occurred to me this week while
using the ICON PM. I am presently creating maps with entities and
drawing lines between everything which are connected via DECnet.
These lines are only pictorial and MCC has no way of verifying this
pictorial topology. Hence, if someone beyond our realm of control
moves a circuit or changes a node, there is no way of telling other
than manually checking the adj node periodically. Worse yet, if a
change is made without us realizing, then a circuit goes down, we
are misinformed when looking at the map and end up wasting valuable
bandwidth/connectivity time searching for topology answers in order
to work the issue properly.

NETops has been using a DT homegrown tool for years now which has
been the mainstay of monitoring our nodes/circuits (even with NMCC). 
It works by polling and keying off the ADJ NODES of circuits. If a  
circuit goes down OR if the adj node changes, a line is graphically
changed red, notifying analysts to 'take a look'. The key is that it
informs you of circuit problems and/or forces the topology information
to be up to date. Knowledge of the topology is critical in network
management and this tool helps maintain that knowledge especially
in a large network.

When using the following expression this also creates the environment
of informing/forcing (subtle) changes in the topology info are needed.

expression=(node4 <node> circuit syn-n adj node <adj_node> name=adj_node)

This idea becomes VERY important within distributed network management
environments (ie. EASYnet, ACTnet, DSNnet, & ext vendors also) where
the boundaries of responsibility are not so well defined and political!!

In order to successfully implement this, 'firing' alarms on error are
needed. Is this going to be supported (or somehow made an option)? This
also resolves node reachability alarm problems which I know you are
also looking into resolving as a seperate issue.


...brad
    
123.5We have it now!WAKEME::ANILThu May 10 1990 21:1039
RE: .4


>> In order to successfully implement this, 'firing' alarms on error are
>> needed. Is this going to be supported (or somehow made an option)? This
>> also resolves node reachability alarm problems which I know you are
>> also looking into resolving as a separate issue.
>> ..brad

Brad,

Alarms has it now! I was really happy to see you "require" this functionality!!

We are adding finishing touches to the functionality you mentioned i.e.
Alarming on error. An example of rule indicating the syntax is as follows:
!

CREATE MCC ALARMS RULE Sample_Rule__102 -
    Expression = (NODE4 foo Connects Sent > 3, at every 00:00:45),-
    Procedure = SYS$COMMON:[MCC]MCC$ALARMS_MAIL_ALARM, -
    Parameter = "wakeme::ANIL", -
    exception handler = SYS$COMMON:[MCC]MCC$ALARMS_MAIL_ALARM,-
    Description = " => Demo of MCC ALARMS functioning"

If there is any error in accessing the entity, (for that matter any error
in processing the data returned by the entity) Alarms will be queue the command
procedure indicated by the argument "Exception Handler".

If the error is "found" to be of permanent nature then the rule is Disabled
else we bump the error counter and save the error msg in memory and wait for
next "scheduled" time. There was a lot of effort put into this functionality,
but I think it is well worth it!

EFT update kit will have this functionality so stay tuned!

- Anil Navkal

Alarms Team Member

123.6job well doneJETSAM::WOODCOCKFri May 11 1990 10:3122
re: .5
    
EXCELLENT! This has been a major snag for us to set up MCC for monitoring
nodes/circuits parallel to our present production methods. We'll give MCC
a serious workout with the next update and see how it holds up compared
to our present tool(s).

> If the error is "found" to be of permanent nature then the rule is Disabled
> else we bump the error counter and save the error msg in memory and wait for
> next "scheduled" time. There was a lot of effort put into this functionality,
> but I think it is well worth it!

> EFT update kit will have this functionality so stay tuned!

My only question is regarding this statement. What is 'found to be of
permanent nature' defined as? Does MCC or the USER have control of the
Disabling and/or Error Counting activity? I'm sure there will be situations
where the alarm should NEVER be disabled on error while using the exception
handler to do the actual alarm notification. Is this covered?

...brad
    
123.7We wanted to be a little smart...WAKEME::ANILFri May 11 1990 12:1451
RE: .6

>>>    My only question is regarding this statement. What is 'found to be of
>>>    permanent nature' defined as? Does MCC or the USER have control of the
>>>    Disabling and/or Error Counting activity? I'm sure there will be
>>>    situations  where the alarm should NEVER be disabled on error while
>>>    using the exception  handler to do the actual alarm notification. Is
>>>    this covered?

   I knew it! You had to ask that question! You are almost right when you
   said Alarms should NEVER be disabled. The actual algorithm used to disable
   Alarms is quite complicated. I will try to describe it in a nut shell.

   Rules will be disabled under following circumstances:

   1. CVR other than Response, Exception (both common or specialized)
      is returned by the MCC_CALL. This include, for example
   	 < dispatch entry not found >
   	 < The Image file activation error >
   	 < : >
   	there are 8-10 such errors where retry does not make sense.

   2. If either Exception has an argument that indicates a Permanent
      problem then again we do not retry. The theory is that the
      information provider has encountered a "fatal" problem and
      hence wouldn't do any good to call it back. Problems of the
      nature "ran out of Virtual memory, Acc vio etc" fall into this
      category.

   3. If the common exception have certain argument like BUGCHECK
      (Trans. "Unknown problem: I cannot handle it! I give up" )
      we disable the rule. I am not going in to the internals
      because its too involved.

   This is the first cut at making the decision of Disabling the rule.
   I am sure it is not the end of it. But our emphasis has been that

   >>>     when in doubt do not disable the rule! <<<<
   >>>  but when you know for sure please try     <<<<
   >>>  to be a little smart!                     <<<<

   If there are cases where we did not do a good job we certainly would
   appriciate hearing from you.

   If you need the exact algorithm send me mail and I will mail you
   the design paper.

   Does this help?



123.8Making it easy...TOOK::PLOUFFEJerryFri May 11 1990 12:269
RE: .7


  Actually, the user doesn't need to be aware of this algorithm.  If the
  rule gets automatically disabled, the string "The rule has been disabled."
  will be passed to the exception handler command procedure in parameter P6.
  The alarms error log will also contain this same information.

                                                                     - Jerry
123.9disabled rule -at least we'll knowGDJUNK::HOULESteve, NM is the future!Tue May 15 1990 12:314
RE: .8

As long as a "disable" rule gets to the exception handle thats great!
===Steve
123.10Clarification...TOOK::PLOUFFEJerryTue May 15 1990 14:3115
RE: .9

  To be clear...

  If a rule gets automatically disabled because of a "fatal" error that occurs
  during the processing of an alarm expression, the exception handler command
  procedure will get executed.

  If a rule is disabled by user action (i.e., the user issues a DISABLE MCC
  ALARMS RULE foo command), the exception handler command procedure will *not*
  get executed.

  OK?

                                                                     - Jerry