[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

1372.0. "alarm rules setup" by HOO78C::TIMMERMANS () Wed Aug 21 1991 17:05

I am just about to start writing alarm rules in DECmcc, but is there a template
or setup available which will cover the most usefull Decnet events for a PHASE
IV node.

To setup the right structure I have to have a list of Decnet events with there
possibility of occurance and what impact this event will have and give it the
right severity level. The same we have with Decnet counters, is one error a 
problem or can we say that X number of errors within X time is a real problem.
I am looking for some margins and thresholds based on experience in the same
way errors are implemented in VAXsimPlus.

After this we should have the right basis to make procedures which are able to
gather all information needed to evaluate or locate the real problem and 
alarming the right entity.

Following this notes-file everyone is creating his own set of alarm rules, 
trying to put some intelligence in those rules based on there own experience.
So everyone is trying to do the right thing in his own way and is deciding 
what's important what isn't. 

Any inputs or suggestions on this would be helpfull,

Adrie Timmermans
T.RTitleUserPersonal
Name
DateLines
1372.1Examples GivenMAYDAY::ANDRADEThe sentinel (.)(.)Fri Aug 30 1991 06:023
    
    Have you looked into MCC_ALARMS_SAMPLE_RULES.COM in the MCC directory.
    
1372.2mcc_system:....TOOK::CALLANDERJill Callander DTN 226-5316Tue Sep 03 1991 15:457
just to clarify, that is in the mcc_system directory for 
mcc_alarms_sample_rules.com. Also there is some information and additional
examples in the alarms manual. As to another source for information on
what would be good to alarm on, try looking at the network trouble shooting
guide.

jill
1372.3HOO78C::TIMMERMANSTue Sep 03 1991 19:3939
RE: .1 and .2

Sorry for the misunderstanding in .0, I am not interrested in the alarm rules
it self but in how to get a balanced set of rules to cover up 95% of the "reel"
network or entity problems.
Another problem is what kind of AI are we putting behind those rules to alarm
only the "reel" problems instead of overloading the operator with resetting
alarms of so called problems.
To make the job a little bit easier we can group events, status changes and
counters. After this I can make one single flow diagram of what to do after 
occurance of any event, status change of counter change in that group.

First of all I can group errors and events

1)   No DECnet or ethernet access
2)   Errors caused by a "reel" problem (bad H4000, cable problem, defective
     ethernet controller, etc.)
3)   Errors caused by network overload, resource problems or wrong setup.

Second I can group entities in functionality and importance

1)   DECnet area-routers, and DECnet routers
2)   Major DECnet end nodes
3)   Minor DECnet end nodes
3)   Ethernet stations
4)   Ethernet bridges
5)   Terminal servers
6)   TCP/IP nodes

Third I have to put extra information in DECmcc, such as which ethernet segment
is the entity connected on. Otherwise it's difficult to figger out the relation
between entities. In this way we should be able to alarm e.g. a delni instead of
all entities behind it.

Any comment on this can be helpfull,

Adrie Timmermans