[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

711.0. "Circuit Down & How long ?" by ULYSSE::ZITTA (Batchman) Tue Feb 12 1991 02:49

    
    Is it (or will it be) possible with NMCC to automatically give an
    alarm only when a CIRCUIT was DOWN for more than a given time 
    (settable by user) ?
    
    Thanks,
    
    	Gerard
T.RTitleUserPersonal
Name
DateLines
711.1NMCC or MCCTOOK::W_MCGRATHDTN 226-5075Wed Feb 13 1991 08:4611
    Are you refering to NMCC (NMCC DECnet Monitor) or DECmcc?  If you are
    refering to the point product tool, NMCC, then try the NAC::NMCC notes
    conference.  NMCC cannot update the map reachability status based on
    the amount of time a node is down.  
    
    If you are refering to DECmcc (the EMA product), you might want to ask
    again here.  My understanding is that as long as the attribute (in this
    case, seconds the circuit has been down) is recognized by the AM, the
    Alarms FM can alarm on the attribute (with >, <, =, etc...)
    
    -Will-
711.2DECmccULYSSE::ZITTABatchmanThu Feb 14 1991 03:4315
    
    RE-1:
    
    Sorry for the lack of precision.
    I was meaning DECmcc
    
    Can somebody confirm and precise how to achieve this with DECmcc ?
    
    Another related question I have,is that if a circuit is down ,I would
    like to be sure  that ACCOUNTING/BILLING is stopped during that time.
    Is that possible ?
    
    	Thanks,
    
    		Gerard
711.3Here is a hack!WAKEME::ANILTue Feb 19 1991 19:4646
Re: .0

>
>
>    Is it (or will it be) possible with NMCC to automatically give an
>    alarm only when a CIRCUIT was DOWN for more than a given time
>    (settable by user) ?
>
>    Thanks,
>
>    Gerard

       Let me up front accept that the functionality you have requested does
       not exist in V1.1 Alarms, nor is it planned for V1.2. Based on how
       important you (and others) feel this functionality is, we could use
       it as an input to V2.0.

       Now having said the above, here is a hack. Jim Carry may want to
       correct me if I am wrong, but from my limited understanding of DECNET
       phase4 it should work.

       There are number of ways you can accomplish what you are trying to
       do. I am suggesting just one way of doing it!

       You can make one rule that watches for a Circuit Down Event from
       that Circuit. Rule fires when the circuit goes down.

       Let this trigger another session of MCC. (You can do this by invoking a
       command procedure). 

       In this other session of MCC, ZERO all the counters for this
       circuit. 

       Then enable a rule that will look for the attribute "Byte sent" to
       be = 0!  If you want to make sure the circuit comes up within 10
       minutes start watching after 10 minutes. If after 10 minutes you see
       that the rule fires then you know for sure that the circuit was down
       for 10 minutes. The rule expression for such a rule will be along
       the following lines:

	expression = (node4 <node-name> circuit <cir-name> byte sent = 0,
			at start +00:10)

	It is a hack, but then you don't have to wait till V2.0!

       	- Anil Who-is-not-very-happy-with-his-own-answer!
711.4too bad!ULYSSE::ZITTABatchmanWed Feb 20 1991 08:4419
    
    Thanks Anil for the information .
    Glad to hear such a functionality was already on the wishlist.
    It would be *very* interesting for any network management center .
    Right now the existing tools send so many "unuseful" alarms usually
    that you can miss the real problems .
    It would be nice to have a tool that can acknowledge "by itself"
    the unuseful events ,and that could flag only the real problems on
    the map .
    
    (An example of what I call "unuseful" is CIRCUIT DOWN (and all the
    related events ) followed by CIRCUIT UP close enough )
    
    Anyway thanks for your reply .We'll try to put in place a workaround
    in the meantime .
    
    	Gerard
    
          						
711.5Want it? I crave it!SICKO::FRAZERVision operates on many levels...Wed Feb 20 1991 18:496
>>	Based on how important you (and others) feel this functionality 
>>	is, we could use it as an input to V2.0.

    In a word, VERY_IMPORTANT!

    timf
711.6How might you expression this type of Rule ?WAKEME::ROBERTSKeith Roberts - DECmcc Alarms TeamThu Feb 21 1991 09:1428
We agree that this type of functionality would be *very* important for
managing a network.  How would it be expressed in an Alarm Rule?

Maybe:

  ------------------------------------------------------------------------

   SEQUENCE( Entity-Event-1, Entity-Event-2, Duration )

   Ex: (SEQUENCE(node4 A circuit qna-0 circuit down, -
                 node4 A circuit qna-0 circuit up, +00:10:00))

   "If Event-1 occurs, followed by Event-2 within the Duration, tell me"

  ------------------------------------------------------------------------

  o  Anyone have any other ideas?

  o  What should we be adding to the wish list for Alarm Rules?
      - Please try and give suggestions how this could be written
        in terms of an Alarm Rule Expression.


Thanks

Keith Roberts
DECmcc Alarms Team

711.7catch 22JETSAM::WOODCOCKThu Feb 21 1991 12:2636
Hi,

What your attempting is probably more difficult than meets the eye. Circuits
don't typically go down cleanly and then come up cleanly, they very often
bounce many times. Will your sequence example below be able to handle 
multiple outages within that ten minutes? If not you would have to reduce 
the ten minutes, the smaller the more accurate. 

Here's the catch 22. What happens if the circuit goes down cleanly for 
longer than 10 minutes. This means your sequence won't be satisfied. 
Therefore you would have to increase the ten minutes, the larger the better.

Also from a net mngmt standpoint, a steadily bouncing circuit has a far worse
impact on the net than the circuit which simple goes down and stays down for
the outage. Every time there is an event (up or down) a routing update occurs 
which traverses and propogates thru the net. This impact on routing can't be 
measured to keep our vendors and our rebates honest. Although you appear to 
be on the right track there are practical pitfalls which I'm not certain how 
to resolve.

regards,
brad...

>Maybe:

>  ------------------------------------------------------------------------

>   SEQUENCE( Entity-Event-1, Entity-Event-2, Duration )

>   Ex: (SEQUENCE(node4 A circuit qna-0 circuit down, -
>                 node4 A circuit qna-0 circuit up, +00:10:00))

>   "If Event-1 occurs, followed by Event-2 within the Duration, tell me"



711.8One shoe for all feet?TOOK::KOHLSRuth KohlsThu Feb 21 1991 13:1416
Re: .7
It seems to me that you are trying to make too general a tool from one,
very specific, rule.  You will still have to set up alarms and notifications 
for each situation you want to cover--and a 10 minute period isn't going to fit
a lot of situations!  

"Just" listing various things that circuits do when they aren't behaving well
is a starting point, and I suspect the Alarms team will find your description
helpful.  It sounds like they may need an expert system or two, among other 
things (;-). 

I thought the question in .6 was more along the lines of: would an 
expression *like* this make mcc alarms nicer for you?  

Ruth Kohls
(Mcc Kernel Team)
711.9It sounds like AI to me 8)WAKEME::ROBERTSKeith Roberts - DECmcc Alarms TeamThu Feb 21 1991 13:5917
There is an extention to the OCCURS function we've been planning:

	OCCURS( Event, Count, Duration )

If 'Event' occurs 'Count' times within this 'Duration' ... let me know.


But - I think Ruth hit the nail ... What describes a happy (or not so happy)
circuit.  It can't be expressed as one simple statement.

We *want* to add AND's and OR's to the Rule Expression so you can combine
'thoughts' (not for v1.1 .. sorry)

Let's keep this discussion going -- from what I see DECmcc already has
one of the most powerful Alarming capabilities going.

/Keith
711.10Can an AM help us here?WAKEME::ANILThu Feb 21 1991 16:4421
Generic solutions if powerful enough will solve many problems with
surprising elegance. How ever I a afraid the CIRCUIT DOWN is more
like a technology specific problem. We can solve the detection
of circuit bouncing problem by the OCCURS function with the extension
indicated in .-1 by Keith.

I was wondering if the AMs can provide a counter such as 

	entity down timer = xyz seconds

The above timer could be a generic timer and any Entity AM can support
it in a generic fashion at any global/child entity level. Availability
of such counter will make Alarms job ridiculously easy! It will be zeroed
every time the entity(circuit) reaches a stable Enabled/On state!

Alarms could certainly use some extentions like trigger rule foo
if rule foobar fires. This will make it much easier to start detecting
a problem/condition only after an event (ie circuit down) reaches MCC.


- Anil
711.11clock itJETSAM::WOODCOCKThu Feb 21 1991 17:5913
The use of a generic timer seems to be an excellent approach. The start 
clock can be driven by an event/alarm and then stopped by another event/alarm,
then report that time. The expression should be able to handle different
events for start and stop. This case it's circuit down and circuit up and 
should allow for continuous monitoring for the events and not rely on alarms
fm to poll for info at intervals (the use of getevent). The basenote wants
to measure the outage times, my comments in .7 were to simply point out that
determining absolute time of multiple events cannot easily be done on an
interval basis. It's not a working solution unless you gather the events and
work one time stamp against the other.

brad...

711.12NSSG::R_SPENCENets don&#039;t fail me now...Fri Feb 22 1991 14:186
    In fact, when I read over the first proposal (several reples back)
    of having (sequence(event-1, event-2, duration)), what popped into
    my mind before I read the explanation was that the rule would fire
    when the sequence occured, and PASS the duration to the proceedure.
    
    s/rob
711.13from a user only to the expertsULYSSE::ZITTABatchmanMon Feb 25 1991 05:3489

	Maybe at this time of discussion , we can enlarge the debate .
	
	In our group we remotely monitor,manage ,operate and support internal 
	and external networks 24h/365days . Important area to improve (I think)
	 ,is the Transmissions and physical layer management and my base
	note was just a small (but important)part of it . 
	Right now there is no easy and direct DEC tool to remotely monitor or
	manage a line or a circuit .So ,here are some inputs that ,I hope, will 
	help to find the best solution .
 
	For "store and forward" type of networks ,it is not (very) important
	to react quickly ,but for "real-time" networks (where each bit lost
	is worth $$$$$$$$ (-: ) ,we may need to detect a problem in ,
	let's say ,3 seconds max , and troubleshoot it  (localize) in 15' max
	for instance .
	As a minimum we want to react as soon as a circuit is down for
	more than x seconds/minutes,for the monitoring part and if needed for 
	troubleshooting ,we want to be able to retrieve more details if 
	needed .The point is to flag only critical events at the monitoring
	center .
	
	There are also different needs (Basically for use NOW , DAILY or 
	MONTHLY) :

	- Monitoring of all the events if needed .
	- Automatic Storage and easy retrieval of the events .
	- Detection of Priority #1 problems that need immediate attention or
	  escalation to support . 
	- Preventive maintenance : Monitor some quality parameters over a
	longer period (e.g one month) and detect any degradation .
	This can be useful for dealing with carriers and planning upgrades .
	- Error Performance (ES,SES,DM) 
	- Testing : If a link fails ,break down the DECmcc results/counters
	to ensure a detailed analysis that will localize the problem .
	We need results at the end of a test period ,daily or monthly ,as
	well as indication of error distribution over time .

	The above needs can vary ,depending on the "maturity" of service.
		o Development : Functional conformance test
		o Installation/commissioningg/out-of-service repair :
		Conformance and full quality validation
		o In-service : Quality of service monitor

	It can also depend on the regulations and country . But a good 
	reference are the CCITT recommendations G.821 ,G.921 and M.550 .
	The following can give some ideas on what we need to manage digital
	connections :
	- G.821 defines a set of performance objectives to test 64kbps lines.
	The parameters are :
		o ES (Errored seconds) : Any second during which an error occurs
		(unavailable time is excluded)
		o SES (Severely errored seconds) : Any 1-second interval during 
	which the bit error ratio exceeds 10E-3 (unavailable time is excluded
	too)
		o DM (Degraded minutes) : Any 1-minute interval during which the
	bit error ratio exceeds 10E-6 .(SES and unavailable time are excluded)

	Unavailable time begins when the bit error ratio in each second exceeds 
	10E-3 for 10 consecutive seconds .These 10 seconds are unavailable
	time .

	- G.921 is for 2048kbps 
	- M.550 sets limits for ES,SES and DM according to the application
	and defines the times over which they should be measured .

	The best would be a system that enables the network managers to see
	results on a daily basis graphically to quickly see which results
	exceed the daily limits .
	Those limits should be programmable (entered manually,or calculated
	by DECmcc depending on other data like link grade ,length or circuit
	classification ).They should satisfy the CCITT rec. or user-defined
	limits .

	Summary :
	As a minimum ,I would like DECmcc to tell me when there is a "real"
	problem .
	As a complement , I would like DECmcc to perform at least some basic 
	functions of a Data Comm. analyzer .

	I guess another question is : Is it possible in DECmcc the 
	transmission line to be a manageable object ?

		gerard