[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:	DECmcc user notes file. Does not replace IPMT.
Notice:	Use IPMT for problems. Newsletter location in note 6187
Moderator:	TAEC::BEROUD

Created:	Mon Aug 21 1989
Last Modified:	Wed Jun 04 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	6497
Total number of notes:	27359

711.0. "Circuit Down & How long ?" by ULYSSE::ZITTA (Batchman) Tue Feb 12 1991 02:49

    
    Is it (or will it be) possible with NMCC to automatically give an
    alarm only when a CIRCUIT was DOWN for more than a given time 
    (settable by user) ?
    
    Thanks,
    
    	Gerard

T.R	Title	User	Personal Name	Date	Lines
711.1	NMCC or MCC	TOOK::W_MCGRATH	DTN 226-5075	`Wed Feb 13 1991 08:46`	11
	Are you refering to NMCC (NMCC DECnet Monitor) or DECmcc? If you are refering to the point product tool, NMCC, then try the NAC::NMCC notes conference. NMCC cannot update the map reachability status based on the amount of time a node is down. If you are refering to DECmcc (the EMA product), you might want to ask again here. My understanding is that as long as the attribute (in this case, seconds the circuit has been down) is recognized by the AM, the Alarms FM can alarm on the attribute (with >, <, =, etc...) -Will-
711.2	DECmcc	ULYSSE::ZITTA	Batchman	`Thu Feb 14 1991 03:43`	15
	RE-1: Sorry for the lack of precision. I was meaning DECmcc Can somebody confirm and precise how to achieve this with DECmcc ? Another related question I have,is that if a circuit is down ,I would like to be sure that ACCOUNTING/BILLING is stopped during that time. Is that possible ? Thanks, Gerard
711.3	Here is a hack!	WAKEME::ANIL		`Tue Feb 19 1991 19:46`	46
	Re: .0 > > > Is it (or will it be) possible with NMCC to automatically give an > alarm only when a CIRCUIT was DOWN for more than a given time > (settable by user) ? > > Thanks, > > Gerard Let me up front accept that the functionality you have requested does not exist in V1.1 Alarms, nor is it planned for V1.2. Based on how important you (and others) feel this functionality is, we could use it as an input to V2.0. Now having said the above, here is a hack. Jim Carry may want to correct me if I am wrong, but from my limited understanding of DECNET phase4 it should work. There are number of ways you can accomplish what you are trying to do. I am suggesting just one way of doing it! You can make one rule that watches for a Circuit Down Event from that Circuit. Rule fires when the circuit goes down. Let this trigger another session of MCC. (You can do this by invoking a command procedure). In this other session of MCC, ZERO all the counters for this circuit. Then enable a rule that will look for the attribute "Byte sent" to be = 0! If you want to make sure the circuit comes up within 10 minutes start watching after 10 minutes. If after 10 minutes you see that the rule fires then you know for sure that the circuit was down for 10 minutes. The rule expression for such a rule will be along the following lines: expression = (node4 <node-name> circuit <cir-name> byte sent = 0, at start +00:10) It is a hack, but then you don't have to wait till V2.0! - Anil Who-is-not-very-happy-with-his-own-answer!
711.4	too bad!	ULYSSE::ZITTA	Batchman	`Wed Feb 20 1991 08:44`	19
	Thanks Anil for the information . Glad to hear such a functionality was already on the wishlist. It would be very interesting for any network management center . Right now the existing tools send so many "unuseful" alarms usually that you can miss the real problems . It would be nice to have a tool that can acknowledge "by itself" the unuseful events ,and that could flag only the real problems on the map . (An example of what I call "unuseful" is CIRCUIT DOWN (and all the related events ) followed by CIRCUIT UP close enough ) Anyway thanks for your reply .We'll try to put in place a workaround in the meantime . Gerard
711.5	Want it? I crave it!	SICKO::FRAZER	Vision operates on many levels...	`Wed Feb 20 1991 18:49`	6
	>> Based on how important you (and others) feel this functionality >> is, we could use it as an input to V2.0. In a word, VERY_IMPORTANT! timf
711.6	How might you expression this type of Rule ?	WAKEME::ROBERTS	Keith Roberts - DECmcc Alarms Team	`Thu Feb 21 1991 09:14`	28
	We agree that this type of functionality would be very important for managing a network. How would it be expressed in an Alarm Rule? Maybe: ------------------------------------------------------------------------ SEQUENCE( Entity-Event-1, Entity-Event-2, Duration ) Ex: (SEQUENCE(node4 A circuit qna-0 circuit down, - node4 A circuit qna-0 circuit up, +00:10:00)) "If Event-1 occurs, followed by Event-2 within the Duration, tell me" ------------------------------------------------------------------------ o Anyone have any other ideas? o What should we be adding to the wish list for Alarm Rules? - Please try and give suggestions how this could be written in terms of an Alarm Rule Expression. Thanks Keith Roberts DECmcc Alarms Team
711.7	catch 22	JETSAM::WOODCOCK		`Thu Feb 21 1991 12:26`	36
	Hi, What your attempting is probably more difficult than meets the eye. Circuits don't typically go down cleanly and then come up cleanly, they very often bounce many times. Will your sequence example below be able to handle multiple outages within that ten minutes? If not you would have to reduce the ten minutes, the smaller the more accurate. Here's the catch 22. What happens if the circuit goes down cleanly for longer than 10 minutes. This means your sequence won't be satisfied. Therefore you would have to increase the ten minutes, the larger the better. Also from a net mngmt standpoint, a steadily bouncing circuit has a far worse impact on the net than the circuit which simple goes down and stays down for the outage. Every time there is an event (up or down) a routing update occurs which traverses and propogates thru the net. This impact on routing can't be measured to keep our vendors and our rebates honest. Although you appear to be on the right track there are practical pitfalls which I'm not certain how to resolve. regards, brad... >Maybe: > ------------------------------------------------------------------------ > SEQUENCE( Entity-Event-1, Entity-Event-2, Duration ) > Ex: (SEQUENCE(node4 A circuit qna-0 circuit down, - > node4 A circuit qna-0 circuit up, +00:10:00)) > "If Event-1 occurs, followed by Event-2 within the Duration, tell me"
711.8	One shoe for all feet?	TOOK::KOHLS	Ruth Kohls	`Thu Feb 21 1991 13:14`	16
	Re: .7 It seems to me that you are trying to make too general a tool from one, very specific, rule. You will still have to set up alarms and notifications for each situation you want to cover--and a 10 minute period isn't going to fit a lot of situations! "Just" listing various things that circuits do when they aren't behaving well is a starting point, and I suspect the Alarms team will find your description helpful. It sounds like they may need an expert system or two, among other things (;-). I thought the question in .6 was more along the lines of: would an expression like this make mcc alarms nicer for you? Ruth Kohls (Mcc Kernel Team)
711.9	It sounds like AI to me 8)	WAKEME::ROBERTS	Keith Roberts - DECmcc Alarms Team	`Thu Feb 21 1991 13:59`	17
	There is an extention to the OCCURS function we've been planning: OCCURS( Event, Count, Duration ) If 'Event' occurs 'Count' times within this 'Duration' ... let me know. But - I think Ruth hit the nail ... What describes a happy (or not so happy) circuit. It can't be expressed as one simple statement. We want to add AND's and OR's to the Rule Expression so you can combine 'thoughts' (not for v1.1 .. sorry) Let's keep this discussion going -- from what I see DECmcc already has one of the most powerful Alarming capabilities going. /Keith
711.10	Can an AM help us here?	WAKEME::ANIL		`Thu Feb 21 1991 16:44`	21
	Generic solutions if powerful enough will solve many problems with surprising elegance. How ever I a afraid the CIRCUIT DOWN is more like a technology specific problem. We can solve the detection of circuit bouncing problem by the OCCURS function with the extension indicated in .-1 by Keith. I was wondering if the AMs can provide a counter such as entity down timer = xyz seconds The above timer could be a generic timer and any Entity AM can support it in a generic fashion at any global/child entity level. Availability of such counter will make Alarms job ridiculously easy! It will be zeroed every time the entity(circuit) reaches a stable Enabled/On state! Alarms could certainly use some extentions like trigger rule foo if rule foobar fires. This will make it much easier to start detecting a problem/condition only after an event (ie circuit down) reaches MCC. - Anil
711.11	clock it	JETSAM::WOODCOCK		`Thu Feb 21 1991 17:59`	13
	The use of a generic timer seems to be an excellent approach. The start clock can be driven by an event/alarm and then stopped by another event/alarm, then report that time. The expression should be able to handle different events for start and stop. This case it's circuit down and circuit up and should allow for continuous monitoring for the events and not rely on alarms fm to poll for info at intervals (the use of getevent). The basenote wants to measure the outage times, my comments in .7 were to simply point out that determining absolute time of multiple events cannot easily be done on an interval basis. It's not a working solution unless you gather the events and work one time stamp against the other. brad...
711.12		NSSG::R_SPENCE	Nets don't fail me now...	`Fri Feb 22 1991 14:18`	6
	In fact, when I read over the first proposal (several reples back) of having (sequence(event-1, event-2, duration)), what popped into my mind before I read the explanation was that the rule would fire when the sequence occured, and PASS the duration to the proceedure. s/rob
711.13	from a user only to the experts	ULYSSE::ZITTA	Batchman	`Mon Feb 25 1991 05:34`	89
	Maybe at this time of discussion , we can enlarge the debate . In our group we remotely monitor,manage ,operate and support internal and external networks 24h/365days . Important area to improve (I think) ,is the Transmissions and physical layer management and my base note was just a small (but important)part of it . Right now there is no easy and direct DEC tool to remotely monitor or manage a line or a circuit .So ,here are some inputs that ,I hope, will help to find the best solution . For "store and forward" type of networks ,it is not (very) important to react quickly ,but for "real-time" networks (where each bit lost is worth $$$$$$$$ (-: ) ,we may need to detect a problem in , let's say ,3 seconds max , and troubleshoot it (localize) in 15' max for instance . As a minimum we want to react as soon as a circuit is down for more than x seconds/minutes,for the monitoring part and if needed for troubleshooting ,we want to be able to retrieve more details if needed .The point is to flag only critical events at the monitoring center . There are also different needs (Basically for use NOW , DAILY or MONTHLY) : - Monitoring of all the events if needed . - Automatic Storage and easy retrieval of the events . - Detection of Priority #1 problems that need immediate attention or escalation to support . - Preventive maintenance : Monitor some quality parameters over a longer period (e.g one month) and detect any degradation . This can be useful for dealing with carriers and planning upgrades . - Error Performance (ES,SES,DM) - Testing : If a link fails ,break down the DECmcc results/counters to ensure a detailed analysis that will localize the problem . We need results at the end of a test period ,daily or monthly ,as well as indication of error distribution over time . The above needs can vary ,depending on the "maturity" of service. o Development : Functional conformance test o Installation/commissioningg/out-of-service repair : Conformance and full quality validation o In-service : Quality of service monitor It can also depend on the regulations and country . But a good reference are the CCITT recommendations G.821 ,G.921 and M.550 . The following can give some ideas on what we need to manage digital connections : - G.821 defines a set of performance objectives to test 64kbps lines. The parameters are : o ES (Errored seconds) : Any second during which an error occurs (unavailable time is excluded) o SES (Severely errored seconds) : Any 1-second interval during which the bit error ratio exceeds 10E-3 (unavailable time is excluded too) o DM (Degraded minutes) : Any 1-minute interval during which the bit error ratio exceeds 10E-6 .(SES and unavailable time are excluded) Unavailable time begins when the bit error ratio in each second exceeds 10E-3 for 10 consecutive seconds .These 10 seconds are unavailable time . - G.921 is for 2048kbps - M.550 sets limits for ES,SES and DM according to the application and defines the times over which they should be measured . The best would be a system that enables the network managers to see results on a daily basis graphically to quickly see which results exceed the daily limits . Those limits should be programmable (entered manually,or calculated by DECmcc depending on other data like link grade ,length or circuit classification ).They should satisfy the CCITT rec. or user-defined limits . Summary : As a minimum ,I would like DECmcc to tell me when there is a "real" problem . As a complement , I would like DECmcc to perform at least some basic functions of a Data Comm. analyzer . I guess another question is : Is it possible in DECmcc the transmission line to be a manageable object ? gerard