[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:	DECmcc user notes file. Does not replace IPMT.
Notice:	Use IPMT for problems. Newsletter location in note 6187
Moderator:	TAEC::BEROUD

Created:	Mon Aug 21 1989
Last Modified:	Wed Jun 04 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	6497
Total number of notes:	27359

1194.0. "alarms on 'recorded' data??" by JETSAM::WOODCOCK () Thu Jun 27 1991 15:28

Hi,

Is there anyway I can write an ALARM for *past* time which has been
RECORDED?? I can't find it in the manuals specifically and if it is
supported I have failed at syntax! Actually MCC let me create an alarm
using the FOR qualifier in the SOT but when it was enabled it immediately
disabled with "scheduled time passed" error. I'm looking for alarms on
stats.

any help most appreciated,
brad...

T.R	Title	User	Personal Name	Date	Lines
1194.1	Try SHOW first. If that works we may have a bug in Alarms!	WAKEME::ANIL		`Thu Jun 27 1991 17:54`	17
	Hi Brad, As usual you were the first to try Alarms on past data. I don't see why Alarms should not be able to handle the case. I did not document it just because I felt it would be a very difficult concept for a user to grasp and its usefulness was questionable. Now that you have tried it when not so happy results, can I request you to try it for just pure data and not stats. Also another question, were you able to do a "show" on the data that you were trying to Alarm on? Let us know your findings. - Anil Navkal
1194.2	avoids polling	JETSAM::WOODCOCK		`Fri Jun 28 1991 13:57`	50
	Hi Anil, Now that I know it should work I'll dig in and see what I find. As far as its usefulness I've got some very real needs for it. One complaint of MCC that I've heard is its complexity. I'm not sure if this should be a seperate topic or not but I'm hoping MCC managers are listening. Please bear with me as I get a little long winded as I throw real numbers out on the table. There are a couple of issues at hand which MCC should try to improve. Number of rules (ie. complexity) and amounts of WAN polling necessary to manage the WAN. I know there have been some procedures to help write the rules but this should probably be taken a step further to some sort of definable default services for the different entity classes. The numbers game... I've got 50 routers and 80 circuits (reality) to manage. If I compare a similar setup to what is used today to manage the net I come up with the following amounts of rules and polling. This is very conservative and nothing fancy. A rule for each router (50) dealing with circuit outages (events, no polling). A rule for each site (12) dealing with node outages (events, no polling). A rule for each circuit (80) for off hours circuit monitoring each 15 min. (320 polls/hr) Export or Record mainly just counters for each circuit (80) and each router (50) only once an hour (130 polls/hr). A rule for each circuit for utilization and error threshold as a warning. One for inbouond, outbound, and errors. 80 circuits 240 rules polling at hourly intervals. PA polls twice per interval therefore 480 polls/hr. A rule for each circuit for utilization and error threshold as a problem. One for inbouond, outbound, and errors. 80 circuits 240 rules polling at hourly intervals. PA polls twice per interval therefore 480 polls/hr. A rule for each router (50) for packet thruput each hour. 100 polls/hr. Grand totals are in the neighborhood of 670 rules and 1510 polls/hr...conservatively!!! No can do. I may not be able to get the number of rules down but if I use past time for non real-time needs (ie. all stats) I can reduce the polling by more than 1000 polls/hr. While I love the versatility of this product it does come at a price. Other large companies will also have to grapple with these numbers and look to cut back on certain non-essentials to make the management managable. In any event, I'll let you know of my successes with alarms for past times. best regards, brad...
1194.3	prob parsing 'in domain ...'	LUVBOT::MCC		`Mon Jul 01 1991 12:01`	24
	In doing a SHOW command the domain needs to be specified. But if I put the "in domain" qualifier into the expression ALARMS doesn't appear to parse it as needed. Bug or unsupported??? thanks, brad... create mcc 0 alarms rule past_count - expression=(node4 bbpk01 cir syn-0 circuit down>0,for start 16:30 - ,in domain .pko-24),- procedure=mcc_common:mcc_alarms_mail_alarm.com,parameter=mcc,- in domain .pko-24 ! !MCC 0 ALARMS RULE past_count !AT 28-JUN-1991 16:50:43 ! !Missing right parenthesis in alarm expression. ! exit !
1194.4	Bot - bug and unsupported	TOOK::ORENSTEIN		`Mon Jul 01 1991 15:22`	22
	>>> In doing a SHOW command the domain needs to be specified. But if I put the >>> "in domain" qualifier into the expression ALARMS doesn't appear to parse it >>> as needed. Bug or unsupported??? Both. In investigating this, I found a bug in the parse routine for prepositions. A QAR has been filed. Also, I discovered that ALARMS does not support the IN DOMAIN qualifier in expressions. Currently there is no support for examining data on a domain basis. A QAR has been filed. I saw your math that states that polling historical data could save you 1000 polls on your network, but I did have some trouble understanding that. How important is this to you? I will see what can get done for V1.2, but I make NO promises. aud...
1194.5	It IS important!	NSSG::R_SPENCE	Nets don't fail me now...	`Mon Jul 01 1991 16:16`	32
	The savings in polling is very important. 1000 polls per hour translates to an average of 16 per minute. That will take a very big system to support in order to leave some resources to deal with and alarms testing true or exception handeling not to mention any management actions initiated by people. Where is the savings? Well, for example... To export data on a node4, (router for example), the Ethernet line and circuit plus the 4 sync lines and circuits, I have to poll the router 15 times (maybe more?). The same sort of number comes up for the Historical Recording. I don't know what happens if you specify several partitions (Brad, you might want to make sure you record and then export characteristics too in case you want to do any external reporting that needs line speeds). Then, if we want to have alarm rules for % utilization inbound and outbound plus errors, we add another 15 polls. That adds up to 45 polls per router for each time we want all this stuff. If the Historian could record it all with a minimum number of polls and then export and alarms use the recorded data the polling could perhaps be reduced from 45 to 10 or less. All the RFIs and RFPs I am seeing these days on Network Management are actually asking us what the traffic level that the management system will add to the network is. We need to be able to minimize that traffic. Hope this clears it up. s/rob
1194.6	suggestions	JETSAM::WOODCOCK		`Mon Jul 01 1991 17:08`	38
	Hi, > The same sort of number comes up for the Historical Recording. I don't > know what happens if you specify several partitions (Brad, you might > want to make sure you record and then export characteristics too in > case you want to do any external reporting that needs line speeds). Actually, I was planning on getting LINE characteristics only once a day for each circuit to handle reports. Like I said, the numbers were conservative. As far as what's needed, I'm going to look for different approaches to get the job done with V1.1. I'll probably end up scaling back info (one threshold for errors rather than two) and hack something together that partially uses MCC. But I'm willing to bet big bucks if our customers understood the mechanics they won't be happy. Suggestions, I've got three: 1. Bring in the support for ALARMS handling historical (in specific domains) data. And fix the parsing bug. VERY IMPORTANT. 2. Change the way PA operates today. This was a previous suggestion but worth mentioning several times :-). Rather than having PA poll at the beginning and end of each interval, have PA poll once each interval and subtract last_counters from present_counters for calculations (also holds true for reports). This solves two problems. It effectively reduces the number of polls by half. Also, as the polling interval decreases MCCs accuracy becomes more dependent on both system and network performance with todays method because the polling must be accurate. If you use the one poll method the polls could be off but the stats are always on the money. A must in my opinion. 3. As a bonus set up a utility which handles default (user definable) services for different entity classes (ie. alarms, stats). This hides some of the complexity of the management environment. best regards, brad...
1194.7	I have a dream...	WAKEME::ANIL		`Tue Jul 02 1991 13:40`	47
	Hi Rob and Brad, Thanks for the valuable data about number of rules needed to manage a reasonable size network. We will look very seriously to provide the domain support in rule expression but we also have face the reality of available (or lack there of!) people power. Talking along the lines of suggestions, from the users point of view the following thought makes a hell of a sense: Record the following attribute for entity foo every 1 hour and by the way let me know if the attribute cross the thresholds indicated them. List Attribute partitions to record o Characteristics o Counters o Status List of attributes threshold values Change for thresholds upper bound lower bound from to aaa 10 20 bbb 30 40 ccc Enable Disable ddd router non router eee 15.5 20.5 Yes I now I am dreaming for now. But I do want to make two points. 1. There is no reason why we can not evaluate the data as it is being collected. Yes that does mean, Alarms and Historian have to communicate a lot! But look at the advantage. We need not poll twice for the same data, nor do we have to wait for the data to be in the MIR. 2. A very simplified user interface that does not need 100 rules to monitor 100 attributes! Thus saving on the resources. I know all this is hind sight. I only hope it becomes a foresight for the future!! - Anil Navkal
1194.8	Ain't that the truth	TOOK::ORENSTEIN		`Tue Jul 02 1991 14:46`	6
	Now that sounds like true integration of Network Management Products! aud...
1194.9	Yup	NSSG::R_SPENCE	Nets don't fail me now...	`Tue Jul 02 1991 14:55`	9
	Anil, exactly... And add to it integration of Export as well. Seems like there should be a "data gatherer FM" that gets called for entity data and by using fuzzy logic it could reduce the network traffic needed for management by consolodating requests for data. s/rob
1194.10	Deja vu!	DFLAT::PLOUFFE	Jerry	`Tue Jul 02 1991 16:07`	18
	> Seems like there should be a "data gatherer FM" that gets called for > entity data and by using fuzzy logic it could reduce the network > traffic needed for management by consolodating requests for data. This is exactly what is needed. We used to call this a "subscription service" and it was talked about many moons agos. I'm glad to see it brought back to light. Hopefully Brad's numbers will provide the necessary justification. We did not call it an FM , we called it a "service" since we thought of it as being part of the IM. After all, the IM handles all scheduling of operations (including SHOWs) so it possibly could implement the "fuzzy logic" that you mentioned. Whatever the design, it certainly seems to be necessary... - Jerry
1194.11	A couple of solutions	TOOK::ORENSTEIN		`Tue Jul 09 1991 14:25`	28
	Back to the original topic: Can ALARMS do rules on historical data? re .3 There are two possibilities for allowing this: 1. Let the user decide: As you did in your example, we could allow the IN DOMAIN qualifier in the expression -- the bug you found could be fixed? But this may be confusing because you could be in domain A and have rules on data recorded from domain B. 2. Make it transparent: The domain in which the rules are ENABLED could be used for determining the domain of the recorded data. In this case, the IN DOMAIN qualifer will not be allowed in a rule expression; but, when using the MAP everything will be transparent since the domain is implicit on every comman. This means that you can be in Domain A and any rules that you Enable will only watch entities in Domain A. I prefer possibility 2. Feedback? aud. ..
1194.12	either method ok	JETSAM::WOODCOCK		`Tue Jul 09 1991 15:21`	6
	I think method two would be sufficient for our needs. Although someone down the road might find uses for the first method depending on the domain structure and how they intend to use it. regards, brad...
1194.13	Clarificatin on .-2	WAKEME::ANIL		`Wed Jul 10 1991 09:29`	26
	Before anyone jumps at us I would like to clarify the following point: > 2. Make it transparent: > > The domain in which the rules are ENABLED could be used for > determining the domain of the recorded data. In this case, the IN > DOMAIN qualifer will not be allowed in a rule expression; but, when > using the MAP everything will be transparent since the domain > is implicit on every comman. This means that you can be in Domain A > and any rules that you Enable will only watch entities in Domain A. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ What Alarms will do is issue a SHOW/GETEVENT directive with IN_Q filled in. The MM then can choose to ingore the IN_Q qualifier and provide the information for the entity. What this means is that if trhe node foo in not a member of DOMAIN A, Alarms will still get the data to evaluate the rule as long as past timne has not been specified. If you do specify past time you will have to have the Historian collected the data for the node foo which will then in turn will have to be the member of Domain A! (Boy is it complicted!!) Hope this helps. ;) - Anil