[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:	DECmcc user notes file. Does not replace IPMT.
Notice:	Use IPMT for problems. Newsletter location in note 6187
Moderator:	TAEC::BEROUD

Created:	Mon Aug 21 1989
Last Modified:	Wed Jun 04 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	6497
Total number of notes:	27359

2405.0. "Alarms/Notifi probs: x1.2.15" by ICS::WOODCOCK () Mon Feb 24 1992 16:13

X1.2.15 3520 w/32M V5.4-3

Hi there,

My first five minutes of testing x1.2.15 has been an adventure.

I have a top level domain (bb) and a domain below it (pko-24) and both
are dynamic. Within the lower domain I have a NODE4 entity which I added
a line to the map as the NODE4's circuit subentity. I look-up and then back
down and the first thing I notice is the domain is not in cache due to the
time it takes to repaint. I guess I'm the first to say I already miss the
quick screen refresh. Secondly when I got back down into the lower domain
my map edits were gone. There needs to be a pop-up window if edits were made
(like when exiting) when navigating for saving the map.

I then created the most basic alarm for domain pko-24 node4 bbpk04 circuit
syn-0 substate <> none polling each minute (from fcl). My IMPM session is
viewing the bb domain and the alarm is enabled via fcl. The circuit is down
for testing purposes. The pko-24 domain turns red then goes back to black
and the circuit is still down. This happens each minute. Looking down into
domain pko-24 shows the CIRCUIT line black and the NODE4 RED. The next time
the alarm fires while viewing this domain the CIRCUIT line turns RED and the
NODE4 turns RED then back to BLACK (this happens each minute).

When the circuit is brought up via ncp there is no change in the map (the
CIRCUIT subentity line stays RED). 

Finally the alarm rule counters look pretty confused probably because I
have drawn in the extra circuit subentity as a line. Each time a 'real'
evaluation=false the EVALUATION FALSE counter increments by 3. Each time
a 'real' evaluation=true the EVALUATION TRUE counter increments by one and
the EVALUATION FALSE counter increments by 2. 

Some of these more serious problems were brought out in the last kit. I was
hoping they would be fixed but it is obvious forward steps have not been
made. Considering the sickness of ALARMS and NOTIFICATIONS pointed out is
there any time frame for fixes??? I wouldn't dream of putting this kit into
production level use and unfortunately that is the type of use needed to bring
out more of the 'abstract' types of problems external users may encounter, and
so my hands are tied until these fixes are available.

kind regards,
brad...

T.R	Title	User	Personal Name	Date	Lines
2405.1	Lets get this straight ...	NANOVX::ROBERTS	Keith Roberts - DECmcc Toolkit Team	`Mon Feb 24 1992 17:02`	21
	RE: .0 Let me see if I can understand whats going on here. You have created a Rule via the FCL. I guess your syntax was something like: create domain pko-24 rule <rule-name> - expression = (node4 bbpk04 circuit syn-0 substate <> none, at every 00:01:00) When this Rule is enabled you see the 'evaluation' counters increment by more than 1 per evaluation (ie, per minute)...Right? Too strange. Could you try the same test without the IMPM .. just by using the FCL and Notification FM ... type: notify domain pko-24 before enabling the rule. Let it run for a bit and Show the Rule Counters every few evaluations. Then post the results here. thanks, Keith
2405.2	results of test	ICS::WOODCOCK		`Tue Feb 25 1992 09:30`	186
	Hi Keith, Thanks for your quick reply. Here are the results you asked for. When the rule was enabled the circuit was up so it did not fire for the first couple of minutes. Right at the point when I spawned to show the time I made the circuit go 'synchronizing' causing it to fire. You'll note that 17 seconds after the rule is enabled I did the first SHO COUNT and there are 3 Evaluation False already. I then showed the char. of the rule for your scrutiny. Two notifies came each firing (true) and this also does not match the counters. best regards, brad... ps. anything on the color update problems (my real concern)?? notify domain .pko-24 !%MCC-S-NOTIFSTART, Notify request 2 started ! enable domain .pko-24 rule bbpk04_0 ! !Domain NOCMAN_NS:.pko-24 Rule bbpk04_0 !AT 25-FEB-1992 08:59:56 ! !Normal operation has begun. ! show domain .pko-24 rule bbpk04_0 all count ! !Domain NOCMAN_NS:.pko-24 Rule bbpk04_0 !AT 25-FEB-1992 09:00:16 Counters ! !Examination of attributes shows: ! Creation Timestamp = 25-FEB-1992 08:59:56.64 ! Evaluation True = 0 ! Evaluation False = 3 ! Evaluation Error = 0 ! show domain .pko-24 rule bbpk04_0 all char ! !Domain NOCMAN_NS:.pko-24 Rule bbpk04_0 !AT 25-FEB-1992 09:02:09 Characteristics ! !Examination of attributes shows: ! Expression = (node4 bbpk04 circuit syn-* substate ! <> none,at every 0:1:0) ! Severity = Critical ! Probable Cause = Unknown ! ! show domain .pko-24 rule bbpk04_0 all count ! !Domain NOCMAN_NS:.pko-24 Rule bbpk04_0 !AT 25-FEB-1992 09:02:43 Counters ! !Examination of attributes shows: ! Creation Timestamp = 25-FEB-1992 08:59:56.64 ! Evaluation True = 0 ! Evaluation False = 9 ! Evaluation Error = 0 ! sp sho time ! the circuit was brought down here and the rule began to fire ! each minute. ! !!!!!!!!!!!!!!! Alarm, 25-FEB-1992 09:03:59 !!!!!!!!!!!!!! [2] !Domain: NOCMAN_NS:.pko-24 Severity: Critical !Notification Entity: Node4 NOCMAN_NS:.BBPK04 Circuit SYN-0 !Event Source: Domain NOCMAN_NS:.pko-24 Rule bbpk04_0 !Event: OSI Rule Fired ! ! Event Type = QualityofServiceAlarm ! Event Time = 25-FEB-1992 09:03:57.25 ! Probable Cause = Unknown ! Additional Info = { ( ! significance = True, ! information = "Rule fired: Node4 24.545 Circuit ! SYN-0 Substate = Synchronizing ! 25-FEB-1992 09:03:57.18" ), ! ( ! significance = True, ! information = "(node4 bbpk04 circuit syn-* ! substate <> none,at every 0:1:0) ! " ) } ! Managed Object = Node4 24.545 Circuit SYN-0 ! Perceived Severity = Critical ! ! ! !!!!!!!!!!!!!!! Alarm, 25-FEB-1992 09:04:03 !!!!!!!!!!!!!! [2] !Domain: NOCMAN_NS:.pko-24 Severity: Clear !Notification Entity: Node4 NOCMAN_NS:.BBPK04 Circuit SYN-1 !Event Source: Domain NOCMAN_NS:.pko-24 Rule bbpk04_0 !Event: OSI Rule Fired ! ! Event Type = QualityofServiceAlarm ! Event Time = 25-FEB-1992 09:03:58.22 ! Probable Cause = Unknown ! Additional Info = { ( ! significance = True, ! information = "Rule cleared: Node4 24.545 ! Circuit SYN-1 Substate = None ! 25-FEB-1992 09:03:57.20" ), ! ( ! significance = True, ! information = "(node4 bbpk04 circuit syn-* ! substate <> none,at every 0:1:0) ! " ) } ! Managed Object = Node4 24.545 Circuit SYN-1 ! Perceived Severity = Clear ! ! ! show domain .pko-24 rule bbpk04_0 all count ! !Domain NOCMAN_NS:.pko-24 Rule bbpk04_0 !AT 25-FEB-1992 09:04:35 Counters ! !Examination of attributes shows: ! Creation Timestamp = 25-FEB-1992 08:59:56.64 ! Evaluation True = 1 ! Evaluation False = 14 ! Evaluation Error = 0 ! !!!!!!!!!!!!!!! Alarm, 25-FEB-1992 09:04:58 !!!!!!!!!!!!!! [2] !Domain: NOCMAN_NS:.pko-24 Severity: Critical !Notification Entity: Node4 NOCMAN_NS:.BBPK04 Circuit SYN-0 !Event Source: Domain NOCMAN_NS:.pko-24 Rule bbpk04_0 !Event: OSI Rule Fired ! ! Event Type = QualityofServiceAlarm ! Event Time = 25-FEB-1992 09:04:57.16 ! Probable Cause = Unknown ! Additional Info = { ( ! significance = True, ! information = "Rule fired: Node4 24.545 Circuit ! SYN-0 Substate = Synchronizing ! 25-FEB-1992 09:04:57.09" ), ! ( ! significance = True, ! information = "(node4 bbpk04 circuit syn-* ! substate <> none,at every 0:1:0) ! " ) } ! Managed Object = Node4 24.545 Circuit SYN-0 ! Perceived Severity = Critical ! ! ! !!!!!!!!!!!!!!! Alarm, 25-FEB-1992 09:05:01 !!!!!!!!!!!!!! [2] !Domain: NOCMAN_NS:.pko-24 Severity: Clear !Notification Entity: Node4 NOCMAN_NS:.BBPK04 Circuit SYN-1 !Event Source: Domain NOCMAN_NS:.pko-24 Rule bbpk04_0 !Event: OSI Rule Fired ! ! Event Type = QualityofServiceAlarm ! Event Time = 25-FEB-1992 09:04:57.61 ! Probable Cause = Unknown ! Additional Info = { ( ! significance = True, ! information = "Rule cleared: Node4 24.545 ! Circuit SYN-1 Substate = None ! 25-FEB-1992 09:04:57.11" ), ! ( ! significance = True, ! information = "(node4 bbpk04 circuit syn-* ! substate <> none,at every 0:1:0) ! " ) } ! Managed Object = Node4 24.545 Circuit SYN-1 ! Perceived Severity = Clear ! ! ! show domain .pko-24 rule bbpk04_0 all count ! !Domain NOCMAN_NS:.pko-24 Rule bbpk04_0 !AT 25-FEB-1992 09:05:11 Counters ! !Examination of attributes shows: ! Creation Timestamp = 25-FEB-1992 08:59:56.64 ! Evaluation True = 2 ! Evaluation False = 16 ! Evaluation Error = 0 ! use log off !
2405.3	You are using Child Wildcards in your Rule Expression	NANOVX::ROBERTS	Keith Roberts - DECmcc Toolkit Team	`Tue Feb 25 1992 14:17`	30
	RE: .2 You are using Child Wildcards in your Rule Expression. I bet you have 3 SYN circuits on node BBPK04 -- which explains why the counter increments by 3. I am currently implementing Global Wildcarding for Alarms. One of the sub-tasks is to fix the Rule Transition logic; that is ... Previous Current Event type False False n/a False True Rule Fired False Error Rule Exception True False Rule Cleared True True Rule Fired True Error Rule Exception Error False Rule Cleared Error True Rule Fired Error Error Rule Exception This existing logic worked fine if there were no wildcards. The new logic adds another column: the rule target entity. Could the Icon color problems be due to rules which contained child wildcards ? /keith
2405.4	that makes sense to me	ICS::WOODCOCK		`Tue Feb 25 1992 15:33`	20
	> You are using Child Wildcards in your Rule Expression. I bet you > have 3 SYN circuits on node BBPK04 -- which explains why the counter > increments by 3. You are most correct and that definitely makes sense for the counters. > Could the Icon color problems be due to rules which contained child > wildcards ? > /keith I suspect you are also right about this but something doesn't feel quite right yet. I'll have to play a little more now that I have a better idea of what is going on and get back to you. thanks, brad...
2405.5	Architecture .nes. Solution	ICS::WOODCOCK		`Wed Feb 26 1992 11:35`	63
	Keith, Again, thanks for pointing out my misinterpretation of what is happening. I wish I understood what was going on when I first saw this a few weeks back. There is a "M A J O R" implementation issue at hand and I can only hope your indication that you are still working in this area will fix this. There seems to be a couple of topics to talk about. I have gone back and retested and here is what I've got. Viewing the top domain. Enable the wildcarded rule. Counters increment by three appropriately with TRUE or FALSE increments depending on the circuit states as should be. Circuit syn-0 is down and circuits syn-1 & 2 are up. The rule fires twice each poll period of the alarm. First it goes critical because syn-0 is down, it then polls syn-1 and turns clear because the circuit is up, it then polls syn-2 and does nothing because the state of the rule hasn't changed and the circuit is up. If I view the lower domain while this is happening the node4 global entity reacts the same; it goes critical then clear. The line drawn as syn-0 goes critical and stays that way as expected. All appears to be operating as architected with one glitch which I'll go into in a paragraph or two. The problem is that the "architecture" doesn't provide the solution to the problem. PROBLEM: Build a REAL TIME MONITOR with multiple levels (domain hierarchy) which shows the current status of the network at any given level. We already know that viewing most severe alarms doesn't produce this monitor. We now know that viewing all alarms last fire doesn't work because as indicated above if I have a problem in a domain I don't see it at the top level. There appears to be a couple of options: 1. Allow flexibility for individual objects to be set up for LAST or MOST SEVERE (this was just recently suggested in another note). To work this has to be defined all the way thru the entity structure with children having the ability to be defined seperately. I don't believe I support this because it is simply too much work to set up. Unless you could set it up globally. Example, Domain =Most Severe, Node4 =Most Severe, Node4 * Circuit *=LAST. 2. A direct hybrid solution may be more appropriate. Dynamic domains and global entities are MOST SEVERE if alarms are detected 'beneath' within the children. Children of the global entities are LAST. I suppose there could be problems with this if a global entity=MOST for the monitoring of its children and another alarm set is used to monitor something at the global level which needs to be LAST. There is a trade off and a decision for the user. The second topic appears to be a bug. While viewing the lower domain the rule executes and the line goes critical and the node4 toggles from critical back to clear. Before the next alarm fires, I look into the node4 and see the circuit critical, I then look back up and the line has gone clear and the node4 is now critical (???). The same thing happens if I look up. In the original example the top level domain toggles from critical to clear. If I look up from the lower domain the top level now shows the lower level critical instead. Another quick look down again indicates the line as clear and the node4 as critical (???). best regards, brad...
2405.6	A partial answer	NANOVX::ROBERTS	Keith Roberts - DECmcc Toolkit Team	`Wed Feb 26 1992 16:14`	24
	RE: .5 Brad, I certainly see your point about the Real-time-monitoring capabilities of DECmcc. There needs to be some more work in this area. As far as: >> The second topic appears to be a bug. While viewing the lower domain the >> rule executes and the line goes critical and the node4 toggles from critical >> back to clear. Before the next alarm fires, I look into the node4 and see >> the circuit critical, I then look back up and the line has gone clear and >> the node4 is now critical (???). The same thing happens if I look up. In the >> original example the top level domain toggles from critical to clear. If >> I look up from the lower domain the top level now shows the lower level >> critical instead. Another quick look down again indicates the line as clear >> and the node4 as critical (???). I hope that the new code which implements the Rule Transition logic for multiple entities fixes your problem. That is, fixes Rule Fired/Cleared events when Global or Child Wildcards are present in the Rule Expression. /keith
2405.7	another possibility	ICS::WOODCOCK		`Wed Feb 26 1992 18:02`	48
	> I certainly see your point about the Real-time-monitoring capabilities of > DECmcc. There needs to be some more work in this area. My question now becomes will this work be done for V1.2 (MCC managers are encouraged to answer)? The use or NON-USE of this product as our monitor may very well depend on this functionallity being available. Politically for marketing reasons we may use it, but my technical recommendation to my current management would be "seek alternative methods until MCC supplies this functionallity, at least for our monitoring needs". The alternative is already running. If V1.2 cannot accomplish this task (which my dollar says EVERY customer will demand anyway) then the last two years and three months spent helping to test this product has been a poor investment. The ESC has two basic needs, a real time monitor as described (without manual clearing of alarms), and network performance reporting for analysis. V1.2 is outstanding in the area of performance mngmt for each of the protocols PA supports, and the monitoring capabilities appears to be equally impressive except for this last issue in the graphics area which is a cliff hanger. It's got to be done to be the market leader. Hi Keith, > I hope that the new code which implements the Rule Transition logic for > multiple entities fixes your problem. That is, fixes Rule Fired/Cleared > events when Global or Child Wildcards are present in the Rule Expression. When you think the fix is done reply here and I'll make a note to test it in the next version. Also to follow up on what I suggested a couple of notes back for the monitor, I don't think that will work. If domains are set to MOST SEVERE we are back to what V1.1 offers. I guess what is really needed is to have MCC keep track of all alarms fired within each dynamic domain and display the "highest current severity" it knows of. Also note it has to handle wildcards and therefore a current severity for each instance matching the wildcard. The global entities could do the same, or potentially display a different severity (warning maybe) if there are only alarms on its children. It would then be up to the user to draw the child as a seperate line entity to view the actual status of the child. This would allow for global entity alarms to be displayed if encountered or needed. sincerely, brad...
2405.8	X1.2.15 issues: Alarms clear & Domains not in cache	CUJO::HILL	Dan Hill-Net.Mgt.-Customer Resident	`Wed Apr 01 1992 01:15`	21
	I'm concerned by what is described in the previous few notes. I am in desperate need of the alarms clear and domain caching capabilities. I am currently running X1.2.15 on VMS V5.5, Motif T1.1. Once alarms fire and change icon colors, those colors remain even when the alarms clear in the notification window. This is unacceptable. For my customer, this is a MAJOR product weakness. The same is true of domain caching. DECmcc domain/map navigation under VMS (on VAXstation 3100s) is already slower than grandma with a hernia. To force another map file read is frustrating. I can manually get what I need via NCP, UCX, Remote Console, or UNIX prompt command lines BEFORE MCC can even complete a "Look Into". Can we please have the old way back? Please don't think I'm insensitive. I understand you're all under the gun and that you're also short-handed, but if the above are not fixed, for V1.2, DECmcc will not be looked upon favorably by my customer. Thanks, Dan
2405.9	domain caching will be in v1.2	POLE::LEMMON		`Wed Apr 01 1992 13:44`	11
	> The same is true of domain caching. DECmcc domain/map navigation under > VMS (on VAXstation 3100s) is already slower than grandma with a hernia. > To force another map file read is frustrating. I can manually get what > I need via NCP, UCX, Remote Console, or UNIX prompt command lines > BEFORE MCC can even complete a "Look Into". Can we please have the old > way back? Domain caching was put back in x1.2.16 and will be in the product. It was accidently pulled when the caching for subentities was pulled.
2405.10		POLE::LEMMON		`Wed Apr 01 1992 13:47`	11
	> I am currently running X1.2.15 on VMS V5.5, Motif T1.1. Once alarms > fire and change icon colors, those colors remain even when the alarms > clear in the notification window. This is unacceptable. For my > customer, this is a MAJOR product weakness. When you say clear, do you mean the alarm fires with clear severity? If so, are you running with highest or latest propagation? /Jim
2405.11	re: .5, hang on a minute notification wasn't completed in X1.2.15!!!!!	DADA::DITMARS	Pete	`Thu Apr 02 1992 13:47`	39
	Brad, >hasn't changed and the circuit is up. If I view the lower domain while this >is happening the node4 global entity reacts the same; it goes critical then >clear. The line drawn as syn-0 goes critical and stays that way as expected. >All appears to be operating as architected with one glitch which I'll go into >in a paragraph or two. The problem is that the "architecture" doesn't provide >the solution to the problem. > >PROBLEM: Build a REAL TIME MONITOR with multiple levels (domain hierarchy) > which shows the current status of the network at any given level. > >We already know that viewing most severe alarms doesn't produce this monitor. You are working with X1.2.15 of the IMPM. X1.2.15 >>>>does not<<<< handle rule-cleared conditions correctly. Rule-cleared conditions were not taken into account in the V1.1 design. The work was not straightforward, and it was not completed until late in the development process. To be precise, the CLEAR severity that was coming in on the later circuit INCORRECTLY caused the parent NODE4 entity to go CLEAR. Using HIGHEST policy, the NODE4 entity should (and does, in version later than X1.2.16) remain the highest severity of any of its children. Also, the IMPM is able to distinguish among multiple conditions (e.g. different rules firing) on the same entity, thus if rule 1 is a WARNING and rule 2 is a MAJOR, the entity will have the MAJOR color. If rule 2 then CLEARS, the entity will go to WARNING. If rule 1 then CLEARS the entity will go to CLEAR. Clear? :^) Latest/highest on a per-entity basis will be a pain to implement and to use, but if it's a real requirement it should go onto the wish-list (and into Phase 0 for the next version) and be prioritized along with all the other future work. PLEASE, before you go bashing the product in any wider circles, let's understand that you haven't really tested the final V1.2 functionality in this area!!!! Thanks for your past and future feedback!
2405.12	re: .8 more details...	DADA::DITMARS	Pete	`Thu Apr 02 1992 14:00`	15
	>Once alarms > fire and change icon colors, those colors remain even when the alarms > clear in the notification window. This is unacceptable. For my > customer, this is a MAJOR product weakness. See reply .11. Explain what you mean by "clear in the notification window". Do you mean that an alarm rule fires with severity CLEAR, or that the user DELETES the notification, or merely that the notification scrolls out of the viewport in the notification window? If you're talking about rules on a global entity, and are using x1.2.15 or earlier, then you're not going to see correct coloring behavior in the IMPM when a rule fires and then clears.
2405.13	great news	ICS::WOODCOCK		`Thu Apr 02 1992 16:45`	49
	Pete, >You are working with X1.2.15 of the IMPM. X1.2.15 >>>>does not<<<< handle >rule-cleared conditions correctly. Rule-cleared conditions were not taken into >account in the V1.1 design. The work was not straightforward, and it was not >completed until late in the development process. Glad to hear I wasn't seeing finished code. >To be precise, the CLEAR severity that was coming in on the later circuit >INCORRECTLY caused the parent NODE4 entity to go CLEAR. Using HIGHEST >policy, the NODE4 entity should (and does, in version later than X1.2.16) >remain the highest severity of any of its children. You just made my day (3 months worth)... Is this also going to be true for domains? >Also, the IMPM is able to distinguish among multiple conditions (e.g. different >rules firing) on the same entity, thus if rule 1 is a WARNING and rule 2 is >a MAJOR, the entity will have the MAJOR color. If rule 2 then CLEARS, the >entity will go to WARNING. If rule 1 then CLEARS the entity will go to CLEAR. >Clear? :^) Clear as the sky is blue. >Latest/highest on a per-entity basis will be a pain to implement and to >use, but if it's a real requirement it should go onto the wish-list (and into >Phase 0 for the next version) and be prioritized along with all the other future >work. From your description above it seems that HIGHEST is what the doctor ordered. It may even be worth renaming to CURRENT HIGHEST as this looks quite different than what users of V1.1 see today for HIGHEST. >PLEASE, before you go bashing the product in any wider circles, let's understand >that you haven't really tested the final V1.2 functionality in this area!!!! Fair enough. The previous notes were written before the new schedule was out with an understanding the release date was about to be upon us. It looked as if some serious attention needed to be afforded in this critical area in a very short period or the boat would be missed for another year. The next bash should be held in a local establishment when V1.2 ships :-). >Thanks for your past and future feedback! Bring on the next kit. best regards, brad...
2405.14	DW did it.	DADA::DITMARS	Pete	`Thu Apr 02 1992 19:11`	10
	>You just made my day (3 months worth)... Actually David Wong made your day. I just told you it was made. :^) >Is this also going to be true for domains? Yes. Again, thanks for your testing efforts, your feedback, your concern, and your patience. We really appreciate it!
2405.15	still not right	ICS::WOODCOCK		`Tue Apr 21 1992 15:19`	34
	Pete/all, I just got my EVL sink going and did some testing and this is what I found. >To be precise, the CLEAR severity that was coming in on the later circuit >INCORRECTLY caused the parent NODE4 entity to go CLEAR. Using HIGHEST >policy, the NODE4 entity should (and does, in version later than X1.2.16) >remain the highest severity of any of its children. The problem of a later circuit causing a 'clear' is no longer their as you state above. >Also, the IMPM is able to distinguish among multiple conditions (e.g. different >rules firing) on the same entity, thus if rule 1 is a WARNING and rule 2 is >a MAJOR, the entity will have the MAJOR color. If rule 2 then CLEARS, the >entity will go to WARNING. If rule 1 then CLEARS the entity will go to CLEAR. >Clear? :^) With multiple children I don't seem to be able to get this to work as described. When set to HIGHEST I create a 'critical' event (node goes red), I then create a 'clear' event. The 'clear' notify comes in but the node stays red (events were on the same child). This implies MCC is showing HIGHEST of all events rather than CURRENT HIGHEST OF ALL CHILDREN making this method not suitable for a real time hands off monitor. If I set notify's to LATEST. Create a critical event for two of the children causing the node to be red. Then create a single 'clear' event for only one of the children. The result is that the node goes 'clear' while one of the children is still 'critical'. This also happens at the domain level. Question: Is development of this functionallity still under way?? best regards, brad...
2405.16	using t1.2.7	ICS::WOODCOCK		`Tue Apr 21 1992 15:20`	1
	ps. previous notes' testing was under t1.2.7.
2405.17	event correlation: a modest proposal	DADA::DITMARS	Pete	`Thu Apr 23 1992 10:41`	40
	OK, the problem is that when correlation on events is done, it's done by default on the event ID. So event A (circuit down) is never going to correlate to event B (circuit up) because their IDs are different. Therefore, given the current set of notification services there's no way to make an icon go "red" and then "green" automatically based on events. Presently, there's a way to override the default event correlation behavior so that correlation can be done on the text of an event instead of the ID of the event (via the mcc_ns.replyTextMatchEnts resource in the mcc_notification_resource.dat file). This extends us in one direction to allow the status of multiple conditions to be tracked that are reported via the same event (a la data collector). It would appear that failing some more sophisticated mechanism for informing notification services of how events should be correlated (e.g. a table with circuit up and circuit down identified as being related), a method similar to the replyTextMatchEnts resource would provide behavior that is much more useful than the present implementation. I'm proposing the following change to event correlation for the V1.2 product: 1) event correlation BY DEFAULT changes to lump all events together, thus circuit up and circuit down would correlate to one another 2) a resource is added to the mcc_notification_resource.dat file mcc_ns.eventIdMatchEnts, which is a list of global entity classes for which events should be correlated based on ID (i.e. the present behavior) What this will give us is the ability to use events, notify requests and targetting (to assign severities to events) to more correctly indicate the status of phase4 circuits, etc.. Correlation of alarms would not be affected in any way by this change. I know the above isn't a perfect solution, but we're talking about an acceptable risk and impact to the existing product schedule that gets us a few steps farther in the right direction of solving real customer problems and producing a more saleable product. Feedback is welcome.
2405.18	proposal comments	ICS::WOODCOCK		`Thu Apr 23 1992 11:48`	74
	Hi Pete, As a follow on to our conversation I'd like to get some thoughts in writing. First is the importance of this functionallity. Anyone using polling to manage their circuits should be ok generally speaking. But, anyone with their nose to the grindstone doing network mngmt will want to use events and have proper colors showing status of the net. I don't have to look far for a clear example. We have both MCC and MSU running. There have been many situations (recent) where the network has had critical problems where the backbone has 10-15 circuits bouncing at once. MCC X1.2.15 is right there in the middle of it with bells, mail, log files, notify window, and color even to the point where we saw our other polling alarms fail across the net with exceptions (an ugly sight but a key indicator how bad things were). And MSU, barely a blink!! MCC and events hands down [do success stories help lobbying efforts :-)]. >It would appear that failing some more sophisticated mechanism for informing >notification services of how events should be correlated (e.g. a table with >circuit up and circuit down identified as being related), a method similar to >the replyTextMatchEnts resource would provide behavior that is much more useful >than the present implementation. When you do get to correlation of events here are some examples: Corrolate: Circuit Down Circuit Fault (4.7) to Circuit Up (4.10) Adjacency Down (4.18) to Adjacency Up (4.15) Node Reachability (needs correlation via the text in the event) Area Reachability (needs correlation via the text in the event) >I'm proposing the following change to event correlation for the V1.2 product: > 1) event correlation BY DEFAULT changes to lump all events together, > thus circuit up and circuit down would correlate to one another > 2) a resource is added to the mcc_notification_resource.dat file > mcc_ns.eventIdMatchEnts, which is a list of global entity classes > for which events should be correlated based on ID > (i.e. the present behavior) >What this will give us is the ability to use events, notify requests and >targetting (to assign severities to events) to more correctly indicate the >status of phase4 circuits, etc.. This sounds as if it is workable for the v1.2 time frame as a solution (I'll take anything that comes close at this point). There is one limitation which must be brought to light. This will work well if the user is only doing very basic mngmt tasks like circuit monitoring. If the user does this task AND a notify of some other event something may get missed at some point because the colors will strictly show the LAST severity. I would contend that for most implementations job #1 is to track circuit and node outages with the monitor therefore it should be ok. Adding the above functionallity can only help one way or the other. >Correlation of alarms would not be affected in any way by this change. This should be a future consideration because some users won't use NOTIFY for the events but instead an alarm on the event to trigger other activity. They will have to do both with the above solution. A NOTIFY command to handle color changes in the domain and an ALARM with no associated domain to handle other activity while maintaining the proper colors. >I know the above isn't a perfect solution, but we're talking about an acceptable >risk and impact to the existing product schedule that gets us a few steps >farther in the right direction of solving real customer problems and producing >a more saleable product. Admittedly the solution isn't perfect, but at this late stage in the v1.2 game any changes moving forward are appreciated. any other comments out there... best regards, brad...
2405.19	refinement to proposal	DADA::DITMARS	Pete	`Thu Apr 23 1992 13:22`	18
	a suggested refinement is that instead of 1) event correlation BY DEFAULT changes to lump all events together, thus circuit up and circuit down would correlate to one another we do 1) event correlation BY DEFAULT changes to lump together all events that arrive in response to the SAME NOTIFY REQUEST. thus you could have one notify request looking for circuit up and down and another notify request looking for other events that shouldn't interfere with the circuit up/circuit down event correlation (or another pair of events that correlate to one another like circuit up/down). This is a better step toward the "real" solution of having a table of events that correlate to one another. You instead have a table of notify requests, each of which is looking for a list of events that correlate to one another.
2405.20	agreed, it's better	ICS::WOODCOCK		`Thu Apr 23 1992 21:02`	26
	1) event correlation BY DEFAULT changes to lump together all events that arrive in response to the SAME NOTIFY REQUEST. thus you could have one notify request looking for circuit up and down and another notify request looking for other events that shouldn't interfere with the circuit up/circuit down event correlation (or another pair of events that correlate to one another like circuit up/down). This is a better step toward the "real" solution of having a table of events that correlate to one another. You instead have a table of notify requests, each of which is looking for a list of events that correlate to one another. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ I like it, for the most part. The only complexity comes from having to set up TARGETTING for every dynamic domain. The documentation will have to be precise on how to pull this off or you'll lose the users in the smoke. This is definitely a better approach with the 'real' solution providing a mechanism for severity of each event so TARGETTING isn't so intense. This brings up a question: Why doesn't TARGETTING propogate down thru domains like NOTIFYs??? It seems to be an inconsistency from the users view. Again, good idea...[is the code ready for testing yet :-)] brad...
2405.21	copying targetting from domain to domain to ...	DADA::DITMARS	Pete	`Mon Apr 27 1992 12:36`	51
	Howdy, >This brings up a question: Why doesn't TARGETTING propogate down thru domains >like NOTIFYs??? It seems to be an inconsistency from the users view. The "Expand" argument of the Notify directive controls the "propagate down" behavior. Assign Target doesn't have such an argument. Sounds like a good suggestion. In the mean time, once you plan your use of events and targetting, you could propagate them to another set of domains in one of three ways: 1) use FCL to assign the targets to all domains, e.g. assign target domain * - event source = "node4 * circuit ", - event name = "...", - (etc.) 2) use your favorite editor to create an FCL script that specifies the domain as a symbol, e.g. assign target domain domain_instance - event source = "node4 ", - event name = "..." - (etc.) then create a master script that invokes that script, defining the domain instance symbol appropriately e.g. define domain_instance FOO @setup_targets define domain_instance FOO2 @setup_targets (etc.) 3) using the IMPM, you can copy targets from one domain to another domain using the targetting clipboard: a. get targets the way you want them in source domain b. from the Notification Window, bring up two target directory windows by clicking "Targeting->Directory of Targets.." and "Targeting->Directory of Targets in new window..." c. put source domain's name in one target directory window's domain field and press "Update Display" d. press Edit->Select All in that window, then Edit->Copy e. put destination domain's name in other target directory window, and press Edit->Paste in that window
2405.22	better alarm correlation will probably have to wait	DADA::DITMARS	Pete	`Mon Apr 27 1992 12:56`	17
	re: .18 >>Correlation of alarms would not be affected in any way by this change. > >This should be a future consideration because some users won't use NOTIFY for >the events but instead an alarm on the event to trigger other activity. They >will have to do both with the above solution. A NOTIFY command to handle color >changes in the domain and an ALARM with no associated domain to handle other >activity while maintaining the proper colors. Good point (that notify requests can't presently associate an activity with an event and alarms can). Good work-around too (use Notify to get the color right and null-domain rule to take action). Are there cases where this real primative correlation approach would be useful for alarms? (it'd be more work and more risk and we probably can't consider it for V1.2, unless it's an even bigger win than this proposal).
2405.23	one happy camper	ICS::WOODCOCK		`Thu Apr 30 1992 12:19`	40
	Hi Pete et all, I think this note has served its purpose and looks to be winding down. I wanted to make sure it was left on a HIGH note. The exe you provided has worked very well for our needs. So good actually, I put the 30k block debug beast into production services while we wait for the next kit [and of course we are looking for bugs :-)]. The functionallity added here is a MAJOR win for DECmcc and those involved should know the effort should be worth it and is appreciated greatly internally. >Are there cases where this real primative correlation approach would be useful >for alarms? (it'd be more work and more risk and we probably can't consider >it for V1.2, unless it's an even bigger win than this proposal). Actually I think I PREFER this solution over setting up a stiff correlation table for specific events. I think this should be extended to alarms and also to a MIX of events/rules within a NOTIFY command. For example, if a user has two methods of monitoring an entity (events and polling) they won't correlate today. Scenario: Notify for events today, and poll every half hour as a backup and to fire procedures (this is in use now). A "DOWN" event comes in at 12:00 and the entity turns RED. The polling rule comes in at 12:05 and also goes critical. The circuit comes back "UP" at 12:10 and the event clears the down event. But, the entity stays RED until 12:35 rolls around and the polling rule clears. If both the rule and the events were correlated together the entity would have turned 'clear' at 12:10. Would this be an even BIGGER win?? So what you have begun today would be "BETTER" than what you were going to provide as a solution later. Just extend this method into RULES and a MIX of rules/events and you will have more flexibility than you were going to provide with your future plans!! All this and you could probably pull it off within V1.2 ;-). My only warning is that the DOCUMENTATION must be PRECISE and THOROUGH on how to use these services. kind regards, brad...