T.R | Title | User | Personal Name | Date | Lines |
---|
1027.1 | You must have ghosts ... | TOOK::ORENSTEIN | | Mon May 20 1991 17:09 | 27 |
| Hi Brad,
My only guess is that THIS is not the rule that fired.
Domain NOCMAN_NS:.PKO-24 Rule TEST
AT 17-MAY-1991 15:33:14 Characteristics
Examination of attributes shows:
Alarm Fired Procedure = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_ALARM.
COM;7
Alarm Fired Parameters = "DECMCC"
Expression = (NODE4 LHPK01 MAXIMUM ADDRESS>1,AT
EVERY 00:02:00)
Severity = Indeterminate
Is there any chance that other rules were running. Perhaps you forgot
to turn one off?
You mention DATA -- was this in a mail message or a log. In these two
cases, the rulename should be right there too.
It really doesn't sound like a software bug. SEVERITY can not change.
I am interested in anything else strange you find.
aud...
|
1027.2 | exception procedure different than icon notification | TOOK::CALLANDER | | Mon May 20 1991 17:15 | 7 |
| BTW even if you don't define an action (excepthion handler procedure) to
do something when the rule exception case is found, the
notification services WILL still pick up the event and cause an
icon color change; but it should be using the severity associated
wuith the rule (like Audrey said).
jill
|
1027.3 | gone now | JETSAM::WOODCOCK | | Mon May 20 1991 18:06 | 25 |
| Hi there,
I definitely have had spirits in this system as of late!!! It was the
same rule, the one I showed in .0 and the one which alarmed. The notification
window is where I got the DATA from. Also the name of the rule was
identical in both the notify window and the defined rule.
Actually, I had *several* rules fire with the same symptom. This led me to
create the TEST rule to check myself. I suspect something happened when the
patch was installed. BTW, power was shut over the weekend and therefore the
system has been rebooted. I just double checked and the problem has
disappeared (the correct severity now shows both in color and notify text).
> BTW even if you don't define an action (excepthion handler procedure) to
> do something when the rule exception case is found, the
> notification services WILL still pick up the event and cause an
> icon color change; but it should be using the severity associated
> wuith the rule (like Audrey said).
Mail was also sent (and still does). Is this the correct action? I would
think if exception handling isn't defined the user probably doesn't want
his procedure to fire.
regards,
brad...
|
1027.4 | problem persists | JETSAM::WOODCOCK | | Tue May 21 1991 10:33 | 18 |
| Todays view..
In double checking my testing I have found a couple of things which I
incorrectly stated in the last note. This problem STILL exists.
Clarification on the exact reactions is as follows.
1. Enable rule TEST (with node up and alarm is assumed to fire)
- Rule fires with proper severity (color) and sends mail. All
seems ok.
2. Disconnect node and enable rule TEST
- Rule fires color change but does NOT send mail (this is different
from before MCC node rebooted, mail was also sent)
- The severity is CRITICAL instead of INDETERMINATE (color matches
critical definition).
Any ideas???
|
1027.5 | The ghosts are gone ... | TOOK::ORENSTEIN | | Tue May 21 1991 11:13 | 14 |
|
Yes, That's just right!
If an exception occurs, an EXCEPTION event is generated by ALARMS
(regardless of any exception handler). The module that lights up
the color is aware that this is an exception and thus uses the
color associated with CRITICAL.
It looks like things are back to normal for you.
As to why you even received mail the other day, I still think it
is a ghost :)
aud...
|
1027.6 | | TOOK::GUERTIN | I do this for a living -- really | Tue May 21 1991 11:55 | 8 |
| RE.-1
You mean if we had a "Nuclear Reactor" icon, you would change it the
color of CRITICAL, because your "SHOW" call failed? I wonder if this
is the right model. Food for thought: How about an "EXCEPTION" state?
-Matt.
|
1027.7 | is it the right model | TOOK::CALLANDER | | Tue May 21 1991 12:21 | 13 |
| I don't know if it is the right model, but I do understand some of the
reasoning that went into it.
Since in most cases, when a rule can no longer be evaluated (and they do
a number of retries before giving up), the entity is usually not
accessible for some reason; this seemed in most cases to be of critical
importance. As to if all users would agree with that, well who knows...
some one made an executive decision to try it that way. If you have
feedback (like I like it most of the time but would prefer it to be
customizable so that I can have it come out at the severity I picked...)
then enter it in this note. Your feedback (and any customers you want
to enter comments for as well) will help us made 1.2 more user friendly.
|
1027.8 | exception<>critical | JETSAM::WOODCOCK | | Tue May 21 1991 17:28 | 15 |
| I also don't believe the SEVERITY should be established automatically by
MCC as critical. The idea of adding an EXCEPTION severity seems to be a
good idea if there is time for v1.2. This way the user can set this level
as a different color. When v1.2 comes out (and icon colors can
toggle up or down severity level, ie critical -> clear) a lot of users
will want to use the graphics to know the state of the network real time
as they see it without intervention. With this very common scenerio it
would be wiser not to confuse the user with critical and exception, and
use a seperation of their colors for clarity.
thanks for clearing/confirming,
brad...
|
1027.9 | It may just boil down to a matter of opinion | TOOK::GUERTIN | I do this for a living -- really | Wed May 22 1991 10:48 | 21 |
| RE:.7
Jill,
I understand the justification for going with this model now, thanks.
As I understand it, we (MCC) will be listening for end-user input to
determine if we should change the way we determine severity/color.
To me it's like buying a smoke detector that sets off the alarm
whenever the battery gets low. The first time the alarm goes off,
everyone runs out of the house. The second time you run around and
check if there really is a fire. After a while, you just don't put a
battery in. I think that is why smoke detectors tell you in a
*different* way that the battery is low. I have two smoke detectors
at home. When the battery is low, one gives off a fast (but not too
loud) beep-beep-beep, and the other has a flashing light that just
stops flashing. Personally, I'm very happy that the engineers who
designed them decided not to have them just set off the alarm.
Just one man's opinion :-)
-Matt.
|
1027.10 | Bug? What bug.... | WAKEME::ANIL | | Wed May 22 1991 13:34 | 59 |
| As usual interesting topics always pop out when I am not around!! Sorry
for the delay folks I was off on a course last two days.
First to you Brad. As Audrey/Jill pointed out to you the behavior
of color change you saw was exactly the way we designed it. Now that we
hear that both Matt and you, are not exactly happy with the model
will force us to rethink about a strategy. But before I open this
can, let me give some reasoning behind the way the color changes.
Two type of Alarms are of interest to us. One where the equipment you
are monitoring fails and the other, the monitoring equipment fails.
Matt let us take your example which should help us understand
where we want to go.
Say the Fire Alarms monitor fails at 10:30 PM. The alarms is
suppose to stop blinking (ie only a visual indication) giving
a clear indication that
the if there is fire now, I am not watching! Most of the time
you get up next day and fix the problem. I.e. Change
the battery. What if you are on vacation? Well then the monitor
has to wait till you return. Now what if there is real
fire? Got the point?
Of course one can go wild and say all fire Alarm monitors should
be hooked up to the Towns fire department. But thats another story.
In V1.2 we plan on doing the following. Now, here is your chance folks.
Bare in mind that resources are limited. If you do have a good idea
lets hear it.
In V1.2 The exception severity will be changed to indeterminate.
Under OSI, following values are associated with different severities.
Indeterminate = 0,
Critical = 1,
Major = 2,
Minor = 3,
Warning = 4,
Clear = 5
The standard is silent on which severity is higher and which is lower,
and for a good reason. I think using indeterminate does solve the
problem of assigning a meaning to a severity that user may not want to
be associated.
Also In V1.2 we will be generating a Rule Clear event that will have
the severity "Clear", associated with it.
My problem with adding one more argument to Create rule (rule
exception severity) is that we already have ~15 fields to fill in
due to OSI compatibility requirements. Lets not add one more
unless its absolutely necessary!
Let us know if we have been off the wall!
Thanks,
- Anil Navkal
|
1027.11 | indeterminate=exceptable | JETSAM::WOODCOCK | | Wed May 22 1991 14:15 | 9 |
| Exception=Indeterminate will work from my point of view. As far as I see
it an exception problem IS indeterminate until an intelligable decision
can be made given the error condition. It is up to the user to decide the
severity based on the error and react accordingly. A flag to the user is
needed but it must be unbiased to severity, and indeterminate would solve
this the same as creating an exception severity. V1.2 plans look ok from
here.
brad...
|
1027.12 | Thanks Brad | WAKEME::ANIL | | Wed May 22 1991 15:30 | 3 |
| Thanks Brad.
- Anil
|
1027.13 | Working UI issues is a long tough job | ENUF::GASSMAN | | Thu May 23 1991 08:24 | 17 |
| Here is a case where experience with MSU can be drawn from without
looking like we are stealing patented techniques from other vendors.
When a remote polling daemon goes down, all the devices that were being
polled by that daemon are colored 'indeterminate' - now, in the MSU
circles, discussion is going on about leaving the color the same as it
last was seen - but perhaps making the icon dotted. The reality is
that often the problem is somewhere between the management system and
the agent - sometimes it's even the agent that is broken. The concept
of "I tried to get information but something stopped me" condition is
needed. A level deeper would try to determine what stopped the poll.
Was it a timeout, did the network give a specific error message, did
the remote device itself give you an indication that while it was
somewhat alive, it wasn't going to service your management request.
Here's a case where experience will tell us how to make it work, and
customers will tell us if they like it that way.
bill
|
1027.14 | | TOOK::STRUTT | Management - the one word oxymoron | Mon May 27 1991 17:08 | 15 |
| It's not clear to me that an exception received while evaluating a rule
has any business being associated *directly* with the icon that will
change when the rule fires.
Overloading "indeterminate" seems equally inappropriate.
What you might be better off with is having some way to indicate that
there's a problem with "the alarm evaluation system" (sort of like
Matt's analogy with the smoke detectors). One, but perhaps not the
best, approach might be to have an icon that represents the alarm
system. You already have something like that in the ability to show
alarms - though you "cheat" by having the iconic map implement a special
way of accessing that information. Maybe there's a consistent 'model'
that could be used to deal with both things?
Colin
|