T.R | Title | User | Personal Name | Date | Lines |
---|
401.1 | Can't open data file | TOOK::ORENSTEIN | | Thu Oct 11 1990 11:39 | 26 |
| Hi,
I regret that this message was not put into the release notes.
When a rule fires or an exception occurs, a data file is created and
the parameters to the command procedure (P1 through P7) are written
to this datafile, along with the Severity entered on the command line.
The name of the datafile is MCC_ALARMS_DATA_<numbers>.DAT
If any error occurs in opening this datafile, the error ALARMS INTERNAL
ERROR is reported. Better error messages are on the way in a future
release and each will be documented.
I suspect (since the rule fired about 10 times before this message
appeared) that you ran out of disk space or file quota.
Depending on which version of ALARMS you are running, this file will
be located in MCC_COMMON or SYS$SCRATCH. Also depending on the version
the datafile may be automatically purged. The older version places the
file in MCC_COMMON and does NOT purge - the new version places the file
in SYS$SCRATCH and DOES purge.
Hope this helps...
aud...
|
401.2 | Alarms internal error | BCAT::CSENCSITS | | Thu Oct 11 1990 16:28 | 104 |
| >>>
>>> I suspect (since the rule fired about 10 times before this message
>>> appeared) that you ran out of disk space or file quota.
Don't think that is the problem since I am the only person on this system
with 2 RA81's, full priv and plenty of space (547k blocks)
I have seen the files you mentioned.
In testing right now: WETDRY is my 3100 in normal state
MCC> show MCC 0 ALARMS RULE WETDRY_STATE all char
MCC 0 ALARMS RULE WETDRY_STATE
Characteristics
AT 11-OCT-1990 13:10:06
Examination of attributes shows:
Procedure = USER:[MCC]MCC_DECALERT_ALERT.COM;2
Exception Handler = USER:[MCC]MCC_DECALERT_ALERT_EXCEPTION
.COM;4
Description = "Node WETDRY appears to be
unreachable, Please investigate"
Category = "Node Unreachable"
Parameter = "WETDRY not reachable"
Expression = (NODE4 WETDRY STATE = ON, AT EVERY
00:01:30)
MCC> show MCC 0 ALARMS RULE WETDRY_STATE all statu
MCC 0 ALARMS RULE WETDRY_STATE
Status
AT 11-OCT-1990 13:10:30
Examination of attributes shows:
State = Enabled
Substate = Running
Time of Last Evaluation = 11-OCT-1990 13:09:26.58
Result of Last Evaluation = True
MCC>
In my account is MCC_DECALERT_ALERT.LOG which shows the Procedure being
called and this statement in it:
DELETE SYS$SCRATCH:MCC_ALARMS_DATA_13105675.DAT;"
Everything is normal.
Now I RESTRICT WETDRY (ncp set exec state restric)
Only change is the Result of Last Evaluation = false. No new files created.
Now will stop access to WETDRY. (ncp set NML proxy none)
First pass shows:
MCC> show MCC 0 ALARMS RULE WETDRY_STATE all statu
MCC 0 ALARMS RULE WETDRY_STATE
Status
AT 11-OCT-1990 13:21:20
Examination of attributes shows:
State = Enabled
Substate = Running
Time of Last Evaluation = 11-OCT-1990 13:19:56.24
Result of Last Evaluation = Error
Error Condition = "Access control information invalid
at Node
Additional file appear in my directory. It was _Exception.log. Shows my exception
handler got called.
Then in 1:30 min looked again:
MCC> show MCC 0 ALARMS RULE WETDRY_STATE all statu
MCC 0 ALARMS RULE WETDRY_STATE
Status
AT 11-OCT-1990 13:21:30
Examination of attributes shows:
State = Disabled
Substate = Disabled by error condition
Disable Time = 11-OCT-1990 13:21:27.91
Time of Last Evaluation = 11-OCT-1990 13:19:56.24
Result of Last Evaluation = Error
Error Condition = "Access control information invalid
at Node
Now in the MCC_COMMON directory this appeared:
DSNMCC::CSENCSITS$ typ USER:[MCC]MCC_ALARMS_11-OCT-1990_ERROR.LOG;1
>>> 11-OCT-1990 13:21:26.27 MCC 0 ALARMS RULE WETDRY_STATE
Expression = (NODE4 WETDRY STATE = ON, AT EVERY 00:01:30)
Status = Alarms internal error
MCC 0 ALARMS RULE WETDRY_STATE
AT 11-OCT-1990 13:21:26
Ihope this help and not confuse. It appears to be an access problem to the node
which caused the internal error. When this happens neither my procedure or
exception is called.
(normally I would not test a node every 1:30 but it shows fast errors this way.)
|
401.3 | What happens on subsequent polls? | GOSTE::CALLANDER | | Mon Oct 15 1990 15:47 | 9 |
| Aud,
could this be due not to the firing of the exception handler but
due to the next poll. When it attempts to go back to the DNA4 AM
after the exception to show the attributes again, you will get a
DNA4 exception because the node isn't there (something along the
lines of node does not exist or is not known to local node). In
this case what will alarms do?
|
401.4 | ALARMS handles multiple exceptions | TOOK::ORENSTEIN | | Mon Oct 15 1990 18:08 | 47 |
| ALARMS has no knowledge of, and does not care, whether an exception
is the first, second or the Nth. An exception is an exception is an
exception.
The algorithm ALARMS uses to evaluate rules is the following:
GET DATA:
This polls the entity for its attributes.
CHECK DATA:
If a REPONSE is returned:
(EVALUATE (see below))
If an EXCEPTION is returned:
(write to datafile - this is where ALARMS_INTERNAL_ERROR can happen)
(queue to the batch queue the user's exception handler)
(If handle on data is MORE (for more polls) go to GET DATA)
EVALUATE:
If more data is needed to evaluate the rule (CHANGE_OF perhaps)
GET DATA.
If expression evaluates to true:
(write to datafile - this is where ALARMS_INTERNAL_ERROR can happen)
(queue to the batch queue the user's command procedure)
(If handle on data is MORE (for more polls) go to GET DATA)
Yes, ALARMS does keep counters of how may excpetions have occurred,
but this is only to help the user to see what's going during
evaluation.
As I have mentioned before, an ALARMS_INTERNAL_ERROR can only be
produced if the data file can not be opened. Two things can cause this
to happen:
1. A bug in ALARMS corrupts the file name so that it is not a legal
VMS file specification.
2. Something in the users' environment is preventing the file from
being opened.
We are investigating this with the author and will report back with
the problem (soution).
aud...
|
401.5 | Alarms Internal Error bug .. Fixed !! | WAKEME::ROBERTS | Keith Roberts - DECmcc Alarms Team | Tue Oct 16 1990 16:08 | 12 |
|
With Johns help, we have found & corrected the Alarms-Internal-Error.
The next DECmcc kit will contain the fix - if anyone else experiences the
"Alarms Internal Error" with this kit (x1.0.1) -- then please send me
mail.
Thanks,
Keith Roberts
WAKEME::ROBERTS
(dtn) 226-5394
|
401.6 | EXCEPTION not liked on ALARMS creation | ADO75A::SHARPE | C is bliss? | Wed Nov 07 1990 18:08 | 13 |
| What syntax do I use to register the alarm ... The version of DECmcc
that I am using declares itself to be: DECmcc (X1.1.0).
However, it will not accept an exception statement on a
create mcc 0 alarms ... command ...
How do I do it, or what have I done wrong (including installation
errors)?
I am trying to detect when a node has gone down.
Regards
Richard Sharpe
|
401.7 | Belay that last request, me hearties! | ADO75A::SHARPE | C is bliss? | Wed Nov 07 1990 18:27 | 7 |
| Enter stage left with sheepish look on face.
I looked through the command procedures in mcc_common: and found the
syntax. Seems like it should be "exception handler", not just exception.
Regards
Richard Sharpe
|
401.8 | EX is ambiguos -- common error as well | GOSTE::CALLANDER | | Mon Nov 26 1990 14:37 | 5 |
| actually to be real clear, a problem I have seen a few times is
that people abbreviate the argument to EX which is ambiguous in
the alarms syntax because of EXpression and EXception handler.
|