T.R | Title | User | Personal Name | Date | Lines |
---|
1787.1 | Use the Event collection capability | TOOK::CAREY | | Tue Nov 12 1991 09:59 | 15 |
|
Setup DECmcc on your node as an event sink for the reachability changes
or adjacency down events from the routing node.
Reachability takes a long time to change on a large network; on a LAN,
adjacency down events from a router give you much quicker response.
Look in the DECnet Phase IV Use manual and use the sample EVL startup
command file (something like SYS$STARTUP:MCC_STARTUP_DNA4_EVL.COM) for
some hints on setting up MCC as the event repository and dumping events
from the routing node into your DECmcc.
This will free up the logical links and give you more rapid response.
|
1787.2 | Need more to help identify the problem | TOOK::ORENSTEIN | | Tue Nov 12 1991 15:13 | 9 |
| >> The first problem I have is that once an alarm is triggered it appears
>> to trigger continuously rather than once each polling interval.
I have never seen this happen. Could you please sent up your
environment the way you see this happening, use the LOG sample
command procedures on all your rules, trigger your alarm then
send me the logs you accumulate and also a DIR/DAT of the logs.
aud...
|
1787.3 | Progress.. | DUNDEE::CLEARY | A deviant having fun..." | Wed Nov 13 1991 19:14 | 48 |
| re .1
I modified all the rules to start at different times - they are
currently staggered by 30 seconds. This is not too bad for V1.1 where
you have to create individual rules but not viable for V1.2 where wild
card entities can be used. Do we need tosolve this problem or warn
people about it. A solution would be preferable since a burst of
alarms all at once will cause performance problems on the management
system. I can just see someone creating a generic rule to check the
reachability of all 1000 nodes in our area.
I have played around with using phase IV events and will probably go
that way eventually. This is a LAN only network so reachability and
adjacency down will be about as fast as each other.
Re .3
I made substantial progress yesterday. The creation dates for the
rules all appear to be 11 days 9 hours plus or minus a few minutes in
the past. I don;t know how MCC gets confused about the time but it
appears to be the source of the problem. A simple rule with no
schedule defined which will always evaulate true (snmp hub6 synoptics
s3000chassis s3chassisfanstatus=OK) should run from NOW till FOREVER
every 15:00 minutes. It runs about 3 times each second.
MCC_TDF was defined as "+10" which the correct offset. I played around
with this and found values like "+0-10:0:0" causes major havoc. MCC
couldn't get at the MIR at all. Eventually I used "+10:0:0" and
suddenly alarms work as expected. I'm mystified, but then I expect
nothing less from MCC :-). I guess KITINSTAL.COM should check the
syntax of the TDF if getting it wrong is going to cause such havoc.
Some history.
This system was running DECdts T1.0 but I disabled that and rebooted
before trying the above. That didn't help. Earlier still, I also had
incredible trouble getting DNS to work. It had been installed
previously and I wanted to change the node name and address so I tried
to re-install the name server. This went well except that the IVP
failed with an `unable to talk to server' type error. No further
error status explaining was given. I gather that this can be caused by
time going backwards but I don't know how that could happen in a new
installation. Eventually I deleted all the DNS$*.* files, re-installed
the client files by copying them from another system then installed the
server and things started to work. This took three days of screwing
around to fix. Par for the course with DECdns :-(
-mark
|
1787.4 | Please use the correct syntax for specifying MCC_TDF | TOOK::GUERTIN | Don't fight fire with flames | Thu Nov 14 1991 06:52 | 12 |
| I believe for V1.1 the "true" MCC_TDF syntax is
"[+|-][dd ]hh:mm"
dd=days
hh=hours
mm=minutes
Please use "10:00" or "+10:00"
Without the colon, "+10" may be interpreted as 10 days.
-Matt.
|
1787.5 | yes, but... | DUNDEE::CLEARY | A deviant having fun..." | Fri Nov 15 1991 20:41 | 21 |
| re .4
That's also the conclusion I came, but I guess I was too subtle
about the implications.
Unless I am missing something fundamental, there must be a bug in the
way DECmcc handles time. It should not matter what value the tdf has
as long as it is constant. DECmcc's internal notion of time is
presumably UTC and the TDF is used to convert to and from local system
time. If the result of converting a system time to UTC, adding a 10
minute offset then converting back to system time is off by 11 days and
9 hours then there is either a bug in the UTC routines or I am way off
base.
I know 10 days is a nonsensical TDF but is is easily obtained through
accident. If the consequences of getting it wrong are so severe, then
either the bug in the time routines needs to be fixed (before it bites
somewhere else) or there should be a reasonableness check on the TDF to
hide the bug.
-mark
|
1787.6 | Agreed. We plan on validating what the user enters | TOOK::GUERTIN | Don't fight fire with flames | Mon Nov 18 1991 11:39 | 4 |
| In the V1.2 kit, if the user enters an invalid TDF, we bring it to the
users attention immediately.
-Matt.
|
1787.7 | Thanks. | DUNDEE::CLEARY | A deviant having fun..." | Mon Nov 18 1991 23:19 | 0
|