T.R | Title | User | Personal Name | Date | Lines |
---|
4804.1 | Can you post the Rule ? | MOLAR::ROBERTS | Keith Roberts - Network Management Applications | Fri Apr 02 1993 09:17 | 3 |
| Can you post the exact Rule you are creating? the expression you enter.
thanks /keith
|
4804.2 | Here's the rule... | BAHTAT::BOND | | Mon Apr 05 1993 04:47 | 10 |
| Here's the rule as requested:-
create mcc 0 alarms rule DDC_Remsta_Unreachable -
expression = (occurs(remote_station * remsta_unreachable)), -
procedure = /usr/mcc/ddc/ddc.scp, -
parameter = TYPE=remsta, -
category = "Remsta Event", -
description = "A remote station has become unreachable", -
perceived severity = major, -
in domain = .domain.overall
|
4804.3 | I suspect the Remote_Station Access Module .. but try this .. | MOLAR::ROBERTS | Keith Roberts - Network Management Applications | Mon Apr 05 1993 15:17 | 11 |
| Try executing the same command which Alarms uses to process your Rule:
getevent remote_station * remsta_unreachable, for dur 200-00:00:00
This tells the FCL to get the remsta_unreachable event from any
remote_station global entity ... and to keep watching for 200 days.
Run this test along with your Alarm Rule on the same system and at
the same time...and let me know what happens.
/keith
|
4804.4 | I think max_nofiles 256 is too small | BAHTAT::BOND | | Tue Apr 27 1993 13:48 | 27 |
| Hello Again...at last the problem has re-occurred (things don't go
wrong when they know you're watching!)
In a window where we enabled the alarms, we finally got an error from
the UDM TCP/IP event handler which receives events from a remote system.
This error was 'Socket Create Failure', errno=24 which is 'Too many
open files'. So I guess we will increase max_nofiles from the mcc
recommended 256 to ... well howzabout 1024 to be safe!
In the window where we did getevent which you recommended Keith, we are
seeing:
Remote_station local_ns:.remsta.twet28
at time <time> configuration_event
received event lost message
My initial thought is that this is due to the remote event forwarder
not managing to deliver its event properly because of the socket
failure. Although we killed the process off that had enabled the
alarms (because it was streaming socket create failures) the getevent
still carried on reporting the above error twice a second, always
against twet28. We killed it after an hour.
I think we will attend to the max_nofile first and see whether this
clears up the whole problem. Thankyou for your help Keith.
chris
|