[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DECmcc user notes file. Does not replace IPMT. |
Notice: | Use IPMT for problems. Newsletter location in note 6187 |
Moderator: | TAEC::BEROUD |
|
Created: | Mon Aug 21 1989 |
Last Modified: | Wed Jun 04 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 6497 |
Total number of notes: | 27359 |
5839.0. "SNMP Alarm woes" by ADO75A::BOUCHER (Reece Boucher) Sat Jan 22 1994 19:02
Greetings,
I am having some problems with SNMP IP reachability alarms.
I am running DECmcc V1.3.0 on a VMS V5.5-2 VAXstation 4000 Model 60. I am
currently monitoring approximately 70 cisco routers and need the following to
occur:
Poll for IP reachability every 60 sec. If a node is unreachable, send a
mail message DECMCC::SYSTEM, which gets directed directly to Target HOTline
(Help desk application). Therefore, IP Poller no good.
Poll for IP reachability every 60 sec. If reachability up, send
notification with severity clear to IMPM. No mail sent. (used to clear the icon
on the map only).
What is actually happening is that I am getting a lot of alarms firing saying
that entities are unreachable, when in fact they are. This is causing a number
of problems WRT Target HOTline. It appears that a low value of ICMP timeout or
retry is being applied, even though I have increased them in the process that
enables the alarms.
Attached are copies of the alarms rules that are enabled, as well as the log
file that is created by the alarms process.
Q: Is there any way I can use the IP reachability Poller to fire an alarm ? I
need to be able to send a MAIL message on a poll failure.
Q: Is there a better way of achieving what I need to do ?
Your help is greatly appreciated.
Regards,
Reece Boucher
Adelaide, Australia
!
! MCC Alarm Rules
!
! IP Reachability Down rules
!
Create Domain LOCAL_NS:.state_net_hubs Rule SNMP_IP_Reach_Down -
Expression = (CHANGE_OF (SNMP * ipReachability, Up,*),at every=00:01:00), -
Severity = Critical, -
Category = "Router", -
Description = " IP Reachability = DOWN. Node is unreachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_HOTLINE.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "HOTLINE", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.primary_industries Rule SNMP_IP_Reach_Down -
Expression = (CHANGE_OF (SNMP * ipReachability, Up,*),at every=00:01:00), -
Severity = Critical, -
Category = "Router", -
Description = " IP Reachability = DOWN. Node is unreachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_HOTLINE.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "HOTLINE", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.road_transport Rule SNMP_IP_Reach_Down -
Expression = (CHANGE_OF (SNMP * ipReachability, Up,*),at every=00:01:00), -
Severity = Critical, -
Category = "Router", -
Description = " IP Reachability = DOWN. Node is unreachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_HOTLINE.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "HOTLINE", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.southern_power_and_water Rule SNMP_IP_Reach_Down -
Expression = (CHANGE_OF (SNMP * ipReachability, Up,*),at every=00:01:00), -
Severity = Critical, -
Category = "Router", -
Description = " IP Reachability = DOWN. Node is unreachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_HOTLINE.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "HOTLINE", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.environ_land_management Rule SNMP_IP_Reach_Down -
Expression = (CHANGE_OF (SNMP * ipReachability, Up,*),at every=00:01:00), -
Severity = Critical, -
Category = "Router", -
Description = " IP Reachability = DOWN. Node is unreachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_HOTLINE.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "HOTLINE", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.housing_urban_development Rule SNMP_IP_Reach_Down -
Expression = (CHANGE_OF (SNMP * ipReachability, Up,*),at every=00:01:00), -
Severity = Critical, -
Category = "Router", -
Description = " IP Reachability = DOWN. Node is unreachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_HOTLINE.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "HOTLINE", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.labour_admin_services Rule SNMP_IP_Reach_Down -
Expression = (CHANGE_OF (SNMP * ipReachability, Up,*),at every=00:01:00), -
Severity = Critical, -
Category = "Router", -
Description = " IP Reachability = DOWN. Node is unreachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_HOTLINE.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "HOTLINE", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.justice Rule SNMP_IP_Reach_Down -
Expression = (CHANGE_OF (SNMP * ipReachability, Up,*),at every=00:01:00), -
Severity = Critical, -
Category = "Router", -
Description = " IP Reachability = DOWN. Node is unreachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_HOTLINE.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "HOTLINE", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.arts_cultural_heritage Rule SNMP_IP_Reach_Down -
Expression = (CHANGE_OF (SNMP * ipReachability, Up,*),at every=00:01:00), -
Severity = Critical, -
Category = "Router", -
Description = " IP Reachability = DOWN. Node is unreachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_HOTLINE.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "HOTLINE", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.premier_govt_management Rule SNMP_IP_Reach_Down -
Expression = (CHANGE_OF (SNMP * ipReachability, Up,*),at every=00:01:00), -
Severity = Critical, -
Category = "Router", -
Description = " IP Reachability = DOWN. Node is unreachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_HOTLINE.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "HOTLINE", -
Batch Queue = "alarms$batch"
!
!
! IP Reachability Up rules
!
Create Domain LOCAL_NS:.state_net_hubs Rule SNMP_IP_Reach_Up -
Expression = (CHANGE_OF (SNMP * ipReachability, down, up),at every=00:01:00), -
Severity = Clear, -
Category = "Router", -
Description = " IP Reachability = UP. Node is now reachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_BROADCAST.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "NETMAN", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.primary_industries Rule SNMP_IP_Reach_Up -
Expression = (CHANGE_OF (SNMP * ipReachability, down, up),at every=00:01:00), -
Severity = Clear, -
Category = "Router", -
Description = " IP Reachability = UP. Node is now reachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_BROADCAST.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "NETMAN", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.road_transport Rule SNMP_IP_Reach_Up -
Expression = (CHANGE_OF (SNMP * ipReachability, down, up),at every=00:01:00), -
Severity = Clear, -
Category = "Router", -
Description = " IP Reachability = UP. Node is now reachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_BROADCAST.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "NETMAN", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.southern_power_and_water Rule SNMP_IP_Reach_Up -
Expression = (CHANGE_OF (SNMP * ipReachability, down, up),at every=00:01:00), -
Severity = Clear, -
Category = "Router", -
Description = " IP Reachability = UP. Node is now reachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_BROADCAST.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "NETMAN", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.environ_land_management Rule SNMP_IP_Reach_Up -
Expression = (CHANGE_OF (SNMP * ipReachability, down, up),at every=00:01:00), -
Severity = Clear, -
Category = "Router", -
Description = " IP Reachability = UP. Node is now reachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_BROADCAST.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "NETMAN", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.housing_urban_development Rule SNMP_IP_Reach_Up -
Expression = (CHANGE_OF (SNMP * ipReachability, down, up),at every=00:01:00), -
Severity = Clear, -
Category = "Router", -
Description = " IP Reachability = UP. Node is now reachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_BROADCAST.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "NETMAN", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.labour_admin_services Rule SNMP_IP_Reach_Up -
Expression = (CHANGE_OF (SNMP * ipReachability, down, up),at every=00:01:00), -
Severity = Clear, -
Category = "Router", -
Description = " IP Reachability = UP. Node is now reachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_BROADCAST.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "NETMAN", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.justice Rule SNMP_IP_Reach_Up -
Expression = (CHANGE_OF (SNMP * ipReachability, down, up),at every=00:01:00), -
Severity = Clear, -
Category = "Router", -
Description = " IP Reachability = UP. Node is now reachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_BROADCAST.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "NETMAN", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.arts_cultural_heritage Rule SNMP_IP_Reach_Up -
Expression = (CHANGE_OF (SNMP * ipReachability, down, up),at every=00:01:00), -
Severity = Clear, -
Category = "Router", -
Description = " IP Reachability = UP. Node is now reachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_BROADCAST.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "NETMAN", -
Batch Queue = "alarms$batch"
!
!
Create Domain LOCAL_NS:.premier_govt_management Rule SNMP_IP_Reach_Up -
Expression = (CHANGE_OF (SNMP * ipReachability, down, up),at every=00:01:00), -
Severity = Clear, -
Category = "Router", -
Description = " IP Reachability = UP. Node is now reachable by IP", -
Alarm Fired Procedure = DKA200:[ALARMS]MCC_ALARMS_BROADCAST.COM, -
Alarm Exception Procedure = DKA200:[ALARMS]MCC_ALARMS_EXCEPTION.COM, -
Alarm Fired Parameters = "NETMAN", -
Batch Queue = "alarms$batch"
!
!
$!
$ manage/enterprise
!
! enable alarms
!
SET MCC 0 TCPIP_AM UDP TIMEOUT=30,UDP RETRIES=3,ICMP TIMEOUT=30,ICMP RETRIES=3
show mcc 0 tcpip_am all attr
do mcc_alarms:enable_alarms.com
!
! wait for a very long time
show mcc 0 all id, at start=(+9999-00:00:00)
exit
$ exit
$set noverify
DECmcc (V1.3.0)
MCC 0 TCPIP_AM
AT 21-JAN-1994 13:54:51 Characteristics
Examination of Attributes Shows
UDP Timeout = 30
UDP Retries = 3
ICMP Timeout = 30
ICMP Retries = 3
MCC 0 TCPIP_AM
AT 21-JAN-1994 13:54:51 All Attributes
Component Version = V1.3.0
Component Identification = "DECmcc TCP/IP SNMP AM"
UDP Timeout = 30
UDP Retries = 3
ICMP Timeout = 30
ICMP Retries = 3
Mib Extensions Available = ( "rmon",
"EXP_RMON",
"cisco",
"novell",
"synoptics" )
Domain LOCAL_NS:.primary_industries Rule SNMP_IP_Reach_Down
AT 21-JAN-1994 13:54:53
Normal operation has begun.
Domain LOCAL_NS:.primary_industries Rule SNMP_IP_Reach_Up
AT 21-JAN-1994 13:54:54
Normal operation has begun.
Domain LOCAL_NS:.road_transport Rule SNMP_IP_Reach_Down
AT 21-JAN-1994 13:54:55
Normal operation has begun.
Domain LOCAL_NS:.road_transport Rule SNMP_IP_Reach_Up
AT 21-JAN-1994 13:54:55
Normal operation has begun.
Domain LOCAL_NS:.southern_power_and_water Rule SNMP_IP_Reach_Down
AT 21-JAN-1994 13:54:56
Normal operation has begun.
Domain LOCAL_NS:.southern_power_and_water Rule SNMP_IP_Reach_Up
AT 21-JAN-1994 13:54:56
Normal operation has begun.
Domain LOCAL_NS:.environ_land_management Rule SNMP_IP_Reach_Down
AT 21-JAN-1994 13:54:57
Normal operation has begun.
Domain LOCAL_NS:.environ_land_management Rule SNMP_IP_Reach_Up
AT 21-JAN-1994 13:54:57
Normal operation has begun.
Domain LOCAL_NS:.housing_urban_development Rule SNMP_IP_Reach_Down
AT 21-JAN-1994 13:54:58
Normal operation has begun.
Domain LOCAL_NS:.housing_urban_development Rule SNMP_IP_Reach_Up
AT 21-JAN-1994 13:54:58
Normal operation has begun.
Domain LOCAL_NS:.labour_admin_services Rule SNMP_IP_Reach_Down
AT 21-JAN-1994 13:55:00
Normal operation has begun.
Domain LOCAL_NS:.labour_admin_services Rule SNMP_IP_Reach_Up
AT 21-JAN-1994 13:55:01
Normal operation has begun.
Domain LOCAL_NS:.justice Rule SNMP_IP_Reach_Down
AT 21-JAN-1994 13:55:02
Normal operation has begun.
Domain LOCAL_NS:.justice Rule SNMP_IP_Reach_Up
AT 21-JAN-1994 13:55:03
Normal operation has begun.
Domain LOCAL_NS:.arts_cultural_heritage Rule SNMP_IP_Reach_Down
AT 21-JAN-1994 13:55:07
Normal operation has begun.
Domain LOCAL_NS:.arts_cultural_heritage Rule SNMP_IP_Reach_Up
AT 21-JAN-1994 13:55:08
Normal operation has begun.
Domain LOCAL_NS:.premier_govt_management Rule SNMP_IP_Reach_Down
AT 21-JAN-1994 13:55:10
Normal operation has begun.
Domain LOCAL_NS:.premier_govt_management Rule SNMP_IP_Reach_Up
AT 21-JAN-1994 13:55:10
Normal operation has begun.
T.R | Title | User | Personal Name | Date | Lines |
---|
5839.1 | | MOLAR::YAHEY::BOSE | | Mon Jan 24 1994 10:01 | 5 |
|
When you get an alarm for reachability down, can you immediately
issue a "SHOW SNMP xxxx ALL STATUS" and tell us the result.
Rahul.
|
5839.2 | option: 2nd poll in script | CTHQ::WOODCOCK | Skiing's 1st Human Groomer | Mon Jan 24 1994 10:52 | 15 |
| Greetings,
Assuming you get the poller as robust as it can be and you are still having
problems I'd recommend an implementation modification.
Within the script POLL THE DEVICE AGAIN if you can. I did this using ncp
because of a similar problem as you describe however I have not tried this
with snmp. You could take two approaches. I simply did an "ncp tell x sho exec"
an captured the $STATUS. If successful this means the node is not really down
and don't send mail (in you're case don't send to TARGET). You might be able
to use the same technique by issueing "mcc show snmp x ipreachability, to file
=x.txt". Then search x.txt for "down" before sending to TARGET.
just a thought,
brad...
|
5839.3 | SNMP status OK | ADO75A::BOUCHER | Reece Boucher | Mon Jan 24 1994 22:43 | 15 |
| Rahul,
Showing the status of the SNMP device directly after an IP reachability
Down normally shows IP Reachability=UP.
Brad,
I take it the way you are suggesting is to use the Data Collector.
This would work, but doesn't it go against the intention of the IP
Poller ?
Thanks for the replies.
Reece Boucher
|
5839.4 | use a second poll as proof | CTHQ::WOODCOCK | Skiing's 1st Human Groomer | Tue Jan 25 1994 09:58 | 38 |
| Hi Reece,
> Brad,
>
> I take it the way you are suggesting is to use the Data Collector.
> This would work, but doesn't it go against the intention of the IP
> Poller ?
Actually no, not quite what I was suggesting. This may actually go against
the IP Poller but I'm not sure because I've never used it. Anyway, what we do
is write a simple alarm rule, enable it, and if something alarms down then in
the script we poll the device again. By doing this we double check whether the
device is really down, the network glitched, or the router was too busy, or
etc.
Alarm rule expression something like exp=(snmp * ipreachability=down)
procedure=node_down.com
enable rule
logic in node_down.com
- get snmp device name
- poll it again through the script
$ mcc show snmp <device> ipreachability,to file=x.txt
$ sea x.txt down
$ if $status .ne. %x0000001 then node_status="down"
$ if node_status .eqs. "down"
$ then
$ send mail
$ etc, etc...
$ endif
The above stuff is not exact by any means, it will also not be fast. What it
does do is double check to make sure the device is down before sending mail
or creating a TARGET ticket which is the major hassle we are faced with.
cheers,
brad...
|
5839.5 | | MOLAR::YAHEY::BOSE | | Tue Jan 25 1994 10:58 | 10 |
|
RE .3
Reece,
This probably means that the alarm encountered an exception.
Can you check the contents of the alarms notification to confirm
that ?
Rahul.
|
5839.6 | No exceptions encountered | ADO75A::BOUCHER | Reece Boucher | Wed Jan 26 1994 18:50 | 13 |
| Rahul,
No exceptions were logged in mcc_notification.log.
There are just normal IP reachability Up and IP Reachability DOWN error
conditions reported.
The difference between the two however, can be as much as 10 minutes.
FYI.
Reece...
|