Title: | DEChub/HUBwatch/PROBEwatch CONFERENCE |
Notice: | Firmware -2, Doc -3, Power -4, HW kits -5, firm load -6&7 |
Moderator: | NETCAD::COLELLA DT |
Created: | Wed Nov 13 1991 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 4455 |
Total number of notes: | 16761 |
-------------------------------------------------------------------------------- DANIEL JEYACHANDRAN <F/W V2.0 problem.> 05-MAR-1996 14:44 -------------------------------------------------------------------------------- HUBWATCH 4.1.1 & MAM module V4.0.2 DR900TM S/W V2.0.0, H/W V3, RO v04 ^ | | HP OpenView reporting "RMON rising alarm exceeded 1" etc. Hubwatch reports, "RPT900 Missing data LED14 PCOM LED program LED15 No information." etc I have not been able to find any docs on hubwatch messages. I found the following entry in STARS database indicating a probable bug in V2.0 S/W of DR900TM. Any fix for this problem yet? Thanks in advance. Daniel. CSC, Sydney. ------------------------------------------------------------------------------ {Elev} problems with repeater after micro code upgrade getting traps COPYRIGHT (c) 1988, 1993 by Digital Equipment Corporation. ALL RIGHTS RESERVED. No distribution except as provided under contract. PRODUCT or COMPONENT: DECrepeater 900TM OP/SYS: OPENVMSVAX VERSION INFORMATION: Operating System Version(s): OSF1 V3.2 Layered Product/Component Version(s): {include all relevant version numbers} Polycenter netview 3.1B Hubwatch V3.1 SOURCE: Digital Equipment Corporation SYMPTOM: Customer has seen a problem with traps occuring since upgrading several DECrepeater 900tm's from V1.1.0 to V2.0.0. The customer noted that some of the upgraded repeaters are not having the problem. Customer is launching hubwatch standalone, he is seeing the traps attached below via polycenter netview. His version of hubwatch does not have "alarms" in the applications pull down window so it appears that it can not be set up to log the traps. Customer is seeing the traps hundreds of times a day on the affected repeaters, some of the repeaters having the problem have hundreds of PC's connected and some only have a device connected that monitors a remote sites UPS. The customer clicked on one of the affected modules and and read the status screen to me: status: enabled Health Text: 0 ports are not operational, 0 ports are auto partitioned, 11 media are not available. Health Text Changes: 414 Partitioned Ports: 0 Media Unavailable Ports: 11 Transmit Collisions: 335000 (uptime of 35 days on a busy network per customer). Customer can think of nothing unique to those DECrepeater 900tm's that are having the problem and those that are not. Below is information received by the customer via FAX which is everything the customer has on the traps, there may be a few inaccuracies due to a few words not being readable: A RMON falling alarm repeater mau repeater information repeater mau total media unavailable 0 fell below threshold 1; value = 21: (sample type = 2) specific = 2 enterprise= rmon 1.3.6.1.2.1.16 A RMON rising alarm: repeater repeater information repeater health text changes 0 exceeded threshold 1; value 40 (sample type =2; alarm index =4) specific: 1 generic : 6 catagory: threshold events enterprise: rmon 1.3.6.1.2.1.16 source: agent (A) hostname: rep2.shost.ksc.com severity: critical RMON rising alarm repeater extensions, repeater basic package, repeater repeater information 5.0 exceeded threshold 1; value 83 (sample type =2; alarm index = 5) specific: 1 generic : 6 catagory: threshold events enterprise: rmon 1.3.6.1.2.1.16 source: agent (A) hostname: rep2.elrio.ksc.com severity: critical MTI.fddi.ksc.co N elrio2.elrio.ksc.com reported different link address than obtained from mti.fddi.ksc.com by snmp specific: 58982401 (hex: 3840001) generic : 6 catagory: mode configuration events enterprise: netview 1.3.6.1.4.1.2.6.3.1 source: network (N) hostname: net1.fddi.ksc.com severity: indeterminate I have talked to Frank Levesque regarding the problem, Frank has agreed to look into what the traps are indicating and whether or not polycenter netview could be a factor. Customer has been somewhat difficult to work with and does not understand why we are asking questions regarding version and topology information. I have explained to him that we can not find information regarding what the traps are telling us so we therefore can not tell him how to resolve the issue. The customer wants information on what the traps mean and how to correct them, I have reviewed RFC 1157 (SNMP) and did not see them in there. DIGITAL RESPONSE: This problem has been reported to Engineering. WORKAROUND: {workaround} ANALYSIS: {cause}
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
3323.1 | NETCAD::GALLAGHER | Tue Mar 05 1996 09:31 | 135 | ||
>Customer has seen a problem with traps occuring since upgrading several >DECrepeater 900tm's from V1.1.0 to V2.0.0. Why is this a problem? If customers don't want to get traps they should not provided trap sinks (trap destination IP addresses). Traps don't always report "bad" things. Sometimes they're just imformational, like the trap below: >A RMON falling alarm repeater mau repeater information >repeater mau total media unavailable 0 >fell below threshold 1; value = 21: (sample type = 2) >specific = 2 >enterprise= rmon 1.3.6.1.2.1.16 Definition: > 8: erptrMauTotalMediaUnavailable One or more media have become > 1.3.6.1.4.1.36.2.18.11.5.1.1.5.1.1.0 available or unavailable. This usually means that someone plugged in a a cable, or removed a cable. >A RMON rising alarm: repeater repeater information repeater health text >changes 0 > >exceeded threshold 1; value 40 (sample type =2; alarm index =4) >specific: 1 >generic : 6 >catagory: threshold events >enterprise: rmon 1.3.6.1.2.1.16 >source: agent (A) Traps are also sent when changes to healthText occur. The alarmed object is in the DEC Private "Extended Repeater MIB". It's definition is: >erptrHealthTextChanges OBJECT-TYPE > SYNTAX Counter > ACCESS read-only > STATUS mandatory > DESCRIPTION > "This counter increments each time the rptrHealthText object > defined in RFC 1516 is modified." > REFERENCE > "Reference RFC 1516 repeater MIB" > ::= { erptrRptrInfo 4 } And the repeater MIB's rptrHealthText object is defined as: > rptrHealthText OBJECT-TYPE > SYNTAX DisplayString (SIZE (0..255)) > ACCESS read-only > STATUS mandatory > DESCRIPTION > "The health text object is a text string that > provides information relevant to the operational > state of the repeater. Agents may use this string > to provide detailed information on current > failures, including how they were detected, and/or > instructions for problem resolution. The contents > are agent-specific." > REFERENCE > "Reference IEEE 802.3 Rptr Mgt, 19.2.3.2, > aRepeaterHealthText." > ::= { rptrRptrInfo 3 } This basically means that rptrHealth text is used to report anything deemed "interesting" by the repeater implementation. The trap is meant to alert network managers to look at the health text. >specific: 58982401 (hex: 3840001) >generic : 6 >catagory: mode configuration events >enterprise: netview 1.3.6.1.4.1.2.6.3.1 >source: network (N) >hostname: net1.fddi.ksc.com >severity: indeterminate I'm not sure what this is, but it looks like it's coming form a host rather than a repeater. Can you confirm this? I've attached a list of object on repeaters which are alarmed. >The customer wants information on what the traps mean and how to correct >them, I have reviewed RFC 1157 (SNMP) and did not see them in there. rfc1757 describes RMON and contains definitions for the RMON rising and falling event traps. -Shawn ------------------------------------------------------------------------- REPEATER/PORTswitch (see Matrix below): DEFAULT ALARMS (NAME & OBJECTID) TRIGGER OF EVENT -------------------------------- ----------------- 1: pcomEsysNVRAMavailableOctets There is no more memory for 1.3.6.1.4.1.36.2.18.11.2.7.6.0 nonvolatile parameters. 2: rptrTotalPartitionedPorts One or more ports has been 1.3.6.1.2.1.22.1.1.6.0 autopartitioned, or a port that was previously autopartitioned is now operational. 3: erptrHealthTextChanges The module's operational state 1.3.6.1.4.1.36.2.18.11.5.1.1.1.1.4.0 has changed. 4: erptrTotalPortEvents The total number of times a port 1.3.6.1.4.1.36.2.18.11.5.1.1.1.1.5.0 has become nonoperational, autopartitioned, or unavailable. 5: erptrTotalRptrErrors The total number of errors for 1.3.6.1.4.1.36.2.18.11.5.1.1.1.1.6.0 this module. 6: erptrDprTotalStateChange The module's link state change has 1.3.6.1.4.1.36.2.18.11.5.1.1.3.1.1.0 occurred while using redundant-link configuration. 7: erptrSecurityRptrSecurityViolation A security violation has occurred 1.3.6.1.4.1.36.2.18.11.5.1.1.4.1.1.0 on one or more ports. 8: erptrMauTotalMediaUnavailable One or more media have become 1.3.6.1.4.1.36.2.18.11.5.1.1.5.1.1.0 available or unavailable. 9: erptrSecurityRptrSecurityViolation A security violation has occurred 1.3.6.1.4.1.36.2.18.11.5.1.1.4.1.1.0 on one or more ports. 10: erptrMauTotalMediaUnavailable One or more media have become 1.3.6.1.4.1.36.2.18.11.5.1.1.5.1.1.0 available or unavailable. * indicates the module supports the alarm | |||||
3323.2 | IAMOSI::DANIEL | Fri Mar 08 1996 01:04 | 18 | ||
Hi Shawn, Thanks for your quick reply. The attachment I sent was not from my customer. My cust log did not reveal any details like the one attached, but only a single line summary "RMON rising alarm exceeded 1" " " " " RMON falling alarm below 1 etc. I have asked him to give me a detailed print-out of it. Daniel. | |||||
3323.3 | NETCAD::MILLBRANDT | answer mam | Fri Mar 08 1996 10:36 | 9 | |
from .0 - > HUBWATCH 4.1.1 & MAM module V4.0.2 > DR900TM S/W V2.0.0, H/W V3, RO v04 You should be running V4.1 of the MAM with 4.1 HUBwatch. Dotsie |