[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:	DECmcc user notes file. Does not replace IPMT.
Notice:	Use IPMT for problems. Newsletter location in note 6187
Moderator:	TAEC::BEROUD

Created:	Mon Aug 21 1989
Last Modified:	Wed Jun 04 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	6497
Total number of notes:	27359

4804.0. "Alternate identifier error in alarms FM?" by BAHTAT::BOND () Fri Apr 02 1993 03:25

    One of my customers is reporting an alarm failure problem and I am
    trying to decide whether it is a problem within alarms or within the
    access module.  He has an 'occurs alarm' set against the event
    REMSTA_UNREACHABLE for REMOTE_STATION * which is provided by the
    DECMCC/UDM Asset product (internally, used to be called colombus).
    
    Every 3 or 4 days, this alarm starts to exception at a rate of
    350/minute with the message:
    
    Unable to get alternate identifier for alarm remote_station *.
    Using primary identifier.
    
    Disabling and re-enabling the alarm stops the errors but no further
    alarms are notified by this rule until the system is rebooted.
    
    From the error message, do you think it is an alarms problem or a
    REMOTE_STATION AM problem?
    
    DECmcc 1.2.3, DECmcc/UDM 1.2.1.

T.R	Title	User	Personal Name	Date	Lines
4804.1	Can you post the Rule ?	MOLAR::ROBERTS	Keith Roberts - Network Management Applications	`Fri Apr 02 1993 09:17`	3
	Can you post the exact Rule you are creating? the expression you enter. thanks /keith
4804.2	Here's the rule...	BAHTAT::BOND		`Mon Apr 05 1993 04:47`	10
	Here's the rule as requested:- create mcc 0 alarms rule DDC_Remsta_Unreachable - expression = (occurs(remote_station * remsta_unreachable)), - procedure = /usr/mcc/ddc/ddc.scp, - parameter = TYPE=remsta, - category = "Remsta Event", - description = "A remote station has become unreachable", - perceived severity = major, - in domain = .domain.overall
4804.3	I suspect the Remote_Station Access Module .. but try this ..	MOLAR::ROBERTS	Keith Roberts - Network Management Applications	`Mon Apr 05 1993 15:17`	11
	Try executing the same command which Alarms uses to process your Rule: getevent remote_station * remsta_unreachable, for dur 200-00:00:00 This tells the FCL to get the remsta_unreachable event from any remote_station global entity ... and to keep watching for 200 days. Run this test along with your Alarm Rule on the same system and at the same time...and let me know what happens. /keith
4804.4	I think max_nofiles 256 is too small	BAHTAT::BOND		`Tue Apr 27 1993 13:48`	27
	Hello Again...at last the problem has re-occurred (things don't go wrong when they know you're watching!) In a window where we enabled the alarms, we finally got an error from the UDM TCP/IP event handler which receives events from a remote system. This error was 'Socket Create Failure', errno=24 which is 'Too many open files'. So I guess we will increase max_nofiles from the mcc recommended 256 to ... well howzabout 1024 to be safe! In the window where we did getevent which you recommended Keith, we are seeing: Remote_station local_ns:.remsta.twet28 at time <time> configuration_event received event lost message My initial thought is that this is due to the remote event forwarder not managing to deliver its event properly because of the socket failure. Although we killed the process off that had enabled the alarms (because it was streaming socket create failures) the getevent still carried on reporting the above error twice a second, always against twet28. We killed it after an hour. I think we will attend to the max_nofile first and see whether this clears up the whole problem. Thankyou for your help Keith. chris