[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

4804.0. "Alternate identifier error in alarms FM?" by BAHTAT::BOND () Fri Apr 02 1993 03:25

    One of my customers is reporting an alarm failure problem and I am
    trying to decide whether it is a problem within alarms or within the
    access module.  He has an 'occurs alarm' set against the event
    REMSTA_UNREACHABLE for REMOTE_STATION * which is provided by the
    DECMCC/UDM Asset product (internally, used to be called colombus).
    
    Every 3 or 4 days, this alarm starts to exception at a rate of
    350/minute with the message:
    
    Unable to get alternate identifier for alarm remote_station *.
    Using primary identifier.
    
    Disabling and re-enabling the alarm stops the errors but no further
    alarms are notified by this rule until the system is rebooted.
    
    From the error message, do you think it is an alarms problem or a
    REMOTE_STATION AM problem?
    
    DECmcc 1.2.3, DECmcc/UDM 1.2.1.
T.RTitleUserPersonal
Name
DateLines
4804.1Can you post the Rule ?MOLAR::ROBERTSKeith Roberts - Network Management ApplicationsFri Apr 02 1993 09:173
  Can you post the exact Rule you are creating?  the expression you enter.

  thanks /keith
4804.2Here's the rule...BAHTAT::BONDMon Apr 05 1993 04:4710
    Here's the rule as requested:-
    
    create mcc 0 alarms rule DDC_Remsta_Unreachable -
    expression = (occurs(remote_station * remsta_unreachable)), -
    procedure  = /usr/mcc/ddc/ddc.scp, -
    parameter  = TYPE=remsta, -
    category   = "Remsta Event", -
    description = "A remote station has become unreachable", -
    perceived severity = major, -
    in domain = .domain.overall
4804.3I suspect the Remote_Station Access Module .. but try this ..MOLAR::ROBERTSKeith Roberts - Network Management ApplicationsMon Apr 05 1993 15:1711
  Try executing the same command which Alarms uses to process your Rule:

  getevent remote_station * remsta_unreachable, for dur 200-00:00:00

  This tells the FCL to get the remsta_unreachable event from any
  remote_station global entity ... and to keep watching for 200 days.

  Run this test along with your Alarm Rule on the same system and at
  the same time...and let me know what happens.

  /keith
4804.4I think max_nofiles 256 is too smallBAHTAT::BONDTue Apr 27 1993 13:4827
    Hello Again...at last the problem has re-occurred (things don't go
    wrong when they know you're watching!)
    
    In a window where we enabled the alarms, we finally got an error from
    the UDM TCP/IP event handler which receives events from a remote system. 
    This error was 'Socket Create Failure', errno=24 which is 'Too many
    open files'.  So I guess we will increase max_nofiles from the mcc
    recommended 256 to ... well howzabout 1024 to be safe!
    
    In the window where we did getevent which you recommended Keith, we are
    seeing:
    
    Remote_station local_ns:.remsta.twet28
    	at time <time> configuration_event
    	received event lost message
    
    My initial thought is that this is due to the remote event forwarder
    not managing to deliver its event properly because of the socket
    failure.  Although we killed the process off that had enabled the
    alarms (because it was streaming socket create failures) the getevent
    still carried on reporting the above error twice a second, always
    against twet28.  We killed it after an hour.
    
    I think we will attend to the max_nofile first and see whether this
    clears up the whole problem.  Thankyou for your help Keith.
    
    chris