[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

5662.0. "SERIOUS PROBLEM on notification of SNA-alarms" by STKHLM::BERGGREN (Nils Berggren EIS/Project dpmt, Sweden DTN 876-8287) Thu Oct 14 1993 19:44

**********************************************************************
This note is posted in both MCC and POLYCENTER_SNA_MANAGER
**********************************************************************

We're having a *SERIOUS* problem at a customer site I'm working
at helping them setting up the DECmcc environment.

The environment is:  VAXstation 4000/90, VMS V5.5-2,  128 Mb, DECmcc 1.3.

The problem description is as follows:

They have two alarm rules enabled. 
The first is an occurs-rule defined like:
	(OCCURS (SNAnode SE_NS:.TELIA.C.SNA.SYS1 CLSTR XP3006
	         RECOVERY INITIATED)) 
         SEVERITY=MAJOR

The second is:
	(OCCURS (SNAnode SE_NS:.TELIA.C.SNA.SYS1 CLSTR XP3006  
		 RECOVERY COMPLETE))	
         SEVERITY=CLEAR

There are also two NOTIFY REQUESTS enabled for the RECOVERY INITIATED 
and COMPLETE events on that SNA-NODE.

In the notifications window we see the RECOVERY INITIATED event coming
and very soon after ( <2 secs) the RECOVERY COMPLETE event. The alarms 
may come in various orders:

      The events are ALWAYS notified in the right order, i.e. INITIATED
      before COMPLETE.

1.
    - The 'RECOVERY INITIATED'-event shows up first in notifications
      window
    - The initiated-alarm shows up AFTER the RECOVERY INITIATED-event 
      is displayed but BEFORE the 'RECOVERY COMPLETE'-event. 
    - The 'RECOVERY COMPLETE'-event is displayed AFTER The initiated-alarm
    - The complete-alarm shows up AFTER after the 'RECOVERY COMPLETE'-event

      This is the behaviour one would expect.


2.
    - The 'RECOVERY INITIATED'-event shows up first in notifications
      window
    - The 'RECOVERY COMPLETE'-event is displayed AFTER 'RECOVERY 
      INITIATED'-event
    - The complete-alarm shows up AFTER after the 'RECOVERY
      COMPLETE'-event  **BUT** before the initiated-alarm
    - the initiated-alarm shows up last, LEAVING THE SEVERITY OF THE
      OBJECT TO MAJOR when it should be CLEAR.


3.
    - The 'RECOVERY INITIATED'-event shows up first in notifications
      window
    - The complete-alarm shows up after after the 'RECOVERY INITIATED'-
      event (** AND THIS IS VERY SERIOUS **) before the 'RECOVERY
      COMPLETE'-event which is the event triggering the alarm.....
    - the 'RECOVERY COMPLETE'-event shows up AFTER the complete-alarm
    - initiated-alarm shows up last, LEAVING THE SEVERITY OF THE
      OBJECT TO MAJOR when it should be CLEAR.



After looking in the notification-log especially the ALARM-entries, I
see that the date-and-time of the event triggering the alarm matches 
the way the alarms are displayed, i.e. if the time-stamp of recovery 
complet is before recovery initiated I see the alarms in wrong order.

Well, this is not acceptable. We can tell for sure that the events are
sent, from NetView, in the right order.  

Why do we get the wrong order when arriving to DECmcc? 
Does the SNA_SERVER process have problem posting the incoming events
in the right order, or is it the EVENT_MANAGER causing the problem?

I know there's not to many SNA-AM installed, but for those of you who
have customers running it, have you seen anything like this?


I would very much like to test this on another entity class, but I
don't know for which class I can generate two different events in less
than two seconds.  Anybody any idea, please let me know....


The customer thinks that this is a show-stopper and are not going to
use the SNA-AM (and maybe ont DECmcc at all) if we can't provide them
with a solution. 


One of the main reasons we got the business them was the SNA-AM.......


	Please, anybody any idea what to do?

		/Nils

Below ae three parts from the MCC_NOTIFICAION.LOG showing the above
mentioned orders of notifications.

======================================================================
==========================   example 1    ============================
======================================================================
%%%%%%%%%%%%%% Event, 14-OCT-1993 08:52:02 %%%%%%%%%%%%%% [14,476]
Domain: SE_NS:.telia.c.diab2                          Severity: Indeterminate
Notification Entity: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event Source: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event: Recovery Initiated
Recovery has been initiated on this node.
                 Additional Information = (
                           Message Number = IST619I,
                             Message Text = "IST619I  ID = XP3006   FAILED - 
                                            RECOVERY IN PROGRESS" )
                           Connectivity = ( (
                         Connected Resource = link,
                              Resource Name = XL300001 ) )


!!!!!!!!!!!!!! Alarm, 14-OCT-1993 08:52:02 !!!!!!!!!!!!!! [22,477]
Domain: SE_NS:.TELIA.C.DIAB2                          Severity: Major
Notification Entity: SNAnode SE_NS:.telia.c.sna.xp3006 
Event Source: Domain SE_NS:.TELIA.C.DIAB2 Rule XP3006_DOWN 
Event: OSI Rule Fired

                             Event Type = QualityofServiceAlarm
                             Event Time = 14-OCT-1993 08:52:01.30
                         Probable Cause = Unknown
                        Additional Info = { (
                               significance = True,
                                information = "The last event detected: SNAnode 
                                              SETVTC00.SETVTM15 CLSTR XP3006 
                                              Recovery Initiated  14-OCT-1993 
                                              08:52:01.29" ),
                                            (
                               significance = True,
                                information = "Event: Recovery Initiated 
                                              Recovery has been initiated on 
                                              this node.  Additional 
                                              Information = ( Message Number = 
                                              IST619I, Message Text = ""IST619I 
                                              ID = XP3006 FAILED - RECOVERY IN 
                                              PROGRESS"" )  Connectivity = ( ( 
                                              Connected Resource = link, 
                                              Resource Name = XL300001 ) )" ),
                                            (
                               significance = True,
                                information = "(OCCURS (SNAnode 
                                              SE_NS:.TELIA.C.SNA.SYS1 CLSTR 
                                              XP3006  RECOVERY INITIATED))" ) }
                         Managed Object = SNAnode SETVTC00.SETVTM15 CLSTR 
                                          XP3006 
                     Perceived Severity = Major


%%%%%%%%%%%%%% Event, 14-OCT-1993 08:52:02 %%%%%%%%%%%%%% [14,478]
Domain: SE_NS:.telia.c.diab2                          Severity: Indeterminate
Notification Entity: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event Source: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event: Recovery Complete
Node has completed recovery.
                 Additional Information = (
                           Message Number = IST621I,
                             Message Text = "IST621I  RECOVERY SUCCESSFUL    
                                            FOR NETWORK NODE XP3006 " )
                           Connectivity = ( (
                         Connected Resource = link,
                              Resource Name = XL300001 ) )


!!!!!!!!!!!!!! Alarm, 14-OCT-1993 08:52:03 !!!!!!!!!!!!!! [22,479]
Domain: SE_NS:.TELIA.C.DIAB2                          Severity: Clear
Notification Entity: SNAnode SE_NS:.telia.c.sna.xp3006 
Event Source: Domain SE_NS:.TELIA.C.DIAB2 Rule XP3006_UP 
Event: OSI Rule Fired

                             Event Type = QualityofServiceAlarm
                             Event Time = 14-OCT-1993 08:52:01.77
                         Probable Cause = Unknown
                        Additional Info = { (
                               significance = True,
                                information = "The last event detected: SNAnode 
                                              SETVTC00.SETVTM15 CLSTR XP3006 
                                              Recovery Complete  14-OCT-1993 
                                              08:52:01.70" ),
                                            (
                               significance = True,
                                information = "Event: Recovery Complete Node 
                                              has completed recovery.  
                                              Additional Information = ( 
                                              Message Number = IST621I, Message 
                                              Text = ""IST621I RECOVERY 
                                              SUCCESSFUL FOR NETWORK NODE 
                                              XP3006 "" )  Connectivity = ( ( 
                                              Connected Resource = link, 
                                              Resource Name = XL300001 ) )" ),
                                            (
                               significance = True,
                                information = "(OCCURS (SNAnode 
                                              SE_NS:.TELIA.C.SNA.SYS1 CLSTR 
                                              XP3006  RECOVERY COMPLETE))" ) }
                         Managed Object = SNAnode SETVTC00.SETVTM15 CLSTR 
                                          XP3006 
                     Perceived Severity = Clear


======================================================================
==========================   example 2    ============================
======================================================================

%%%%%%%%%%%%%% Event, 14-OCT-1993 12:17:28 %%%%%%%%%%%%%% [14,587]
Domain: SE_NS:.telia.c.diab2                          Severity: Indeterminate
Notification Entity: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event Source: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event: Recovery Initiated
Recovery has been initiated on this node.
                 Additional Information = (
                           Message Number = IST619I,
                             Message Text = "IST619I  ID = XP3006   FAILED - 
                                            RECOVERY IN PROGRESS" )
                           Connectivity = ( (
                         Connected Resource = link,
                              Resource Name = XL300001 ) )


%%%%%%%%%%%%%% Event, 14-OCT-1993 12:17:29 %%%%%%%%%%%%%% [14,588]
Domain: SE_NS:.telia.c.diab2                          Severity: Indeterminate
Notification Entity: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event Source: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event: Recovery Complete
Node has completed recovery.
                 Additional Information = (
                           Message Number = IST621I,
                             Message Text = "IST621I  RECOVERY SUCCESSFUL    
                                            FOR NETWORK NODE XP3006 " )
                           Connectivity = ( (
                         Connected Resource = link,
                              Resource Name = XL300001 ) )


!!!!!!!!!!!!!! Alarm, 14-OCT-1993 12:17:30 !!!!!!!!!!!!!! [24,589]
Domain: SE_NS:.TELIA.C.DIAB2                          Severity: Clear
Notification Entity: SNAnode SE_NS:.telia.c.sna.xp3006 
Event Source: Domain SE_NS:.TELIA.C.DIAB2 Rule XP3006_UP 
Event: OSI Rule Fired

                             Event Type = QualityofServiceAlarm
                             Event Time = 14-OCT-1993 12:17:27.97
                         Probable Cause = Unknown
                        Additional Info = { (
                               significance = True,
                                information = "The last event detected: SNAnode 
                                              SETVTC00.SETVTM15 CLSTR XP3006 
                                              Recovery Complete  14-OCT-1993 
                                              12:17:27.95" ),
                                            (
                               significance = True,
                                information = "Event: Recovery Complete Node 
                                              has completed recovery.  
                                              Additional Information = ( 
                                              Message Number = IST621I, Message 
                                              Text = ""IST621I RECOVERY 
                                              SUCCESSFUL FOR NETWORK NODE 
                                              XP3006 "" )  Connectivity = ( ( 
                                              Connected Resource = link, 
                                              Resource Name = XL300001 ) )" ),
                                            (
                               significance = True,
                                information = "(OCCURS (SNAnode 
                                              SE_NS:.TELIA.C.SNA.SYS1 CLSTR 
                                              XP3006  RECOVERY COMPLETE))" ) }
                         Managed Object = SNAnode SETVTC00.SETVTM15 CLSTR 
                                          XP3006 
                     Perceived Severity = Clear


!!!!!!!!!!!!!! Alarm, 14-OCT-1993 12:17:30 !!!!!!!!!!!!!! [24,590]
Domain: SE_NS:.TELIA.C.DIAB2                          Severity: Major
Notification Entity: SNAnode SE_NS:.telia.c.sna.xp3006 
Event Source: Domain SE_NS:.TELIA.C.DIAB2 Rule XP3006_DOWN 
Event: OSI Rule Fired

                             Event Type = QualityofServiceAlarm
                             Event Time = 14-OCT-1993 12:17:28.32
                         Probable Cause = Unknown
                        Additional Info = { (
                               significance = True,
                                information = "The last event detected: SNAnode 
                                              SETVTC00.SETVTM15 CLSTR XP3006 
                                              Recovery Initiated  14-OCT-1993 
                                              12:17:27.96" ),
                                            (
                               significance = True,
                                information = "Event: Recovery Initiated 
                                              Recovery has been initiated on 
                                              this node.  Additional 
                                              Information = ( Message Number = 
                                              IST619I, Message Text = ""IST619I 
                                              ID = XP3006 FAILED - RECOVERY IN 
                                              PROGRESS"" )  Connectivity = ( ( 
                                              Connected Resource = link, 
                                              Resource Name = XL300001 ) )" ),
                                            (
                               significance = True,
                                information = "(OCCURS (SNAnode 
                                              SE_NS:.TELIA.C.SNA.SYS1 CLSTR 
                                              XP3006  RECOVERY INITIATED))" ) }
                         Managed Object = SNAnode SETVTC00.SETVTM15 CLSTR 
                                          XP3006 
                     Perceived Severity = Major


======================================================================
==========================   example 3    ============================
======================================================================
%%%%%%%%%%%%%% Event, 14-OCT-1993 13:29:49 %%%%%%%%%%%%%% [14,645]
Domain: SE_NS:.telia.c.diab2                          Severity: Indeterminate
Notification Entity: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event Source: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event: Recovery Initiated
Recovery has been initiated on this node.
                 Additional Information = (
                           Message Number = IST619I,
                             Message Text = "IST619I  ID = XP3006   FAILED - 
                                            RECOVERY IN PROGRESS" )
                           Connectivity = ( (
                         Connected Resource = link,
                              Resource Name = XL300001 ) )


!!!!!!!!!!!!!! Alarm, 14-OCT-1993 13:29:49 !!!!!!!!!!!!!! [24,646]
Domain: SE_NS:.TELIA.C.DIAB2                          Severity: Clear
Notification Entity: SNAnode SE_NS:.telia.c.sna.xp3006 
Event Source: Domain SE_NS:.TELIA.C.DIAB2 Rule XP3006_UP 
Event: OSI Rule Fired

                             Event Type = QualityofServiceAlarm
                             Event Time = 14-OCT-1993 13:29:46.22
                         Probable Cause = Unknown
                        Additional Info = { (
                               significance = True,
                                information = "The last event detected: SNAnode 
                                              SETVTC00.SETVTM15 CLSTR XP3006 
                                              Recovery Complete  14-OCT-1993 
                                              13:29:46.21" ),
                                            (
                               significance = True,
                                information = "Event: Recovery Complete Node 
                                              has completed recovery.  
                                              Additional Information = ( 
                                              Message Number = IST621I, Message 
                                              Text = ""IST621I RECOVERY 
                                              SUCCESSFUL FOR NETWORK NODE 
                                              XP3006 "" )  Connectivity = ( ( 
                                              Connected Resource = link, 
                                              Resource Name = XL300001 ) )" ),
                                            (
                               significance = True,
                                information = "(OCCURS (SNAnode 
                                              SE_NS:.TELIA.C.SNA.SYS1 CLSTR 
                                              XP3006  RECOVERY COMPLETE))" ) }
                         Managed Object = SNAnode SETVTC00.SETVTM15 CLSTR 
                                          XP3006 
                     Perceived Severity = Clear


%%%%%%%%%%%%%% Event, 14-OCT-1993 13:29:50 %%%%%%%%%%%%%% [14,647]
Domain: SE_NS:.telia.c.diab2                          Severity: Indeterminate
Notification Entity: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event Source: SNAnode SE_NS:.telia.c.sna.sys1 CLSTR XP3006 
Event: Recovery Complete
Node has completed recovery.
                 Additional Information = (
                           Message Number = IST621I,
                             Message Text = "IST621I  RECOVERY SUCCESSFUL    
                                            FOR NETWORK NODE XP3006 " )
                           Connectivity = ( (
                         Connected Resource = link,
                              Resource Name = XL300001 ) )


!!!!!!!!!!!!!! Alarm, 14-OCT-1993 13:29:50 !!!!!!!!!!!!!! [24,648]
Domain: SE_NS:.TELIA.C.DIAB2                          Severity: Major
Notification Entity: SNAnode SE_NS:.telia.c.sna.xp3006 
Event Source: Domain SE_NS:.TELIA.C.DIAB2 Rule XP3006_DOWN 
Event: OSI Rule Fired

                             Event Type = QualityofServiceAlarm
                             Event Time = 14-OCT-1993 13:29:46.76
                         Probable Cause = Unknown
                        Additional Info = { (
                               significance = True,
                                information = "The last event detected: SNAnode 
                                              SETVTC00.SETVTM15 CLSTR XP3006 
                                              Recovery Initiated  14-OCT-1993 
                                              13:29:46.75" ),
                                            (
                               significance = True,
                                information = "Event: Recovery Initiated 
                                              Recovery has been initiated on 
                                              this node.  Additional 
                                              Information = ( Message Number = 
                                              IST619I, Message Text = ""IST619I 
                                              ID = XP3006 FAILED - RECOVERY IN 
                                              PROGRESS"" )  Connectivity = ( ( 
                                              Connected Resource = link, 
                                              Resource Name = XL300001 ) )" ),
                                            (
                               significance = True,
                                information = "(OCCURS (SNAnode 
                                              SE_NS:.TELIA.C.SNA.SYS1 CLSTR 
                                              XP3006  RECOVERY INITIATED))" ) }
                         Managed Object = SNAnode SETVTC00.SETVTM15 CLSTR 
                                          XP3006 
                     Perceived Severity = Major


======================================================================
T.RTitleUserPersonal
Name
DateLines
5662.1Tracing? What logicals?STKHLM::BERGGRENNils Berggren EIS/Project dpmt, Sweden DTN 876-8287Thu Oct 14 1993 19:588
    Forgot in the previous note:
    
    Can logging be enabled for the SNA-am and SNA_SERVER process
    so that we can see the exact time-stamp when the events are received?
    
    What logical names has to be defined?
    
           /Nils