[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

104.0. "mcc endurance problem" by HOULE1::HOULE (Steve, NM is the future!) Mon Apr 23 1990 17:22

Hi,

I've been using mcc and have come across the following problem.
Here's the scenario:

  I start up MCC interactively in a window and enable a set of rules (50+).
  Then I verify that they are enabled.
  NEXT I leave mcc 'running' for an extended amount of time (1/2 day +more).
  After this when I issue: "mcc sho alarm rule * state", mcc shows two 
   alarms and "hangs" --output stops and the process remains active. 
   This has happened to me twice.
  In addition, UNTIL this process is stopped aother mcc image cannot display
   an alarm rule (see below the error it gets). 

My guess is a problem with the alarms module.    
If this is a bug then someone needs to submit a QAR.
Below is some information. If more is required just ask.
===Steve   if you email me use:  DINSCO::

ERROR when second mcc fails:

DECmcc (T1.0.0)

MCC> sho mcc alarms rule gsfdr1-syn3
Using default ALL IDENTIFIERS
%MCC-I-CANCEL, Cancel
%MCC-I-CANCEL, Cancel
%MCC-E-KERNINITF, MCC initialization failure
-MCC-E-ALERT_TERMREQ, Thread termination requested
%MCC-E-NOTFOUND,  Dispatch entry for the specified entity does not exist



-------------------
ALARM RULE 

MCC> sho mcc alarms rule gsfdr1_syn3 all attributes
MCC ALARMS RULE GSFDR1_SYN3
ALL ATTRIBUTES
AT 23-APR-1990 16:16:20
                                   NAME = GSFDR1_SYN3
                                  State = Disabled
                               Category = "ROUTER CIRCUIT"
                            Description = "MCC GSFDR1 circuit SYN-3 "
                             Expression = "(change_of ( node4 GSFDR1 circuit syn
                                          -3 state, *,*  ) ,   at every=01:00:00
                                          )"
                              Parameter = "@mcc$common:act_mail.dis"
                              Procedure = DUA1:[MCC]ACT_MAIL_ALARM.COM;1
                                  Queue = "mcc$batch"

----------------------------------
PROCESS STATS:
                             Process _TWA9:                     16:04:21



    State               LEF                 Working set             5894

    Cur/base priority   4/4                 Virtual pages          28753

    Current PC          7FFEE44C            CPU time     000:00:11:40.71

    Current PSL         02800004            Direct I/O              1114

    Current user SP     7FF233D0            Buffered I/O           18198

    PID                 208000B0            Page faults            33204

    UIC                 [SYSTEM]            Event flags         40000001
                                                                C0000000


    HOULE2$DUA0:[SYS0.SYSCOMMON.][SYSEXE]MCC$MAIN.EXE;2
                                                                           
====================
T.RTitleUserPersonal
Name
DateLines
104.1More info please...TOOK::PLOUFFEJerryTue Apr 24 1990 16:5837
RE: .0

Steve:

  I'm going to need more info to solve this one.  

  First of all, when the output of "mcc sho alarm rule * state" hung, would
  MCC respond to a Ctrl Y or a Ctrl C?

  When you expect that the problem has occurred, will any MCC command work or 
  does just the SHOW RULE * STATE command fail?
 
  Is this problem reproducible with those same 50+ rules?  Does it take 1/2 day
  for this condition to appear or will this problem occur after 30 minutes?
  1 hour?

  When you expect that the problem has occurred, will any MCC command work?
 
  Could you provide a listing of these rules for us to examine?

  Could you provide us with a listing of the error log file for the day that
  these problems occurred?  The file is located in MCC$COMMON, file name =
  MCC$ALARMS_<date>_ERROR.LOG.

  I realize that some of the questions above are hard to answer, but any 
  additional information would be helpful.

  I have QAR'ed this problem in the MCC_DEV NACQAR database as QAR #765.
  I have also included this answer in the QAR.  If you can, please track this
  problem in the QAR system.  If you can't, let me know.  We'll work something
  out.

  Thanks...

                                                                    - Jerry  
 
                                                     Alarms Project Leader
104.2I goofed...TOOK::PLOUFFEJerryTue Apr 24 1990 17:2113
RE: .1

  I goofed, sorry.  The QAR should have been filed to the MCC_INT database
  not MCC_DEV.  

  I have closed #765 in MCC_DEV and copied the QAR to #88 in the MCC_INT
  database.

  Sorry for any confusion...

                                                                      - Jerry


104.3a little more infor for nowGDJUNK::HOULESteve, NM is the future!Mon Apr 30 1990 10:1681
Jerry,

I know whats its like to work on problems with too little info; Usually, I like
"trouble shoot" myself before posting but UNfortuantely my "MCC time" is limited.
Here's what little more info I have. I'll try recreating and then send you more.

1. Yes, CTRl Y will terminate MCC when its in this state (don't remember output).
2. Haven't tryed any other command but 'sho rule' (both times thats what 
    I wanted)
3. Reproducable? I'll try. Problem both times was a "next day"; Not 30 minutes.
4. Here's one rule. They are all identical:

type acidr3_syn1.create
 create mcc alarms rule  ACIDR3_SYN1  -
 expression = (change_of ( node4 ACIDR3 circuit syn-1 state, *,*  ) , -
 at every=01:00:00), -
 procedure = MCC$COMMON:ACT_MAIL_ALARM.COM, -
 parameter = "@mcc$common:act_mail.dis", -
 category  = "ROUTER CIRCUIT", -
 description = "MCC ACIDR3 circuit SYN-1 ", -
 QUEUE = "mcc$batch"

4. Error logs. I'm not sure which days the problem occured so I've included 
   a few logs.These are the complete logs.

ty/page MCC$ALARMS_22-APR-1990_ERROR.LOG;

 >>> 22-APR-1990 08:26:52.00   MCC ALARMS RULE WDCDR1_SYN1
     Expression = (change_of ( node4 WDCDR1 circuit syn-1 state, *,*  ) ,   at e
very=01:00:00)
     Status     = %MCC-S-SPECIALIZED_EXC, Success with xM-specific exception rep
ly
MCC ALARMS RULE WDCDR1_SYN1
AT 22-APR-1990 08:26:52


Temporary problems occurred trying to retrieve the data required to evaluate the
 expression.
                         Show Exception = 0, T1.0.0,
                                          %MCC-S-SPECIALIZED_EXC, Success with x
                                          M-specific exception reply,
                                          Show,
                                          Node4 WDCDR1 Circuit SYN-1 ,
                                          Status,
                                          Node4 WDCDR1 Circuit SYN-1 ,
                                          Node not currently accessible.,
                                          PersistenceClass = Temporary Problem

ty MCC$ALARMS_19-APR-1990_ERROR.LOG;
 >>> 19-APR-1990 14:56:04.97   MCC ALARMS RULE GSFDR1_SYN3
     Expression = (change_of ( node4 GSFDR1 circuit syn-3 state, *,*  ) ,   at e
very=01:00:00)
     Status     = %MCC-S-SPECIALIZED_EXC, Success with xM-specific exception rep
ly
MCC ALARMS RULE GSFDR1_SYN3
AT 19-APR-1990 14:56:04


Permanent problems occurred trying to retrieve the data required to evaluate the
 expression.
                         Show Exception = 0, T1.0.0,
                                          %MCC-S-SPECIALIZED_EXC, Success with x
                                          M-specific exception reply,
                                          Show,
                                          Node4 GSFDR1 Circuit SYN-3 ,
                                          Status,
                                          Node4 GSFDR1 Circuit SYN-3 ,
                                          Node does not exist or is not known to
                                           local Node.

ty MCC$ALARMS_18-APR-1990_ERROR.LOG;
 >>> 18-APR-1990 21:12:04.15   MCC ALARMS RULE WR3DR3_SYN1
     Expression = (change_of ( node4 WR3DR3 circuit syn-1 state, *,*  ) ,   at e
very=01:00:00)
     Status     = %MCC-S-SPECIALIZED_EXC, Success with xM-specific exception rep
ly
MCC ALARMS RULE WR3DR3_SYN1
AT 18-APR-1990 21:12:04


thats all for now
104.4mor info problem still exitsGDJUNK::HOULESteve, NM is the future!Wed May 09 1990 11:3137
I've collected a little more info.
I think the problem has something to do with time: midnight.
I ran my test from 9 am and last checked it at 8:30pm (I do go home).
All was fine during this time. However, in the morning it was "LOCKED".

Additional Info:

a. Only requesting alarm information cause the hang. I was able to do a "sho
  node4 xxxx all attrib" successfully. Again, when I issued "sho mcc alarms rule
  * state" the process hung.

b. Processs info at time of hang -nothing changes:

sho proc/cont                           Process _TWA18:                    08:54:50



    State               LEF                 Working set             8192

    Cur/base priority   4/4                 Virtual pages          31768

    Current PC          80133E3C            CPU time     000:00:15:23.97

    Current PSL         02800000            Direct I/O              3820

    Current user SP     7FF23330            Buffered I/O           20073

    PID                 20600106            Page faults            29668

    UIC                 [SYSTEM]            Event flags         C0000001
                                                      
    HOULE2$DUA0:[SYS0.SYSCOMMON.][SYSEXE]MCC$MAIN.EXE;2

Someday if I get time I will trye two more tests:
 1.I'll start the test at night and try the next morning.
 2. I'll reset the clock so midnight occurs while I'm at work.
===Steve
104.5Need you help to re-testTOOK::PLOUFFEJerrySat Jun 30 1990 12:1321
Steve:

  We have not seen a similar situation to the one that you are experiencing.

  Some of the problems may have been caused by memory loss/clobber problems.
  We have fixed many of these problems in the EFT update kit.  I'm sure that
  some remain, but the situation is much better now than in the IFT or EFT kits.

  Could I ask you to please re-run this test with the EFT update kit (T1.0.1)
  and let us know what happens?

  Thanks in advance...

                                                                       - Jerry

  P.S.  This answer was also posted to the NACQAR system -- QAR #88 of the 
        MCC_INT database.  Since I believe that you now have an account, 
        from this point on, please only use the NACQAR system to pursue this
        problem.  This will help me to avoid the tedious job of duplicating 
        the paperwork.  Thanks in advance...