[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

2336.0. "EVL/MCC_DNA4_EVL dyingpartner" by SNOC01::MISNETWORK (They call me LAT) Thu Feb 13 1992 02:05

    A customer has suddenly come up with a nasty bout of the "alarms keep
    dying" illness. The alarms are using event loggin, and for some reason
    EVL and MCC_DNA4_EVL die. I cannot see an obvious reason for the
    problem, all patches have been installed, and the alarms have been
    running fine for some time. ALl that has changed is some extra
    exporting is being done. The logs are below -
    
    $ ty sys$manager:evl.log
    
    $!
    $! This command procedure is always run when anybody on the entire
    system
    $! logs in. It is equivalent to LOGIN.COM except that the instructions
    $! contained herein are executed everytime anyone on the VMS system
    $! logs in to their account.
    $!
    $! For interactive processes, turn on Control T, and set the terminal
    type
    $!
    $ IF (F$MODE() .EQS. "INTERACTIVE") THEN SET CONTROL=T
    $ IF (F$MODE() .EQS. "INTERACTIVE") THEN SET TERMINAL/INQUIRE
    $!
    $! For MicroVAX systems only, use the command MOUNT/NOASSIST.
    $!
    $ IF (.NOT. F$TRNLNM("SYS$MICROVAX")) THEN GOTO SKIP_MICROVAX_COMMANDS
    $SKIP_MICROVAX_COMMANDS:
    $!
    $! Place your site-specific LOGIN commands below
    $!
    $ E*DIT == "EDIT/TPU"
    $ DNSCP :== $DNS$CONTROL        !Foreign command to run DNS utility
    
    ay-91
    $ exit
    $ !  Copyright (c) 1987 Digital Equipment Corporation.  All rights
    reserved.
    $ SET NOON
    $ IF "" .NES. "" THEN EVL$COMMAND
    $ RUN SYS$SYSTEM:EVL
    %EVL-E-OPENMON, error creating logical link to monitor process
    NETMAN::"TASK=MCC
    _DNA4_EVL"
    -SYSTEM-F-INVLOGIN, login information invalid at remote node
    %EVL-E-WRITEMON, error writing event record to monitor process
    MCC_DNA4_EVL
    -SYSTEM-F-FILNOTACC, file not accessed on channel
    
    
    
    
    $ ty mcc_dna*.log
    
    
    DISK$USER:[NETMANAGER]MCC_DNA4_EVL.LOG;105
    
    
    $!
    $! This command procedure is always run when anybody on the entire
    system
    $! logs in. It is equivalent to LOGIN.COM except that the instructions
    $! contained herein are executed everytime anyone on the VMS system
    $! logs in to their account.
    $!
    $! For interactive processes, turn on Control T, and set the terminal
    type
    $!
    $ IF (F$MODE() .EQS. "INTERACTIVE") THEN SET CONTROL=T
    $ IF (F$MODE() .EQS. "INTERACTIVE") THEN SET TERMINAL/INQUIRE
    $!
    $! For MicroVAX systems only, use the command MOUNT/NOASSIST.
    $!
    $ IF (.NOT. F$TRNLNM("SYS$MICROVAX")) THEN GOTO SKIP_MICROVAX_COMMANDS
    $SKIP_MICROVAX_COMMANDS:
    $!
    $! Place your site-specific LOGIN commands below
    $!
    $ E*DIT == "EDIT/TPU"
    $ exit
    $ define/system sys$print cenp04
    %DCL-I-SUPERSEDE, previous value of SYS$PRINT has been superseded
    $ define alarmsnet disk$user:[netmanager.mcc.alarms]
    $ define/system/exec mcc_maps disk$user:[netmanager.australia]
    %DCL-I-SUPERSEDE, previous value of MCC_MAPS has been superseded
    $ alarmsnet :== set def alarmsnet
    $ manage/enter/presen=mcc_dna4_evl
    Network object MCC_DNA4_EVL is declared, Status = 52854793
    
    Waiting for the event message from EVL.....
    The connection with EVL is established.
    ** Unable to connect to NMCC  **
    Ready to read the next event message...
    Ready to read the next event message...
    Ready to read the next event message...
    Ready to read the next event message...
    .
    .
    .
    .
    Failed to receive an event from EVL, status = 8420
    %SYSTEM-F-LINKABORT, network partner aborted logical link
      NETMANAGER   job terminated at 13-FEB-1992 12:58:50.13
    
      Accounting information:
      Buffered I/O count:            1068         Peak working set size:   
    2378
    
    
    Any clues as to what caused this is much appreciated.
    Cheers,
    Louis
T.RTitleUserPersonal
Name
DateLines
2336.1EVL is culpritICS::WOODCOCKSat Feb 15 1992 21:3333
    
>    DISK$USER:[NETMANAGER]MCC_DNA4_EVL.LOG;105
>    
>    Failed to receive an event from EVL, status = 8420
>    %SYSTEM-F-LINKABORT, network partner aborted logical link
>      NETMANAGER   job terminated at 13-FEB-1992 12:58:50.13
    
Most likely EVL has died on you. This can happen as a normal occurrence
of EVL. In the past users never noticed its' death because it gets
restarted automatically. Unfortunately, because MCC creates a link to it,
when EVL dies so does MCC_DNA4_EVL. It's usually stress on the system which
can cause EVL to go belly up. We usually see it when EVL is slammed from a
node with a circuit bouncing 1-2 times a second. But there have also been
times it has died with less event activity. The first thing to do is reduce
the number of events going to the MCC system. That is, if you're not alarming
on it, don't sink it!!! If you are only alarming on 4.7 then only sink 4.7. If
you would like to see other events for other reasons I'd send them to another
system other than the one running MCC.

Also for V1.1 I belive I modified MCC_DNA4_EVL.COM. It's buried in this 
conference somewhere (check DIR/author=woodcock) if you want to try it. Please
be aware of support issues if you do try it. There isn't much of a change in
it, it simply checks the $status symbol's value on error and if it equals that
which is created by LINKABORT it loops back up and restarts MCC_DNA4_EVL. Thru
time it has shown not to work 100% but without it I would have given up on
events and gone back to polling due to this problem.

I tried to suggest this process needs to be a great deal more robust but I
don't know if time was found to put in any mechanism onto MCC_DNA4_EVL in the
case of EVL failure in 1.2 kits.

regards,
brad...
2336.2Have done so !SNOC01::MISNETWORKMCC=My Constant ConfusionMon Mar 23 1992 22:248
    I forgot all about this note. Thanks for the info Brad, I am already
    using your code. Your comments about receiving too many events seem to
    be the most likely cause of the problem. Will have to try and figure
    out whether we can reduce events coming in, but I am pretty sure we
    have already done the best we can.
    
    Cheers,
    Louis
2336.31.2 much betterICS::WOODCOCKTue Mar 24 1992 08:4111
Hi Louis,

The good news is in V1.2. I have brought x1.2.15 into production as best I
can with its limitations in 'alarms'. A noteworthy comment is the MCC_DNA4_EVL
process. The developers were definitely listening thru v1.1 because this
process now appears to handle EVL dropping in and out quite well so far. Hats
off to those giving the effort.

cheers,
brad...