[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

2686.0. "vitalink AM and alarms" by COMICS::MISTRY () Fri Apr 03 1992 09:19

Hi,

Have a problem with the vitalink access module V1.1 (BMS 1.1). Customer has set
up and alarm against the forwarding and backup databases for some translans, 
looking for a change of state. Even though the databases haven't changed the
alarm fires and he gets the following error messages :

Cannot communicate with target 
Communication target has been interrupted.

Now he tells me that the databases haven't changed.

Does anyone have any ideas.

Bipin.

T.RTitleUserPersonal
Name
DateLines
2686.1can mean lots of things.TOOK::MCPHERSONSave a tree: kill an ISO working group.Sat Apr 04 1992 15:3335
>Have a problem with the vitalink access module V1.1 (BMS 1.1). Customer has set
>up and alarm against the forwarding and backup databases for some translans, 
>looking for a change of state. Even though the databases haven't changed the
>alarm fires and he gets the following error messages :
>
>Cannot communicate with target 
>Communication target has been interrupted.
>
>Now he tells me that the databases haven't changed.

    First, what is a 'backup database' ? 

    Second: Please post the alarm rule that your customer is using.

    The alarm, per se, isn't firing; the *Exception* is.

    The error you're seeing means that at some point in the dialogue
    between DECmcc and the TransLAN in question, communications was
    disrupted.   It may have been due to any number of things...
	the TransLAN rebooted, 
	someone cut the wire, 
	etc.

    It may have simply been that the TransLAN got asked to forward a bunch
    of packets while it was trying to service the AM management request and
    it just dropped the management request on the floor (Yes, they _will_
    do that.) and the AM decided that 'Communications with the target have
    been interrupted'...

    Also, forwarding databases can be *large* and the AM may have to
    'churn' for a while to do whatever evaluation it is that the alarm is
    asking for..

    regards,  
    /doug
2686.2more informationCOMICS::MISTRYTue Apr 07 1992 05:1817
Hi,


I am finding out how large the forwarding database is. The problem he is more
or less reproducable at will, he enables the alarm and almost immediately he 
gets this message back. 

The alarm expression is something like IF line change from forward to backup then
fire alarm, however the line state hasn't changed (these are the sync links).

Now i believe there was a problem with translans where mcc couldn't perform loops
against the synchronous link (ie problem with the translan code). Is this still
the case and i believe this may have something to do with it.

Bipin

Rest of the onfo you requested i am getting.
2686.3TOOK::MCPHERSONSave a tree: kill an ISO working group.Tue Apr 07 1992 09:3643
>
>I am finding out how large the forwarding database is. The problem he is more
>or less reproducable at will, he enables the alarm and almost immediately he 
>gets this message back. 
>
>The alarm expression is something like IF line change from forward to backup
>then fire alarm, however the line state hasn't changed (these are the sync
>links).
>

    The "Forwarding Database" is the database containing all of the
    physical and multicast entries that the Translan knows about; it has
    nothing to do with whether a line is forwarding, preforwarding_1,
    preforwarding_2, broken, etc.   What you're talking about sounds like a
    CHANGE_OF rule on LINE <line#> MODULE STATE (a status attribute).

>
>Now i believe there was a problem with translans where mcc couldn't perform
>loops against the synchronous link (ie problem with the translan code). Is this
>still the case and i believe this may have something to do with it.
>

    There is no "LOOP" command from the Translan AM; the RBMS agent on the
    Translan doesn't support that operation, so you'll need to have a
    connection to the management port on the Translan and do circuit LOOP
    testing from the REC menu.   Or are you referring to the fact that
    Vitalink recently dropped support of the Station AM's "TEST" command
    for remote Translans? 

>Rest of the onfo you requested i am getting.

    Also, it would be very helpful to understand:

	1) rev levels of the Translan Software (if you already
    	   mentioned it, forgive me.)
	2) what is the topological relationship between the mgmt station and
    	   the Translans (esp the one you're trying to create the alarm rule on)
        3)  How heavily loaded the Translans are (traffic-wise)
        4) Speed / utilization of the links between ths Translans


    regards ,
    doug
2686.4more info as requestedCOMICS::MISTRYTue Apr 07 1992 11:3526
Sorry for the in complete picture, my understanding of translans isn't that 
great.

>	1) rev levels of the Translan Software (if you already
    	   mentioned it, forgive me.)
>	2) what is the topological relationship between the mgmt station and
    	   the Translans (esp the one you're trying to create the alarm rule on)
>        3)  How heavily loaded the Translans are (traffic-wise)
>        4) Speed / utilization of the links between ths Translans


The translans that he has are 300's and 320's. The 300 are 6.10.1 and the 320's
20.2.13. 

The topology is a MCC station connected to a single port repeater which in
turn is connected to a delni and then on to the main backbone. The backbone then
has the translans connected.

I'm trying to find out the utilization and the throughput of the lines. 

The alarm rule expression is line2 module state <on> forwarding, at every 30 
seconds.

Hope that is more helpful.

Bipin.
2686.5Getting warmer.MCDOUG::MCPHERSONSave a tree: kill an ISO working group.Tue Apr 07 1992 15:1359
>
>The translans that he has are 300's and 320's. The 300 are 6.10.1 and the 320's
>20.2.13. 
>

    There is no such thing as a TransLAN 300 (as far as I know), I assume
    that you mean TransLAN III ?

    There are several known bugs in release 6.10.1 of the TransLAN software
    related to the VBAM; you may have uncovered some new ones.   The
    recommended minimum rev levels of the TransLAN are:
        6.10.2 
        10.4.2 
        20.2.2

>
>The topology is a MCC station connected to a single port repeater which in turn
>is connected to a delni and then on to the main backbone. The backbone then has
>the translans connected.
>

    Something Like this:? 

    [MCC]
      |
   [DESPR]
      |
      |||||||| 
     [ DELNI  ]
         |
---------+------+----------------------------
                |
           [ translan A ]   
              /       \
             /         \
            /           \
 [ Translan B ]         [ Translan C ]    
       |                      |
   ----+---               ----+---




    What happens when you do the following: 

        MCC> SHOW TRANSLAN A line * all status
        MCC> SHOW TRANSLAN B line * all status
        MCC> SHOW TRANSLAN C line * all status

        
>
>The alarm rule expression is line2 module state <on> forwarding, at every 30 
>seconds.
>

    I get the gist, but can you reply here with the *exact* text of the
    rule that was created?  

    /doug
2686.6red hot...COMICS::MISTRYWed Apr 08 1992 08:0427
Hi,

Yep, topology is correct. Right what I got him to do was to active the alarm and
leave it running, the alarm fired (ie icon went red) after about an hour with 
exactly the same exception, cannot communicate with target. He then did a 

mcc> show translan xxxx line * all status. 

No problems, mcc managed to get all the line info and the state hadn't changed,
it was still in backup.

Alarm syntax is translan xxxxx line2 module state <>, forwarding or, backup at
every 30 seconds.

Now I know the translan is probably going to put an mcc request at the bottom of
its list of things to do if it is busy. However, if he can then do a show 
immediately after then it can't be that busy. 

From what I can determine if it is a busy translan problem then there is no way 
round this because if mcc cannot communicate with the target then it won't know
what the the line is in (ie forwarding or backup) and so is going to report a 
problem with the bridge and hence the line.

Any ideas.

Bipin.

2686.7Please post the verbatim output.TOOK::MCPHERSONSave a tree: kill an ISO working group.Wed Apr 08 1992 10:5812
Please post the EXACT, VERBATIM output of DECmcc for the things I asked for. 
This means an *actual* log of the session using the commands I asked for.
This also means the output of a SHOW MCC 0 ALARM RULE <rule-name> ALL
ATTRIBUTES.

It may be just a problem with the alarm syntax, but I will never be able to
tell unless I can see the EXACT output from DECmcc.

Also, Is the alarm being created/enabled interactively or from a batch
procedure?

/doug
2686.8sorry for lack of infoCOMICS::MISTRYWed Apr 08 1992 12:187
doug,

I'm getting the rest of the info as requested ie log of all the output. The
alarms are enabled interactively via a command file.

Bipin.

2686.9Supplementary Info.....KERNEL::MACLEANSandie Maclean,Networks &amp; CommsTue Apr 14 1992 13:18120
Hi Doug!!!,

With respect to Bipin's Problem ,here are the details that you have
been looking for -Get yourself a Coffee & a comfy seat!!.(In the last few
days  he has also been seeing "requested operation cannot be completed,
MCC-E-TRANSMITERR,Error trying to transmit a packet,SYST-F-DEVREQERR Device
Request Error ...The 3100 had Send Fail's flagged against it's DESVA,but
nothing against the Physical Card itself...errors included 'Remote failure
to defer' and also 'excessive collisions' and since his Mcc node is tapped
onto the net via a despr onto a Delni onto an H4005(without heartbeat) onto
the thickwire segment ...Look as if there may be a heavily loaded network
or a physical problem )....Hope this is all the Info. you needed ........

Regards,
Sandie Maclean,Networks & Comm.s

-----------------------------------------------------------------------------
********The Following are EXTRACTS from his MCC_TRANSLAN_ALARMS.COM ...They
are Subsequently enabled by his ENABLE_TRANSLAN_ALARMS.COM (by eg Enable MCC
0 Alarms Rule alarmrulename,in Domain  .UK)...I can Enable them MCC>@xxx.com
but while dialled in today trying to Sh mcc 0 alarm translan_king-issues etc.
came up with no such alarm,but the .com Enabled them quite happily,and a 
rerun of the enable_translan_alarms.com showed they were already operating as
it failed with a 'duplicate' error....He says he has applied the alarms patch
But his mcc_alarms_instance_mir.dat & _attribute_mir.dat have old dates and
the .dat_old_v1_1 Mir files arent there-unfortunately the customer went home
early so I couldnt pursue this further today!! (This is why the command
file extract is included rather than a SHO MCC ALARM RULE * ALL ATTR)
-----------------------------------------------------------------------------
!  F O R W A R D I N G    B R I D G E S
!
!  Translan 320 Kingsgate - Issues
!
Create MCC 0 ALARMS RULE TRANSLAN_KINGS-ISSUES  -
  Category           = "Translan", -
  Description        = "Translan .Translan.KINGS-ISSUES Line 2 not forwarding",
-
  Expression         = (TRANSLAN .Translan.KINGS-ISSUES line 2 module state <> f
                                                                               -
  Procedure          = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_ALARM.COM;, -
  Exception Handler  = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_EXCEPTION.COM;, -
  Parameter          = "Network_S", -
  Queue              = "sys$batch", -
  Perceived Severity = critical, -
  in domain = LBPLC:.UK
!

!
Create MCC 0 ALARMS RULE TRANSLAN_KINGS-COURT  -
  Category           = "Translan", -
  Description        = "Translan .Translan.KINGS-COURT Line 2 not forwarding", -
  Expression         = (TRANSLAN .Translan.KINGS-COURT line 2 module state <> fo
rwarding, at  every=00:00:30), -
  Procedure          = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_ALARM.COM;, -
  Exception Handler  = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_EXCEPTION.COM;, -
  Parameter          = "Network_S", -
  Queue              = "sys$batch", -
  Perceived Severity = critical, -
  in domain = LBPLC:.UK


!
!  B A C K U P    A L A R M S
!
! Translan III   KINGSGATE - YEOMAN ROAD
!
Create MCC 0 ALARMS RULE TRANSLAN_KINGS-YEOM  -
  Category           = "Translan", -
  Description        = "Translan .Translan.KINGS-YEOM Line 2 not in backup", -
  Expression         = (TRANSLAN .Translan.KINGS-YEOM line 2 module state <> Bac
kup, at  every=00:00:30), -
  Procedure          = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_ALARM.COM;, -
  Exception Handler  = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_EXCEPTION.COM;, -
  Parameter          = "Network_S", -
  Queue              = "sys$batch", -
  Perceived Severity = critical, -
  in domain = LBPLC:.UK


------------------------------------------------------------------------------
**Additional Information regarding the Exception Firing **********************
------------------------------------------------------------------------------
$ @SYS$COMMON:[MCC]MCC_ALARMS_MAIL_EXCEPTION.COM;1 "MCC 0 ALARMS RULE TRANSLAN_K
INGS-ISSUES"-          !rulename
                      "Translan .Translan.KINGS-ISSUES Line 2 not forwarding"-
        !category
                      "Translan"-          !description
                      "(TRANSLAN .Translan.KINGS-ISSUES line 2 module state <> f
orwarding, at  every=00:00:30)"-          !expression
                      "14-APR-1992 12:04:11.87"-          !time
                      "Cannot communicate with target"-          !dtcrtf or erro
r
                      "Network_S"-         !notification params
                      "SYS$SCRATCH:MCC_ALARMS_DATA_12041187.DAT"    !file that c
ontains more info about the rule


$ dir mcc*alarm*.dat

Directory SYS$COMMON:[MCC]

MCC_ALARMS_ATTRIBUTE_MIR.DAT;1          MCC_ALARMS_DATA_15440132.DAT;1
MCC_ALARMS_INSTANCE_MIR.DAT;1

Total of 3 files.
$ ty MCC_ALARMS_DATA_15440132.DAT;1
RULE: MCC 0 ALARMS RULE TRANSLAN_YEOM-KINGS
DESCRIPTION: Translan .Translan.YEOM-KINGS Line 2 not forwarding
CATEGORY: Translan
EXPRESSION: (TRANSLAN .Translan.YEOM-KINGS line 2 module state <> forwarding, at
  every=00:00:30)
TIME: 9-APR-1992 15:44:01.32
EVIDENCE: Communication with the target has been interrupted
PARAMETER: Network_S
SEVERITY: critical
DOMAIN: Domain LBPLC:.UK


    
2686.10I suspect the network configuration...TOOK::MCPHERSONSave a tree: kill an ISO working group.Tue Apr 14 1992 13:5953
Howdy,

    Thanks for the details, Sandie!

>In the last few
>days  he has also been seeing "requested operation cannot be completed,
>MCC-E-TRANSMITERR,Error trying to transmit a packet,SYST-F-DEVREQERR Device
>Request Error ...The 3100 had Send Fail's flagged against it's DESVA,but
>nothing against the Physical Card itself...errors included 'Remote failure
>to defer' and also 'excessive collisions' and since his Mcc node is tapped
>onto the net via a despr onto a Delni onto an H4005(without heartbeat) onto
>the thickwire segment ...Look as if there may be a heavily loaded network
>or a physical problem )....Hope this is all the Info. you needed ........

    This doesn't sound too good. & it's starting to smell like a network
    configuration problem.

    Do your comments about SVA-0 mean that you were looking at the NCP LINE
    counters for SVA-0 on the MCC system?  If yes then ok, but if not, then
    please look at those counters for anomalies.
    
    Is it possible to remove the DESPR from the link & go straight to the
    DELNI?  Or better yet, bypass the DELNI altogether?   Sounds like you
    probably have some ethernet topology problems lurking in there
    somewhere....

    Does the customer (or us) have a handle on what's really going on on the
    network? I.e. we need to be able to tell if (as you suggest) the network
    _really_ that busy or we're just seeing the manifestation of a
    configuration snafu...  (Any LTM or even Sniffer data for the segment?)

    Nextly, all the rules you posted appear to be polling at 30 second
    intervals... Can you stretch it out to at least 00:01:00 intervals? 

    Also can you stagger the rules when they're enabled so they don't all try
    to poll at the same time?  (maybe with an "AT START = +00:00:23" or some
    prime number of seconds...)

    I forget: did anyone say what the speed  & utilization of the serial link
    between the local and remote translan is?   If it's low-speed/heavily
    congested OR the Translan is busy coping with an extremely busy or
    misconfigured LAN on its ethernet port, then you'll see those kinds of
    problems.

    I realize that I keep pounding with questions, but there appears to be a
    fundamental question of network configuration/stability that is still
    nagging.   Given the current facts, I seriously doubt that the problem is
    with the Translan AM (except for the fact that it is currently the only
    module that's really "complaining" about things... ;^) )

    /doiug


2686.11more infoCOMICS::MISTRYWed Apr 15 1992 10:2616
Hi,

Right, the translan is connected to the delni and not to the main backbone. I've
asked the customer to move the despr (not sure if he is able to do this) and 
connect the mcc station directly to the delni. The translan link is 64k, using
all the bandwidth.

The ultilization of the network, he is not sure but is going to try and find 
out.

Regarding moving the polling period and skewing the alarms, not sure whether he
is willing to do this (not a very helpful customer).


Bipin.

2686.12VCSESU::WADEBill Wade, VAXc Systems &amp; Support EngWed Apr 15 1992 10:2713
    
    Hi Sandie,
    
    The remote failure to defer errors indicate a network config problem. 
    In some way the (E-)LAN has gone beyond its length limitations and a
    remote node that should have sensed carrier on the wire, and deferred,
    did not.
    
    Although, I haven't an answer to why a SHOW is successful and an alarm poll 
    is not...
                                       
    bill                             
    
2686.13Thanks, Bill.IMDOWN::SYSTEMSave a tree: kill an ISO working group.Wed Apr 15 1992 10:4518
Thanks for jumping in, Bill.   My network troubleshooting skills have gotten
kinda 'arthritic' since I haven't been in that mode for a while...

At this point, I would check for 'over-cascaded' DELNIs and >2 repeater hops
off the backbone.

>   Although, I haven't an answer to why a SHOW is successful and an alarm poll 
>   is not...

Did we actually confirm that the user *could* consistently perform an
*interactive* SHOW TRANSLAN <name> LINE * ALL STATUS ?  Every 30 seconds?    
This is what the Alarms FM is asking the AM to do.   If we can do this from
tyhe command line (consistently) then I, too, am mystified...

/doug 

(Ah, that's doug *mcpherson* who just realized he's answering from a SYSTEM
account..)
2686.14check network configCOMICS::MISTRYWed Apr 15 1992 12:2711
Doug,

The Show translan ..... status has only been tried once, straight after the 
alarm exception occurred. However, what i may have to do is to get someone on
site to check out the exact layout of the ethernet and network configuration.

The reason the other modules are not experiencing a problem is because he is
only monitoring the translans.

Bipin

2686.15VMS version ?STKMCC::LUNDNiklas LundWed Apr 15 1992 12:456
Hi

What version of VMS is your customer running on the MCC Station ?
Have they upgraded lately ?

/Niklas