T.R | Title | User | Personal Name | Date | Lines |
---|
2686.1 | can mean lots of things. | TOOK::MCPHERSON | Save a tree: kill an ISO working group. | Sat Apr 04 1992 15:33 | 35 |
| >Have a problem with the vitalink access module V1.1 (BMS 1.1). Customer has set
>up and alarm against the forwarding and backup databases for some translans,
>looking for a change of state. Even though the databases haven't changed the
>alarm fires and he gets the following error messages :
>
>Cannot communicate with target
>Communication target has been interrupted.
>
>Now he tells me that the databases haven't changed.
First, what is a 'backup database' ?
Second: Please post the alarm rule that your customer is using.
The alarm, per se, isn't firing; the *Exception* is.
The error you're seeing means that at some point in the dialogue
between DECmcc and the TransLAN in question, communications was
disrupted. It may have been due to any number of things...
the TransLAN rebooted,
someone cut the wire,
etc.
It may have simply been that the TransLAN got asked to forward a bunch
of packets while it was trying to service the AM management request and
it just dropped the management request on the floor (Yes, they _will_
do that.) and the AM decided that 'Communications with the target have
been interrupted'...
Also, forwarding databases can be *large* and the AM may have to
'churn' for a while to do whatever evaluation it is that the alarm is
asking for..
regards,
/doug
|
2686.2 | more information | COMICS::MISTRY | | Tue Apr 07 1992 05:18 | 17 |
| Hi,
I am finding out how large the forwarding database is. The problem he is more
or less reproducable at will, he enables the alarm and almost immediately he
gets this message back.
The alarm expression is something like IF line change from forward to backup then
fire alarm, however the line state hasn't changed (these are the sync links).
Now i believe there was a problem with translans where mcc couldn't perform loops
against the synchronous link (ie problem with the translan code). Is this still
the case and i believe this may have something to do with it.
Bipin
Rest of the onfo you requested i am getting.
|
2686.3 | | TOOK::MCPHERSON | Save a tree: kill an ISO working group. | Tue Apr 07 1992 09:36 | 43 |
| >
>I am finding out how large the forwarding database is. The problem he is more
>or less reproducable at will, he enables the alarm and almost immediately he
>gets this message back.
>
>The alarm expression is something like IF line change from forward to backup
>then fire alarm, however the line state hasn't changed (these are the sync
>links).
>
The "Forwarding Database" is the database containing all of the
physical and multicast entries that the Translan knows about; it has
nothing to do with whether a line is forwarding, preforwarding_1,
preforwarding_2, broken, etc. What you're talking about sounds like a
CHANGE_OF rule on LINE <line#> MODULE STATE (a status attribute).
>
>Now i believe there was a problem with translans where mcc couldn't perform
>loops against the synchronous link (ie problem with the translan code). Is this
>still the case and i believe this may have something to do with it.
>
There is no "LOOP" command from the Translan AM; the RBMS agent on the
Translan doesn't support that operation, so you'll need to have a
connection to the management port on the Translan and do circuit LOOP
testing from the REC menu. Or are you referring to the fact that
Vitalink recently dropped support of the Station AM's "TEST" command
for remote Translans?
>Rest of the onfo you requested i am getting.
Also, it would be very helpful to understand:
1) rev levels of the Translan Software (if you already
mentioned it, forgive me.)
2) what is the topological relationship between the mgmt station and
the Translans (esp the one you're trying to create the alarm rule on)
3) How heavily loaded the Translans are (traffic-wise)
4) Speed / utilization of the links between ths Translans
regards ,
doug
|
2686.4 | more info as requested | COMICS::MISTRY | | Tue Apr 07 1992 11:35 | 26 |
| Sorry for the in complete picture, my understanding of translans isn't that
great.
> 1) rev levels of the Translan Software (if you already
mentioned it, forgive me.)
> 2) what is the topological relationship between the mgmt station and
the Translans (esp the one you're trying to create the alarm rule on)
> 3) How heavily loaded the Translans are (traffic-wise)
> 4) Speed / utilization of the links between ths Translans
The translans that he has are 300's and 320's. The 300 are 6.10.1 and the 320's
20.2.13.
The topology is a MCC station connected to a single port repeater which in
turn is connected to a delni and then on to the main backbone. The backbone then
has the translans connected.
I'm trying to find out the utilization and the throughput of the lines.
The alarm rule expression is line2 module state <on> forwarding, at every 30
seconds.
Hope that is more helpful.
Bipin.
|
2686.5 | Getting warmer. | MCDOUG::MCPHERSON | Save a tree: kill an ISO working group. | Tue Apr 07 1992 15:13 | 59 |
| >
>The translans that he has are 300's and 320's. The 300 are 6.10.1 and the 320's
>20.2.13.
>
There is no such thing as a TransLAN 300 (as far as I know), I assume
that you mean TransLAN III ?
There are several known bugs in release 6.10.1 of the TransLAN software
related to the VBAM; you may have uncovered some new ones. The
recommended minimum rev levels of the TransLAN are:
6.10.2
10.4.2
20.2.2
>
>The topology is a MCC station connected to a single port repeater which in turn
>is connected to a delni and then on to the main backbone. The backbone then has
>the translans connected.
>
Something Like this:?
[MCC]
|
[DESPR]
|
||||||||
[ DELNI ]
|
---------+------+----------------------------
|
[ translan A ]
/ \
/ \
/ \
[ Translan B ] [ Translan C ]
| |
----+--- ----+---
What happens when you do the following:
MCC> SHOW TRANSLAN A line * all status
MCC> SHOW TRANSLAN B line * all status
MCC> SHOW TRANSLAN C line * all status
>
>The alarm rule expression is line2 module state <on> forwarding, at every 30
>seconds.
>
I get the gist, but can you reply here with the *exact* text of the
rule that was created?
/doug
|
2686.6 | red hot... | COMICS::MISTRY | | Wed Apr 08 1992 08:04 | 27 |
| Hi,
Yep, topology is correct. Right what I got him to do was to active the alarm and
leave it running, the alarm fired (ie icon went red) after about an hour with
exactly the same exception, cannot communicate with target. He then did a
mcc> show translan xxxx line * all status.
No problems, mcc managed to get all the line info and the state hadn't changed,
it was still in backup.
Alarm syntax is translan xxxxx line2 module state <>, forwarding or, backup at
every 30 seconds.
Now I know the translan is probably going to put an mcc request at the bottom of
its list of things to do if it is busy. However, if he can then do a show
immediately after then it can't be that busy.
From what I can determine if it is a busy translan problem then there is no way
round this because if mcc cannot communicate with the target then it won't know
what the the line is in (ie forwarding or backup) and so is going to report a
problem with the bridge and hence the line.
Any ideas.
Bipin.
|
2686.7 | Please post the verbatim output. | TOOK::MCPHERSON | Save a tree: kill an ISO working group. | Wed Apr 08 1992 10:58 | 12 |
| Please post the EXACT, VERBATIM output of DECmcc for the things I asked for.
This means an *actual* log of the session using the commands I asked for.
This also means the output of a SHOW MCC 0 ALARM RULE <rule-name> ALL
ATTRIBUTES.
It may be just a problem with the alarm syntax, but I will never be able to
tell unless I can see the EXACT output from DECmcc.
Also, Is the alarm being created/enabled interactively or from a batch
procedure?
/doug
|
2686.8 | sorry for lack of info | COMICS::MISTRY | | Wed Apr 08 1992 12:18 | 7 |
| doug,
I'm getting the rest of the info as requested ie log of all the output. The
alarms are enabled interactively via a command file.
Bipin.
|
2686.9 | Supplementary Info..... | KERNEL::MACLEAN | Sandie Maclean,Networks & Comms | Tue Apr 14 1992 13:18 | 120 |
|
Hi Doug!!!,
With respect to Bipin's Problem ,here are the details that you have
been looking for -Get yourself a Coffee & a comfy seat!!.(In the last few
days he has also been seeing "requested operation cannot be completed,
MCC-E-TRANSMITERR,Error trying to transmit a packet,SYST-F-DEVREQERR Device
Request Error ...The 3100 had Send Fail's flagged against it's DESVA,but
nothing against the Physical Card itself...errors included 'Remote failure
to defer' and also 'excessive collisions' and since his Mcc node is tapped
onto the net via a despr onto a Delni onto an H4005(without heartbeat) onto
the thickwire segment ...Look as if there may be a heavily loaded network
or a physical problem )....Hope this is all the Info. you needed ........
Regards,
Sandie Maclean,Networks & Comm.s
-----------------------------------------------------------------------------
********The Following are EXTRACTS from his MCC_TRANSLAN_ALARMS.COM ...They
are Subsequently enabled by his ENABLE_TRANSLAN_ALARMS.COM (by eg Enable MCC
0 Alarms Rule alarmrulename,in Domain .UK)...I can Enable them MCC>@xxx.com
but while dialled in today trying to Sh mcc 0 alarm translan_king-issues etc.
came up with no such alarm,but the .com Enabled them quite happily,and a
rerun of the enable_translan_alarms.com showed they were already operating as
it failed with a 'duplicate' error....He says he has applied the alarms patch
But his mcc_alarms_instance_mir.dat & _attribute_mir.dat have old dates and
the .dat_old_v1_1 Mir files arent there-unfortunately the customer went home
early so I couldnt pursue this further today!! (This is why the command
file extract is included rather than a SHO MCC ALARM RULE * ALL ATTR)
-----------------------------------------------------------------------------
! F O R W A R D I N G B R I D G E S
!
! Translan 320 Kingsgate - Issues
!
Create MCC 0 ALARMS RULE TRANSLAN_KINGS-ISSUES -
Category = "Translan", -
Description = "Translan .Translan.KINGS-ISSUES Line 2 not forwarding",
-
Expression = (TRANSLAN .Translan.KINGS-ISSUES line 2 module state <> f
-
Procedure = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_ALARM.COM;, -
Exception Handler = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_EXCEPTION.COM;, -
Parameter = "Network_S", -
Queue = "sys$batch", -
Perceived Severity = critical, -
in domain = LBPLC:.UK
!
!
Create MCC 0 ALARMS RULE TRANSLAN_KINGS-COURT -
Category = "Translan", -
Description = "Translan .Translan.KINGS-COURT Line 2 not forwarding", -
Expression = (TRANSLAN .Translan.KINGS-COURT line 2 module state <> fo
rwarding, at every=00:00:30), -
Procedure = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_ALARM.COM;, -
Exception Handler = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_EXCEPTION.COM;, -
Parameter = "Network_S", -
Queue = "sys$batch", -
Perceived Severity = critical, -
in domain = LBPLC:.UK
!
! B A C K U P A L A R M S
!
! Translan III KINGSGATE - YEOMAN ROAD
!
Create MCC 0 ALARMS RULE TRANSLAN_KINGS-YEOM -
Category = "Translan", -
Description = "Translan .Translan.KINGS-YEOM Line 2 not in backup", -
Expression = (TRANSLAN .Translan.KINGS-YEOM line 2 module state <> Bac
kup, at every=00:00:30), -
Procedure = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_ALARM.COM;, -
Exception Handler = SYS$COMMON:[MCC]MCC_ALARMS_MAIL_EXCEPTION.COM;, -
Parameter = "Network_S", -
Queue = "sys$batch", -
Perceived Severity = critical, -
in domain = LBPLC:.UK
------------------------------------------------------------------------------
**Additional Information regarding the Exception Firing **********************
------------------------------------------------------------------------------
$ @SYS$COMMON:[MCC]MCC_ALARMS_MAIL_EXCEPTION.COM;1 "MCC 0 ALARMS RULE TRANSLAN_K
INGS-ISSUES"- !rulename
"Translan .Translan.KINGS-ISSUES Line 2 not forwarding"-
!category
"Translan"- !description
"(TRANSLAN .Translan.KINGS-ISSUES line 2 module state <> f
orwarding, at every=00:00:30)"- !expression
"14-APR-1992 12:04:11.87"- !time
"Cannot communicate with target"- !dtcrtf or erro
r
"Network_S"- !notification params
"SYS$SCRATCH:MCC_ALARMS_DATA_12041187.DAT" !file that c
ontains more info about the rule
$ dir mcc*alarm*.dat
Directory SYS$COMMON:[MCC]
MCC_ALARMS_ATTRIBUTE_MIR.DAT;1 MCC_ALARMS_DATA_15440132.DAT;1
MCC_ALARMS_INSTANCE_MIR.DAT;1
Total of 3 files.
$ ty MCC_ALARMS_DATA_15440132.DAT;1
RULE: MCC 0 ALARMS RULE TRANSLAN_YEOM-KINGS
DESCRIPTION: Translan .Translan.YEOM-KINGS Line 2 not forwarding
CATEGORY: Translan
EXPRESSION: (TRANSLAN .Translan.YEOM-KINGS line 2 module state <> forwarding, at
every=00:00:30)
TIME: 9-APR-1992 15:44:01.32
EVIDENCE: Communication with the target has been interrupted
PARAMETER: Network_S
SEVERITY: critical
DOMAIN: Domain LBPLC:.UK
|
2686.10 | I suspect the network configuration... | TOOK::MCPHERSON | Save a tree: kill an ISO working group. | Tue Apr 14 1992 13:59 | 53 |
| Howdy,
Thanks for the details, Sandie!
>In the last few
>days he has also been seeing "requested operation cannot be completed,
>MCC-E-TRANSMITERR,Error trying to transmit a packet,SYST-F-DEVREQERR Device
>Request Error ...The 3100 had Send Fail's flagged against it's DESVA,but
>nothing against the Physical Card itself...errors included 'Remote failure
>to defer' and also 'excessive collisions' and since his Mcc node is tapped
>onto the net via a despr onto a Delni onto an H4005(without heartbeat) onto
>the thickwire segment ...Look as if there may be a heavily loaded network
>or a physical problem )....Hope this is all the Info. you needed ........
This doesn't sound too good. & it's starting to smell like a network
configuration problem.
Do your comments about SVA-0 mean that you were looking at the NCP LINE
counters for SVA-0 on the MCC system? If yes then ok, but if not, then
please look at those counters for anomalies.
Is it possible to remove the DESPR from the link & go straight to the
DELNI? Or better yet, bypass the DELNI altogether? Sounds like you
probably have some ethernet topology problems lurking in there
somewhere....
Does the customer (or us) have a handle on what's really going on on the
network? I.e. we need to be able to tell if (as you suggest) the network
_really_ that busy or we're just seeing the manifestation of a
configuration snafu... (Any LTM or even Sniffer data for the segment?)
Nextly, all the rules you posted appear to be polling at 30 second
intervals... Can you stretch it out to at least 00:01:00 intervals?
Also can you stagger the rules when they're enabled so they don't all try
to poll at the same time? (maybe with an "AT START = +00:00:23" or some
prime number of seconds...)
I forget: did anyone say what the speed & utilization of the serial link
between the local and remote translan is? If it's low-speed/heavily
congested OR the Translan is busy coping with an extremely busy or
misconfigured LAN on its ethernet port, then you'll see those kinds of
problems.
I realize that I keep pounding with questions, but there appears to be a
fundamental question of network configuration/stability that is still
nagging. Given the current facts, I seriously doubt that the problem is
with the Translan AM (except for the fact that it is currently the only
module that's really "complaining" about things... ;^) )
/doiug
|
2686.11 | more info | COMICS::MISTRY | | Wed Apr 15 1992 10:26 | 16 |
| Hi,
Right, the translan is connected to the delni and not to the main backbone. I've
asked the customer to move the despr (not sure if he is able to do this) and
connect the mcc station directly to the delni. The translan link is 64k, using
all the bandwidth.
The ultilization of the network, he is not sure but is going to try and find
out.
Regarding moving the polling period and skewing the alarms, not sure whether he
is willing to do this (not a very helpful customer).
Bipin.
|
2686.12 | | VCSESU::WADE | Bill Wade, VAXc Systems & Support Eng | Wed Apr 15 1992 10:27 | 13 |
|
Hi Sandie,
The remote failure to defer errors indicate a network config problem.
In some way the (E-)LAN has gone beyond its length limitations and a
remote node that should have sensed carrier on the wire, and deferred,
did not.
Although, I haven't an answer to why a SHOW is successful and an alarm poll
is not...
bill
|
2686.13 | Thanks, Bill. | IMDOWN::SYSTEM | Save a tree: kill an ISO working group. | Wed Apr 15 1992 10:45 | 18 |
| Thanks for jumping in, Bill. My network troubleshooting skills have gotten
kinda 'arthritic' since I haven't been in that mode for a while...
At this point, I would check for 'over-cascaded' DELNIs and >2 repeater hops
off the backbone.
> Although, I haven't an answer to why a SHOW is successful and an alarm poll
> is not...
Did we actually confirm that the user *could* consistently perform an
*interactive* SHOW TRANSLAN <name> LINE * ALL STATUS ? Every 30 seconds?
This is what the Alarms FM is asking the AM to do. If we can do this from
tyhe command line (consistently) then I, too, am mystified...
/doug
(Ah, that's doug *mcpherson* who just realized he's answering from a SYSTEM
account..)
|
2686.14 | check network config | COMICS::MISTRY | | Wed Apr 15 1992 12:27 | 11 |
| Doug,
The Show translan ..... status has only been tried once, straight after the
alarm exception occurred. However, what i may have to do is to get someone on
site to check out the exact layout of the ethernet and network configuration.
The reason the other modules are not experiencing a problem is because he is
only monitoring the translans.
Bipin
|
2686.15 | VMS version ? | STKMCC::LUND | Niklas Lund | Wed Apr 15 1992 12:45 | 6 |
| Hi
What version of VMS is your customer running on the MCC Station ?
Have they upgraded lately ?
/Niklas
|