[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference 7.286::fddi

Title:FDDI - The Next Generation
Moderator:NETCAD::STEFANI
Created:Thu Apr 27 1989
Last Modified:Thu Jun 05 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2259
Total number of notes:8590

2220.0. "Problem with FDDI communication in VMS cluster enviroment(?)" by BRADEC::BALOGH (Gabriel Balogh @BRC) Thu Feb 13 1997 10:17

Hello!

	    I got similar problem as described in topic 1923.0. Error 
messages are same, but some status are different  and every time when 
problem occurs got only 2 messages . 

          There aren't control time out messages   (after reboot, but 
there are 

                     "FATAL ERRORS DETECTED BY DATALINK"
	
      After reboot there are immediately 2 errors on PEA0 and then 1 
or 2 times weekly we got the same errors on running  system. In this 
case all  network connections are broken and in some minutes 
restarted.

      System description: 2x OVMS in cluster  AS2100 with DEFPA.

       If this is the same problem as described in 1923.0 , please 
give me a pointer to patch, because the topic 1923 ending with CLD 
request and I could not found patch.
	
	ALPLAN02_062  includes  sys$fwdriver for V6.1  and   patch
ALPLAN03_062 is on hold.(?)

					Thanks!

						Gabriel 
Gabriel Balogh @ BRC



 V M S                SYSTEM ERROR REPORT         COMPILED 13-FEB-1997 11:01:47
                                                                      PAGE   1.

 ******************************* ENTRY     133. *******************************
 ERROR SEQUENCE 1.                               LOGGED ON:  CPU_TYPE 00000005
 DATE/TIME 29-JAN-1997 17:46:20.50                            SYS_TYPE 00000009
 SYSTEM UPTIME: 0 DAYS 00:00:17
 SCS NODE: BCPUPP                                           OpenVMS AXP V6.2

 HW_MODEL: 0000045F Hardware Model = 1119.

 DEVICE ATTENTION AlphaServer 2100 5/250

 NI-SCS SUB-SYSTEM, BCPUPP$PEA0:

       FATAL ERROR DETECTED BY DATALINK

       STATUS          8A2DF200
                       00001201
       DATALINK UNIT       0001
       DATALINK NAME   41574603
                       00000000
                       00000000
                       00000000
                                       DATALINK NAME = FWA1:
       REMOTE NODE     00000000
                       00000000
                       00000000
                       00000000
       REMOTE ADDR     00000000
                           0000
       LOCAL ADDR      000400AA
                           0401
                                       ETHERNET ADDR = 0E-01-01-00-00-00
       ERROR CNT           0001
                                       1. ERROR OCCURRENCES THIS ENTRY
       UCB$L_ERRCNT    00000001
                                       1. ERRORS THIS UNIT

 V M S                SYSTEM ERROR REPORT         COMPILED 13-FEB-1997 11:01:47
                                                                      PAGE   2.

 ******************************* ENTRY     134. *******************************
 ERROR SEQUENCE 2.                               LOGGED ON:  CPU_TYPE 00000005
 DATE/TIME 29-JAN-1997 17:46:21.36                            SYS_TYPE 00000009
 SYSTEM UPTIME: 0 DAYS 00:00:18
 SCS NODE: BCPUPP                                           OpenVMS AXP V6.2

 HW_MODEL: 0000045F Hardware Model = 1119.

 DEVICE ATTENTION AlphaServer 2100 5/250

 NI-SCS SUB-SYSTEM, BCPUPP$PEA0:

       FATAL ERROR DETECTED BY DATALINK

       STATUS          00000400
                       00001200
       DATALINK UNIT       0001
       DATALINK NAME   41574603
                       00000000
                       00000000
                       00000000
                                       DATALINK NAME = FWA1:
       REMOTE NODE     00000000
                       00000000
                       00000000
                       00000000
       REMOTE ADDR     00000000
                           0000
       LOCAL ADDR      000400AA
                           0401
                                       ETHERNET ADDR = 0E-01-01-00-00-00
       ERROR CNT           0001
                                       1. ERROR OCCURRENCES THIS ENTRY
       UCB$L_ERRCNT    00000002
                                       2. ERRORS THIS UNIT

 V M S                SYSTEM ERROR REPORT         COMPILED 13-FEB-1997 11:01:47
                                                                      PAGE   3.

 ******************************* ENTRY     146. *******************************
 ERROR SEQUENCE 3114.                            LOGGED ON:  CPU_TYPE 00000005
 DATE/TIME 30-JAN-1997 13:37:29.84                            SYS_TYPE 00000009
 SYSTEM UPTIME: 0 DAYS 19:51:23
 SCS NODE: BCPUPP                                           OpenVMS AXP V6.2

 HW_MODEL: 0000045F Hardware Model = 1119.

 ERL$LOGMESSAGE AlphaServer 2100 5/250

 NI-SCS SUB-SYSTEM, _BCPUPP$PEA0:

       PORT HAS CLOSED VIRTUAL CIRCUIT

       LOCAL STATION ADDRESS, FFFFFFFFFF00(X)
       LOCAL SYSTEM ID, 000000000401(X)

       REMOTE STATION ADDRESS, 0000000000DE(X)
       REMOTE SYSTEM ID, 000000000402(X)

       UCB$L_ERTCNT    00000032
                                       50. RETRIES REMAINING
       UCB$L_ERTMAX    00000032
                                       50. RETRIES ALLOWABLE
       UCB$L_ERRCNT    00000003
                                       3. ERRORS THIS UNIT
       PPD$B_PORT            00
                                       REMOTE NODE # 0.
       PPD$B_STATUS          00
       PPD$B_OPC             00
                                       UNKNOWN OPCODE
       PPD$B_FLAGS           00

 V M S                SYSTEM ERROR REPORT         COMPILED 13-FEB-1997 11:01:47
                                                                      PAGE   4.

 ******************************* ENTRY     148. *******************************
 ERROR SEQUENCE 3116.                            LOGGED ON:  CPU_TYPE 00000005
 DATE/TIME 30-JAN-1997 13:38:28.29                            SYS_TYPE 00000009
 SYSTEM UPTIME: 0 DAYS 19:52:21
 SCS NODE: BCPUPP                                           OpenVMS AXP V6.2

 HW_MODEL: 0000045F Hardware Model = 1119.

 DEVICE ATTENTION AlphaServer 2100 5/250

 NI-SCS SUB-SYSTEM, BCPUPP$PEA0:

       FATAL ERROR DETECTED BY DATALINK

       STATUS          0000045C
                       00001201
       DATALINK UNIT       0001
       DATALINK NAME   41574603
                       00000000
                       00000000
                       00000000
                                       DATALINK NAME = FWA1:
       REMOTE NODE     00000000
                       00000000
                       00000000
                       00000000
       REMOTE ADDR     00000000
                           0000
       LOCAL ADDR      000400AA
                           0401
                                       ETHERNET ADDR = 0E-01-01-00-00-00
       ERROR CNT           0001
                                       1. ERROR OCCURRENCES THIS ENTRY
       UCB$L_ERRCNT    00000004
                                       4. ERRORS THIS UNIT

 V M S                SYSTEM ERROR REPORT         COMPILED 13-FEB-1997 11:01:47
                                                                      PAGE   5.

 ******************************* ENTRY     149. *******************************
 ERROR SEQUENCE 3117.                            LOGGED ON:  CPU_TYPE 00000005
 DATE/TIME 30-JAN-1997 13:38:32.30                            SYS_TYPE 00000009
 SYSTEM UPTIME: 0 DAYS 19:52:25
 SCS NODE: BCPUPP                                           OpenVMS AXP V6.2

 HW_MODEL: 0000045F Hardware Model = 1119.

 DEVICE ATTENTION AlphaServer 2100 5/250

 NI-SCS SUB-SYSTEM, BCPUPP$PEA0:

       FATAL ERROR DETECTED BY DATALINK

       STATUS          00000400
                       00001200
       DATALINK UNIT       0001
       DATALINK NAME   41574603
                       00000000
                       00000000
                       00000000
                                       DATALINK NAME = FWA1:
       REMOTE NODE     00000000
                       00000000
                       00000000
                       00000000
       REMOTE ADDR     00000000
                           0000
       LOCAL ADDR      000400AA
                           0401
                                       ETHERNET ADDR = 0E-01-01-00-00-00
       ERROR CNT           0001
                                       1. ERROR OCCURRENCES THIS ENTRY
       UCB$L_ERRCNT    00000005
                                       5. ERRORS THIS UNIT
ANAL ERRLOG.SYS/ERR/FULL/SINCE=15-JAN-1997
00:00:00.00/INCL=PEA0/OUT=V6.TXT/ENTR=(START:133,END:149)
T.RTitleUserPersonal
Name
DateLines
2220.1STAR::STOCKDALEFri Feb 14 1997 06:406
What version of VMS?

Note that the error log entries are meaningless.  Do a SHOW LAN/ERROR
in SDA to find the device error information.

- Dick
2220.2re .1BRADEC::BALOGHGabriel Balogh @BRCFri Feb 14 1997 09:3118
Hello Dick!

	1) Version of VMS is V6.2
	2) I got output from show lan/err by fax from this reason I put here
 		only part with errors:

Fatal error count  2			Last error CSR 00000400
Fatal error code   3-XmtTimeout		Last fatal error 14-feb 13:09:04
.
.
Transmit timeouts   2
					Last UUB time 14-feb 13:57:03

In this moment they got the messages described in .0 entry=146,148,149

			Thank!

				Gabriel
2220.3STAR::STOCKDALEFri Feb 14 1997 14:0311
Normally, transmit timeouts occur when the link goes unavailable and there
are outstanding transmits issued to the device.  The driver times them out
by declaring a fatal error which results in the error log entries.

Most likely there is a ring problem and the link goes away for a while.  You
might see these sort of errors on multiple systems at the same time which
would be a strong indication that the problem is not related to the system
and DEFPA itself.  I'd try swapping the cable used on the DEFPA and/or the
port in the concentrator. 

- Dick
2220.4Re.: .3BRADEC::BALOGHGabriel Balogh @BRCMon Feb 17 1997 06:5628
Hello Dick!

  ========================================\\
//                                        ||
||                                        ||
||          MS900 Backplane               ||
||                                        ||
||         ----------       ----------    || 
\\=========|DEF6X-MA| ======|DEFBA-MA|===//
           ----------       ----------
            | | |
            | | \____-> BCPUPP
            | \______-> BCPDOWN
            \________-> CAESAR 


BCPUPP and BCPDOWN are in cluster, errors occur in different time. 
On CAESAR no errors found. 
        UTP cables & ports on DEF6X was changed between cluster members
and DEFPA was changed on BCPUPP.
        There is theoretical possibility to change port between
CAESAR & one cluster member. I try to do this today.

                                Thanks !

                
                                        Gabriel

2220.5+ .$4BRADEC::BALOGHGabriel Balogh @BRCMon Feb 17 1997 08:5954
You are right the problems appear on two nodes in same time, BUT
on node BCPDWN only 1 message (PORT HAS CLOSED VIRTUAL CIRCUIT)
and on the BCPUPP 3 Messages (PORT & 2 Data link see .0)
On BCPDWN no sho lan/err reported! (in this moment, but I can found
opposite case also).

SW versions are
MS900 4.1.1
900EF 1.5.2
900MX 3.2.3

                   		Gabriel

P.S. here is the report from second cluster member.
There is no datalink errors!?


 V M S                SYSTEM ERROR REPORT         COMPILED 17-FEB-1997 14:46:36
                                                                      PAGE   1.

 ******************************* ENTRY     114. *******************************
 ERROR SEQUENCE 6141.                            LOGGED ON:  CPU_TYPE 00000005
 DATE/TIME 30-JAN-1997 13:37:29.00                            SYS_TYPE 00000009
 SYSTEM UPTIME: 0 DAYS 19:51:22
 SCS NODE: BCPDWN                                           OpenVMS AXP V6.2

 HW_MODEL: 0000045F Hardware Model = 1119.

 ERL$LOGMESSAGE AlphaServer 2100 5/250

 NI-SCS SUB-SYSTEM, _BCPDWN$PEA0:

       PORT HAS CLOSED VIRTUAL CIRCUIT

       LOCAL STATION ADDRESS, FFFFFFFFFF00(X)
       LOCAL SYSTEM ID, 000000000402(X)

       REMOTE STATION ADDRESS, 0000000000DE(X)
       REMOTE SYSTEM ID, 000000000401(X)

       UCB$L_ERTCNT    00000032
                                       50. RETRIES REMAINING
       UCB$L_ERTMAX    00000032
                                       50. RETRIES ALLOWABLE
       UCB$L_ERRCNT    00000003
                                       3. ERRORS THIS UNIT
       PPD$B_PORT            00
                                       REMOTE NODE # 0.
       PPD$B_STATUS          00
       PPD$B_OPC             00
                                       UNKNOWN OPCODE
       PPD$B_FLAGS           00
ANA/ERR ERRLOG.SYS/INCL=PEA/SINCE=30-JAN-1997 00:00:00.00/BEFORE=31-JAN-1997
00:00:00.00/OUT=XXX.TXT
2220.6STAR::STOCKDALEMon Feb 17 1997 13:2410
So it sounds like the problem is localized to the one system (where the
SHOW LAN/ERROR shows errors).  I'd verify that the revisions of the modules
that you provided are the correct versions.  And verify the DEFPA firmware
version and if everything is up to rev, start replacing hardware.  Also,
you could try the DEFPA in a different slot.

I'll send you the latest V6.2 remedial stream SYS$FWDRIVER.EXE just in
case although there were no problems fixed that I know of in this area.

- Dick
2220.7re.: .6BRADEC::BALOGHGabriel Balogh @BRCTue Feb 18 1997 04:4724
Hello Dick!

"So it sounds like the problem is localized to the one system..." 

	I could not prove now but I think, that messages are logged in the
following order:

		BCPUPP                                        BCPDWN

PORT HAS CLOSED VIRTUAL CIRCUIT			PORT HAS CLOSED VIRTUAL CIRCUIT	
FATAL ERROR DETECTED BY DATALINK				-
FATAL ERROR DETECTED BY DATALINK				-

LAN errors in SDA


================================================================================

I have found the above errors  symmetric on the opposite machine, but I could
not found LAN errors in errlog.sys files. This is a missing information, which
will be prove, that errors are symmetric.

				Gabriel

2220.8+.7BRADEC::BALOGHGabriel Balogh @BRCTue Feb 18 1997 06:326
FDDI port on concentrator was changed between CAESAR & BCPDWN, yesterday.
Now BCPDWN reports 2 LAN errors (3-XMitTimeouts) and the described 3
errorlog entries in errlog.sys => It's symmetric. Not dependent on 
concentrator port. It can depend on cluster ?

				Gabriel
2220.9BRADEC::BALOGHGabriel Balogh @BRCTue Mar 04 1997 07:4310
Hi!
	We have changed DECconcentrator 900MX.
There are increasing LEM count on every port. What does it mean exactly?

On VMS SDA> show lan /err => are no new errors, but on one of them
was changed LAST UUB time. What does it mean LAST UUB TIME ?

			Thanks

				Gabriel. 
2220.10STAR::STOCKDALETue Mar 04 1997 09:4413
>>There are increasing LEM count on every port. What does it mean exactly?

LEM = Link Error Monitor.  What counter are you seeing increment and who is
displaying the counter?

>>On VMS SDA> show lan /err => are no new errors, but on one of them
>>was changed LAST UUB time. What does it mean LAST UUB TIME ?

UUB are User Buffer Unavailable which means an application did not keep
up with the incoming receives so the driver discarded a received packet
for this user because the user had not supplied a buffer.

- Dick
2220.11re .10BRADEC::BALOGHGabriel Balogh @BRCFri Mar 07 1997 07:508
LEM counters are increasing on every port directed via front inserts.
LEMR also are non zeroes on 2 of them.
These values are from DECconcentrator 900 MX in MS mananger.


			thank.

					Gabriel
2220.12Some questions on the UTP portsNPSS::KIRKFri Mar 07 1997 08:3211
    What UTP cable lengths are used on the ports with the increasing 
    LEM counts?   Can you measure the ring utilization rate?
    
    We have been having some LEM problems with UTP FDDI connections.
    
    Can you obtain the 54 Class numbers and serial numbers from the
    UTP cards?
    
    
    Dick Kirk
    NEtwork Product support
2220.13re. .12BRADEC::BALOGHGabriel Balogh @BRCFri Mar 07 1997 10:1314
Hi Dick!


	UTP cable lenght is less then 20m.(Customer guess)
	UTP Card 54 Class number is : 54-22499-03
                                   SN: TA62900004


				Thanks

					Gabriel

P.S. there are another 2 UTP card. I can check number for these cards.
	If you require.