T.R | Title | User | Personal Name | Date | Lines |
---|
5241.1 | | UTRTSC::thecow.uto.dec.com::JurVanDerBurg | Change mode to Panic! | Wed Mar 05 1997 01:24 | 6 |
| 1000's of ReXmt's must be related to the time the connection is up. If not more
than let's say .1% then no worry. If it's significant then you have some network
problem which may affect overall performance.
Jur.
|
5241.2 | | AMCFAC::RABAHY | dtn 471-5160, outside 1-810-347-5160 | Wed Mar 05 1997 08:54 | 18 |
| In 24 days, give or take, the worse case was 21387 ReXmt's. There seems to be a
load depended steady increase over time. In just a couple of hours one node had
83 ReXmt's to a single partner after a reboot.
The network is comprised of dedicated point-to-point MMF FDDI GIGAswitch/FDDI
links. So, they do operate in full duplex mode. 6 nodes, 4 with 1 link, 2 with
2 links. All using DEFPA-DA.
One node sticks out with especially large values of ReXmt. This node is the
only node which had NISCS_MAX_PKTSZ still at the default 1498 instead of being
increased to 4468. The next reboot will correct this and we'll see if it comes
into line. The other difference about this node is it is the only one in a
different data center. Longer fibers are required through the walls. Naturally
this involves connecting at ST-style LIU's.
I am most concerned about performance impact. My gut is telling me these
numbers are way too small to adversely effect performance noticably -- still it
would be nice to have confirmation. Also, what do the two numbers mean?
|
5241.3 | | AMCFAC::RABAHY | dtn 471-5160, outside 1-810-347-5160 | Wed Mar 05 1997 14:58 | 10 |
| From section F.2.4 of the OpenVMS Cluster Systems manual;
A well-configured OpenVMS Cluster system should
not perform excessive retransmissions between nodes.
Retransmissions between any nodes that occur more fre-
quently than once every few seconds deserve network
investigation.
Every few seconds is too vague and too lenient, yes? It should be guided by
load.
|
5241.4 | | AMCFAC::RABAHY | dtn 471-5160, outside 1-810-347-5160 | Wed Mar 05 1997 15:38 | 18 |
| re .3:
Section F.3.3, Table F-5 does much better;
The leftmost number (128) indicates the number of packets actually retransmit-
ted. For example, if the network loses two packets at the same time, one timeout
is counted but two packets are retransmitted. A retransmission occurs when the
local node does not receive an acknowledgment for a transmitted packet within a
predetermined timeout interval.
Although you should expect to see a certain number of retransmissions (especially
in heavily loaded networks), an excessive number of retransmissions wastes
network bandwidth and indicates excessive load or intermittent hardware fail-
ure. If the leftmost value in the ReXmt field is greater than about 0.01% to
0.05%
of the total number of the transmitted messages shown in the Msg Xmt field, the
OpenVMS Cluster system probably is experiencing excessive network problems or
local loss from congestion.
|
5241.5 | | AMCFAC::RABAHY | dtn 471-5160, outside 1-810-347-5160 | Thu Mar 06 1997 10:22 | 31 |
| re .4:
Again, from section F.3.3, Table F-5;
>The rightmost number (106) in the ReXmt field indicates the number of times a
>timeout occured.
My question is, how can there be more timeouts than retransmitted packets per
the following actual example??
VMScluster data structures
--------------------------
--- Virtual Circuit (VC) 82028D00 ---
Remote System Name: SPARTY (1:ALPHA) Remote SCSSYSTEMID: 26725
Local System ID: 218 (DA) Status: 0005 open,path
------ Transmit ------- ------ VC Closures ---- ---- Congestion Control ----
Msg Xmt 356679359 SeqMsg TMO 1 Pipe Quota/Slo/Max 5/ 3/31
Unsequence 5 CC DFQ Empty 0 Pipe Quota Reached 1611311
Sequence 298467108 Topology Change 0 Xmt C/T 6/320
ReXmt 3082/3106 NPAGEDYN Low 0 RndTrp uS 8700+9756
Lone ACK 58209164 UnAcked Msgs 13
Bytes Xmt 3211438740 CMD Queue Len/Max 0/121
------- Receive ------- - Messages Discarded - ----- Channel Selection ----
Msg Rcv 273590792 No Xmt Chan 0 Preferred Channel 82023AC0
Unsequence 6 Rcv Short Msg 0 Delay Time 0A0A2747
Sequence 267286043 Illegal Seq Msg 0 Buffer Size 4382
ReRcv 7576 Bad Checksum 0 Channel Count 2
Lone ACK 6295562 TR DFQ Empty 0 Channel Selections 227776
Cache 1609 TR MFQ Empty 0 Protocol 1.4.0
Ill ACK 0 CC MFQ Empty 0 Open 11-FEB-1997 08:28:01.80
Bytes Rcv 2485526100 Cache Miss 0 Cls 11-FEB-1997 08:27:52.17
|
5241.6 | | AMCFAC::RABAHY | dtn 471-5160, outside 1-810-347-5160 | Thu Mar 06 1997 10:26 | 3 |
| What does it mean when the Preferred Channel under the channel selection portion
is 00000000 momentarily? I'm guessing it is just a fluke that I caught it
changing between channels.
|