[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference help::decnet-osi_for_vms

Title:	DECnet/OSI for OpenVMS

Moderator:	TUXEDO::FONSECA

Created:	Thu Feb 21 1991
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	3990
Total number of notes:	19027

3929.0. "nsp retransmission time" by CSC32::J_RYER (MCI Mission Critical Support Team) Fri Apr 11 1997 18:06

    Customer has a time critical application which sends data between
    Alpha's (running VMS V6.2, OSI V6.3 with ECO 6) at two sites over 
    multiplexed T1's (between DECnis routers).  The typical roundtrip 
    delay estimate on logical links between the two systems is 
    100 milliseconds (as observed via ncl> "sho nsp port * roundtrip
    delay estimate" ).
    
    Occasionally (several times a day), a packet has to be re-transmitted
    because it apparently just got dropped down at the data link layer.
    Customer would like this retransmission to happen as quickly as
    possible.   He has the NSP delay factor parameter set to 2, which
    is as low as it's allowed to be set.  However, he sometimes observes
    the roundtrip delay estimate on one of the NSP ports suddenly jump 
    from 100 milliseconds up to three or four seconds.  (no intermittent
    values that he can observe via repeated up-arrow, carriage returns,
    to repeat the ncl>show command.)
    
    In an attempt to prevent this, he set delay weight to its maximum
    allowable value (255).  In our reading of the NSP specification, we
    assumed that this would mean that new values for round-trip time
    would get incorporated into the rolling average very slowly, so the
    variability in the estimate should be minimal.  However, when he did
    this, what he observed was that the estimate would suddenly jump to
    five or six seconds (even longer than the previous observed value of
    three-to-four seconds).
    
    What are we misunderstanding?  Are we backwards on what high and low
    values for delay weight mean?  Is something broken in the OSI
    implementation of NSP retransmission of packets which have not been
    acknowledged in twice the roundtrip delay estimate?  Customer does
    have some CTF traces which show that the packet doesn't get 
    re-transmitted for much longer than twice the roundtrip delay estimate.
    
    Thanks in advance for any comments/advice,
    Jane Ryer
    MCI Mission Critical Support Team
    
    
    Here are his NSP parameter settings:
    
    ncl> sho node noas00 nsp all
    
    Node noas00 NSP
    AT 1997-04-11-21:41:31.750+00:00I45.540
    
    Status
    
        UID                               =
    05818060-ABE3-11D0-8003-AA0004008C64
        State                             = On
        Currently Active Connections      = 13
    
    Characteristics
    
        Maximum Transport Connections     = 200
        Maximum Receive Buffers           = 4000
        Delay Weight                      = 3
        Delay Factor                      = 2
        Maximum Window                    = 20
        DNA Version                       = T4.2.1
        Acknowledgement Delay Time        = 3
        Maximum Remote NSAPS              = 201
        NSAP Selector                     = 32
        Keepalive Time                    = 60
        Retransmit Threshold              = 12
        Congestion Avoidance              = False
        Flow Control Policy               = Segment Flow Control
    
    
    Comment:  could the fact that flow control policy is set to "segment"
    rather than "no" flow control be affecting the retransmission
    algorithm?  I'll have the customer re-test with that changed to
    "no flow control" and the delay weight at 3 and then at 255 (the
    two extremes of its allowable range).

T.R	Title	User	Personal Name	Date	Lines
3929.1	Lower Weight, use OSI TP if possible	HELP::TAYLOR		`Thu Apr 17 1997 10:14`	16
	Hi Jane, Yes, they should keep the delay weight as low as possible to get a smaller incremental change. Retransmits are bad because even in Phase V you are going to get a delay in seconds. Also, the routing end node cache is going to be flushed. If they have 2 Phase V systems then they should use OSI Transport. Cheers, Pat
3929.2	need more details . . .	CSC32::J_RYER	MCI Mission Critical Support Team	`Mon Apr 21 1997 16:30`	12
	Hi, Pat, Why is the delay before retransmission in seconds rather than milliseconds? Doesn't that go against the NSP spec? (which I think says that the retransmission timer will be delay factor times the estimated roundtrip delay estimate) Is the extra dely related to the second part of what you said (that the entry gets flushed from the end-system cache) ? Thanks, Jane
3929.3	traces confirm three-second pause	CSC32::J_RYER	MCI Mission Critical Support Team	`Thu May 29 1997 17:52`	92
	Still looking for answers. Here's a more detailed description of my customer's situation . . . MCI runs an application which uses Digital's RTR (Reliable Transaction Router) product to communicate between two sites (North Royalton, Ohio and Sacramento, California) via NSP logical links. MCI has experienced numerous problems with the RTR product over the last few years, and they have been told by the RTR engineers that at least one of the triggering factors is relatively "long" pauses in data arriving over the NSP logical links. To test exactly what the delay was (and how often it occurred), one of MCI's software engineers wrote an application which brings up an NSP logical link to the MIRROR session control application and sends a packet once a minute. Typical return times for the packet are around a tenth of a second. (Even though this is a wide-area link, there are multiple T1's between the sites.) The program keeps a log of any packet for which the mirrored response is not received within a second. On average, this occurs about twenty to thirty times a day (out of 60 x 24 = 1440 attempts), so it occurs on less than 1% of the transmitted packets). MCI did some tracing of the LAN at one of the two sites and was able to correlate the retransmissions with packets that never showed up on the LAN trace. They are not disputing that packets will sometimes be dropped or corrupted on the LAN. Their concern is the length of time that the sending NSP waits before retransmitting the packet when no acknowledgement arrives from the receiving end. It seems to never be less than three seconds. We've all read the NSP spec until we're blue in the face and tried to correlate what it says about when to retransmit a not-yet-acknowledged data packet with what we're observing in MCI's actual network. As I read the NSP spec, the retransmission should occur at delay factor (2 in MCI's case) times the estimated roundtrip delay estimate (reported as around 100 milliseconds by NCL> SHO NSP PORT * ALL). So MCI believes that a packet which just "disappears" en-route to the other site should be re-transmitted 200 milliseconds after the first retransmission. Instead, the values they tyically see for the "echo" are in the 3.2 to 3.6 seconds range. It certainly appears that NSP is adding 3 seconds to the value for retransmission timer that would be implied by the NSP functional spec. The only parameter that I can find that looks like it might have an effect on this is "acknowledgment delay time" which is hard-coded at three seconds and apparently can not be modified. I found a note in the DECNETVAX conference (5589), entered a couple of years ago, which seems to describe a similar situation. The last entry on that note reads: ------------------------------------------------------------------------------- Can somebody verify that the following info given to a customer from the CSC is correct? ... question that you have regarding the long time that decnet takes to retransmit a packet. The fact that you can only get it down it about 4 seconds, is because there is a hardcoded 3 second delay, and whatever is calculated using delay weight and delay factor is added on to that value. This is a leftover from when DECnet was primarily a WAN protocol. This has been addressed in DECnet/osi (phase V decnet), and the value can be as low as 1 second. Why is this so? Is it that once NSP creates a link transparently, the retransmit timer isn't recalculated? Do you somehow have to force the transmission of packets that have the Delay ACK bit clear? -------------------------------------------------------------------------------- but was never responded to by "people in the know". On a previous reply (.1) to this note, Pat Taylor said "even in Phase V you are going to get a delay in seconds". What I am hoping to have answered here is : a) why does my customer never see NSP retransmissions in less than three seconds, even though the estimated round-trip time for the link is less than a tenth of a second, delay factor is set to 2 and delay weight is set to 3? b) is there some sort of "hardcoded" value that affects NSP retransmission? c) if there is a fixed delay prior to retransmission, is that considered an extension to the NSP functional specification, or am I missing something in my reading of said spec? d) any suggestions on anything the customer can change (he's currently running OSI V6.3 with eco 6) to get NSP to retrasmit those packets in less than a second? If an IPMT case is needed to answer these questions, I can do that. (I'm gathering CTF traces at the moment and actually intend to log the IPMT case on Friday afternoon unless replies to this note produce an explanation prior to then.) Thanks, Jane Ryer MCI Mission Critical Support Team
3929.4	Possible workaround	OZROCK::HARTWIG	Arthur Hartwig, TaN Engineering-Australia	`Fri May 30 1997 09:32`	5
	Question: Would using OSI Transport instead of NSP be a suitable work-around? (I don't know, but maybe its worth a try.)
3929.5	trivial change or more involved?	CSC32::J_RYER	MCI Mission Critical Support Team	`Fri May 30 1997 10:24`	10
	Would that be as simple as just changing the session control transport precedence, or would more work be required? I don't know if Bruce (the MCI employee investigating this problem, he's well known to you, Arthur!) has discussed that possibility with the Digital RTR engineers or not. Might there be anything in their code that could break when using OSI Transport rather than NSP for its node-to-node connections? Jane Ryer
3929.6		RMULAC::S_WATTUM	Scott Wattum - FTAM/VT/OSAK Engineering	`Fri May 30 1997 11:50`	4
	All you should need to do is change the transport precedence. A DNA Session Control application should not be able to tell what transport it is running over, however, be aware that RTR Engineering may claim "no support" for this configuration simply because they haven't tested it.
3929.7		OZROCK::HARTWIG	Arthur Hartwig, TaN Engineering-Australia	`Tue Jun 03 1997 09:46`	3
	Maybe someone with more knowledge about the transport protocol specifics could comment on whether or not OSI Transport is also likely to show these delays on timeout recovery.
3929.8	OSITP - Local retransmission time (T1)	BIKINI::DITE	John Dite@RTO DTN 865-4065	`Thu Jun 05 1997 04:28`	30
	I don't know if this helps: Extract out of ISO/IEC 8073:1992 -------------------------------------------------------------------------------- 12.2.1.1.4 Local retransmission time (T1) The local transport entity is assumed to maintain a bound on the time it will wait for an acknowledgement before retransmitting the TPDU. The value is given by T1=Elr+Erl+Ar+x where Elr is the expected maximum transit delay local-to-remote; Erl is the expected maximum transit delay remote-to-local; Ar is the remote acknowledgement time; x is the local processing time for a TPDU -------------------------------------------------------------------------------- As far as I'm aware there is no additional 'hard coded value' that is added during this calculation. Please be aware that DECnet/OSI systems have default value of Ar (NCL OSI Transpor Template characteristic attribute 'Acknowledgement Delay Time') of 1 second. Ar is passed at connection establishment time. So if you want to ensure that every TPDU is acknowledged as quickly as possible then set 'Acknowledgement Delay Time' to 0. John