[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference 7.286::fddi

Title:FDDI - The Next Generation
Moderator:NETCAD::STEFANI
Created:Thu Apr 27 1989
Last Modified:Thu Jun 05 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2259
Total number of notes:8590

854.0. "High interrupt stack time (6000/DEMFA)" by BONNET::STRATMAN (Peter Stratman @VBO) Tue Feb 09 1993 15:19


    Has anyone experienced very high interrupt stack time on VAX 6000s with
    a DEMFA as soon as traffic is generated ?

    I have a 2 6530s with VMS V5.5-1, DEMFA microcode 1.4.

    The primary CPU on each node runs with 100% interrupt stack mode as
    soon as 5 network backups are started. Throughput is same or worse than
    previously with a DEBNA, however there is obviously a lot less user CPU
    time left. Secondary CPU are mostly doing MP sync and Kernel mode work.

    I've done some PC sampling, and the system seems to be doing mostly
    ENQ' and DEQ's.

    As soon as the backups are stopped, interrupt stack time goes way down.
    General network traffic (directed to other nodes) also seems to drive
    it up again.

    There are no NCP-detected or internal DEMFA errors. The DEMFAs are
    in the right slots.
    
    Any ideas on how to tackle this problem ?

    Thanks,
    Peter.    
    
    P.S. I have a slight doubt about our building's fiber network. Could
    a fiber problem generate these interrupts ?

    
T.RTitleUserPersonal
Name
DateLines
854.1More computes neededSTAR::STOCKDALEThu Feb 11 1993 13:4316
>>    Any ideas on how to tackle this problem ?

Get more computes.

Much of the I/O processing shows up as interrupt stack usage
so if you have a lot of network traffic showing 100% interrupt stack
is not unusual.  Its not that the system is being swamped by
interrupts from the DEMFA.

On OpenVMS AXP, much work has been done to improve performance in
the LAN drivers.  This work will show up in a future release of
OpenVMS VAX.  But this only reduces the amount of time in the
LAN drivers which represents just a portion of the amount of
computes needed to do the backups.

- Dick
854.2..only if througput increases.. but it decreases !BONNET::STRATMANPeter Stratman @VBOFri Feb 12 1993 06:2017
>Much of the I/O processing shows up as interrupt stack usage
>so if you have a lot of network traffic showing 100% interrupt stack
>is not unusual.  Its not that the system is being swamped by
>interrupts from the DEMFA.

OK, I know that increasing the IO performance will put more strain on the
CPUs too, that's normal. But overal throughput should improve.

What I'm trying to understand however is how I can have 2 identical
machines, one on FDDI, one on Ethernet, with a very similar network IO
workload, and getting only half the Ethernet machine's throughput on the
FDDI machine !

Agree that this is abnormal ?

Peter.

854.3Strange indeed...KONING::KONINGPaul Koning, A-13683Fri Feb 12 1993 11:494
Yes.  Especially since the DEMFA is very similar to the DEMNA/DEBNI, both of
which are more efficient than the DEBNA...

	paul
854.4The Picture..BONNET::STRATMANPeter Stratman @VBOFri Feb 12 1993 13:2729
    Hmmm... I replaced DEBNA's with a DEMFA, but I'm COMPARING with a node
    in the cluster which has a DEMNA...

    Still, performance should not be less...
    
    This is the picture :

	    ------------------------
	====| Concentrator         |=======
    	    ------------------------
	    |	       |	   |
 	    |   ----------------   |
 	    |   |DECBRIDGE 500 |   |
 	    |   ----------------   |
  	    |	       |           | 
	    |  |---------------|   |
  	    |	       |           |<------ Throughput here is 1/2 of 
  	    |	       |<----------|------- here...
............|..........|...........|............
.	    |	       |	   |           .
.   ----DEMFA       --DEMNA--      DEMFA----   .
.   | 6530  |	    | 6530  |	   | 6530  |   . 
.   ---------	    --------|	   ---------   .
.	|               |              |       .
.	|          CI   |              |     .
.       ----------------*---------------       .
.                       |HSCs, disks           .
..CI Cluster....................................
854.5STAR::GAGNEDavid Gagne - VMS/LAN DevelopmentFri Feb 12 1993 13:4219
    The driver for the DEMNA is much more efficient with regards to CPU
    cycles than the driver for the DEMFA.  The DEMFA driver was the first
    FDDI driver from VMS and it has extra code for checking that it's
    doing the right thing.  Due to release restrictions for C2
    certification of VMS, we were not able to improve the DEMFA driver
    until a future release of VMS - which will probably ship in the
    next calender year (1994).
    
    One area where you can get CPU cycles back is in the size of your
    LRPs.  Typically your LRP size is too small for FDDI packets; thus
    causing all FDDI buffer allocations to be much more costly than
    Ethernet buffer allocations.  To reduce the cost for allocating
    FDDI buffers, increase your LRP size to about 4700 (I don't recall the
    exact number, but I believe there's a memo available about this - it
    may even be in the release notes - it was a couple of years ago that
    the FDDI driver first shipped).  The reason we don't have this done
    automatically is because there are too many pieces of VMS that use
    memory for us to determine the impact to the other pieces of VMS
    that will no longer get their buffers from the LRP look aside list.
854.6If you can boost the performance, send the patchesMUDDY::WATERSFri Feb 12 1993 14:029
>    doing the right thing.  Due to release restrictions for C2
>    certification of VMS, we were not able to improve the DEMFA driver
>    until a future release of VMS - which will probably ship in the
>    next calender year (1994).
    
    C'mon.  If you have a significantly better driver, provide it to the
    customers that want acceptable performance from their costly CPUs.
    These customers may not be concerned about loosing C2 certification.
    --gw
854.7increase in LRPSIZE useful for DECNET ?BONNET::STRATMANPeter Stratman @VBOMon Feb 15 1993 12:3124
    re .5

    David, the interface is basically being used for DECNET traffic.
    
    So, if I have understood previous notes on the subject, packet size
    will not exceed Ethernet packet size. It will not, in this case, make
    much sense to increase LRP size, will it ?



    re .6

    I agree, if the source of the "problem" is due to known
    inefficiencies in the driver, then please don't make us wait for a
    future VMS version if not absolutely necessary.

    Anyway, hey, I'm not only getting high interrupt stack time, but also
    consistently getting 50% worse performance than with a DEMNA. If this
    were really due to the driver, then inefficiency is not the right
    term to use ! ... This can't be due to the driver. ... or can it ?
    
    Peter.
    
854.8STAR::GAGNEDavid Gagne - VMS/LAN DevelopmentMon Feb 15 1993 12:407
    The receive buffers allocated by the driver are for full size FDDI
    packets - the driver doesn't know what size packets will be arriving.
    So the LRP size may help receive and hinder transmit.
    
    The more efficient driver cannot be used until the future because it
    was written for a newer common routine interface which will only be
    available in a future release of OpenVMS VAX.
854.9LRPSIZE - formula neededSTKHLM::ALMERLOVKarl Jonas Almerl�v, StockholmThu Feb 18 1993 12:3322
    I have tried to find out what is a reasonable value for LRPSIZE when
    using FDDI as a cluster interconnect, but the values differ a lot.
    
    Our customer is running VMS v5.5-2 and hasn't started using FDDI yet.
    
    In the VAXcluster MDF configuration guidelines the recommendation is to
    have a LRPSIZE of 4471.
    
    In the VAXcluster MDF facility v 1.0a release notes it says LRPSIZE
    should be 4474.
    
    854.5 in this conference says about 4700, referring to a memo or the
    release notes (which release notes? I have checked the VMS release
    notes with no success).
    
    Has somebody out there got an exact figure (or even better, a formula)?
    
    	Karl Jonas
    
    
    
    
854.10STAR::GAGNEDavid Gagne - VMS/LAN DevelopmentThu Feb 18 1993 17:1930
    4700 was my guess - of course I knew that the correct number was
    documented somewhere - although I am quite sure that 4471/4474 are
    wrong.
    
    The formula is (without spending another hour verifying this):
    
    You want the FDDI receive buffers to come from the LRP lookaside list.
    Since LRPSIZE is set to 1504 on a system with Ethernet, then you need
    to know the maximum space needed for FDDI (on V5.5-2) and subtract
    14 from it.
    
    The space needed for FDDI (at least the DEMFA) is:
    
      4495 (maximum packet size)
         3 (extra bytes for device packet control)
        31 (for beginning alignment)
        31 (for ending alignment)
     -----
      4560
      - 14 (subtract the amount that is already accounted for)
     -----
      4546
    
    I would add some extra to this because it would be sad to find out that
    the number was off by a few bytes.  So I would use the value 4578.  You
    should test this before having a customer try it.
    
    Maybe we can try this here if we ever get some free time.
    
    Good luck.
854.11LRPSIZE=4541 for FDDI and VMS 5.5-2!VCSESU::WADEBill Wade, VAXc Systems &amp; Support EngMon Feb 22 1993 12:5428
                                            
    I did some testing this morning on a VAX6000 equipped with a DEMFA running 
    VMS 5.5-2.

    I found that the magic number for LRPSIZE was 4541.  With this value -

$SHOW MEMORY
.
.
.
Large Packet (LRP) Lookaside List            Packets       Bytes       Pages
    Current Total Size                           800     3916800        7650
    Initial Size (LRPCOUNT)                      800     3916800        7650
    Maximum Size (LRPCOUNTV)                    4000    19584000       38250
    Free Space                                   728     3564288
    Space in Use                                  72      352512
    Packet Size/Upper Bound (LRPSIZE +  355)                4896
    Lower Bound on Allocation                               1088

    
    
    With this value I found that all the 4896 byte VCRPs were allocated from the
    LRP lookaside list. Setting LRPSIZE <4541 caused the 4896 byte VCRPs to be
    allocated from nonpaged pool.   
    
    
    Bill
    
854.12VCSESU::WADEBill Wade, VAXc Systems &amp; Support EngMon Feb 22 1993 17:0221
    
    More on LRPSIZE -
                   
    The only reference that I've found for setting LRPSIZE on an FDDI equipped 
    VAX system is in the release notes for VMS 5.4-3.  The value given is 4474 
    (pg. 2-30). Earlier investigation has shown this to be insufficient in
    that the allocation is still from nonpaged pool and not from the LRP
    lookaside list.  
    
    There is no reference to LRPSIZE in the 5.5-1 or 5.5-2 release notes or 
    the VAXc SPD.  The MDF product set the number based on the number in the 
    5.4-3 release notes. 
                   
    Is there agreement that this is a severe performance problem and
    a QAR against VMS is required and AUTOGEN should set LRPSIZE dependent 
    on an FDDI adapter?
     
    
    
    
    
854.13STAR::GAGNEDavid Gagne - VMS/LAN DevelopmentTue Feb 23 1993 10:174
    The next release of VMS due out does not use the LRPSIZE parameter. 
    Pool management has been changed such that there are tons of lookaside
    lists.  So you can file a QAR; but it will probably be closed
    immediately as "fixed in next release".
854.14what is really the next release??CVG::STRICKERCluster Validation Group, Salem NHTue Feb 23 1993 10:294
       Isn't VIKING the next release... Shouldn't the VIKING kit contain
    some of these changes with it's new FDDI adapter support? Before the 
    next big release (BLADE) which contains all of the Pool Management
    changes...
854.15STAR::GAGNEDavid Gagne - VMS/LAN DevelopmentTue Feb 23 1993 11:205
    VIKING is a release for hardware support.  So it will not include
    the new memory management code or bug fixes for generic V5.5-2 bugs.
    VIKING is (to me) just a variant of V5.5-2 (as its name will show).
    The new memory management code is in the release named BLADE; which
    will probably be numbered 6.0.
854.16Something that might helpSTAR::GILLUMKirt GillumThu Mar 04 1993 15:298
    
    This may or may not be useful...
    
    Check the SYSGEN parameter POOLCHECK.  If it's non-zero, every buffer
    is "checked" upon allocation/deallocation.  If you can live with this
    turned off, you might see higher throughput (perhaps you'll have more
    CPU left too).