[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::atm

Title:atm
Moderator:NPSS::WATERS
Created:Mon Oct 05 1992
Last Modified:Thu Jun 05 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:970
Total number of notes:3630

825.0. "VMS 7.1 LANE performance figures (and where's Bricks?)" by VIRGIN::HOTZ (gimme an F !!!, ...D !!!, ...D !!!, ...I !!!, what's that spell !!!?) Fri Feb 14 1997 05:57

One of our customers has done some tests with VMS 7.1, ATM-LANE, on 2
AlphaServer 400 4/233 connected with ATMw350 to a Madge (LANnet) atm
switch and is not very happy with the performance. Here is what he says:

-----------------------------------------------------------------------
OS version:               OpenVMS 7.1
Last significant changes: Upgrade from V7.1-EFT2

Problem description:

The decnet-plus and ftp throughput using elan driver and atmworks gives
a slightly better figure using released OpenVMS V7.1. However, the
figures are still poor:

DECnet DTSEND:

Bytes 256   512   1024   1500  2048   2500  3500   4096
Mbits   3.8   7.5   14.3   16    20.9   25    27.7   30.8

DECnet Copy: 9.3 Mbits
UCX FTP:     9   Mbits

The figures are a little higher than the one obtained using V7.1-eft2
Also, the figures for the cpu modes are about the same for smaller
decnet packet sizes. For higher sizes, there is a difference in the
relation of interrupt and kernel mode (31/67%) as apposed to eft2
(51/47%)
-----------------------------------------------------------------------

Is this as expected? Especially the 30.8 Mb/s dtsend with 4096 Message
size and 100% cpu utilization? 

Looking at notes entry 670, that DEC3000-600/unix 4.0 system did 29Mb/s
with only 25% cpu utilization (LANE, TCP, Bricks).

Is there a better memory to memory test program, like Bricks for VMS
7.1? DTsend's max transmit buffer is only 4096 bytes.

greg
T.RTitleUserPersonal
Name
DateLines
825.1EVMS::JSTICKNEYJay Stickney, OpenVMS LAN Eng., 381-2020, ZK3-4/T12Fri Feb 14 1997 13:4165
    Greg,
    
    I did some performance measurements a few months ago that are a 
    bit more flattering than your customers results.  We used our own
    application (XXDRIVER) which doesn't have all the overhead that
    DECnet-Plus and UCX have. We had quite a bit more horsepower also.
    I've attached my 'unofficial' results below (132 column format)
    which also shows Ethernet and FDDI results for comparison.  I don't
    think you'll do any better with bricks because I think bricks runs 
    over DECnet or UCX.  DECnet Phase IV may help a little.  For V7.1
    our emphasis was on functionality not performance.  You'll see some
    improvement in future releases but there's always going to be 
    a lot of overhead running LANE.  
    
    Jay.
    
    
XXDRIVER Transmit test results 22-OCT-1996 
  Test parameters .... ETH, Pipeline 16, Chaining Off, Alignment 0, User mask 00000001
  CPU type ........... Turbolaser,EV5,CPU=00000000,PAL=0.1.1.18, 300 MHz, 1 CPU active, AlphaServer 8400 Model 5/300
  Versions ........... Sys AXP-F7.1, Drv 07010018 00000002, Lan 07010116 00000001, Hw 00000000
  System parameters .. Multiprocessing 0, Poolcheck 0, Vaxcluster 0, System_check 0

ELx0

Packet_size	Device		Pkt/sec		bit/sec		uSec/IO		%CPU
-----------	------		-------		--------	-------		----
9234		ELA0 XMIT	1840		135613680	39.302		14%
		ELA0 RECV	1839		135907840			
		ELB0 XMIT	1840		135602624	39.303
		ELB0 RECV	1839		135907840
-----------	------		-------		--------	-------		----
4495		ELA0 XMIT	3759		134404192	38.934		29%
		ELA0 RECV	3757		135132992		
		ELB0 XMIT	3757		134324928	38.947
		ELB0 RECV	3757		135119680	
-----------	------		-------		--------	-------		----
1518		ELA0 XMIT	11034		131703624	39.466		68%
		ELA0 RECV	6185		75116520 		
		ELB0 XMIT	6182		73799144 	39.475
		ELB0 RECV	11032		133981856

FWx0 (DEFPA)

Packet_size	Device		Pkt/sec		bit/sec		uSec/IO		%CPU
-----------	------		-------		--------	-------		----
4495		FWA0 XMIT	1852		66564592	20.943		8%	
		FWA0 RECV	1852		66609056
		FWB0 XMIT	1851		66564592	20.943		8%
		FWB0 RECV	1849		66609056	
-----------	------		-------		--------	-------		----
1518		FWA0 XMIT	5318		64457072	22.837		24%	
		FWA0 RECV	5318		64584712	
		FWB0 XMIT	5318		64457680	22.837		24%
		FWB0 RECV	5318		64585320	

EWx0 (DE500-AA)

Packet_size	Device		Pkt/sec		bit/sec		uSec/IO		%CPU
-----------	------		-------		--------	-------		----
1518		EWA0 XMIT	8123		98655240	18.942		31%
		EWA0 RECV	8123		98656816 	      
		EWB0 XMIT	8123		98655240	18.943
		EWB0 RECV	8123		98655240
    
825.2maybe 'netperf' will compile on VMS?NPSS::WATERSI need an egg-laying woolmilkpig.Fri Feb 14 1997 14:283
  Common memory-to-memory network performance apps are 'netperf' and 'ttcp'.
  Feed those terms into your search engine to find sources on the Web.
  The database of 'netperf' results lives on a host at HP, for example.
825.3...what overhead?VIRGIN::HOTZgimme an F !!!, ...D !!!, ...D !!!, ...I !!!, what's that spell !!!?Thu Feb 20 1997 09:1322
re .1,
Jay, thank you for the numbers. Can you explain a few things?

-You had 2 systems sy0 and sy1, each with 2 controllers,  A0 and B0

-A0 on sy0 was connected to A0 on sy1. full duplex mode
 B0 on sy0 was connected to B0 an sy1. full duplex mode

-is the cpu utilization for 2 controllers with 4 data streams?

I'd like to normalize the data to 1 controller @135 Mb/s fdx with 1518 Byte
packets.


I'm wondering, why LANE should have a "lot of overhead". Once the data
direct VC is setup, LANE just has to add a 2 byte LECID in front of the
ethernet frame. 

I assumed, the AAL5 padding and crc and segmentation/reassembly
was done by the hardware. Isn't it?

greg.
825.4don't forget mac addr to VC translationWASNT::"[email protected]"Born to be MildTue Feb 25 1997 15:507
Don't forget that the driver is given a MAC address and must
convert that to a VC.  This takes a certain amount of work.
There is always more overhead with LANE (or CLIP) but it
doesn't have a huge impact on performance.  I don't think the
driver is your bottleneck, something else is going on.

doug
825.5EVMS::JSTICKNEYJay Stickney, OpenVMS LAN Eng., 381-2020, ZK3-4/T12Mon Mar 03 1997 09:099
    Greg,
    
    Sorry for the delay getting back to you but I've been out
    of the office.
    
    There were two controllers on one system. The CPU utilization was
    per controller.
    
    Jay. 
825.635 Mb/s ftp, 25% cpu idleZUR01::HOTZgimme an F !!!, ...D !!!, ...D !!!, ...I !!!, what's that spell !!!?Thu Mar 06 1997 10:2233
I was onsite and did a few tests:
VMS 7.1, UCX 4.1, ATMw350, LANE ethernet, RZ28M (2GB) disks. Default UCX
parameters, ftp bin nohash. NPAGEDYN has not expanded.
No TCP retransmissions -> no cell loss.
1 Madge Collage 740 Backbone ATM Switch.
B=Byte, b=bit

                       ftp> put sysdump.dmp
                           ------------->
   Sender                                  Receiver
___AlphaServer400 4/233___                _ALphaServer400 4/233

  %    %    % _Disk_  time    ___ftp____    %    %    % Disk_
Int Krnl Idle I/O  Q  secn    kB/s  Mb/s  Int Krnl Idle I/O Q
--- ---- ---- --- --  ----    ----  ----  --- ---- ---- --- -
 20    9   70  60 .3    45    1119   8.9   20   11   66  59 1    disk1->disk

 13   45   40 114 .3    14    3666  29.3   33   15   50  -- -    disk1->NL:

 18   53   25  68 .3    --    2123  16.9   40   18   40  -- -    disk1->NL:
               68 .3    --    2291  18.3                         disk2->NL:
     Total of both streams:   4414  35.2

The first  row shows 1 ftp disk to disk copy
The second row shows 1 ftp disk to memory (nulldevice NL:)
The 3rd+4th rows show 2 simultaneous ftp disk to NL: copy, from 2 disks.

We get 35 Mb/s ftp throughput with the sender still 25% idle.

The bottleneck seems to be the disk read i/o and probably the network roundtrip
delay (window size).

greg