[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference lassie::ucx

Title:DEC TCP/IP Services for OpenVMS
Notice:Note 2-SSB Kits, 3-FT Kits, 4-Patch Info, 7-QAR System
Moderator:ucxaxp.ucx.lkg.dec.com::TIBBERT
Created:Thu Nov 17 1994
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:5568
Total number of notes:21492

5389.0. "UCX vs DECnet performance, INETDRIVE SYBASE application" by PRSSOS::DIETZ (Pierre-Etienne DIETZ, Support/France) Fri Mar 28 1997 12:21

Problem: A SYBASE application, on an ALPHA 2100A, is running at 
         least 4 to 10 times slower with UCX than with DECnet.      
         (It's being ported from a RDB application on a VAX 7000).
                *** cross posted on ORAREP::NOMAHS::SQL ***
In example:
          +------------------------------------------+   Client transactions:
Config :  !  DataBase Server          Cobol Client <---- Reads 4500 lines
--------  !  SYBASE V11.0.2                 !        !        from a file
          !         !    <- SRI QIO$ ->     !        !  => 4 to 5 SQL requests
          !     INETDRIVER             INETDRIVER    !        by line
          !         !                       !        !  => ~20000 requests
          !         !----<--        --->----!        !
          !                 UCX V4.1                 !
          !                 VMS 6.2                  !
          !         Alpha 2100A Biprocessor          !
          +------------------------------------------+
  Performance figures:                         Batch Elapsed Time:
              DECnet => ~10 SQL requests /sec (completion in  5 minutes)
              UCX    => ~0.5 SQL request /sec (completion in 60 minutes)!!!
                                                             ===========
- Tests also run on a similar configuration, with similar results:
    SYBASE V10.0.2
    UCX V4.0 ECO 3
    AS 2100A  

Analysis : the following has been checked:
---------

o  The client process is running ...
    (If UCX)    50% of its time in Kernel Mode (pretty high)
    (if DECnet) 25% of its time in Kernel Mode (~normal)
o  UCX proto TCP quota increase (4K -> 32K) does not improve the link.
o  No MBUF Waits or Drops.
o  No Device_socket "SEND" Buffer or I/O waits 

Questions:
---------
- INETDRIVER Setup: Yet the QIOs done on the INETDRIVER have to be run 
    a second time on the BGDRIVER, is there some tuning available?
- Kernel Mode activity: Is that level of Kernel Mode reduceable? 
- Any other reference of such an application with good performances, 
    in order to compare with that one?

Thanks a lot for any advice, best regards
Pierre-Etienne
T.RTitleUserPersonal
Name
DateLines
5389.1UCXAXP::GEMIGNANIFri Mar 28 1997 18:4310
    There is a correction to the INETDRIVER which will be applied next week
    to fix a memory allocation problem.  I would expect that this fix could
    improve the performance, but by how much, I cannot say.
    
    Do you have any idea what the server is doing?  How is it using the
    sockets?  [I don't have this information and can only rely on you to
    provide it...]
    
    
    
5389.2LASSIE::GEMIGNANIMon Apr 14 1997 19:043
    
    The new INETDRIVER appears to run at near DECnet speeds, as reported by
    the test site.
5389.3Set Proto TCP /LOOPBACK => INETDRIVE = DECnet's speedSOS6::DIETZPierre-Etienne DIETZ, Support/FranceTue Apr 15 1997 05:3589
RE: 5389.2 by LASSIE::GEMIGNANI 
>   The new INETDRIVER appears to run at near DECnet speeds, 
>   as reported by the test site.

	Thanks John for your help on INETDRIVER. Your questions 
        helped us to find the path to the solution.

The solution came from a similar performance case, on a UNIX system,
where a "TCP LOOPBACK" mode is available. We used that feature on UCX, 
in order to "short-cut" the data transfers at the TCP transport level.

The references I found are:
  - Note UCX 1368.1: << When loopback is enabled, both system-wide and by 
      the involved applications, TCP connections within a single host 
      will utilize a special internal code path which has been optimized 
      for performance. >>
  - Management Command Ref. page 2-139 "SET PROTOCOL /LOOPBACK":
      System-wide setup for high performance loca host interprocesses.
  - System Services and C Socket Programming page 5-50 "UCX$C_USELOOPBACK":
      Specific application setup at socket level.
  - Note UCX 1139.1: << Avoid OOB, and Buffers Larger than Quota >>
                and  << UCX Set Proto TCP /Quota={rec,send}:200000 >>

Below is our site test result.
  In case anybody could confirm such metrics, or provide advices 
  about the parameters we used, we would be very happy to read such remarks.
Best regards,
Pierre-Etienne

       ...............................................
       !!! We almost reach the DECnet performances !!!
       ...............................................

Following informations are catched over the phone, therefore still 
  aproximative. 


I - $UCX Set Proto TCP /NOloopback (First step = initial status)

>  Something to look for:  The INETDRIVER runs at IPL 6, and UCX runs
>  at IPL 8.  Can you tell me how much time the systems are spending
>  at each of these IPLs?  

  With:
  $UCX Set Comm /Non_Ucx_Buffers=Free:100 (=> Show Comm displays 44)
  --------------------------------------------------
  IPL           : 00 : 02 : 03 : 04 : 08 : 11 : 15 : 
  --------------------------------------------------
  % of S0 Space : 07 : 02 : 54 :    : 31 : 02 :    : (of a 200% bi-proc.)
  % of Int Stack:    :    :    : 32 : 45 :    : 22 :

>  Also, if this is an MP, can you tell me what the MP synch time 
>  percentage is running?

   " $ Monitor Modes " shows 0% of MP_SYNCH (NO percentage).
   Also, Direct_IO is around 40 /second (See II.1 below).

II - $UCX Set Proto TCP /LOOPBACK (Last step = final status)

  With:
 ......................................................................
 . $UCX Set Comm /Non_Ucx_Buffers=Free:100 (=> Show Comm displays 44) .
 . $UCX Set Comm /Quota=(Rec:200000,SEND:200000)                      .
 . $UCX Set Proto TCP /LOOPBACK                                       .
 ......................................................................

  Then, Shutdown + Restart SYBASE Data Server

  $UCX Sho Dev /port=5000 /Full   ! (5000 = SYBASE server port)
   => Options: Loop   -> Probably SYBASE has seen the TCP LOOPBACK
                         swith, and has therefore setup the 
                         UCX$C_USELOOPBACK option? 

  Results: The test elapsed time felt from 60 minutes to ~5 minutes
  ========
   [II.1] The Direct_IO stay around 700 to 800 IO/second instead of ~40 !
   [II.2] The elapsed time was between 5 and 6 minutes for the test
            that took 5 min with DECnet!!!                   <===***Great***
          The customer is very happy, and is going to confirm that 
            after running a few other tests this week.

III - Action Plan:

  Yet, a question: Which parameter did lead to such an improvement? 
    (NB: The customer noticed in a previous test, that 
     /Non_Ucx_buffers=free=100 reduced the elapsed time of 30%, 
     and reduced the KERNEL mode from ~50 to ~30%).
  => The customer will try to play a little with these UCX parameters.
  => Also, if you have an idea ... thanks in advance ;-).
5389.4/Non_Ucx_Buffers=Free:n boosted to 255 => 20% speed increasePRSSOS::DIETZPierre-Etienne DIETZ, Support/FranceMon Apr 21 1997 09:2540
RE: 5389.3 by SOS6::DIETZ 
    -< Set Proto TCP /LOOPBACK => INETDRIVER = DECnet's speed >-
	
	Hello,

Customer complete satisfaction with:

  $UCX Set Comm /Non_Ucx_Buffers=Free:255 (=> Show Comm confirms 255)
    (maximum Non UCX Buffers)
  + AUTOGEN because of the NPAGEDYN pool taken by the above command, 
  => 
  Another 20 to 30% of elapsed time saved VS the (5 minutes) DECnet 
    Batch Test!!!

Conclusion: They will use this solution (and abandon their 
  comparaison tests with TGV Multinet ;-) ).

NB: Correction, of the reply 5389.3. 
    When I wrote 
    >    $UCX Set Comm /Non_Ucx_Buffers=Free:100 (=> Show Comm displays 44)
    it was actually 
       $UCX Set Comm /Non_Ucx_Buffers=Free:300
    that was entered by the customer.

  EXPLANATION:
    Decimal 300 	=> Hexadecimal 	12C
    Masking with Hex FF =>		 2C 	=> Decimal 44

    Analysis:
    $ UCX Set Comm /Non_Ucx_Buffers=Free:n
    "n" value is in the range [1,255], and is set to 10 by default.

    Any value greater than 255, is masked by the Hexadecimal %XFF, 
      in example, Free:1000 => %D1000 = %X03E8 
                  Masking with            %XFF => %X00E8 => %D0232
      and  $ucx Set Comm /Non_Ucx_Buffers=Free:1000
           $ucx sho comm                  => Non UCX buffers = 232 

Rgds,
Pierre-Etienne
5389.5I'm curious ...LASSIE::GEMIGNANITue Apr 22 1997 12:141
    What is the final time?  Also, what were the MultiNet timings?
5389.6Final timings + What's a Non_UCX_Buffers?sos6.evt.dec.com::DIETZPierre-Etienne DIETZ, Support/FranceWed Apr 23 1997 05:2619
RE: 5389.5 by LASSIE::GEMIGNANI -< I'm curious ... >-
.5>   What is the final time?  
      DECnet Initial test                            =>  05 minutes
      UCX Initial test                               =>  60 minutes
      UCX + TCP Loopback + Non_Ucx_Buffers:Free=44   =>  05 minutes
      UCX + TCP Loopback + Non_Ucx_Buffers:Free=255  =>  04 minutes !!!

.5>   Also, what were the MultiNet timings?
      TGV Initial test                               =>  60 minutes
      Therefore, they gave up, having to tune TGV Multinet, 
      as they did for UCX; and having to learn TGV...

PS: UCX Set Conf Comm /Non_UCX_Buffers:Free=n
    What's the unit count of "Non_UCX_Buffers"?
    (More precisely, if they are 'preallocated' buffers, 
     how can we pre-allocate "non UCX buffers" if the size is not known?)

Thanks John, and best regards,
Pierre-Etienne