| There is a correction to the INETDRIVER which will be applied next week
to fix a memory allocation problem. I would expect that this fix could
improve the performance, but by how much, I cannot say.
Do you have any idea what the server is doing? How is it using the
sockets? [I don't have this information and can only rely on you to
provide it...]
|
| RE: 5389.2 by LASSIE::GEMIGNANI
> The new INETDRIVER appears to run at near DECnet speeds,
> as reported by the test site.
Thanks John for your help on INETDRIVER. Your questions
helped us to find the path to the solution.
The solution came from a similar performance case, on a UNIX system,
where a "TCP LOOPBACK" mode is available. We used that feature on UCX,
in order to "short-cut" the data transfers at the TCP transport level.
The references I found are:
- Note UCX 1368.1: << When loopback is enabled, both system-wide and by
the involved applications, TCP connections within a single host
will utilize a special internal code path which has been optimized
for performance. >>
- Management Command Ref. page 2-139 "SET PROTOCOL /LOOPBACK":
System-wide setup for high performance loca host interprocesses.
- System Services and C Socket Programming page 5-50 "UCX$C_USELOOPBACK":
Specific application setup at socket level.
- Note UCX 1139.1: << Avoid OOB, and Buffers Larger than Quota >>
and << UCX Set Proto TCP /Quota={rec,send}:200000 >>
Below is our site test result.
In case anybody could confirm such metrics, or provide advices
about the parameters we used, we would be very happy to read such remarks.
Best regards,
Pierre-Etienne
...............................................
!!! We almost reach the DECnet performances !!!
...............................................
Following informations are catched over the phone, therefore still
aproximative.
I - $UCX Set Proto TCP /NOloopback (First step = initial status)
> Something to look for: The INETDRIVER runs at IPL 6, and UCX runs
> at IPL 8. Can you tell me how much time the systems are spending
> at each of these IPLs?
With:
$UCX Set Comm /Non_Ucx_Buffers=Free:100 (=> Show Comm displays 44)
--------------------------------------------------
IPL : 00 : 02 : 03 : 04 : 08 : 11 : 15 :
--------------------------------------------------
% of S0 Space : 07 : 02 : 54 : : 31 : 02 : : (of a 200% bi-proc.)
% of Int Stack: : : : 32 : 45 : : 22 :
> Also, if this is an MP, can you tell me what the MP synch time
> percentage is running?
" $ Monitor Modes " shows 0% of MP_SYNCH (NO percentage).
Also, Direct_IO is around 40 /second (See II.1 below).
II - $UCX Set Proto TCP /LOOPBACK (Last step = final status)
With:
......................................................................
. $UCX Set Comm /Non_Ucx_Buffers=Free:100 (=> Show Comm displays 44) .
. $UCX Set Comm /Quota=(Rec:200000,SEND:200000) .
. $UCX Set Proto TCP /LOOPBACK .
......................................................................
Then, Shutdown + Restart SYBASE Data Server
$UCX Sho Dev /port=5000 /Full ! (5000 = SYBASE server port)
=> Options: Loop -> Probably SYBASE has seen the TCP LOOPBACK
swith, and has therefore setup the
UCX$C_USELOOPBACK option?
Results: The test elapsed time felt from 60 minutes to ~5 minutes
========
[II.1] The Direct_IO stay around 700 to 800 IO/second instead of ~40 !
[II.2] The elapsed time was between 5 and 6 minutes for the test
that took 5 min with DECnet!!! <===***Great***
The customer is very happy, and is going to confirm that
after running a few other tests this week.
III - Action Plan:
Yet, a question: Which parameter did lead to such an improvement?
(NB: The customer noticed in a previous test, that
/Non_Ucx_buffers=free=100 reduced the elapsed time of 30%,
and reduced the KERNEL mode from ~50 to ~30%).
=> The customer will try to play a little with these UCX parameters.
=> Also, if you have an idea ... thanks in advance ;-).
|
| RE: 5389.3 by SOS6::DIETZ
-< Set Proto TCP /LOOPBACK => INETDRIVER = DECnet's speed >-
Hello,
Customer complete satisfaction with:
$UCX Set Comm /Non_Ucx_Buffers=Free:255 (=> Show Comm confirms 255)
(maximum Non UCX Buffers)
+ AUTOGEN because of the NPAGEDYN pool taken by the above command,
=>
Another 20 to 30% of elapsed time saved VS the (5 minutes) DECnet
Batch Test!!!
Conclusion: They will use this solution (and abandon their
comparaison tests with TGV Multinet ;-) ).
NB: Correction, of the reply 5389.3.
When I wrote
> $UCX Set Comm /Non_Ucx_Buffers=Free:100 (=> Show Comm displays 44)
it was actually
$UCX Set Comm /Non_Ucx_Buffers=Free:300
that was entered by the customer.
EXPLANATION:
Decimal 300 => Hexadecimal 12C
Masking with Hex FF => 2C => Decimal 44
Analysis:
$ UCX Set Comm /Non_Ucx_Buffers=Free:n
"n" value is in the range [1,255], and is set to 10 by default.
Any value greater than 255, is masked by the Hexadecimal %XFF,
in example, Free:1000 => %D1000 = %X03E8
Masking with %XFF => %X00E8 => %D0232
and $ucx Set Comm /Non_Ucx_Buffers=Free:1000
$ucx sho comm => Non UCX buffers = 232
Rgds,
Pierre-Etienne
|
| RE: 5389.5 by LASSIE::GEMIGNANI -< I'm curious ... >-
.5> What is the final time?
DECnet Initial test => 05 minutes
UCX Initial test => 60 minutes
UCX + TCP Loopback + Non_Ucx_Buffers:Free=44 => 05 minutes
UCX + TCP Loopback + Non_Ucx_Buffers:Free=255 => 04 minutes !!!
.5> Also, what were the MultiNet timings?
TGV Initial test => 60 minutes
Therefore, they gave up, having to tune TGV Multinet,
as they did for UCX; and having to learn TGV...
PS: UCX Set Conf Comm /Non_UCX_Buffers:Free=n
What's the unit count of "Non_UCX_Buffers"?
(More precisely, if they are 'preallocated' buffers,
how can we pre-allocate "non UCX buffers" if the size is not known?)
Thanks John, and best regards,
Pierre-Etienne
|