[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference lassie::ucx

Title:DEC TCP/IP Services for OpenVMS
Notice:Note 2-SSB Kits, 3-FT Kits, 4-Patch Info, 7-QAR System
Moderator:ucxaxp.ucx.lkg.dec.com::TIBBERT
Created:Thu Nov 17 1994
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:5568
Total number of notes:21492

5383.0. "INETDRIVER believed to cause excessive MPSYNCS" by BACHUS::KNEUTS (Erwin Kneuts DTN 856-8726) Thu Mar 27 1997 05:18

Hello,

I'm posting this one on behalf of my VMS support collegue, 
Jan Vercammen.

Fasten your seatbelts, as it is tough reading  


OpenVMS VAX V6.2 - VAX 7760 (member of a heterogenous cluster)
UCX 3.3 ECO11

While running several concurrent batch jobs that access a sybase database
located on a digital unix system over TCP/IP on FDDI, large (up to 300%)
ammounts of MP Synchronization (multiprocessing synchronization) activity 
are observed.

This activity causes a slowdown of these batches and all other system
activity.
        
Evaluation of performance (decps) figures and PC samples indicate that the
MP synch is related to exessive allocations and deallocations from variable
nonpaged pool.

Affected processes spend upto 95% in kernel mode, mostly at IPL 11.
The process is either waiting on the pool spinlock because a competing
process holds this lock while it works through exe$allocate or
exe$deallocate or it is working through these routines itself.

It is unclear for the moment why nonpagedpool allocations go to the
variable space.

Some causes might be (we have not been able to discriminate between
them for the moment)
1)REMQTI timeou on secondary interlock despite that it retries 9000
  times.
2)preallocated list needed always empty? not sure how this could
  happen as there is plenty of variable pool free
3)packetsize requested larger than max in preallocatedlists.
  the system is currently booted with _MIN versions of the system
  execlets, a reboot with the _MON versions is sheduled today.
  We will try to capture relevant data this evening.

The slowdown of all business processes endangers timely execution
of critical parts of these processes.

After enabling the full checking system execlets, we have taken some
statistics
(SHOW POOL/RING - SHOW POOL/STAT - SHOW SPIN/FULL), and observed a huge
demand for packets of size 8192 (Indeed above 5120 bytes and as such going
to variable length pool, which could explain what we are seeing...) coming
from the INETDRIVER.
   Packet Adr  Size  Type  Subtype Caller's PC   Routine called      Entry
Adr
    8A0AA780   8192  CXB      0     8A6912C6    EXE$DEANONPAGED     
88AC0050
                			|
					|
					---INETDRIVER calling EXE$ALONPAGVAR

As such, we are more than suspicious that the INETDRIVER is the cause of
this high synch time.

Question which remains te BE answered quickly:

	WHY IS INETDRIVER allocating packets (CXB's - Complex buffers I
assume)
	of a size = 8192 bytes???

	I sincerely think it is NOT a hardcoded value inside UCX.

	So this could then be either a UCX internal parameter, or some     
        value passed at the application level.
	In the code (in SDA) calling the varaible pool allocation routine, 
	we indeed see a allocation request for hex 2000 bytes (=8192
decimal).
INETDRIVER+01E17:  MOVZWL  #2000,R1
INETDRIVER+01E1C:  BISL2   #-80000000,R1
INETDRIVER+01E23:  JSB     @#EXE$DEBIT_BYTCNT
INETDRIVER+01E29:  BLBC    R0,INETDRIVER+01EA9
INETDRIVER+01E2C:  JSB     @#EXE$ALONONPAGED
INETDRIVER+01E32:  BLBC    R0,INETDRIVER+01EB2
...
	SO FROM THE SOURCE LISTINGS, it should be very easy to determine
where
	they get this value from... (BUT I assume we don't have them, do
	we...?)

	This is WHAT we need to know now as sson as possible.

T.RTitleUserPersonal
Name
DateLines
5383.1some clarificationBACHUS::VANHAVEREWilly Van Havere 8786Thu Mar 27 1997 08:0017
    
    Hi,
    
    I was supporting Jan on site yesterday and I thing my college slipt in a 
    little error.
    
    The mozwl #2000,r1 indicates that IT IS a hardcode value.
                                      -----
    
    So the question now becomes why this value was chosen and if we can
    patch it to a lower value.
    
    If lowering the value is not possible the only route I see is changing
    the allocation strategy in the driver eg. work with a private pool
    of 8k packets.
    
    
5383.2UCXAXP::GEMIGNANIFri Mar 28 1997 18:3711
    
    I know what the problem is ...
    
    The INETDRIVER allocates and deallocates a stack for each $QIO.  The 8K
    is the size of the stack's allocation.  I can make a change for V4.0,
    4.1 and for the next release, but I cannot provide a fix for V3.3, as
    it is no longer supported (this is case CFS.50024, which I just received
    today regarding this problem).
    
    The customer should actually upgrade to V4.0 or better.