[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

9065.0. "protocol stack tuning for NFS on large servers?" by MEOC02::JANKOWSKI () Fri Mar 07 1997 04:53

    I am looking for advice for tuning of the network protocol
    stack for large SMP servers.
    
    I read the Tuning manual and there is nothing there.
    
    The specific situation that I have is this:
    
    The customers has two sables that each mount filesystems from the
    other. Those are 3 CPU, 2GB systems used for large ADA development
    project. The network is squeaky clean - FDDI.
    We are using 4% of the available bandwidth.
    There is no CPU, memory or IO bottlenecks on any of the machines.
    
    With all of that the customer still gets NFS timeouts (they use
    soft mounts).
    
    I looked at all the stats and the only suspicious this I noticed was:
    
    netstat -p udp
    ............
    6 full sockets
    
    I would have read this as the stack complaining that sockets cannot be
    obtained. But how that can be if there is plenty of memory left.
    
    I am aware of note 6850.1 and I also recommended hard mounts.
    I have not yet resorted to increasing the number of retries.
    
    Any comments?
    
    DU is V3.2C - no patches. (The customer cannot move up yet.)
    
    Regards,
    
    Chris Jankowski
    Melbourne Australia
    
    
T.RTitleUserPersonal
Name
DateLines
9065.1Some ideasNETRIX::"[email protected]"Dave CherkusFri Mar 07 1997 07:3222
The 'full sockets' count is pretty darn low so I doubt it's an 
issue.  It means that the operating system didn't schedule a
thread/process quick enough to drain the socket.  You don't
say if it's the client or server so it could be a nfs server
thread, a client biod thread, etc.  You could patch the 
kernel variable 'udp_recvspace' to be something like 63k
but again I doubt this is the core issue.

Is the system cycle-starved?

The only other explaination I could see for the timeouts is that
the 2100s PCI busses are so busy the network adapter is not being
serviced often enough i.e. the system is bus-starved.

Are the FDDIs on EISA or PCI?   Is there a lot of local disk IO
in addition to the NFS IO?  Which disk controllers are in use?

Have you installed patch OSF350-248 (big bag o' network performance
fixes)?

Dave
[Posted by WWW Notes gateway]
9065.2What would udp_recvspace = 63k do?MEOC02::JANKOWSKIMon Mar 10 1997 07:0822
    Re. .2
    
    The socket full message is seen on the client side.
    
    The client would very occasionally reach 100% CPU.
    
    FDDI is on the PCI.
    The SCSI controller is SWXCR - PCI 3 channel, RAID 5 with
    optimum disk configuration across channels.
    The IO rates are in the order of 40/s across the controller
    with occasional higher bursts.
    
    I recommended the networking patches so this may help too.
    
    Question:
    
    What would setting of udp_recvspace = 63k do?
    
    Thanks and regards,
    
    Chris Jankowski
    Melbourne Australia
9065.3udp_recvspaceNETRIX::"[email protected]"Dave CherkusMon Mar 10 1997 15:258
> What would setting of udp_recvspace = 63k do?

It would let more data sit in the receiver's socket buffer.
This may or may not be what you want to happen, because the
way most IP protocols learn that there is a problem is via
the loss of data - it may be good to just let the data be
dropped so the server will throttle back.
[Posted by WWW Notes gateway]