[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:	DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:	Welcome to the Digital UNIX Conference
Moderator:	SMURF::DENHAM

Created:	Thu Mar 16 1995
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	10068
Total number of notes:	35879

9065.0. "protocol stack tuning for NFS on large servers?" by MEOC02::JANKOWSKI () Fri Mar 07 1997 04:53

    I am looking for advice for tuning of the network protocol
    stack for large SMP servers.
    
    I read the Tuning manual and there is nothing there.
    
    The specific situation that I have is this:
    
    The customers has two sables that each mount filesystems from the
    other. Those are 3 CPU, 2GB systems used for large ADA development
    project. The network is squeaky clean - FDDI.
    We are using 4% of the available bandwidth.
    There is no CPU, memory or IO bottlenecks on any of the machines.
    
    With all of that the customer still gets NFS timeouts (they use
    soft mounts).
    
    I looked at all the stats and the only suspicious this I noticed was:
    
    netstat -p udp
    ............
    6 full sockets
    
    I would have read this as the stack complaining that sockets cannot be
    obtained. But how that can be if there is plenty of memory left.
    
    I am aware of note 6850.1 and I also recommended hard mounts.
    I have not yet resorted to increasing the number of retries.
    
    Any comments?
    
    DU is V3.2C - no patches. (The customer cannot move up yet.)
    
    Regards,
    
    Chris Jankowski
    Melbourne Australia

T.R	Title	User	Personal Name	Date	Lines
9065.1	Some ideas	NETRIX::"[email protected]"	Dave Cherkus	`Fri Mar 07 1997 07:32`	22
	The 'full sockets' count is pretty darn low so I doubt it's an issue. It means that the operating system didn't schedule a thread/process quick enough to drain the socket. You don't say if it's the client or server so it could be a nfs server thread, a client biod thread, etc. You could patch the kernel variable 'udp_recvspace' to be something like 63k but again I doubt this is the core issue. Is the system cycle-starved? The only other explaination I could see for the timeouts is that the 2100s PCI busses are so busy the network adapter is not being serviced often enough i.e. the system is bus-starved. Are the FDDIs on EISA or PCI? Is there a lot of local disk IO in addition to the NFS IO? Which disk controllers are in use? Have you installed patch OSF350-248 (big bag o' network performance fixes)? Dave [Posted by WWW Notes gateway]
9065.2	What would udp_recvspace = 63k do?	MEOC02::JANKOWSKI		`Mon Mar 10 1997 07:08`	22
	Re. .2 The socket full message is seen on the client side. The client would very occasionally reach 100% CPU. FDDI is on the PCI. The SCSI controller is SWXCR - PCI 3 channel, RAID 5 with optimum disk configuration across channels. The IO rates are in the order of 40/s across the controller with occasional higher bursts. I recommended the networking patches so this may help too. Question: What would setting of udp_recvspace = 63k do? Thanks and regards, Chris Jankowski Melbourne Australia
9065.3	udp_recvspace	NETRIX::"[email protected]"	Dave Cherkus	`Mon Mar 10 1997 15:25`	8
	> What would setting of udp_recvspace = 63k do? It would let more data sit in the receiver's socket buffer. This may or may not be what you want to happen, because the way most IP protocols learn that there is a problem is via the loss of data - it may be good to just let the data be dropped so the server will throttle back. [Posted by WWW Notes gateway]