[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

9952.0. "Poor vrestore performance" by NNTPD::"[email protected]" (Gianfrancesco Scaglioni) Tue May 27 1997 06:15

Hi,
I am Gianfrancesco Scaglioni, NSIS Milan Italy.
A large account in Italy has a critical problem in vrestore performance.
Vdump performance are ok, 3 hours for a 44 GB domain.
Vrestore takes 19 hours to put the data back from the tape (TZ887).
Customer system configuration is the following.
CPU 4000 5/466
DIGITAL UNIX 3.2G ( SAP R/3 need this version ).
they make a single Advfs domain for SAP R/3 software and data.
This Advfs domain is based on 4 raid 5 volumes controlled by 2 HSZ40 
(dual redundant configuration). RAID volumes contains 6 disks each
3 volumes are rz28M based and one is rz29 based.
Chunk size of the three rz28 volumes is 409, the last volume has a chunk 
size of 256. Write back cache is disabled. Hsz40 firmware revision is
v30Z-2 HW B02.
I made some test to see the problem on site. The I/O rate was around 0,7
MB/Sec,
a little better with tar ( 0,77) on the same amount of data.
I made a vrestore test on the internal disk ( /usr  ) vrestore performance was

1,24 MB/sec (this will lead to 10 hours restore time the minimum acceptable
for the customer) .
During the test the system was running in production: on the /usr filesystem
there
was the primary swap area.
Test were made from tape to disk with 290MB and 853 MB filesets.
Is this normal raid 5 behavior ?  Can we reduce the time to restore to half
the time it takes now?

			Thanks in advance for any help 

					Gianfrancesco 
[Posted by WWW Notes gateway]
T.RTitleUserPersonal
Name
DateLines
9952.1No surprise.SSDEVO::ROLLOWDr. File System's Home for Wayward Inodes.Tue May 27 1997 10:2039
	RAID-5, in all but the most ideal conditions, has generally
	poor write performance.  When you hobble it by turning off
	the write-cache, this performance is likely to get even
	worse.  And, in the case of your RAIDs, the large chunk
	sizes make the likelyhood of large write optimizations even
	less likely.

	The first thing I would do is turn on the write cache for
	the logical unit having the restore done.  That should have
	some positive affect.  After that it gets harder.  If this
	kind of write load is going to be common, the customer needs
	to think again about using RAID-5.

	Typically, RAID-5 writes require reading the old copy of
	the data and the corresponding XOR.  From this, the new
	XOR is calculated and then the new data and XOR can be
	written.  This is called read-modify-write and requires
	four I/Os for every write.  Implementations that do work
	to close the write-hole, may do even more I/O to keep track
	of the state, though these will tend to be small.

	As the width of a write affects more members of the array
	by being close to the stripe width, the write algorithm
	changes and it becomes more efficient to read the unaffected
	data to calcuate the XOR.  It has been too long for me to
	remember what this particular write algorithm is called.
	At the point where the write size is the stripe width,
	no data has to be read, since the XOR can be calculated
	from all the data being written.  At this point a RAID-5
	looks remarkably like RAID-3 and the only extra I/O needed
	(aside from the state management ones) is the one to write
	the XOR data.

	Recalling that your chunk size is 40something sectors across
	four members, your stripe width is around 800 KB.  Since the
	HSZ has a 64 KB I/O size limit you'll always be doing a read-
	modify-write.  With the write-cache enabled, the HSZ would
	have the chance to collect enough data to do a full width
	write.
9952.2Do cache really improve vrestore I/oNNTPD::"[email protected]"Gianfrancesco ScaglioniTue May 27 1997 14:0711
Hi, thanks for your quick reply,

I have two questions more: 1) what happen if vrestore continuosly fills write
back
cache, and 2) what happen when vrestore writes on raids sets, it countinuosly
add
new data: does not modify any data.

				thanks for any help.
 
[Posted by WWW Notes gateway]
9952.3NABETH::alanDr. File System's Home for Wayward Inodes.Tue May 27 1997 17:4227
	1.  The cache will become full at which time, the controller will
	    have to flush it to allow more data into the cache.

	2.  One hopes that the data gets written to disk...

	The typical write load served by a cache such as the one in the
	HSZ is one where many of the same blocks are being written over
	and over and the cache absorbs all the writes so that it only
	needs to write infrequently to disk.  A vrestore isn't going to
	present this kind of load to a device.

	But, the other use of the cache in RAID-5 is to accept a lot
	of hopefully contiguous data, so that the controller firmware
	can collect a lot of little writes into a single large one.
	Or, to take advantage of the RAID-3 optimizations, a smaller
	set of large writes that span an entire stripe.

	The 3rd use of the cache is also particular to RAID-5 is to
	handle all the metadata I/O that is needed to maintain the
	array in a safe state.  This is I/O that the host never sees,
	but can signficiantly affect the performance if you have to
	wait on a relatively slow disk to complete a write.

	I don't know that enabling the write-back cache will help
	this particular I/O load, but I think it is poor use of
	the feature not to use it.  I doubt it will hurt and anything
	could help.