[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

9622.0. "Flat file performance" by CHEFS::rasmodem42.reo.dec.com::WorkBenchUser () Fri Apr 25 1997 10:21

All, 

I've a customer who wants some advice on getting the ultimate performance out 
of Digital UNIX, Alpha and StorageWorks.

I've pointed him in the direction of the Performance and Tuning Guide but he's 
got some additional questions.

The application will be a mixture of Oracle DB and flat files.  The Oracle 
side is a known entity (relatively speaking)but flat file performance is not 
(given the volumes that we are talking about).

Let me briefly explain ...

part of the process will involve an extraction process from 
the Oracle db which will result in the writing of anywhere between 150 
million and 300 million records of say 100 bytes to a series of flat 
files (this is on the assumed peak day when 30 settlement runs are 
required for a PES with 10 million metering systems - these are hopefully 
maxima that I am quoting).  A number of parallel processes (say 1.5 per 
CPU ... assume a minimum of 4 CPUs) will attack this task.

The next process will read these files back in, aggregating them into a 
series of matrices (one Supplier purchase matrix as it is called per 
settlement run). Again, the problem will be tackled by multiple parallel 
processes.  These matrices are eventually written to the Oracle db.

The basic question is what can we do to provide optimum flat file 
performance .... in particular ....

what options are available to minimise the number of physical i/os and 
the time that they take (both service & queueing time).  

what is the maximum size of a single physical i/o?
are multi-block transfers possible?
does caching buy us anything - either in the file system or in a disk 
controller (or does the sheer volume preclude any significant benefits)?
how effective are caches on the disks themselves likely to be in a 
sequential process such as this?
how many simultaneous i/os can a controller support ... ignoring 
concurrent seeks .... i.e. at a given point in time how many discrete 
transfers can go through the controller? One? If more than one please 
explain how? 
how long is a controller tied up with an i/o?  latency + data transfer 
time or just data transfer time? if the latter is some form of rps 
(rotational positional sensing) used?    
what advantages might the use of raw i/o bring?
what degradation would result from the use of RAID 5 particularly, or 
RAID 1?


Any help or pointers please? 
T.RTitleUserPersonal
Name
DateLines
9622.1KITCHE::schottEric R. Schott USG Product ManagementFri Apr 25 1997 11:0318
Hi

 I assume these flat files are to the file system?

Are the files read/written sequentially?

If the above answers are yes, then the system should be
able to do a good job with it via AdvFS, LSM striping, and
HSZxx controllers (where xx should be 50 or 70).

You should look at the I/O tuning talk from the
symposium in http://www-unix.zk3.dec.com/symp_s97/unix.htm

The machine choice and specific I/O tests would be your next
step....you may want to discuss this with experts like
Doug Williams from the server group.

Eric
9622.2NABETH::alanDr. File System's Home for Wayward Inodes.Fri Apr 25 1997 16:12105
	The short answer is host based striping.  Across many controllers.
	If you want redundancy you may want to use array controllers
	to create mirrors or RAID-5s, but RAID-5 may not do well on
	the write side of things.

	re: max I/O size.

	It depends on the I/O subsystem and higher level components
	used.  SCSI has a maximum I/O size of 16,777,215 bytes (16 MB
	- 1).  The HSZ40/50 only allow a maximum of 64 KB.  Out of
	the box, I think UFS, AdvFS and LSM also have a 64 KB maximum,
	though it may be possible to raise it.

	re: multi-block transfers

	The answer is trivially obvious that the question is confusing
	me.  Of course.  You can hand the raw disk interface or file
	system a very large I/O and it will handle breaking that down
	in however many I/Os are needed.

	re: Caching.

	When reading, caches usually serve two purposes:

	o  Holding frequently read data.
	o  Holding the data that is about to be read (read-ahead).

	You'll be hard to pressed to find a big enough cache that
	it can hold these large files (assuming they're sequentially
	read).  On the other, UFS and I think AdvFS will do read-
	head once they decide you're doing sequential reads.  This
	can help performance considerably when using the file system.
	Most modern disks also do read-ahead with their caches.  The
	HSZ family doesn't do any read-ahead that I know of (a notable
	flaw in my opinion).

	Write caching is a little different.  Even when the cache
	has filled and you can't write any faster than the cache can
	flush, there may be opportunities for I/O subsystem to
	optimize the writes.  It may collect smaller writes done to
	the cache into larger writes, it may write data out of order
	if it knows the head may be under the right sector, etc.

	re: effectiveness of disk caches.

	Probably very.  You can use scu(8) to turn off the read-ahead
	cache on a disk.  Do a benchmark.  I suspect the difference
	will be significant.

	re: Controller questions.

	I need to think about these.  You may want to ask in the
	ASK_SSAG conference on SSAG.  Be sure to specify the controller
	of interest.

	re: Advantage of raw I/O.

	It doesn't tie up any memory for buffer cache.  It saves a
	data copy from the cache to the user buffer.  The disadvantage
	is that it doesn't tie up any memory for the buffer cache,
	meaning you have none features offered by the buffer cache;
	no read-head unless you do it yourself.  When in doubt

		Benchmark.

	re: Performance problems of RAID-5.

	In a worst case a simple RAID-5 implementation has to do four
	I/Os to write one sector.  Fancy implementation that fix the
	write hole may do more.  Write back caches may reduce this some
	by absorbing writes to the same places.  They may also allow
	the oppurtunity to take separate writes to the same general
	area and combine them.

	A smart RAID-5 implementation can take very large writes and
	make them appear to be RAID-3 writes, which only require one
	additional I/O for the parity (and any house keeping I/Os).

	In the normal state for reads, RAID-5 should look like striping
	with one extra disk.  Not in the normal state, RAID-5 will still
	be reading where anything else would be getting I/O errors.

	If your application can take data fast enough, you don't want
	RAID-5 anyway.  A controller based array is only as fast as
	the connection to the host.  Host based striping lets you
	spread out that I/O over more controllers and generally get
	faster I/O.

	re: Performance problem is RAID-1.

	Without a safe write-back cache, writes don't complete until
	the last one does.  This hurts the write performance.  With
	a safe write cache (such as when using controller RAID), the
	controller can complete the write when it gets the data and
	get the data to the disk at its leisure.  The usual problems
	of the cache filling apply...  And there are the usual house
	keeping I/Os.

	Using controller based mirroring the host only has to send
	the data once, but that controller becomes a single point
	of failure.  You can use mirroring either at the host level
	or the controller level.  Benchmark each and balance the
	tradeoffs.  If you want some device redundancy, but want
	the highest performance, may be host striping and controller
	mirroring is the right answer.
9622.3Some controller answers.NABETH::alanDr. File System's Home for Wayward Inodes.Fri Apr 25 1997 16:2631
	re: Concurrent transfers.

	The simple answer is one, but...

	A pair of Fast/Wide SCSI devices can exchange one 16 bit word
	of data every clock tick, with the clock ticking at 10 Mhz.
	For long transfers, not all the data may be tranferred at
	once, to prevent other devices from getting starved for
	bandwidth.  I think protocol transfers use 8 bit words at
	5 Mz, which cuts into the total bandwidth available.

	So, any discrete instant (a clock tick) only one word is
	being transferred.  Over a collection of such instants
	many transfers may appear to be in progress.  If a single
	device can't saturate the bus, then two or more devices
	may be able to use the spare cycles.

	And, there is more to being able to accept multiple commands
	than concurrent seeking.  If an individual SCSI device
	supports command queuing it can accept multiple commands
	sort them, break them up, combine them internally, etc
	to offer higher through-put.  In an array controller, the
	multiple commands could be completed independently since
	each device of the array can operate independently.

	re: Controller doing I/O.

	What kind of controller?  For your proposed I/O load, I don't
	think it really matters much, because you'll have saturated
	the bus long before you saturate the ability of the controller
	to handle I/Os.