[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:	DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:	Welcome to the Digital UNIX Conference
Moderator:	SMURF::DENHAM

Created:	Thu Mar 16 1995
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	10068
Total number of notes:	35879

9622.0. "Flat file performance" by CHEFS::rasmodem42.reo.dec.com::WorkBenchUser () Fri Apr 25 1997 09:21

All, 

I've a customer who wants some advice on getting the ultimate performance out 
of Digital UNIX, Alpha and StorageWorks.

I've pointed him in the direction of the Performance and Tuning Guide but he's 
got some additional questions.

The application will be a mixture of Oracle DB and flat files.  The Oracle 
side is a known entity (relatively speaking)but flat file performance is not 
(given the volumes that we are talking about).

Let me briefly explain ...

part of the process will involve an extraction process from 
the Oracle db which will result in the writing of anywhere between 150 
million and 300 million records of say 100 bytes to a series of flat 
files (this is on the assumed peak day when 30 settlement runs are 
required for a PES with 10 million metering systems - these are hopefully 
maxima that I am quoting).  A number of parallel processes (say 1.5 per 
CPU ... assume a minimum of 4 CPUs) will attack this task.

The next process will read these files back in, aggregating them into a 
series of matrices (one Supplier purchase matrix as it is called per 
settlement run). Again, the problem will be tackled by multiple parallel 
processes.  These matrices are eventually written to the Oracle db.

The basic question is what can we do to provide optimum flat file 
performance .... in particular ....

what options are available to minimise the number of physical i/os and 
the time that they take (both service & queueing time).  

what is the maximum size of a single physical i/o?
are multi-block transfers possible?
does caching buy us anything - either in the file system or in a disk 
controller (or does the sheer volume preclude any significant benefits)?
how effective are caches on the disks themselves likely to be in a 
sequential process such as this?
how many simultaneous i/os can a controller support ... ignoring 
concurrent seeks .... i.e. at a given point in time how many discrete 
transfers can go through the controller? One? If more than one please 
explain how? 
how long is a controller tied up with an i/o?  latency + data transfer 
time or just data transfer time? if the latter is some form of rps 
(rotational positional sensing) used?    
what advantages might the use of raw i/o bring?
what degradation would result from the use of RAID 5 particularly, or 
RAID 1?


Any help or pointers please?

T.R	Title	User	Personal Name	Date	Lines
9622.1		KITCHE::schott	Eric R. Schott USG Product Management	`Fri Apr 25 1997 10:03`	18
	Hi I assume these flat files are to the file system? Are the files read/written sequentially? If the above answers are yes, then the system should be able to do a good job with it via AdvFS, LSM striping, and HSZxx controllers (where xx should be 50 or 70). You should look at the I/O tuning talk from the symposium in http://www-unix.zk3.dec.com/symp_s97/unix.htm The machine choice and specific I/O tests would be your next step....you may want to discuss this with experts like Doug Williams from the server group. Eric
9622.2		NABETH::alan	Dr. File System's Home for Wayward Inodes.	`Fri Apr 25 1997 15:12`	105
	The short answer is host based striping. Across many controllers. If you want redundancy you may want to use array controllers to create mirrors or RAID-5s, but RAID-5 may not do well on the write side of things. re: max I/O size. It depends on the I/O subsystem and higher level components used. SCSI has a maximum I/O size of 16,777,215 bytes (16 MB - 1). The HSZ40/50 only allow a maximum of 64 KB. Out of the box, I think UFS, AdvFS and LSM also have a 64 KB maximum, though it may be possible to raise it. re: multi-block transfers The answer is trivially obvious that the question is confusing me. Of course. You can hand the raw disk interface or file system a very large I/O and it will handle breaking that down in however many I/Os are needed. re: Caching. When reading, caches usually serve two purposes: o Holding frequently read data. o Holding the data that is about to be read (read-ahead). You'll be hard to pressed to find a big enough cache that it can hold these large files (assuming they're sequentially read). On the other, UFS and I think AdvFS will do read- head once they decide you're doing sequential reads. This can help performance considerably when using the file system. Most modern disks also do read-ahead with their caches. The HSZ family doesn't do any read-ahead that I know of (a notable flaw in my opinion). Write caching is a little different. Even when the cache has filled and you can't write any faster than the cache can flush, there may be opportunities for I/O subsystem to optimize the writes. It may collect smaller writes done to the cache into larger writes, it may write data out of order if it knows the head may be under the right sector, etc. re: effectiveness of disk caches. Probably very. You can use scu(8) to turn off the read-ahead cache on a disk. Do a benchmark. I suspect the difference will be significant. re: Controller questions. I need to think about these. You may want to ask in the ASK_SSAG conference on SSAG. Be sure to specify the controller of interest. re: Advantage of raw I/O. It doesn't tie up any memory for buffer cache. It saves a data copy from the cache to the user buffer. The disadvantage is that it doesn't tie up any memory for the buffer cache, meaning you have none features offered by the buffer cache; no read-head unless you do it yourself. When in doubt Benchmark. re: Performance problems of RAID-5. In a worst case a simple RAID-5 implementation has to do four I/Os to write one sector. Fancy implementation that fix the write hole may do more. Write back caches may reduce this some by absorbing writes to the same places. They may also allow the oppurtunity to take separate writes to the same general area and combine them. A smart RAID-5 implementation can take very large writes and make them appear to be RAID-3 writes, which only require one additional I/O for the parity (and any house keeping I/Os). In the normal state for reads, RAID-5 should look like striping with one extra disk. Not in the normal state, RAID-5 will still be reading where anything else would be getting I/O errors. If your application can take data fast enough, you don't want RAID-5 anyway. A controller based array is only as fast as the connection to the host. Host based striping lets you spread out that I/O over more controllers and generally get faster I/O. re: Performance problem is RAID-1. Without a safe write-back cache, writes don't complete until the last one does. This hurts the write performance. With a safe write cache (such as when using controller RAID), the controller can complete the write when it gets the data and get the data to the disk at its leisure. The usual problems of the cache filling apply... And there are the usual house keeping I/Os. Using controller based mirroring the host only has to send the data once, but that controller becomes a single point of failure. You can use mirroring either at the host level or the controller level. Benchmark each and balance the tradeoffs. If you want some device redundancy, but want the highest performance, may be host striping and controller mirroring is the right answer.
9622.3	Some controller answers.	NABETH::alan	Dr. File System's Home for Wayward Inodes.	`Fri Apr 25 1997 15:26`	31
	re: Concurrent transfers. The simple answer is one, but... A pair of Fast/Wide SCSI devices can exchange one 16 bit word of data every clock tick, with the clock ticking at 10 Mhz. For long transfers, not all the data may be tranferred at once, to prevent other devices from getting starved for bandwidth. I think protocol transfers use 8 bit words at 5 Mz, which cuts into the total bandwidth available. So, any discrete instant (a clock tick) only one word is being transferred. Over a collection of such instants many transfers may appear to be in progress. If a single device can't saturate the bus, then two or more devices may be able to use the spare cycles. And, there is more to being able to accept multiple commands than concurrent seeking. If an individual SCSI device supports command queuing it can accept multiple commands sort them, break them up, combine them internally, etc to offer higher through-put. In an array controller, the multiple commands could be completed independently since each device of the array can operate independently. re: Controller doing I/O. What kind of controller? For your proposed I/O load, I don't think it really matters much, because you'll have saturated the bus long before you saturate the ability of the controller to handle I/Os.