T.R | Title | User | Personal Name | Date | Lines |
---|
9116.1 | | SSDEVO::ROLLOW | Dr. File System's Home for Wayward Inodes. | Tue Mar 11 1997 23:53 | 28 |
| One notable performance advantage of raw device is that
it avoids a data copy compared to our file system imple-
mentations. When data is read and written to a file
system, the data is read into a kernel buffer (part of
the UBC) and then copied to user space. The raw device
interface moves the data directly to the user space. I
don't recall if memory mapped files avoid this copy.
Some vendors have file systems that allow direct I/O
to user space (SGI), but I'm not sure Sun does.
I/O to raw devices also gives more control over data
organization. UFS tries hard to spread very large
files out across the file system, though it can easily
create large chunks of contiguous space (16 MB). AdvFS
tries hard to keep files contiguous, but unless the
underlying domain has lots of free space, it will still
be a bit fragmented. With a raw device, the application
has complete control.
In bypassing the file system, the application has a little
better control of data caching, since the file system
layer won't be doing any.
While some file system implementations may do read-ahead
which can offer excellent sequential read performance, a
well written application using raw devices can get the
same result.
|
9116.2 | a couple of comments | ALFAM7::GOSEJACOB | | Wed Mar 12 1997 05:54 | 29 |
| re .0
Well unfortunately the answer to the UFS/raw device question is not
exactly black or white. In general Oracle databases on top of raw
devices tend to have better performance. The simple explanation for
that being double buffering: if you are using UFS (or AdvFS) you need
to allocate memory for the buffer cache and you are simple taking that
amount of memory away from the Oracle SGA.
Things to remember:
Will your application actually exploit a large SGA? E.g. when you do
full table scans of tables much larger then the SGA the Oracle kernel
will have to go out to the disks anyway.
In some cases (especially with small Oracle blocks of 2 or 4 KB)
storing the datafiles on UFS may display a better performance than raw
devices because chances are that UFS will cluster the large number of
small I/O's into fewer bigger I/O's.
Certain parts in the Unix kernel (LSM for sure; maybe others) limit I/O
sizes to 64KB in which case using BOB's even on LSM raw volumes will not
produce I/O's larger than this 64KB limit.
Just a couple of thoughts. I just wanted to point out what SUN might
use as arguments. In any case if the SUN Supercaching (which I'm not
really familiar with) is just "that is implemented using small SGA and
very big UFS caches" we can always do the same thing on our machines.
Martin
|
9116.3 | | VAXRIO::LEO | | Wed Mar 12 1997 08:02 | 30 |
| Hi,
Thank you for the prompt replies.
As Martin said on the previous reply we can always try the same
approach that SUN is using on Datawarehousing opportunities,e.g. small
SGA and very big ufs caches.
But my main question is still there. Is this kind of approach better
than using VLM (big SGA) and raw devices?
We are using bitmap indexes and parallel queries but we are still
having full table scans over big fact and aggregate tables. Some of these
tables have more than 8GB.
We are using right now 5GB of SGA (6GB of RAM).
What is the best approach for this kind of environment ? VLM or SUN's
Supercaching ?
Thanks once more,
Best regards,
Leo
Digital Technical Support
DEC Brazil
|
9116.4 | | SMURF::DENHAM | Digital UNIX Kernel | Wed Mar 12 1997 17:24 | 10 |
| An up and coming alternative to raw i/o in these situations is
the "direct I/O" feature new to advfs. The bypasses the buffer
cache, essentially letting the application do its own buffering,
etc.
A nice feature of this is that libaio will be able to do fast
kernel aio through advfs, i.e., no threads required for
asynchrony. I believe this is a Steel feature, but enough
demand might pull in back to pt minor. That's pure conjecture
on my part, of course.
|
9116.5 | | UTRUST::PILMEYER | Questions raise the doubt | Thu Mar 13 1997 05:46 | 9 |
| Oracle has better knowledge of what's in the database than the file
system. So at least in theory it should be better to have Oracle cache
the data (VLM) than to let the file system take care of it (UBC).
However in real life I've seen situations where Oracle isn't handling
large SGA's very well. Oracle clearly still needs to mature in this
area.
-Han
|
9116.6 | | LEXSS1::GINGER | Ron Ginger | Fri Mar 14 1997 16:49 | 6 |
| does digital Unix really still copy buffers from the buffer cache to
the users space as suggested in .1?
I just finished a unix internals course and the instructor was quite
sure all data passing was just bupper memory pointers, not copying
contents of buffers.
|