[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

9116.0. "RAW devices on Datawarehousing solution" by VAXRIO::LEO () Tue Mar 11 1997 17:46

    Hi,                                           
    
    I read a lot of stuff comparing raw devices with advfs.
    
    It clearly seems to me that raw devices will be a less manageable solution 
    than advfs but should represent a better performance in most of cases.
    
    I am facing a very big Datawarehousing bid. We are competing against
    all the main RISC vendors such as SUN, IBM, Pyramid, HP and so on. But
    SUN seems to be our real competitor on that.
    
    The real customer database has 700 GB and uses Oracle 7.3.2.3.
    
    We are going to use Oracle VLM and BOBs (Big Oracle Blocks - 32KB) and 
    RAID 3 in order to get the better performance. We are using raw devices
    as well.
    
    Our solution is based on more than 2GB SGA (Oracle Shared Global Area
    where all the database caches are located).
    
    SUN is claiming that Supercaching (that is implemented using small SGA
    and very big UFS caches) is a better solution than VLM. I was not able
    to find SUN's Supercaching results in Datawarehousing environments. I
    know that with this kind of solution they will be not able to use RAW
    devices and will be using a double buffering process (SGA and UFS).
    
    Bottom line, the customer is interested to know the raw devices
    advantages against ufs, specially in terms of performance. If we list
    enough reasons to use raw devices instead of ufs the customer will 
    eliminate  SUN's high proprietary solution (only works with Oracle and is 
    not a real 64-bit implementation) out of the bid.
    
    It would be great because we will be able to sell 4*8400 immediately.      
    
	Please could you help me in order to get more info about raw 
    devices advantages against ufs in terms of performance ?
    
    	And about VLM against SUN's Supercaching ?
    
        Thanks in advance,
    
    	Regards,
    
    	Leo
    	Digital Technical Support 
    	Digital - Brazil
    
    	
T.RTitleUserPersonal
Name
DateLines
9116.1SSDEVO::ROLLOWDr. File System's Home for Wayward Inodes.Tue Mar 11 1997 23:5328
	One notable performance advantage of raw device is that
	it avoids a data copy compared to our file system imple-
	mentations.  When data is read and written to a file
	system, the data is read into a kernel buffer (part of
	the UBC) and then copied to user space.  The raw device
	interface moves the data directly to the user space.  I
	don't recall if memory mapped files avoid this copy.

	Some vendors have file systems that allow direct I/O
	to user space (SGI), but I'm not sure Sun does.

	I/O to raw devices also gives more control over data
	organization.  UFS tries hard to spread very large
	files out across the file system, though it can easily
	create large chunks of contiguous space (16 MB).  AdvFS 
	tries hard to keep files contiguous, but unless the 
	underlying domain has lots of free space, it will still 
	be a bit fragmented.  With a raw device, the application
	has complete control.

	In bypassing the file system, the application has a little
	better control of data caching, since the file system
	layer won't be doing any.

	While some file system implementations may do read-ahead
	which can offer excellent sequential read performance, a 
	well written application using raw devices can get the
	same result.
9116.2a couple of commentsALFAM7::GOSEJACOBWed Mar 12 1997 05:5429
    re .0
    Well unfortunately the answer to the UFS/raw device question is not
    exactly black or white. In general Oracle databases on top of raw
    devices tend to have better performance. The simple explanation for
    that being double buffering: if you are using UFS (or AdvFS) you need
    to allocate memory for the buffer cache and you are simple taking that
    amount of memory away from the Oracle SGA.
    
    Things to remember:
    
    Will your application actually exploit a large SGA? E.g. when you do
    full table scans of tables much larger then the SGA the Oracle kernel
    will have to go out to the disks anyway.
    
    In some cases (especially with small Oracle blocks of 2 or 4 KB)
    storing the datafiles on UFS may display a better performance than raw
    devices because chances are that UFS will cluster the large number of
    small I/O's into fewer bigger I/O's.
    
    Certain parts in the Unix kernel (LSM for sure; maybe others) limit I/O
    sizes to 64KB in which case using BOB's even on LSM raw volumes will not
    produce I/O's larger than this 64KB limit.
    
    Just a couple of thoughts. I just wanted to point out what SUN might
    use as arguments. In any case if the SUN Supercaching (which I'm not
    really familiar with) is just "that is implemented using small SGA and
    very big UFS caches" we can always do the same thing on our machines.
                                                                         
    	Martin  
9116.3VAXRIO::LEOWed Mar 12 1997 08:0230
    Hi,
    
    Thank you for the prompt replies.
    
    As Martin said on the previous reply we can always try the same
    approach that SUN is using on Datawarehousing opportunities,e.g. small
    SGA and very big ufs caches.
    
    But my main question is still there. Is this kind of approach better
    than using VLM (big SGA) and raw devices?
    
    We are using bitmap indexes and parallel queries but we are still
    having full table scans over big fact and aggregate tables. Some of these
    tables have more than 8GB.
    
    We are using right now 5GB of SGA (6GB of RAM).
    
    What is the best approach for this kind of environment ? VLM or SUN's
    Supercaching ?
    
    Thanks once more,
    
    Best regards,
    
    Leo
    Digital Technical Support 
    DEC Brazil
    
    
    
9116.4SMURF::DENHAMDigital UNIX KernelWed Mar 12 1997 17:2410
    An up and coming alternative to raw i/o in these situations is
    the "direct I/O" feature new to advfs. The bypasses the buffer
    cache, essentially letting the application do its own buffering,
    etc.
    
    A nice feature of this is that libaio will be able to do fast
    kernel aio through advfs, i.e., no threads required for 
    asynchrony. I believe this is a Steel feature, but enough
    demand might pull in back to pt minor. That's pure conjecture
    on my part, of course.
9116.5UTRUST::PILMEYERQuestions raise the doubtThu Mar 13 1997 05:469
    Oracle has better knowledge of what's in the database than the file
    system. So at least in theory it should be better to have Oracle cache
    the data (VLM) than to let the file system take care of it (UBC).
    
    However in real life I've seen situations where Oracle isn't handling
    large SGA's very well. Oracle clearly still needs to mature in this
    area.
    
    -Han
9116.6LEXSS1::GINGERRon GingerFri Mar 14 1997 16:496
    does digital Unix really still  copy buffers from the buffer cache to
    the users space as suggested in .1?
    
    I just finished a unix internals course and the instructor was quite
    sure all data passing was just bupper memory pointers, not copying
    contents of buffers.