[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

8785.0. "problem: UFS/UBC handling of > 32 GB files" by ALFAM7::GOSEJACOB () Tue Feb 11 1997 05:54

    I ran into this problem during a database benchmark. The major task
    during the tests was to measure the elapsed time for loading a flat
    ASCII file into the database. The largest file to load was
    37000000000 bytes. So it was just above 32 GB in size. We
    experienced a pretty dramatic performance drop approaching the end
    of the load phase for this file. In fact it took about 20% of the
    total elapsed time to load the last 2 GB.

    This test was run on 3.2g with the file located in an UFS. I
    started to experiment again and here is what I found.

    I'm working on a 60 GB UFS that was created with 'newfs -i 1073741824'
    (in order to speed up the filesystem creation and I'm only using a
    very small number of files). I have recreated the input ASCII file
    with the same size. And I have coded a little C program which
    'lseeks' into the file and starts 'read'ing 128 KB chunks
    sequentially from there.

    Monitoring the disk I/O (LSM volume I/O actually) I find that when
    I start reading at byte 1 of the file I see 64 KB reads and 'vmubc'
    reports close to 100% UBC hits; exactly as I expected.

    Now when I read past 32 GB (either by starting at byte 1 or
    'lseek'ing to about 32 GB) the I/O transfer size drops down to 8 KB
    and I see close to 90% UBC misses. The read throughput drops down
    to about 20% of the throughput reading from earlier parts of the
    file. Looking at the offsets I'm reading from this seems to happen
    at exactly the 32 GB boundary.

    Reading the whole file (on a 8 GB T'laser with ubc-max-percent = 90)
    the UBC gets filled with pages from the one large file (not very
    much else is happening on that machine in parallel). When the UBC
    is filled up completely it is flushed to about 10% of its size and
    the allocated memory is released. Filling and flushing the UBC
    takes place several times during the test.

    Now when I read past 32 GB the algorithm that flushed the UBC is
    not activated anymore. The UBC stays filled at 100% until I
    'umount' the filesystem.

    Just for the fun of it I re'ran the same tests with the file
    located in an AdvFs fileset. The behavior is different: AdvFs
    produces the same I/O transfer rate throughout the whole file. The
    elapsed time for reading the whole file is about double of what I
    get with UFS.

    Another little experiment I did was using a sparse file (starting
    with 1 block and a couple of blocks around 32 GB offset). I saw the
    very same behavior as described above.

    I have not been able to re'run the test with 4.0x yet.

    It looks like several UFS/UBC strategy routines stop working
    correctly with files larger than 32 GB. QAR-time, is it?

	Martin

    P.S. If anyone's interested: ask me for a copy of the test programs
T.RTitleUserPersonal
Name
DateLines
8785.1Problem was fixed but not in V3.2xRUSURE::HEWITTTue Feb 11 1997 13:058
    There was a fix for this that went into Platinum and it may have
    slipped through the cracks for 3.2x. I believe it was originally
    reported against 3.2C. I'll check on it and if you'd like a patch we
    can make one available (and make sure it get's into the other
    versions).
    
    -Alex
    
8785.2same thing under 4.0BALFAM7::GOSEJACOBWed Feb 12 1997 09:304
    I just ran the very same tests with the same machine using 4.0B now. 
    And I see the exact same behavior as described in .0.
    
    	Martin
8785.3QAR 51534ALFAM7::GOSEJACOBFri Feb 14 1997 04:341