[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:Welcome to the Digital UNIX Conference
Moderator:SMURF::DENHAM
Created:Thu Mar 16 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:10068
Total number of notes:35879

9356.0. "what is th sparse?" by DEKVC::YONGJOONCHOI () Wed Apr 02 1997 05:53

    
    
    Hi!
    
    It is basic question i think
    To distinguish between tar,cpio and dump, examining through note
    meet many times such as sparse file, holes
    I don,t  figure out what is the word(sparse file, holes)
    
    tar,cpio  are unware of sparse file  and can never determine where
    holes could be.         -----------
    -----                       
    But dump knows where the holes are and can reproduce them.
                             -----
    
    Please more detail explanation
    
    thanks
    
    
    
T.RTitleUserPersonal
Name
DateLines
9356.1What sparse means...NETRIX::"[email protected]"Tim MarkWed Apr 02 1997 07:239
A sparse file is one which has offsets into it at which no data has yet
been written.  For example, the pseudocode below creates a new file, seeks
8192 bytes into it, and writes 100 bytes there.  That would create a "hole"
of size 8192 starting at offset zero.

fd = creat("myfile");
lseek(fd, 8192, SEEK_SET);
write(fd, buf, 100);
[Posted by WWW Notes gateway]
9356.2Sparse and Zero filledVAXRIO::63116::CARDOSOWed Apr 02 1997 09:37124
A quick description of what "Sparse" and "Zero filled"
mean:

When you write to a file you normally think of appending
data.  But in some cases, you instead choose to lseek()
some large amount (to avoid actually taking up disk blocks
for space that you don't expect to ever fully populate) and
then resume writing.  This leaves a "hole" between the two
points of data.  "Zero filled" would populate the area
between the two points of data with 0's, i.e. "fill in the
gap".

AdvFS represents the "hole" by a special null map
entry in the .tag entry for the file.  The UFS file
system performs similar function within the inode table.

Why all the fuss?

Certain applications either require sparse files, or worse
cannot handle sparse files.  Oracle is one such application.
You must know which utilities to use to provide proper
file types for these applications to function, without
unintentionally changing the file while performing backups.

SOLUTION :

O/S VER  Command        File Type (input)  File Type (output)
------- --------------- -----------------  ------------------
3.X     vdump/vrestore  Zero filled     ---->Zero filled
 Note1: vdump/vrestore  Sparse          ---->Zero filled

        cp              Zero filled     ---->Zero filled
        cp              Sparse          ---->Zero filled

 Note2: cpio (pax)      Zero filled     ---->Sparse
 Note2: cpio (pax)      Sparse          ---->Sparse

 Note2: tar (pax)       Zero filled     ---->Sparse
 Note2: tar (pax)       Sparse          ---->Sparse

        dd conv=sparse  Zero filled     ---->Sparse


The above were tested on V3.2B, V3.2C, V3.2D-1. The vdump
and vrestore utilities were also tested on an ADVFS clone-
set.

NOTE1:
------
In V3.x of Digital UNIX, the "vdump/vrestore" commands did
not properly restore the sparse file.  Instead all the
"holes" were changed to ZERO filled blocks on the physical
disk during the vrestore.  This was corrected in UNIX V4.0
(Reference: Release Notes, 4.8.3.7, page 4-35).

NOTE2:
------
In V3.x of Digital UNIX, the "pax" based "tar/cpio" commands
incorrectly restore all files as "sparse".  The workaround
for this problem is to install the "tar/cpio" commands in
the OSFOBSOLETE3xx subset.  This was also corrected in
UNIX V4.0.

O/S VER  Command        File Type (input)  File Type (output)
------- --------------- -----------------  ------------------
4.X     vdump/vrestore  Zero filled     ---->Zero filled
        vdump/vrestore  Sparse          ---->Sparse

        cp              Zero filled     ---->Zero filled
        cp              Sparse          ---->Zero filled

        cpio            Zero filled     ---->Zero filled
        cpio            Sparse          ---->Zero filled

        tar             Zero filled     ---->Zero filled
        tar             Sparse          ---->Zero filled

        dd conv=sparse  Zero filled     ---->Sparse

The above were tested on V4.0A (Rev 464) + ADVFS. The vdump
and vrestore utilities were also tested on an ADVFS clone-
set.

Other Utilities:
----------------
        Sterling tar    Zero filled     ---->Zero filled
        Sterling tar    Sparse          ---->Zero filled

        Gnu Tar (gtar)  Zero filled     ---->Zero filled
        Gnu Tar (gtar)  parse           ---->Zero filled

The "cp" command can be used to un-sparse a file.  It checks
for "holes", and in writing the output file expands them to
be ZERO filled.


The "dd" command can be used to sparse a file. It method-
ically checks for blocks full of repeating zeroes, and on
output does the "lseek()"s to avoid writing the zeroes.
Generally, this is the way to put the "holes" back in the
file.

        "dd" example:
         # compress file
         # tar xvf /dev/rmtnh
         # dd if=xx.Z | compress -dc | dd of=xx conv=sparse

The "ls -sl" command can be used to determine whether a file
is sparseor not.  The "s" parameter displays physical disk
usage in 1024 byte units.  A zero filled file will display
a number that when multipled by 1024 is very close to the
allocated size displayed.  The sparse file will display
a disk usage number significantly smaller.

Examples:
#ls -ls         !Zero filled file
1032 -rw-r--r--   1 .. ..   1048582 Nov 14 09:23 sparse_file
   (1032 * 1024) = 1056768 bytes


#ls -ls         !sparse file
 8 -rw-r--r--   1 .. ..   1048582 Nov 14 09:23 sparse_file
   (8 * 1024) = 8192 bytes