|  | A sparse file is one which has offsets into it at which no data has yet
been written.  For example, the pseudocode below creates a new file, seeks
8192 bytes into it, and writes 100 bytes there.  That would create a "hole"
of size 8192 starting at offset zero.
fd = creat("myfile");
lseek(fd, 8192, SEEK_SET);
write(fd, buf, 100);
[Posted by WWW Notes gateway]
 | 
|  | A quick description of what "Sparse" and "Zero filled"
mean:
When you write to a file you normally think of appending
data.  But in some cases, you instead choose to lseek()
some large amount (to avoid actually taking up disk blocks
for space that you don't expect to ever fully populate) and
then resume writing.  This leaves a "hole" between the two
points of data.  "Zero filled" would populate the area
between the two points of data with 0's, i.e. "fill in the
gap".
AdvFS represents the "hole" by a special null map
entry in the .tag entry for the file.  The UFS file
system performs similar function within the inode table.
Why all the fuss?
Certain applications either require sparse files, or worse
cannot handle sparse files.  Oracle is one such application.
You must know which utilities to use to provide proper
file types for these applications to function, without
unintentionally changing the file while performing backups.
SOLUTION :
O/S VER  Command        File Type (input)  File Type (output)
------- --------------- -----------------  ------------------
3.X     vdump/vrestore  Zero filled     ---->Zero filled
 Note1: vdump/vrestore  Sparse          ---->Zero filled
        cp              Zero filled     ---->Zero filled
        cp              Sparse          ---->Zero filled
 Note2: cpio (pax)      Zero filled     ---->Sparse
 Note2: cpio (pax)      Sparse          ---->Sparse
 Note2: tar (pax)       Zero filled     ---->Sparse
 Note2: tar (pax)       Sparse          ---->Sparse
        dd conv=sparse  Zero filled     ---->Sparse
The above were tested on V3.2B, V3.2C, V3.2D-1. The vdump
and vrestore utilities were also tested on an ADVFS clone-
set.
NOTE1:
------
In V3.x of Digital UNIX, the "vdump/vrestore" commands did
not properly restore the sparse file.  Instead all the
"holes" were changed to ZERO filled blocks on the physical
disk during the vrestore.  This was corrected in UNIX V4.0
(Reference: Release Notes, 4.8.3.7, page 4-35).
NOTE2:
------
In V3.x of Digital UNIX, the "pax" based "tar/cpio" commands
incorrectly restore all files as "sparse".  The workaround
for this problem is to install the "tar/cpio" commands in
the OSFOBSOLETE3xx subset.  This was also corrected in
UNIX V4.0.
O/S VER  Command        File Type (input)  File Type (output)
------- --------------- -----------------  ------------------
4.X     vdump/vrestore  Zero filled     ---->Zero filled
        vdump/vrestore  Sparse          ---->Sparse
        cp              Zero filled     ---->Zero filled
        cp              Sparse          ---->Zero filled
        cpio            Zero filled     ---->Zero filled
        cpio            Sparse          ---->Zero filled
        tar             Zero filled     ---->Zero filled
        tar             Sparse          ---->Zero filled
        dd conv=sparse  Zero filled     ---->Sparse
The above were tested on V4.0A (Rev 464) + ADVFS. The vdump
and vrestore utilities were also tested on an ADVFS clone-
set.
Other Utilities:
----------------
        Sterling tar    Zero filled     ---->Zero filled
        Sterling tar    Sparse          ---->Zero filled
        Gnu Tar (gtar)  Zero filled     ---->Zero filled
        Gnu Tar (gtar)  parse           ---->Zero filled
The "cp" command can be used to un-sparse a file.  It checks
for "holes", and in writing the output file expands them to
be ZERO filled.
The "dd" command can be used to sparse a file. It method-
ically checks for blocks full of repeating zeroes, and on
output does the "lseek()"s to avoid writing the zeroes.
Generally, this is the way to put the "holes" back in the
file.
        "dd" example:
         # compress file
         # tar xvf /dev/rmtnh
         # dd if=xx.Z | compress -dc | dd of=xx conv=sparse
The "ls -sl" command can be used to determine whether a file
is sparseor not.  The "s" parameter displays physical disk
usage in 1024 byte units.  A zero filled file will display
a number that when multipled by 1024 is very close to the
allocated size displayed.  The sparse file will display
a disk usage number significantly smaller.
Examples:
#ls -ls         !Zero filled file
1032 -rw-r--r--   1 .. ..   1048582 Nov 14 09:23 sparse_file
   (1032 * 1024) = 1056768 bytes
#ls -ls         !sparse file
 8 -rw-r--r--   1 .. ..   1048582 Nov 14 09:23 sparse_file
   (8 * 1024) = 8192 bytes
 |