| A sparse file is one which has offsets into it at which no data has yet
been written. For example, the pseudocode below creates a new file, seeks
8192 bytes into it, and writes 100 bytes there. That would create a "hole"
of size 8192 starting at offset zero.
fd = creat("myfile");
lseek(fd, 8192, SEEK_SET);
write(fd, buf, 100);
[Posted by WWW Notes gateway]
|
| A quick description of what "Sparse" and "Zero filled"
mean:
When you write to a file you normally think of appending
data. But in some cases, you instead choose to lseek()
some large amount (to avoid actually taking up disk blocks
for space that you don't expect to ever fully populate) and
then resume writing. This leaves a "hole" between the two
points of data. "Zero filled" would populate the area
between the two points of data with 0's, i.e. "fill in the
gap".
AdvFS represents the "hole" by a special null map
entry in the .tag entry for the file. The UFS file
system performs similar function within the inode table.
Why all the fuss?
Certain applications either require sparse files, or worse
cannot handle sparse files. Oracle is one such application.
You must know which utilities to use to provide proper
file types for these applications to function, without
unintentionally changing the file while performing backups.
SOLUTION :
O/S VER Command File Type (input) File Type (output)
------- --------------- ----------------- ------------------
3.X vdump/vrestore Zero filled ---->Zero filled
Note1: vdump/vrestore Sparse ---->Zero filled
cp Zero filled ---->Zero filled
cp Sparse ---->Zero filled
Note2: cpio (pax) Zero filled ---->Sparse
Note2: cpio (pax) Sparse ---->Sparse
Note2: tar (pax) Zero filled ---->Sparse
Note2: tar (pax) Sparse ---->Sparse
dd conv=sparse Zero filled ---->Sparse
The above were tested on V3.2B, V3.2C, V3.2D-1. The vdump
and vrestore utilities were also tested on an ADVFS clone-
set.
NOTE1:
------
In V3.x of Digital UNIX, the "vdump/vrestore" commands did
not properly restore the sparse file. Instead all the
"holes" were changed to ZERO filled blocks on the physical
disk during the vrestore. This was corrected in UNIX V4.0
(Reference: Release Notes, 4.8.3.7, page 4-35).
NOTE2:
------
In V3.x of Digital UNIX, the "pax" based "tar/cpio" commands
incorrectly restore all files as "sparse". The workaround
for this problem is to install the "tar/cpio" commands in
the OSFOBSOLETE3xx subset. This was also corrected in
UNIX V4.0.
O/S VER Command File Type (input) File Type (output)
------- --------------- ----------------- ------------------
4.X vdump/vrestore Zero filled ---->Zero filled
vdump/vrestore Sparse ---->Sparse
cp Zero filled ---->Zero filled
cp Sparse ---->Zero filled
cpio Zero filled ---->Zero filled
cpio Sparse ---->Zero filled
tar Zero filled ---->Zero filled
tar Sparse ---->Zero filled
dd conv=sparse Zero filled ---->Sparse
The above were tested on V4.0A (Rev 464) + ADVFS. The vdump
and vrestore utilities were also tested on an ADVFS clone-
set.
Other Utilities:
----------------
Sterling tar Zero filled ---->Zero filled
Sterling tar Sparse ---->Zero filled
Gnu Tar (gtar) Zero filled ---->Zero filled
Gnu Tar (gtar) parse ---->Zero filled
The "cp" command can be used to un-sparse a file. It checks
for "holes", and in writing the output file expands them to
be ZERO filled.
The "dd" command can be used to sparse a file. It method-
ically checks for blocks full of repeating zeroes, and on
output does the "lseek()"s to avoid writing the zeroes.
Generally, this is the way to put the "holes" back in the
file.
"dd" example:
# compress file
# tar xvf /dev/rmtnh
# dd if=xx.Z | compress -dc | dd of=xx conv=sparse
The "ls -sl" command can be used to determine whether a file
is sparseor not. The "s" parameter displays physical disk
usage in 1024 byte units. A zero filled file will display
a number that when multipled by 1024 is very close to the
allocated size displayed. The sparse file will display
a disk usage number significantly smaller.
Examples:
#ls -ls !Zero filled file
1032 -rw-r--r-- 1 .. .. 1048582 Nov 14 09:23 sparse_file
(1032 * 1024) = 1056768 bytes
#ls -ls !sparse file
8 -rw-r--r-- 1 .. .. 1048582 Nov 14 09:23 sparse_file
(8 * 1024) = 8192 bytes
|