Filesystem images in local files can be used by many PC emulators and virtual machines (user-mode Linux, QEMU and Xen, to name but three). Typically these filesystems are created as sparse files using commands like:
dd if=/dev/zero of=fs.image bs=1024 seek=2000000 count=0 /sbin/mke2fs fs.imagewhere the enormous
ddto move forward by 2GB before writing nothing at all. This results in the creation of a sparse file which takes disk space only for blocks which are actually used:
$ ls -l fs.image -rw-rw-r-- 1 rmy rmy 2048001024 Apr 18 19:10 fs.image $ du -s fs.image 31692 fs.imageAs the filesystem is used, more and more of the non-existent blocks are filled with data and the size of the file on disk grows. Sometimes it would be nice to be able to reclaim unused blocks from a filesystem image. However, deleting files from the image doesn't return the space to the underlying filesystem: even free blocks in the image still consume space. Reclaiming the space can be achieved in two stages:
One traditional way to zero unused blocks is to create a file that fills all the free space:
dd if=/dev/zero of=junk sync rm junk
The disadvantage of
dd in this context is that it destroys
any sparseness that exists: free blocks that were originally represented
as holes in the image file are replaced with actual blocks containing
As an alternative approach, and as practice in mucking about with ext2 filesystems, I've written a utility which scans the free blocks in an ext2 filesystem and fills any non-zero blocks with zeroes. The source, zerofree-1.0.4.tgz, is available for download.
Better than either of these would be to have the guest kernel keep the free blocks empty. My original inspiration was the ext2fs privacy (i.e. secure deletion) patch described in a Linux kernel mailing list thread. I've also made use of a later patch for ext3 entitled Secure Deletion Functionality in ext3 from the linux-fsdevel mailing list. (See also the authors' paper on Secure Deletion File Systems.) I've modified the patches to make them more suitable for the present purpose.
zerofreeoption (added by these patches) all the blocks freed when a file is deleted are filled with zeroes. Remember, this extra work will hurt disk performance. Note that the ext3 patch doesn't support data journalling mode, so deleted metadata isn't zeroed. It also hasn't been tested as thoroughly as the patch for ext2. And neither has been maintained for a very long time.
However, the above techniques are only half the story: the empty free blocks still consume space in the underlying filesystem, so something must to be done to reclaim that space.
A common suggestion is to use the sparse file handling capabilities
of the GNU
cp command to take a copy of the filesystem image with
cp --sparse=always (though this does require the original
and sparse files to exist at the same time, which may be inconvenient).
As an alternative I wrote a utility which can make any specified files on an ext2 filesystem sparse, sparsify.c. This doesn't require any additional disk space to work its magic, but it does require that the filesystem containing the filesystem image is unmounted, which is just a different sort of inconvenience.
As an example, suppose we have an unmounted filesystem
fs.image, in the directory
/data, which is the
root of the
/dev/hda2 filesystem. We can reclaim deleted
blocks and make it sparse like so:
zerofree /data/fs.image umount /data sparsify /dev/hda2 /fs.image mount /data
However, I really, really don't recommend using sparsify. I don't.
If your kernel and util-linux are sufficiently modern and you have a supported
filesystem you can use
fallocate -d to 'dig holes' in a file.
This makes the file sparse in-place, without using extra disk space. It
also means you get to blame someone other than me if it doesn't work.