Keeping filesystem images sparse

Filesystem images in local files can be used by many PC emulators and virtual machines (user-mode Linux, QEMU and Xen, to name but three). Typically these filesystems are created as sparse files using commands like:

   dd if=/dev/zero of=fs.image bs=1024 seek=2000000 count=0
   /sbin/mke2fs fs.image
where the enormous seek value causes dd to move forward by 2GB before writing nothing at all. This results in the creation of a sparse file which takes disk space only for blocks which are actually used:

   $ ls -l fs.image 
   -rw-rw-r--  1 rmy rmy 2048001024 Apr 18 19:10 fs.image
   $ du -s fs.image 
   31692   fs.image
As the filesystem is used, more and more of the non-existent blocks are filled with data and the size of the file on disk grows. Sometimes it would be nice to be able to reclaim unused blocks from a filesystem image. However, deleting files from the image doesn't return the space to the underlying filesystem: even free blocks in the image still consume space. Reclaiming the space can be achieved in two stages:

One traditional way to zero unused blocks is to create a file that fills all the free space:

   dd if=/dev/zero of=junk
   sync
   rm junk

The disadvantage of dd in this context is that it destroys any sparseness that exists: free blocks that were originally represented as holes in the image file are replaced with actual blocks containing zeroes. Also, filling up a live filesystem is probably a bad idea.

As an alternative approach, and as practice in mucking about with ext2 filesystems, I've written a utility which scans the free blocks in an ext2 filesystem and fills any non-zero blocks with zeroes. The source, zerofree-1.1.0.tgz, is available for download.

However, this is only half the story: the empty free blocks still consume space in the underlying filesystem, so something must to be done to reclaim that space.

A common suggestion is to use the sparse file handling capabilities of the GNU cp command to take a copy of the filesystem image with cp --sparse=always (though this does require the original and sparse files to exist at the same time, which may be inconvenient).

If your kernel and util-linux are sufficiently modern and you have a supported filesystem you can use fallocate -d to 'dig holes' in a file. This makes the file sparse in-place, without using extra disk space.


Ron Yorston
18th April 2004 (updated 18th January 2017)
Some obsolete information has been moved to a separate page.