September 3, 2021

On the state of Linux backups

Backup utilities available within linux kinda suck.

As a disclaimer, there are lots of different backup utilities in the linux ecosystem and as many of them share similarities. I’ll mostly be making a general case against those systems, and will be partially focusing on desktop use.

Taking a cursory look at backup options for Linux typically results in rsync or tar popping out as answers. Additionally many other utilities are effectively wrappers around these utilities, to name a few:

Otherwise tools like borgbackup share the main flaw:

They guarantee no file consistency during backup

Some of these utilities clearly admit this, some don’t. Basically, files which get created, modified, or removed during the backup process will not be consitent with the rest of the backup. rsync attempts to alleviate this in some ways, but trying to make those guarantees outside of the filesystem layer is difficult.

To be clear: I’m not talking about something like progressive writes to files not being “complete” during backup, that seems like a much more difficult issue, but rather in the example case of two files being written at the same time during a backup, file ‘A’ may have its writes captured, while file ‘B’ may not. If your backup takes 10 minutes to run, that’s a ten minute period of time where files near the beginning of the backup will be “ahead” of files near the end.

So what’s the proper way to do this?

Filesystem snapshots

The only method I’ve seen to guarantee this level of consistency is in the filesystem, with a snapshot1. There are at least three fairly known implementations for this available in Linux, btrfs, zfs and lvm (xfs also does snapshots but I haven’t even heard of anyone seriously using xfs in the past decade)

All these filesystems can create relatively cheap and fast snapshots for on-site rollback features, such as rolling back after update failures.

Additionally, these filesystems typically allow for relatively easy ability to export incremental snapshots to remote machines for example. (There are also other utilities that can better automate backup/restores this way.)

So there’s the solution, what’s the problem?

Adoption

Due to zfs’s licensing, its use in linux distros has been extremely limited until recently with continuous efforts by the ‘ZFS on Linux’ project. There are now at least a couple distros that support a ZFS root in their installers.

btrfs was made for linux, and as far as I can tell mostly aims to be “Linux’s zfs”, but from what I continue to see, it’s reliability is rather questionable. (debian’s btrfs recommendations, btrfs-convert corruption, btrfswiki’s gotchas).

lvm is quite stable from what I can tell but I’ve personally found its management to be headache-inducing. zfs has much simpler interactions, and appears to be far more “field tested” than btrfs.

I can’t exactly find any statistics on this, but I would make a confident guess that ext4 is still far more deployed on linux than btrfs2, and either of those two more common than zfs. lvm being an exception since several distros seem to default to enabling lvm2 during install when using ext4 these days.

At this point I should note a the few backup utilities I’ve found that utilize the above mentioned filesystems.

The case for desktop usage

On the server side, you’ll likely be comfortable writing simple scripts or cron jobs to automate snapshots, backups, and restores. On the desktop however I (and I think many others) would prefer to utilize the fact a graphical interface exists. I’d much rather have a little dock applet that can show me the status of my backups and snapshots and pop up errors at me, rather than having to rerun status command and check my logs every so often for the same peace of mind.

Additionally zfs-on-linux has a special .zfs pseudo-directory which could should allow for easy file manager integration for simple tasks like a quick file restoration. Obviously you can just navigate to the .zfs directory, but it would be nice if there was something … more.3

Anyway: urbackup and timeshift both have seemingly decent GUIs, urbackup offering even a TUI and web interface. However at time of writing, timeshift has some 300 open issues (a lot of them seem to just be questions rather than bug reports however), and I’m not sure how much I trust urbackup either.

Conclusion

As a final note, I’m not aware of any progression in restoring backup on desktop-oriented linux distributions past “create a livecd, boot the livecd, and manually restore.” While a “nicer” recovery method might necessitate a recovery partition. In any of [lvm, btrfs, zfs], you can make a compressed and deduplicated subvolume that would take up little space and offer something better than the experience of creating a livecd, or ending up in an initramfs busybox environment. I’m sure people will scream their vocal cords dry about how “bloat” this would be regardless of the potential to make a recovery partition that takes up a few MB though.


  1. Technically fsfreeze also exists, but it has some finickiness and honestly, compared to the instant snapshots available in CoW filesystems, “freezing” the fs is garbage. ↩︎

  2. In the context of how many people/organizations use it, not individual machines. (Although you’d be surprised how many companies are still using ext4…) ↩︎

  3. If you’ve ever used a mac, you probably know what I’m getting at. I basically want Time Machine for Linux. ↩︎