Skip to content

doutsis/vmbackup

Repository files navigation

vmbackup — Automated KVM/libvirt Backup Manager

License: MIT Version

The backup half of the vmbackup / vmrestore ecosystem. Automated backup manager for KVM/libvirt virtual machines, built on virtnbdbackup.

vmbackup automates virtnbdbackup — scheduling, rotation, retention, backup validation, replication and reporting. It works on personal machines, homelabs and production KVM hosts alike. For restores, see vmrestore.

Why vmbackup

virtnbdbackup handles the hard part but it operates on one VM at a time with no scheduling, no retention management and no replication. If you run more than a couple of VMs you end up writing your own wrapper scripts for rotation, cleanup, backup validation and email alerts.

vmbackup is that wrapper. It orchestrates virtnbdbackup across your entire fleet and handles everything around it — backup validation, failure recovery, multi-destination replication and reporting.

Quick Start

Prerequisite: vmbackup requires virtnbdbackup (≥ 2.28) — install it first: installation instructions

Debian / Ubuntu:

wget https://github.com/doutsis/vmbackup/releases/download/v0.5.4/vmbackup_0.5.4_all.deb
sudo dpkg -i vmbackup_0.5.4_all.deb

Any distro (Arch, Fedora, openSUSE, etc.):

git clone https://github.com/doutsis/vmbackup.git
cd vmbackup
sudo make install

Then edit /opt/vmbackup/config/default/vmbackup.conf to set your backup path and preferences:

sudo vmbackup --run                        # run a backup now
sudo systemctl start vmbackup.timer      # enable the daily schedule

For the full step-by-step walkthrough — backup path setup, per-VM overrides, email, replication and more — see the Quick Setup Guide in the detailed documentation.

Features

  • Every VM, automatically — discovers and backs up all VMs on the host. No manifest to maintain — new VMs are picked up on the next run
  • Targeted backup — back up one or more specific VMs on demand with --vm, without waiting for the full scheduled run
  • Full + incremental, zero decisions — first backup is a full; every backup after that is an incremental. Period boundaries (daily, weekly, monthly) trigger a fresh full automatically
  • Self-healing — failed incrementals convert to fulls, broken chains are archived and restarted, interrupted runs clean up after themselves. Scheduled backups should never need manual intervention
  • Multi-destination replication — rsync to any mounted filesystem, rclone to cloud. Failed replication can be re-run independently without repeating a backup
  • TPM and BitLocker handled — TPM state and BitLocker recovery keys are extracted and stored alongside each VM backup
  • Host environment captured — libvirt configuration, network definitions and dependent service config are backed up so you can rebuild the host, not just the VMs
  • FSTRIM optimisation — trims guest filesystems via the QEMU agent before backup so qcow2 images compress better and incrementals are smaller. Per-path logging, configurable minimum extent, per-VM exclusions, and automatic detection of missing Windows VirtIO discard_granularity overrides
  • Paired with vmrestore — single-command disaster recovery, clones and point-in-time restores via vmrestore
  • Minimal dependencies — pure Bash + SQLite with no additional runtimes, frameworks or services to install. If your host runs libvirt, vmbackup runs too

How It Works

vmbackup wraps virtnbdbackup and manages the full backup lifecycle:

  1. Discovery — queries libvirt for every VM on the host and applies your include/exclude filters. New VMs are picked up automatically.
  2. Backup — runs full or incremental backups per VM based on what already exists on disk. Per-VM overrides let you set different policies or exclude individual VMs entirely.
  3. Rotation — organises backups into period-based directories. Daily, weekly and monthly policies archive the previous period and start a fresh full automatically. The accumulate policy runs incrementals indefinitely until a configurable limit is reached. Per-VM overrides apply here too.
  4. Retention — removes expired archives based on configurable age and count limits per policy. Runs after every backup so storage stays predictable without manual cleanup.
  5. Replication — copies the backup tree to local and cloud destinations so backups exist in more than one place. Local targets use rsync; cloud targets use rclone. Both can run in parallel. If replication fails or is interrupted, it can be re-run independently without repeating the backup.
  6. Reporting — sends an email summary with per-VM status, duration, errors and replication results.

Installation

Prerequisites

vmbackup is a wrapper around virtnbdbackup — it will not function without it. Install virtnbdbackup (≥2.28) first:

virtnbdbackup installation instructions

Also requires bash >= 5.0, libvirt-daemon-system, qemu-utils, sqlite3 and jq. Optionally msmtp for email reports and rclone for cloud replication.

From .deb Package (Debian / Ubuntu)

Download the latest .deb from Releases:

wget https://github.com/doutsis/vmbackup/releases/download/v0.5.4/vmbackup_0.5.4_all.deb
sudo dpkg -i vmbackup_0.5.4_all.deb

From Source (any distro)

git clone https://github.com/doutsis/vmbackup.git
cd vmbackup
sudo make install

Both methods install to /opt/vmbackup/ and set up:

  • vmbackup command in PATH
  • root:backup ownership with restricted permissions
  • systemd service and timer units
  • AppArmor profile for libvirt/QEMU integration

Uninstall

Debian / Ubuntu (.deb install):

sudo apt remove vmbackup    # remove but keep config
sudo apt purge vmbackup     # remove everything including config and logs

From source (make install):

sudo make uninstall

Remove keeps your configuration under /opt/vmbackup/config/ so you can reinstall later without reconfiguring. Purge (or make uninstall) deletes config files, logs and the AppArmor profile. Backup data is never touched — it lives wherever you configured BACKUP_PATH.

Configuration

All configuration lives in /opt/vmbackup/config/. Each config directory is a named instance containing:

File Purpose
vmbackup.conf Backup path, schedule policy, compression, VM filters
email.conf Email reporting (SMTP via msmtp)
replication_local.conf Local replication destinations (rsync)
replication_cloud.conf Cloud replication destinations (rclone)
vm_overrides.conf Per-VM rotation policy and exclusion overrides
exclude_patterns.conf Wildcard rules to exclude VMs by name (e.g. test-*)
fstrim_exclude.conf VM name patterns to exclude from pre-backup FSTRIM

The default/ instance is used when vmbackup runs without --config-instance. The template/ directory contains fully documented reference configs — copy it to create a new instance:

cp -r /opt/vmbackup/config/template /opt/vmbackup/config/prod
vmbackup --run --config-instance prod

This lets you run separate configurations (e.g. dev, staging, prod) from the same installation.

VM discovery and exclusion

vmbackup discovers and backs up every VM on the host automatically. You don't maintain a list of VMs to back up — if libvirt knows about it, vmbackup backs it up.

To give a specific VM a different rotation policy or exclude it entirely, add an entry to vm_overrides.conf. This is the right place for permanent, per-VM decisions — a production database that needs daily rotation while everything else runs monthly, or a template VM that should never be backed up.

To exclude VMs by naming convention, add wildcard rules to exclude_patterns.conf. Patterns like test-* or *-clone-* let you skip entire classes of VMs without listing each one individually. Useful when test or scratch VMs are created and destroyed frequently.

Self-healing

vmbackup validates backup state, data integrity and lock health at the start of every run. If an incremental backup fails, it converts to a full and retries. If the backup sequence is broken, it archives what's there and starts fresh. If a previous run was interrupted, stale locks and partial files are cleaned up automatically. Scheduled backups should never require manual intervention to get back on track.

Usage

Once configured, vmbackup runs unattended via the systemd timer. For manual runs and operational tasks:

# Run a backup using the default config (config/default/)
sudo vmbackup --run

# Run using a named config instance (config/prod/)
sudo vmbackup --run --config-instance prod

# Preview what a backup would do without writing anything
sudo vmbackup --run --dry-run

# Cancel replication on a running session (backups continue)
sudo vmbackup --cancel-replication

# Back up a specific VM (replication skipped)
sudo vmbackup --run --vm web

# Back up multiple VMs
sudo vmbackup --run --vm web,db,mail

# Re-run replication without repeating the backup
sudo vmbackup --replicate-only

# Clean up archived chains and old periods
sudo vmbackup --prune list

All commands accept --config-instance and --dry-run. See vmbackup.md for the full CLI reference.

VM State Handling

vmbackup handles VMs in any power state:

State Backup Method Consistency
Running (with QEMU agent) FSFREEZE + incremental Application-consistent
Running (no agent) Pause + incremental Crash-consistent
Shut off Copy backup (if disk changed) Clean
Paused Treated as running Crash-consistent

Shut off VMs are only backed up when their disk has changed since the last backup. Unchanged VMs are skipped to avoid wasting storage.

Rotation & Retention

Rotation policies control how backups are organised and when old data is removed:

Policy Behaviour
daily Archives existing backups when the date changes and starts a fresh full. Keeps 7 daily folders by default.
weekly Archives existing backups at the start of a new ISO week. Keeps 4 weekly folders by default.
monthly Archives existing backups at the start of a new month. Keeps 3 monthly folders by default. This is the default policy.
accumulate Backups accumulate indefinitely with no scheduled archival. When the number of incremental backups hits the hard limit (default 365) they are automatically archived and a fresh full backup starts.
never VM is excluded from backup entirely. Use for templates, scratch VMs or anything you don't want backed up.

The default rotation policy is set in vmbackup.conf and applies to all VMs. Individual VMs can be assigned a different policy in vm_overrides.conf. Retention is enforced per policy.

Manual cleanup

Automated retention runs after each backup, but sometimes you need to reclaim space on demand — remove archived chains, clean up old periods or wipe a decommissioned VM entirely. --prune handles this without running a backup session. All operations support --dry-run to preview, --yes to skip confirmation, and a keep-last guard that prevents removing the last period. See vmbackup.md for the full target reference.

TPM & BitLocker Support

For VMs with emulated TPM (Windows BitLocker, Linux Secure Boot), vmbackup backs up TPM state from /var/lib/libvirt/swtpm/ alongside each VM backup. TPM state is deduplicated — unchanged state is symlinked to the previous copy rather than stored again.

For Windows VMs with BitLocker, vmbackup uses the QEMU guest agent to extract recovery keys from the running guest automatically. The keys are stored alongside the TPM state so they're available if the TPM becomes unusable after restore — new UUID, hardware change or TPM corruption. If the guest agent isn't installed or the VM isn't running, extraction is skipped silently without blocking the backup.

Security

vmbackup enforces root:backup ownership across everything it touches — the install tree, backup data, logs and lock files. This is not configurable.

The backup group

The backup group (GID 34) is a standard system group. Both the .deb package and make install create it if it doesn't already exist. All vmbackup files are owned root:backup so that root can write backups and members of the backup group can read them.

To browse backups, check logs or query the SQLite database, add your user to the group:

sudo usermod -aG backup myuser
# Log out and back in for group membership to take effect

If you also want non-root access to virsh list and other libvirt commands, add the libvirt group too:

sudo usermod -aG backup,libvirt myuser

SGID and permissions

Backup directories use the SGID bit (mode 2750, shown as drwxr-s---). When SGID is set on a directory, every new file and subdirectory automatically inherits the backup group — no post-hoc chown is needed. Combined with umask 027, the result is files at 640 and directories at 2750 with root:backup ownership throughout.

On first run, vmbackup detects that BACKUP_PATH lacks SGID and applies it automatically. From that point forward, SGID propagates to all subdirectories created by vmbackup, virtnbdbackup or any other child process.

Layer Mechanism
Script umask 027 — files 640, dirs 750
Directories SGID bit (2750) — group inheritance propagates to all new files and subdirectories
systemd UMask=0027 — belt-and-suspenders with the in-script umask
Package install -m 750/640 — nothing is world-accessible
AppArmor Profile for libvirt/QEMU NBD socket access

Sensitive material

TPM private keys and BitLocker recovery keys are isolated from the backup group. The tpm-state/ directory has SGID stripped and contents are owned root:root with mode 600. A user in the backup group can browse the backup tree and read VM configs and logs but cannot read TPM keys or BitLocker recovery keys.

SQLite Logging

All backup activity is logged to a SQLite database at $BACKUP_PATH/_state/vmbackup.db. The database tracks sessions, per-VM results, replication runs, retention actions and backup health events. This enables queries like "last successful backup per VM" or "total bytes replicated this month" without parsing log files.

Replication

Replication runs after backup completes. Local and cloud replication operate independently and can run in parallel or sequentially.

Local replication uses rsync to any locally accessible path — local disks, NFS mounts, virtiofs shares, pre-mounted CIFS, or anything else that appears as a local directory. Configurable bandwidth limits and post-sync verification (size or checksum).

Cloud replication uses rclone to sync to SharePoint, Backblaze B2, S3, or any rclone-supported backend. Currently ships with a SharePoint transport driver.

Both systems use a pluggable transport architecture. New local transports can be added by implementing five functions (init, sync, verify, cleanup, get_free_space) and a metrics contract. New cloud transports are added by implementing the cloud transport function and metrics contracts. See the full transport interface in vmbackup.md.

Run replication on demand

Replication normally runs at the end of each backup session, but --replicate-only lets you trigger it independently. Useful when pre-seeding a new destination before the first scheduled run, adding a destination to an existing setup, or re-running replication that was interrupted or cancelled during a backup. Scope can be narrowed to local or cloud only. No VMs are touched and no retention runs. See vmbackup.md for the full reference.

Restoring

vmbackup and vmrestore are two halves of one system. vmbackup backs up — vmrestore restores. They share no code and have no runtime coupling, but vmrestore exclusively restores backups created by vmbackup.

vmrestore provides single-command disaster recovery, clone restores and point-in-time recovery — with full identity management, TPM/BitLocker support and pre-flight safety checks.

sudo vmrestore --vm my-vm --restore-path /var/lib/libvirt/images

Tested

vmbackup and vmrestore are tested end-to-end on a fleet of Linux and Windows VMs across multiple config instances. Tests are destructive — VMs are backed up, checkpointed, destroyed and restored from scratch — validating the full lifecycle from first backup to disaster recovery.

Test Fleet

VM Instance Disks TPM Boot
Linux base default 1× VirtIO No BIOS
Linux multi-disk default 2× VirtIO + 1× SATA No BIOS
Linux multi-disk clone default 2× VirtIO + 1× SATA No BIOS
Windows base default 1× VirtIO Yes UEFI
Windows multi-disk default 2× VirtIO + 1× SATA Yes UEFI
Windows multi-disk clone default 2× VirtIO + 1× SATA Yes UEFI
Linux base prod 1× VirtIO No BIOS
Linux multi-disk prod 2× VirtIO + 1× SATA No BIOS
Windows base prod 1× VirtIO Yes UEFI
Windows multi-disk prod 2× VirtIO + 1× SATA Yes UEFI

The default and prod instances back up to isolated paths with separate VM filters, validating that multi-instance deployments stay fully isolated.

Testing Phases

  1. CLI and argument validation — all vmbackup and vmrestore flags, error paths, privilege enforcement and conflict guards
  2. Record identities — UUID, MAC addresses, TPM presence and disk layout for every VM
  3. Build backup chains with checkfiles — unique marker files are written inside each guest (Linux and Windows) via the QEMU agent between backup rounds. Multiple vmbackup rounds across both instances create active and archived chains, each capturing different checkfile content. This gives every restore point a verifiable fingerprint — after restore, the checkfile content proves which point in time was actually recovered
  4. Backup verificationvmrestore --verify confirms integrity across both instances
  5. Prune — archived period cleanup on live backup data
  6. Clone restore — restore as clones with new identity, verify disk integrity, boot via QEMU agent, confirm checkfile content matches the source backup, then destroy
  7. Point-in-time restore — restore to specific checkpoints across both active and archived chains. Each restored VM is booted and the checkfile inside the guest is read back to confirm it contains exactly the content that existed at that point in the backup history — not the latest, not a neighbour, but the precise checkpoint requested. This is the strongest proof that incremental chains and archive navigation produce correct results
  8. Single-disk restore — replace one disk on a multi-disk VM, verify .pre-restore backup, disk integrity and vmbackup auto-heal after chain invalidation
  9. Destroy everything — delete all original VMs including definitions, disks and NVRAM
  10. DR restore — restore all VMs from backup to a clean path, verify UUID/MAC match originals, all disks intact, TPM state preserved, checkfiles survived the full backup → destroy → restore cycle, BitLocker not triggered
  11. Multi-instance backup and restore — backup and restore across config instances (--config-instance prod), verifying that each instance resolves to its own backup path, lists only its own VMs, and restores produce correct identities. Covers VMBACKUP_INSTANCE env var equivalence and cross-instance clone and DR
  12. Windows TPM/BitLocker — clone and DR with TPM state isolation per UUID, NVRAM separation, archived chain recovery, and BitLocker unlock without recovery prompt
  13. Auto-recovery — corrupt .cpt chain marker, verify vmbackup archives the broken chain and starts fresh

Every restore verifies disk integrity (qemu-img check), identity against pre-test baselines, and successful boot via automated QEMU guest agent polling.

Documentation

Full technical documentation is included in vmbackup.md (installed to /opt/vmbackup/vmbackup.md). It covers architecture, configuration reference, rotation policies, backup lifecycle, archive management, replication transport interface, SQLite schema, failure detection and security model in detail.

Known Issues

Windows VMs: slow FSTRIM with VirtIO disks

QEMU's default discard_granularity for VirtIO block devices causes Windows to issue millions of tiny 512-byte TRIM operations instead of coalescing them. A 20 GB disk can take 10+ minutes to trim — versus 1–2 seconds with the fix applied.

Linux guests are unaffected (the kernel coalesces TRIMs regardless). SATA guests also work fine.

Fix: Add a discard_granularity override (32 MiB recommended) to each VirtIO disk in the VM's libvirt XML. vmbackup detects missing overrides automatically at backup time and logs the exact XML to add.

Full details, performance benchmarks and step-by-step XML instructions: VirtIO discard_granularity & Windows TRIM Performance

Issues

Found a bug or have a feature request? Open an issue.

License

MIT


100% Vibe Coded

About

Automated KVM/QEMU/libvirt backup manager — scheduling, rotation, retention, replication and disaster recovery for virtnbdbackup. Full + incremental, self-healing, TPM/BitLocker aware.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors