Skip to content
Go back

Borg vs Duplicacy: Dedup Backup Wars

By SumGuy 13 min read
Borg vs Duplicacy: Dedup Backup Wars

You’ve probably done the “rsync + cron” thing — it works until it doesn’t: huge transfers, duplicate data, and that night where you realise your “backup” was just a copy of the trash folder. If you want safe, space-efficient snapshots (and fewer 2 AM heart attacks), you need deduplicating backups. Borg and Duplicacy are the two heavy-hitters for home labs. One’s the battle-hardened Swiss army knife; the other’s the clever kid that solved multi-source tetris. Here’s the head-to-head.

Why dedup matters (short version)

Deduplication saves storage and network: store identical chunks once and reference them from multiple snapshots. For a homelab with lots of similar VMs, dotfiles, or repeated ISOs, dedup is the difference between feasible and “please buy another drive”.

Your 2 AM self will appreciate the bandwidth and time savings.

Dedup mechanics — plain English

Think of a backup like packing cargo into a truck.

Borg: chunker (sliding / segmenting)

Borg splits data into chunks and groups them into segments. It uses a rolling-style approach that tends to avoid re-chunking when small edits happen (so a changed line doesn’t ruin the whole file). The result: excellent intra-repo deduplication and efficient appends.

Pros:

Cons:

Analogy: Borg is like a careful packer who numbers crates and keeps an index — slow if three people try to jam crates through the door at once.

Duplicacy: content-defined chunks + lock-free multi-source

Duplicacy also uses content-defined chunking, but its storage layout is fundamentally different: chunks are stored and referenced by their hashes and snapshots are independent JSON-like manifests that reference those chunks. That enables truly lock-free concurrent backups: multiple clients can push to the same storage concurrently without server-side locking.

Pros:

Cons:

Analogy: Duplicacy is Git for chunks — many contributors can push branches (snapshots) at once; storage just holds the blobs.

If your setup looks like a spaghetti diagram, Duplicacy wins.

Quick install + first backup

Running these is boringly simple. Examples below use local repo or SFTP/S3 endpoints; adapt to your host.

Borg — init, backup, list

borg-quickstart.sh
# Install (Debian/Ubuntu)
sudo apt update && sudo apt install -y borgbackup
# Export passphrase in shell (safer: use keyfile or env var in service)
export BORG_PASSPHRASE='supersecret-passphrase'
# Initialize repository (local or remote via user@host:/path)
borg init --encryption=repokey /srv/borg-repo
# Create a snapshot (archive)
borg create --stats --progress /srv/borg-repo::home-$(hostname)-$(date +%F) /home/kingpin /etc
# List archives
borg list /srv/borg-repo

Notes:

Duplicacy — init, backup, restore (multi-source)

duplicacy-quickstart.sh
# Download latest binary from GitHub releases, make executable (example)
curl -Lo /usr/local/bin/duplicacy \
https://github.com/gilbertchen/duplicacy/releases/download/latest/duplicacy_linux_x64
chmod +x /usr/local/bin/duplicacy
# On machine A (repo id 'laptop'), point to the same storage (sftp, s3, b2, etc.)
cd /home/kingpin
duplicacy init -e laptop sftp://backup.example.com/duplicacy
# Run backup (will prompt for encryption passphrase)
duplicacy backup -stats
# On machine B (same storage, different repo id)
cd /home/other
duplicacy init -e tower sftp://backup.example.com/duplicacy
duplicacy backup -stats
# Restore (restore snapshot revision '1' to /restore)
duplicacy restore -r 1 /restore

Notes:

Multi-source and multi-destination workflows

This is the crown jewel: Duplicacy was designed for multi-writer, multi-source vaults. Want 5 laptops + server + NAS all pushing dedup’d data into a single S3 bucket? Duplicacy does that without special servers or locks. It stores chunks by hash and snapshots separately, so concurrent writes are safe.

Borg, by contrast, expects one repository to be the authoritative metadata holder. While you can mount a central borg repo over SSH and script clients to avoid collisions, you must manage repo locks. Workarounds:

If your setup is multiple machines backing up to one central vault — Duplicacy wins hands down. No locks, no merge headaches, no shepherding backups at 3 AM.

Encryption & key management — the practical story

Both tools offer strong client-side encryption (AES-256-level security is the commonly used algorithm in both ecosystems). But the key story is what trips people up.

Borg:

Duplicacy:

Practical take:

Encryption example — Borg (repokey) and Duplicacy:

encryption-examples.sh
# Borg: repokey init (stores key in repo encrypted by passphrase)
export BORG_PASSPHRASE='supersecret'
borg init --encryption=repokey /srv/borg-repo
# Duplicacy: init with encryption (will prompt; same password on all clients)
duplicacy init -e laptop sftp://backup.example.com/duplicacy
# (follow prompt to set passphrase)

Prune / forget policies compared

Keeping every snapshot forever = quick way to buy a bigger shelf of drives. Both tools provide retention/forget/prune options, but their models differ.

Borg:

borg-prune.sh
# Keep 7 daily, 4 weekly, 6 monthly
borg prune -v --list --keep-daily=7 --keep-weekly=4 --keep-monthly=6 /srv/borg-repo

Duplicacy:

duplicacy-prune.sh
# Keep 1 revision/day for revisions older than 7 days,
# 1 revision/week for revisions older than 30 days,
# 1 revision/month for revisions older than 180 days,
# no revisions older than 365 days
# (multiple -keep flags, sorted by m descending)
duplicacy prune -keep 0:365 -keep 30:180 -keep 7:30 -keep 1:7 -exhaustive -stats

Practical note: pruning speed matters when you manage hundreds of snapshots. Borg’s internal index and segment model often make metadata operations snappier for single-repo situations. Duplicacy’s prune is robust across many clients, but expect more work when the storage is huge.

Cloud story — friction matters

If you’re cloud-first, this is where opinions harden.

Borg:

Duplicacy:

If your goal is “put backups straight into cloud storage and forget”, Duplicacy is 100x less friction. If you prefer fully open source, run-your-own-server setups, or have an SSH-accessible NAS, Borg is fine — but expect to manage the server.

Restore speed & recovery — which gets you unstuck

Restores are where courage and caffeine meet.

Borg:

Duplicacy:

The truth:

If you’re on a slow link and need a quick file at 3 AM, mount the Borg repo and copy. Your 2 AM self will love you for it.

Scheduling: Borgmatic vs Duplicacy native + Web

Borgmatic = borg + sane automation.

Updated: borgmatic 1.8 deprecated the location:/storage:/retention: section keys. This uses the modern flat configuration format.

borgmatic-config.yaml
# /etc/borgmatic/config.yaml
source_directories:
- /home
- /etc
repositories:
- path: user@backup:/srv/borg-repo
encryption_passphrase: "{{ BORG_PASSPHRASE }}"
keep_daily: 7
keep_weekly: 4
keep_monthly: 6

Borgmatic runs on cron/systemd timers and is rock-solid.

Duplicacy:

If you like YAML config that just works, borgmatic + borg is a polished combo. If you’d rather point-and-click and pay a few bucks for a dashboard, duplicacy web is a reasonable choice.

License reality check

Short version:

Practical impact:

Which one gets the trophy? Decision matrix by use case

Practical checklist before you pick

Final verdict — SumGuy’s pick by persona

If forced to pick one for a general-purpose, multi-machine home lab: Duplicacy for multi-source + cloud convenience; Borg for pure OSS lovers and local-first setups. Either way, stop running rsync+cron like it’s 2006. Your future self and your wallet (storage bills) will thank you.

Closing notes (and a tiny bit of sarcasm)

Backups are boring until they’re life-saving. Don’t let shiny GUI features distract you from: verifying restores, storing keys offline, and actually testing a recovery. Backing up your backups is not paranoid — it’s responsible. If backing up feels like hiring a forklift to move a couch — technically it works, but your neighbors will have questions — maybe simplify your approach: pick the tool that matches your topology and stick to a tested schedule.

Want a short pairing guide for your exact setup (how many machines, cloud vs local, bandwidth constraints)? Tell me your topology and I’ll map the winner.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it'll show up above once verified.


Next Post
Boundary vs Teleport

Discussion

Powered by Garrul . Sign in with GitHub or Google, or post anonymously.

Related Posts