😤 When a Backup Fails… Because of a Filename
Recently, one of my automated backups on Linux unexpectedly crashed mid-run. No fancy error message, no graceful fallback — just a silent abort.
After some hardcore grep
and log spelunking, I found the culprit: a single file with a non-compliant filename that the backup tool couldn’t process. Worse yet — the file had come from a macOS system (MacBook Air). Of course. 😅
Turns out macOS loves to sprinkle in some lovely Unicode quirks, invisible characters, and creative use of special symbols. Great for humans, not so great for Linux file systems or backup tools.
💾 Enter rdiff-backup
: My Versioned Backup Weapon of Choice
I use rdiff-backup
for my daily backups because it supports versioned, incremental backups — like Time Machine, but nerdier and CLI-friendly:
rdiff-backup /data /mnt/backup/data
But here’s the kicker: rdiff-backup choked on that malformed filename and bailed out. No incremental magic, no versioning — just failure.
Lesson learned: Clean your files before backing them up. So I added a pre-backup step: detox
.
🧼 What Is detox
and Why Should You Care?
detox
is a small Linux command-line tool that recursively cleans up messy filenames by removing problematic characters like:
- Spaces
- Umlauts (ä, ö, ü → ae, oe, ue)
- Unicode madness
- Weird symbols and control characters
Perfect for files that came from… well… anywhere that isn’t Linux.
🛠️ Install It
On most distros, it’s as simple as:
sudo apt install detox
🚀 Basic Usage
To recursively clean a directory:
detox -r /your/directory
🔧 Sample Before/After Magic
Original Filename | Cleaned Up |
---|---|
Urlaubsfotos 2024 (Köln).jpg | Urlaubsfotos_2024_Koeln.jpg |
Résumé_finalé.pdf | Resume_finale.pdf |
Projekt 1#Beta!.zip | Projekt_1Beta.zip |
🧪 Recommended Schemes
detox
supports various “schemes” for how filenames are sanitized. Here’s what worked best for me:
detox -r -s utf_8 -s iso8859_1 -v /data
utf_8
: removes funky multibyte Unicode stuffiso8859_1
: transliterates German/French accents-v
: shows exactly what’s being renamed (highly recommended)
🔁 Now My Backup Pipeline Looks Like This
# Step 1: Clean up file names
detox -r -s utf_8 -s iso8859_1 -v /data
# Step 2: Run versioned backup
rdiff-backup /data /mnt/backup/data
Ever since I added detox
to the mix, my rdiff-backup
runs have been smooth as butter — even when syncing messy files from macOS, USB sticks, or synced cloud folders.
🧙 Automate It Like a Nerd
If you’re into scripting, here’s a no-nonsense example:
#!/bin/bash
SOURCE="/data"
TARGET="/mnt/backup/data"
# Clean first
detox -r -s utf_8 -s iso8859_1 -v "$SOURCE"
# Then backup with version history
rdiff-backup "$SOURCE" "$TARGET"
Add it to cron, systemd, or whatever your flavor of automation is. You’ll thank yourself later.
📚 Final Thoughts
If you’re running backups, syncing across OSes, or just want filenames that won’t make your shell scripts cry, detox
is a must-have.
It’s small, fast, recursive, and it saved my ass. Especially when used with rdiff-backup
, it makes your backup pipeline a lot more resilient and deterministic.
And remember: garbage in, garbage out. Clean first, backup second.