😤 When a Backup Fails… Because of a Filename

Recently, one of my automated backups on Linux unexpectedly crashed mid-run. No fancy error message, no graceful fallback — just a silent abort.

After some hardcore grep and log spelunking, I found the culprit: a single file with a non-compliant filename that the backup tool couldn’t process. Worse yet — the file had come from a macOS system (MacBook Air). Of course. 😅

Turns out macOS loves to sprinkle in some lovely Unicode quirks, invisible characters, and creative use of special symbols. Great for humans, not so great for Linux file systems or backup tools.

💾 Enter rdiff-backup: My Versioned Backup Weapon of Choice

I use rdiff-backup for my daily backups because it supports versioned, incremental backups — like Time Machine, but nerdier and CLI-friendly:

rdiff-backup /data /mnt/backup/data

But here’s the kicker: rdiff-backup choked on that malformed filename and bailed out. No incremental magic, no versioning — just failure.

Lesson learned: Clean your files before backing them up. So I added a pre-backup step: detox.

🧼 What Is detox and Why Should You Care?

detox is a small Linux command-line tool that recursively cleans up messy filenames by removing problematic characters like:

  • Spaces
  • Umlauts (ä, ö, ü → ae, oe, ue)
  • Unicode madness
  • Weird symbols and control characters

Perfect for files that came from… well… anywhere that isn’t Linux.

🛠️ Install It

On most distros, it’s as simple as:

sudo apt install detox

🚀 Basic Usage

To recursively clean a directory:

detox -r /your/directory

🔧 Sample Before/After Magic

Original FilenameCleaned Up
Urlaubsfotos 2024 (Köln).jpgUrlaubsfotos_2024_Koeln.jpg
Résumé_finalé.pdfResume_finale.pdf
Projekt 1#Beta!.zipProjekt_1Beta.zip

🧪 Recommended Schemes

detox supports various “schemes” for how filenames are sanitized. Here’s what worked best for me:

detox -r -s utf_8 -s iso8859_1 -v /data
  • utf_8: removes funky multibyte Unicode stuff
  • iso8859_1: transliterates German/French accents
  • -v: shows exactly what’s being renamed (highly recommended)

🔁 Now My Backup Pipeline Looks Like This

# Step 1: Clean up file names
detox -r -s utf_8 -s iso8859_1 -v /data

# Step 2: Run versioned backup
rdiff-backup /data /mnt/backup/data

Ever since I added detox to the mix, my rdiff-backup runs have been smooth as butter — even when syncing messy files from macOS, USB sticks, or synced cloud folders.

🧙 Automate It Like a Nerd

If you’re into scripting, here’s a no-nonsense example:

#!/bin/bash

SOURCE="/data"
TARGET="/mnt/backup/data"

# Clean first
detox -r -s utf_8 -s iso8859_1 -v "$SOURCE"

# Then backup with version history
rdiff-backup "$SOURCE" "$TARGET"

Add it to cron, systemd, or whatever your flavor of automation is. You’ll thank yourself later.

📚 Final Thoughts

If you’re running backups, syncing across OSes, or just want filenames that won’t make your shell scripts cry, detox is a must-have.

It’s small, fast, recursive, and it saved my ass. Especially when used with rdiff-backup, it makes your backup pipeline a lot more resilient and deterministic.

And remember: garbage in, garbage out. Clean first, backup second.

📣 Got a story where a filename ruined your day? Or a better tool than detox? Drop a comment and let’s geek out!

Leave a Reply