Automatically Verifying All RAR Archives with a Bash Script

In my previous article I described how I reorganized several terabytes of personal data and bundled large collections of photos into RAR archives.

This approach dramatically reduced the number of small files in my dataset and made backups much faster. However, once you start relying on archives for long-term storage, a new question naturally arises:

How do you regularly verify that all archive files are still intact? 🤔

Hard drives age, SSDs can fail, and silent corruption (bitrot) is always a theoretical possibility when storing data over many years.

Fortunately, the RAR toolchain includes built-in commands to test archives without extracting them. Running a test manually is easy:

rar t archive.rar

But doing this manually for hundreds or thousands of archives is obviously impractical.

So I wrote a small Bash script that recursively scans a storage tree and verifies every RAR archive it finds.

What the Script Does

Recursively scans a storage directory
Detects both single-volume and multi-volume RAR archives
Only tests the first volume of multipart archives
Performs a fast header scan first
Runs a full CRC verification
Logs all results to a logfile
Writes failing archives into a separate error list
Runs tests in parallel threads for speed
Displays a progress indicator and ETA

This makes it practical to periodically verify a large archive dataset without manually touching every file.

The Script

Below is the Bash script I currently use to verify my RAR archives.


#!/usr/bin/env bash

# ==========================================
# Configuration
# ==========================================

ERRORFILE="fehlerhafte_rars.txt"
LOGFILE="rar_test.log"
LISTFILE="$(mktemp)"

# Number of parallel jobs (adjust depending on storage speed)
THREADS=${THREADS:-4}

: > "$ERRORFILE"
: > "$LOGFILE"

echo "Searching for RAR archives..."

# ==========================================
# Build list of archives to test
# Only first volumes are included
# ==========================================

find /media/raphael/MASTER/daten/ -type f \( -iname "*.rar" -o -iname "*.r[0-9][0-9]" \) | while read -r f
do
    base=$(basename "$f")

    # Detect modern multipart archives (.part01.rar)
    if [[ "$base" =~ \.part([0-9]+)\.rar$ ]]; then
        part="${BASH_REMATCH[1]}"

        # Only keep the first volume
        [[ "$part" != "01" && "$part" != "1" ]] && continue

        echo "$f"
        continue
    fi

    # Skip old multipart volumes (.r00 .r01 ...)
    if [[ "$base" =~ \.r[0-9][0-9]$ ]]; then
        continue
    fi

    # Normal archive or first volume of an old multipart set
    echo "$f"

done | sort -u > "$LISTFILE"

TOTAL=$(wc -l < "$LISTFILE") echo "Found archives: $TOTAL" echo "Parallel jobs: $THREADS" echo START=$(date +%s) COUNT=0 # ========================================== # Function: test archive # ========================================== test_archive() { file="$1" # Fast header scan (quickly checks archive structure) if ! unrar lt -idq "$file" >/dev/null 2>&1; then

        echo "$(date '+%F %T') | HEADER_ERROR | $file" >> "$LOGFILE"
        echo "$(date '+%F %T') | $file" >> "$ERRORFILE"
        return

    fi

    # Full CRC verification
    if unrar t -idq "$file" >/dev/null 2>&1; then

        echo "$(date '+%F %T') | OK | $file" >> "$LOGFILE"

    else

        echo "$(date '+%F %T') | CRC_ERROR | $file" >> "$LOGFILE"
        echo "$(date '+%F %T') | $file" >> "$ERRORFILE"

    fi
}

export -f test_archive
export ERRORFILE LOGFILE

# ==========================================
# Main loop with parallel execution
# ==========================================

while read -r file
do
    ((COUNT++))

    test_archive "$file" &

    # Limit number of parallel jobs
    if (( $(jobs -r | wc -l) >= THREADS )); then
        wait -n
    fi

    # Progress and ETA calculation
    now=$(date +%s)
    runtime=$((now-START))
    avg=$((runtime/COUNT))
    remain=$((TOTAL-COUNT))
    eta=$((remain*avg))

    printf "\rProgress: %d/%d | ETA %02d:%02d:%02d" \
        "$COUNT" "$TOTAL" \
        $((eta/3600)) $((eta%3600/60)) $((eta%60))

done < "$LISTFILE"

wait

echo
echo
echo "Finished."
echo "Error list: $ERRORFILE"
echo "Logfile: $LOGFILE"

How It Works

The script performs two verification stages:

Header Scan – quickly verifies archive structure
Full CRC Test – verifies the actual file contents

If an archive fails either step, it is written to a separate error file.

Because the script runs multiple verification jobs in parallel, it can test large datasets much faster than a sequential scan.

Why Archive Verification Matters

Long-term storage always carries some risk. Periodic archive verification helps detect problems early before backups become unusable.

Especially for datasets stored for many years — like family photos or personal archives — running a periodic integrity check is simply good practice.

And of course… automating such tasks with Bash is half the fun 🙂.

Automatically Verifying All RAR Archives with a Bash Script

Byraphael

What the Script Does

The Script

How It Works

Why Archive Verification Matters

By raphael

Related Post

Installing Longhorn on a 3-Node Homelab Cluster

Archiving Terabytes of Personal Data – My Nerdy RAR + Bash Workflow

My First Kubernetes Cluster in the Homelab

Leave a Reply Cancel reply

You missed

Installing Longhorn on a 3-Node Homelab Cluster

Automatically Verifying All RAR Archives with a Bash Script

Archiving Terabytes of Personal Data – My Nerdy RAR + Bash Workflow

My First Kubernetes Cluster in the Homelab