Backup my data to AWS S3 Glacier

Table of Contents

🛡️ The 3-2-1 backup rule: Your data’s helmet

Are you familiar with the 3-2-1 backup rule?
It’s a CRUCIAL rule in my opinion.
If you don’t do it, it’s like riding a bike or a scooter without a helmet: as long as nothing has happened to you, you don’t realize the risk.

The 3-2-1 backup rule is:

  • 3 copies of your data
    ➡️ This means you must have three separate copies of your data.
    For example, 1 copy on your PC and 2 backup copies.
  • 2 types of media
    ➡️ Copies must be stored on at least two different types of media.
    For example, a hard drive and cloud storage (like AWS S3 Glacier).
  • 1 remote copy
    ➡️ At least one of the copies must be stored remotely.
    In the cloud, at a friend’s house, etc.

So since I realized the loss I suffered when I didn’t have a headset, I apply this rule mainly to the photos and videos on my NAS which are irreplaceable memories.

❌ Why I ditched Synology’s Glacier Backup app

Synology provides an app called Glacier Backup, but in my experience:

  • It’s obsolete and no longer maintained.
  • It crashes randomly.
  • Logs are almost non-existent, making debugging impossible.
  • It completely stopped working for me a few weeks ago, with no clear reason.

🔄 Why not use Hyper Backup app?

Hyper Backup is a more modern and reliable backup app on Synology. However, it doesn’t support Glacier Deep Archive directly. The only workaround is:

  1. Back up to a standard S3 bucket.
  2. Apply a lifecycle rule to transition the data to Glacier Deep Archive.

The problem? Lifecycle transitions are not immediate and can’t be scheduled. Since I back up daily, I ended up paying for S3 Standard storage for almost 24 hours every day which defeats the cost-saving purpose.

✅ The rclone alternative

rclone is a terribly powerful command line tool that manages a bunch of remote storage including AWS S3 Glacier.

So I built a script that use rclone to directly upload my NAS data to an AWS S3 bucket configured for Glacier Deep Archive. It gives me full control, direct sync, and reliable logging.

🛠️ How to set it up

Install rclone

Follow the documentation, it’s very simple.

On Linux for example, it’s as simple as:

curl https://rclone.org/install.sh | sudo bash

Configure an S3 remote

You can configure your S3 remote with a single-line command.
For example:

rclone config create <config_name> s3 provider AWS access_key_id <my_access_key_id> secret_access_key <my_secret_access_key> region eu-west-1 location_constraint eu-west-1 acl private server_side_encryption AES256 storage_class DEEP_ARCHIVE

Fill in the <config_name>, <my_access_key_id> and <my_secret_access_key> fields

Vous pouvez aussi créer directement un fichier rclone.conf dans votre répertoire home:
~/.config/rclone/rclone.conf

Here is an example:

[<config_name>]
type = s3
provider = AWS
access_key_id = <my_access_key_id>
secret_access_key = <my_secret_access_key>
region = eu-west-1
location_constraint = eu-west-1
acl = private
server_side_encryption = AES256
storage_class = DEEP_ARCHIVE

Fill in the <config_name>, <my_access_key_id> and <my_secret_access_key> fields

Or you can also type this command:

using rclone config

And select the options one by one with a menu.

Here is an example of the options you can select:

  • New remote: Put the name of the config that is created.
  • Storage: Amazon S3
  • Provider: AWS S3
  • Credentials: Enter AWS credentials access_key_id and secret_access_key of a user with a Glacier policy
  • Region: eu-west-1 (Irlande)
  • Endpoint: empy
  • Location constraint: eu-west-1
  • ACL: private
  • server_side_encryption: AES256
  • sse_kms_key_id: none
  • Storage Class: DEEP_ARCHIVE
  • Advanced config: no

Test

To test your configuration, you can copy a file to the remote S3 bucket:

rclone copy /path/to/my/file <config_name>:<bucket_name>

Fill in the <config_name> and <bucket_name> fields

And list the remote bucket:

rclone ls <config_name>:<bucket_name>

Fill in the <config_name> and <bucket_name> fields

📄 The script

#!/bin/bash
# glacier_backup.bash v1.0
# Author: Julián
# Description:
# - Sync a folder to an S3 Glacier Deep Archive bucket using rclone
# - Optionally use a filter file to include/exclude files

# === Configuration ===

SOURCE_DIR="/path/to/your/data"
REMOTE_NAME="<config_name>:<bucket_name>"
DRY_RUN=false
DEBUG_MODE=false
VERBOSE=false

# === Logging setup ===

SCRIPT_DIR=$(dirname "$(realpath "$0")")
BASENAME=$(basename "${0%.*}")
CURRENT_TIME=$(date +"%Y%m%d_%H%M%S")
LOG_FILE="${SCRIPT_DIR}/${BASENAME}_${CURRENT_TIME}.log"

# === echo and log function ===

echo_log() {
    local level="$1"
    local message="$2"
    local timestamp
    timestamp=$(date +'%Y-%m-%d %H:%M:%S')

    if [[ "${level^^}" == "LOG" || "${level^^}" == "ERROR" ]]; then
        local formatted="[$timestamp] [${level^^}] $message"
    else
        local formatted="[$timestamp] [UNKNOWN] $message"
    fi

    if [ "${VERBOSE^^}" == "TRUE" ]; then
        if [ "${level^^}" == "ERROR" ]; then
            echo "$formatted" | tee -a "$LOG_FILE" >&2
        else
            echo "$formatted" | tee -a "$LOG_FILE"
        fi
    else
        echo "$formatted" >> "$LOG_FILE"
    fi
}

# === Checks ===

if [ ! -d "$SOURCE_DIR" ]; then
    echo_log error "SOURCE_DIR does not exist: $SOURCE_DIR"
    exit 1
fi

# === Display header ===

echo_log log "$BASENAME"
separator=$(printf '%*s\n' "${#BASENAME}" '' | tr ' ' '-')
echo_log log "$separator"
echo_log log "SOURCE_DIR  = $SOURCE_DIR"
echo_log log "REMOTE_NAME = $REMOTE_NAME"
echo_log log "LOG_FILE    = $LOG_FILE"
echo_log log "DRY_RUN     = $DRY_RUN"
echo_log log "DEBUG_MODE  = $DEBUG_MODE"

# === Check dependencies ===

if ! command -v rclone >/dev/null 2>&1; then
    echo_log error "rclone is not installed."
    exit 1
fi

# === Set rclone options ===

FILTER_FILE="${SCRIPT_DIR}/glacier_backup.filter"
if [ -f "$FILTER_FILE" ]; then
    FILTER_OPTION="--filter-from $FILTER_FILE"
    echo_log log "FILTER_FILE = $FILTER_FILE"
else
    FILTER_OPTION=""
fi

if [ "${DRY_RUN^^}" == "TRUE" ]; then
    DRY_RUN_OPTION="--dry-run"
else
    DRY_RUN_OPTION=""
fi

if [ "${DEBUG_MODE^^}" == "TRUE" ]; then
    LOG_LEVEL="DEBUG"
    DUMP_FILTERS_OPTION="--dump filters"
else
    LOG_LEVEL="INFO"
    DUMP_FILTERS_OPTION=""
fi

# === Run backup ===

echo_log log "Starting sync..."
rclone sync "$SOURCE_DIR" "$REMOTE_NAME" \
    --s3-storage-class DEEP_ARCHIVE \
    --s3-server-side-encryption AES256 \
    --delete-excluded \
    --progress \
    --log-file="$LOG_FILE" \
    --log-level "$LOG_LEVEL" \
    $DUMP_FILTERS_OPTION \
    $FILTER_OPTION \
    $DRY_RUN_OPTION

STATUS=$?

# === Summary ===

if [ "$STATUS" -eq 0 ]; then
    echo_log log "✅ Sync completed successfully."
else
    echo_log log "⚠️ Sync completed with errors (exit code: $STATUS). Check the log file: $LOG_FILE"
fi

You simply need to schedule the execution of the script via the crontab or the “Task Scheduler” on a Synology NAS.

📋 Optional: Use a filter file

You can define a file named glacier_backup.filter in the script folder to include/exclude files and folders, just like a .gitignore file except that you have to add + or - at the beginning of the line.

But just like a .gitignore file, remember that rules are read from top to bottom and the first match applies.

Here is an example:

- **/@eaDir/*
- /@eaDir/**
- /#recycle/**
- *.tmp
+ *.jpg
+ /my/data/**
- *

The first two lines exclude the @eaDir directory which can be located in any subdirectory (this is a hidden directory automatically created on Synology NAS).
The third line excludes the #recycle recycle bin directory.
The fourth line excludes files with the tmp extension.
The fifth line includes files with the jpg extension.
The sixth line includes the data directory.
The seventh line is a security line that excludes any other file or directory type that does not match any of the previous filters.

📌 Conclusion

This approach gives you full control, detailed logging, and saves you from paying for unnecessary S3 Standard storage. It may not be as “plug and play” as Synology’s apps, but it’s more reliable, transparent, and flexible—and in my case, it actually works perfectly.