Backup my data to AWS S3 Glacier
Table of Contents
🛡️ The 3-2-1 backup rule: Your data’s helmet
Are you familiar with the 3-2-1 backup rule?
It’s a CRUCIAL rule in my opinion.
If you don’t do it, it’s like riding a bike or a scooter without a helmet: as long as nothing has happened to you, you don’t realize the risk.
The 3-2-1 backup rule is:
- 3 copies of your data
➡️ This means you must have three separate copies of your data.
For example, 1 copy on your PC and 2 backup copies. - 2 types of media
➡️ Copies must be stored on at least two different types of media.
For example, a hard drive and cloud storage (like AWS S3 Glacier). - 1 remote copy
➡️ At least one of the copies must be stored remotely.
In the cloud, at a friend’s house, etc.
So since I realized the loss I suffered when I didn’t have a headset, I apply this rule mainly to the photos and videos on my NAS which are irreplaceable memories.
❌ Why I ditched Synology’s Glacier Backup app
Synology provides an app called Glacier Backup, but in my experience:
- It’s obsolete and no longer maintained.
- It crashes randomly.
- Logs are almost non-existent, making debugging impossible.
- It completely stopped working for me a few weeks ago, with no clear reason.
🔄 Why not use Hyper Backup app?
Hyper Backup is a more modern and reliable backup app on Synology. However, it doesn’t support Glacier Deep Archive directly. The only workaround is:
- Back up to a standard S3 bucket.
- Apply a lifecycle rule to transition the data to Glacier Deep Archive.
The problem? Lifecycle transitions are not immediate and can’t be scheduled. Since I back up daily, I ended up paying for S3 Standard storage for almost 24 hours every day which defeats the cost-saving purpose.
✅ The rclone alternative
rclone is a terribly powerful command line tool that manages a bunch of remote storage including AWS S3 Glacier.
So I built a script that use rclone
to directly upload my NAS data to an AWS S3 bucket configured for Glacier Deep Archive. It gives me full control, direct sync, and reliable logging.
🛠️ How to set it up
Install rclone
Follow the documentation, it’s very simple.
On Linux for example, it’s as simple as:
curl https://rclone.org/install.sh | sudo bash
Configure an S3 remote
You can configure your S3 remote with a single-line command.
For example:
rclone config create <config_name> s3 provider AWS access_key_id <my_access_key_id> secret_access_key <my_secret_access_key> region eu-west-1 location_constraint eu-west-1 acl private server_side_encryption AES256 storage_class DEEP_ARCHIVE
Fill in the
<config_name>
,<my_access_key_id>
and<my_secret_access_key>
fields
Vous pouvez aussi créer directement un fichier rclone.conf
dans votre répertoire home:
~/.config/rclone/rclone.conf
Here is an example:
[<config_name>]
type = s3
provider = AWS
access_key_id = <my_access_key_id>
secret_access_key = <my_secret_access_key>
region = eu-west-1
location_constraint = eu-west-1
acl = private
server_side_encryption = AES256
storage_class = DEEP_ARCHIVE
Fill in the
<config_name>
,<my_access_key_id>
and<my_secret_access_key>
fields
Or you can also type this command:
using rclone config
And select the options one by one with a menu.
Here is an example of the options you can select:
- New remote: Put the name of the config that is created.
- Storage: Amazon S3
- Provider: AWS S3
- Credentials: Enter AWS credentials access_key_id and secret_access_key of a user with a Glacier policy
- Region: eu-west-1 (Irlande)
- Endpoint: empy
- Location constraint: eu-west-1
- ACL: private
- server_side_encryption: AES256
- sse_kms_key_id: none
- Storage Class: DEEP_ARCHIVE
- Advanced config: no
Test
To test your configuration, you can copy a file to the remote S3 bucket:
rclone copy /path/to/my/file <config_name>:<bucket_name>
Fill in the
<config_name>
and<bucket_name>
fields
And list the remote bucket:
rclone ls <config_name>:<bucket_name>
Fill in the
<config_name>
and<bucket_name>
fields
📄 The script
#!/bin/bash
# glacier_backup.bash v1.0
# Author: Julián
# Description:
# - Sync a folder to an S3 Glacier Deep Archive bucket using rclone
# - Optionally use a filter file to include/exclude files
# === Configuration ===
SOURCE_DIR="/path/to/your/data"
REMOTE_NAME="<config_name>:<bucket_name>"
DRY_RUN=false
DEBUG_MODE=false
VERBOSE=false
# === Logging setup ===
SCRIPT_DIR=$(dirname "$(realpath "$0")")
BASENAME=$(basename "${0%.*}")
CURRENT_TIME=$(date +"%Y%m%d_%H%M%S")
LOG_FILE="${SCRIPT_DIR}/${BASENAME}_${CURRENT_TIME}.log"
# === echo and log function ===
echo_log() {
local level="$1"
local message="$2"
local timestamp
timestamp=$(date +'%Y-%m-%d %H:%M:%S')
if [[ "${level^^}" == "LOG" || "${level^^}" == "ERROR" ]]; then
local formatted="[$timestamp] [${level^^}] $message"
else
local formatted="[$timestamp] [UNKNOWN] $message"
fi
if [ "${VERBOSE^^}" == "TRUE" ]; then
if [ "${level^^}" == "ERROR" ]; then
echo "$formatted" | tee -a "$LOG_FILE" >&2
else
echo "$formatted" | tee -a "$LOG_FILE"
fi
else
echo "$formatted" >> "$LOG_FILE"
fi
}
# === Checks ===
if [ ! -d "$SOURCE_DIR" ]; then
echo_log error "SOURCE_DIR does not exist: $SOURCE_DIR"
exit 1
fi
# === Display header ===
echo_log log "$BASENAME"
separator=$(printf '%*s\n' "${#BASENAME}" '' | tr ' ' '-')
echo_log log "$separator"
echo_log log "SOURCE_DIR = $SOURCE_DIR"
echo_log log "REMOTE_NAME = $REMOTE_NAME"
echo_log log "LOG_FILE = $LOG_FILE"
echo_log log "DRY_RUN = $DRY_RUN"
echo_log log "DEBUG_MODE = $DEBUG_MODE"
# === Check dependencies ===
if ! command -v rclone >/dev/null 2>&1; then
echo_log error "rclone is not installed."
exit 1
fi
# === Set rclone options ===
FILTER_FILE="${SCRIPT_DIR}/glacier_backup.filter"
if [ -f "$FILTER_FILE" ]; then
FILTER_OPTION="--filter-from $FILTER_FILE"
echo_log log "FILTER_FILE = $FILTER_FILE"
else
FILTER_OPTION=""
fi
if [ "${DRY_RUN^^}" == "TRUE" ]; then
DRY_RUN_OPTION="--dry-run"
else
DRY_RUN_OPTION=""
fi
if [ "${DEBUG_MODE^^}" == "TRUE" ]; then
LOG_LEVEL="DEBUG"
DUMP_FILTERS_OPTION="--dump filters"
else
LOG_LEVEL="INFO"
DUMP_FILTERS_OPTION=""
fi
# === Run backup ===
echo_log log "Starting sync..."
rclone sync "$SOURCE_DIR" "$REMOTE_NAME" \
--s3-storage-class DEEP_ARCHIVE \
--s3-server-side-encryption AES256 \
--delete-excluded \
--progress \
--log-file="$LOG_FILE" \
--log-level "$LOG_LEVEL" \
$DUMP_FILTERS_OPTION \
$FILTER_OPTION \
$DRY_RUN_OPTION
STATUS=$?
# === Summary ===
if [ "$STATUS" -eq 0 ]; then
echo_log log "✅ Sync completed successfully."
else
echo_log log "⚠️ Sync completed with errors (exit code: $STATUS). Check the log file: $LOG_FILE"
fi
You simply need to schedule the execution of the script via the crontab or the “Task Scheduler” on a Synology NAS.
📋 Optional: Use a filter file
You can define a file named glacier_backup.filter
in the script folder to include/exclude files and folders, just like a .gitignore
file except that you have to add +
or -
at the beginning of the line.
But just like a .gitignore
file, remember that rules are read from top to bottom and the first match applies.
Here is an example:
- **/@eaDir/*
- /@eaDir/**
- /#recycle/**
- *.tmp
+ *.jpg
+ /my/data/**
- *
The first two lines exclude the @eaDir
directory which can be located in any subdirectory (this is a hidden directory automatically created on Synology NAS).
The third line excludes the #recycle
recycle bin directory.
The fourth line excludes files with the tmp
extension.
The fifth line includes files with the jpg
extension.
The sixth line includes the data directory.
The seventh line is a security line that excludes any other file or directory type that does not match any of the previous filters.
📌 Conclusion
This approach gives you full control, detailed logging, and saves you from paying for unnecessary S3 Standard storage. It may not be as “plug and play” as Synology’s apps, but it’s more reliable, transparent, and flexible—and in my case, it actually works perfectly.