How to Set Up Automated Daily PostgreSQL Backups to Amazon S3 on a Linux VM
Introduction
There are two types of developers: those who perform regular database backups, and those who haven't lost data yet. Database durability is the cornerstone of any production-grade application. While modern cloud environments are highly reliable, unexpected VM failures, corrupted filesystems, or accidental human errors can wipe out months of hard work in seconds.
If you are running PostgreSQL on a self-managed Linux virtual machine, establishing an automated offsite backup strategy is essential. In this guide, we will walk you through setting up a secure, automated daily backup system that dumps your PostgreSQL database, compresses it, and uploads it directly to an Amazon S3 bucket.
Prerequisites
Before we begin, ensure you have the following ready:
- A Linux VM (Ubuntu/Debian or RHEL-based) running your PostgreSQL database.
- Non-root user access with
sudoprivileges. - An AWS Account with an S3 bucket created for storing backups.
- An AWS IAM User with programmatic access (Access Key ID and Secret Access Key) and write permissions to your specific S3 bucket.
Step 1: Install and Configure the AWS CLI
First, we need a way for our VM to securely talk to Amazon S3. We will use the AWS Command Line Interface (CLI) for this.
Update your package index and install the AWS CLI:
sudo apt-get update
sudo apt-get install awscli -y
Once installed, configure the AWS CLI with the credentials of your IAM user:
aws configure
You will be prompted to enter:
- AWS Access Key ID
- AWS Secret Access Key
- Default region name (e.g.,
us-east-1) - Default output format (press Enter to leave as
json)
Test the connection by listing your S3 buckets:
aws s3 ls
If your configuration is correct, this will list your S3 buckets without errors.
Step 2: Create the PostgreSQL Backup Script
Now, let's write a bash script that automates the backup process. This script will:
- Generate a PostgreSQL dump using
pg_dump. - Compress the dump file using
gzipto save bandwidth and storage costs. - Upload the compressed archive to your S3 bucket with a timestamped filename.
- Clean up the temporary local file to prevent your VM's disk from filling up.
Create a directory for your scripts and open a new file:
mkdir -p ~/scripts
nano ~/scripts/pg_backup_to_s3.sh
Paste the following script, making sure to replace the placeholder values with your actual database and S3 configurations:
#!/bin/bash
# Exit immediately if a command exits with a non-zero status
set -e
# --- CONFIGURATION ---
DB_NAME="your_database_name"
DB_USER="postgres"
S3_BUCKET="s3://your-s3-bucket-name/db-backups"
TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
BACKUP_DIR="/tmp"
BACKUP_FILENAME="${DB_NAME}_backup_${TIMESTAMP}.sql.gz"
LOCAL_BACKUP_PATH="${BACKUP_DIR}/${BACKUP_FILENAME}"
# ---------------------
echo "[$(date)] Starting PostgreSQL backup..."
# Run pg_dump, compress it, and save locally
pg_dump -U "$DB_USER" -h localhost "$DB_NAME" | gzip > "$LOCAL_BACKUP_PATH"
echo "[$(date)] Backup created locally at ${LOCAL_BACKUP_PATH}"
# Upload to S3
echo "[$(date)] Uploading backup to S3..."
aws s3 cp "$LOCAL_BACKUP_PATH" "${S3_BUCKET}/${BACKUP_FILENAME}"
# Clean up the local temporary file
echo "[$(date)] Cleaning up local backup file..."
rm -f "$LOCAL_BACKUP_PATH"
echo "[$(date)] Backup process completed successfully!"
Save and close the file (Ctrl+O, Enter, Ctrl+X).
Step 3: Secure Your Script and Database Password
By default, pg_dump might prompt you for a password. Since this script will run unattended, we need to bypass this prompt securely.
We can use PostgreSQL’s .pgpass file, which stores passwords securely in the home directory of the user executing the script.
Create the file in your home directory:
touch ~/.pgpass
chmod 600 ~/.pgpass
Open .pgpass and add your database credentials in the following format:
hostname:port:database:username:password
For example:
localhost:5432:your_database_name:postgres:your_secure_password
Next, make your backup script executable and restrict its read/write permissions so other users on the VM cannot view your configuration details:
chmod 700 ~/scripts/pg_backup_to_s3.sh
Test the script manually to ensure everything is working as expected:
~/scripts/pg_backup_to_s3.sh
Check your S3 console; you should see your newly uploaded .sql.gz file!
Step 4: Automate the Process with Cron
To run this backup automatically every day, we will use cron, the built-in Linux job scheduler.
Open the crontab editor for your current user:
crontab -e
Add the following line at the bottom of the file to run the backup script every day at 2:00 AM:
0 2 * * * /bin/bash /home/your-username/scripts/pg_backup_to_s3.sh >> /home/your-username/scripts/backup.log 2>&1
Note: Replace your-username with your actual Linux system username. This rule also redirects both standard output and error messages to a backup.log file, allowing you to easily debug any failures.
Step 5: Implement an S3 Lifecycle Policy (Recommended)
Uploading daily backups to S3 is highly secure, but over time, storage costs will accumulate. To prevent paying for years of obsolete daily backups, we highly recommend setting up an S3 Lifecycle Policy directly in your AWS Console:
- Go to your S3 Bucket, click on the Management tab.
- Under Lifecycle rules, click Create lifecycle rule.
- Name your rule (e.g.,
DeleteOldBackups). - Apply to all objects in the bucket (or restrict by prefix, such as
db-backups/). - Under Lifecycle rule actions, select Expire current versions of objects.
- Specify the retention period (e.g.,
30days) and save.
This automatically removes backups older than 30 days, keeping your storage footprint predictable and cost-effective.
Conclusion
Congratulations! You have successfully established a production-grade, automated database backup pipeline. Your PostgreSQL data is now safe from infrastructure failures, human errors, and local VM corruption.
Managing your own infrastructure can be rewarding, but configuring and monitoring backups, firewalls, and updates takes valuable time away from building your core product. If you're looking for a hassle-free hosting experience, consider trying Depnix. Depnix lets you deploy applications on virtual machines with automated backups, easy firewall configuration, and developer-friendly tooling built right into your dashboard.
