Comprehensive DuckDB database backup toolkit supporting both file-based (cp) and native (ATTACH+COPY) backup approaches. Use when you need to backup DuckDB databases locally or to remote storage, create daily scheduled backups, verify backup integrity, or manage backup retention policies.
Two backup methods, choose based on your needs:
python3 scripts/backup_duckdb.py \
--db data/awsntp.duckdb \
--backup data/backup-20251223.duckdb \
--method cp
python3 scripts/backup_duckdb.py \
--db data/awsntp.duckdb \
--backup data/backup-20251223.duckdb \
--method attach
python3 scripts/backup_duckdb.py \
--db data/awsntp.duckdb \
--backup data/backups/ \
--method cp \
--timestamp
# Creates: data/backups/backup-20251223-153045.duckdb
python3 scripts/backup_duckdb.py --db <source> --backup <target> [options]
Arguments:
--db (required): Path to source DuckDB database--backup (required): Backup target (file path for cp, directory for attach with --timestamp)--method (optional): Backup method - cp (default) or attach--timestamp (optional): Add YYYYMMDD-HHMMSS timestamp to filenameExit codes:
0: Backup successful1: Backup failed (check logs)python3 scripts/backup_duckdb.py \
--db ~/LocalRepos/awsntpdagster/data/awsntp.duckdb \
--backup ~/LocalRepos/awsntpdagster/data/backup-20251223.duckdb \
--method cp
python3 scripts/backup_duckdb.py \
--db ~/LocalRepos/awsntpdagster/data/awsntp.duckdb \
--backup ~/LocalRepos/awsntpdagster/data/backups \
--method cp \
--timestamp
Add to crontab:
0 2 * * * python3 /path/to/backup_duckdb.py --db /path/to/awsntp.duckdb --backup /path/to/backups/ --method cp --timestamp >> /var/log/duckdb_backup.log 2>&1
#!/bin/bash
# Backup and keep only 7 most recent
BACKUP_DIR="/path/to/backups"
python3 scripts/backup_duckdb.py \
--db data/awsntp.duckdb \
--backup "$BACKUP_DIR" \
--method cp \
--timestamp
# Keep only 7 most recent backups
ls -t "$BACKUP_DIR"/backup-*.duckdb | tail -n +8 | xargs rm -f
For 975 MB DuckDB database (awsntp.duckdb):
| Scenario | Method | Speed | Best For |
|---|---|---|---|
| Daily backup before asset materialization | cp | ~1-2s | Production safety |
| Weekly archive to external drive | cp | ~1-2s | Local storage |
| Cloud backup to S3 | attach | ~30-60s | Remote storage |
| Backup during active queries | attach | N/A | Concurrent access |
# Check backup exists and is readable
duckdb data/backup-20251223-153045.duckdb \
"SELECT COUNT(*) as table_count FROM information_schema.tables;"
ls -lh data/awsntp.duckdb data/backup-*.duckdb | awk '{print $5, $9}'
ls -lt data/backups/backup-*.duckdb | head -10
Issue: "duckdb: command not found"
Issue: "database is locked"
Issue: "out of disk space"
df -hIssue: Slow backup
# Backup before running expensive assets
python3 scripts/backup_duckdb.py \
--db data/awsntp.duckdb \
--backup data/pre-materialization-backup.duckdb \
--method cp
# Then run your asset materialization
dagster asset materialize -m awsntpdagster.definitions --select awsntp_features_merged_v6
For your awsntpdagster project:
macOS/Linux crontab:
0 2 * * * cd /Users/zhaoliang/LocalRepos/awsntpdagster && python3 src/awsntpdagster/scripts/backup_duckdb.py --db data/awsntp.duckdb --backup data/backups/ --method cp --timestamp
For detailed backup strategy guide, scheduling patterns, and retention policies, see: