Compare commits

...

78 Commits

Author SHA1 Message Date
3wordchant 2a7e564a24 Switch ENTRYPOINT to try to resolve loop on start
continuous-integration/drone/push Build is passing
2024-10-01 22:43:12 -04:00
3wordchant 5f381f395d Update requirements 2024-10-01 22:43:12 -04:00
3wordchant e0ee16426b Make entrypoint executable 2024-10-01 22:43:12 -04:00
3wordchant 92845c4142 Add --break-system-packages, surely we don't need a virtualenv 2024-10-01 22:43:12 -04:00
3wordchant d0d0f29c79 Move entrypoint script into Docker image 2024-10-01 22:43:12 -04:00
3wordchant 88168de90e Move /entrypoint.sh to Dockerfile 2024-10-01 22:43:12 -04:00
3wordchant 71c88d0428 Remove redundant stuff from entrypoint 2024-10-01 22:43:12 -04:00
3wordchant 15b2d656bb Whoops, wrong image 2024-10-01 22:43:12 -04:00
3wordchant 46522a2e9a Switch to backup-bot-two image 2024-10-01 22:43:12 -04:00
3wordchant 30e88a972a Whoops skip shellcheck 2024-10-01 22:43:12 -04:00
3wordchant 64e09a6472 Reinstate Docker image 2024-10-01 22:43:12 -04:00
3wordchant 84d606fa80 Add CHANGELOG.md
[ci skip]
2024-04-09 22:51:09 -03:00
moritz 7865907811 fix push notification precendence race condition
continuous-integration/drone/push Build is failing
2024-03-08 15:42:00 +01:00
moritz dc66c02e23 make run_cron cmd independent from push_success_notifiaction
continuous-integration/drone/push Build is failing
2024-02-13 11:53:27 +01:00
moritz f730c70bfe feat: add retry option
continuous-integration/drone/push Build is failing
2024-01-18 18:01:30 +01:00
moritz faa7ae3dd1 fix Readme
continuous-integration/drone/push Build is failing
2024-01-17 20:36:06 +01:00
moritz 79eeec428a Push Notifications #24
continuous-integration/drone/push Build is failing
2024-01-16 19:40:31 +01:00
moritz 4164760dc6 Sepcify secret and volume donwload via env, fixes #44
continuous-integration/drone/push Build is failing
2024-01-11 18:46:58 +01:00
moritz e644679b8b Clearer service name in warning message. Fixes #46
continuous-integration/drone/push Build is failing
2024-01-11 18:39:26 +01:00
moritz 0c587ac926 add spaces for missing snapshot, fixes #45
continuous-integration/drone/push Build is failing
2024-01-11 18:34:58 +01:00
moritz 65686cd891 Fix python package install error
continuous-integration/drone/push Build is failing
2023-12-19 01:16:12 +01:00
moritz ac055c932e fix: remove bash/sh wrapping
continuous-integration/drone/push Build is failing
2023-12-13 18:27:12 +01:00
moritz 64328c79b1 make --noninteractive a flag
continuous-integration/drone/push Build is failing
2023-12-12 13:39:26 +01:00
moritz 15275b2571 structured json logging with -m flag
continuous-integration/drone/push Build is failing
2023-11-23 20:16:15 +01:00
moritz 4befebba38 Merge pull request 'fix removing quotes' (#40) from p4u1/backup-bot-two:fix-quotes into main
continuous-integration/drone/push Build is failing
Reviewed-on: #40
2023-11-11 08:15:12 +00:00
p4u1 d2087a441e fix removing quotes
continuous-integration/drone/pr Build is failing
2023-11-11 08:55:12 +01:00
moritz f4d96b0875 update README
continuous-integration/drone/push Build is failing
2023-11-10 20:04:05 +01:00
moritz c73bbe8c0d Always backup all apps to increase restic performance
continuous-integration/drone/push Build is failing
2023-11-09 10:30:19 +01:00
renovate-bot ff2b5a25a2 chore(deps): update docker docker tag to v24.0.7
continuous-integration/drone/pr Build is failing
continuous-integration/drone/push Build is failing
2023-10-27 07:01:33 +00:00
moritz e186813a49 better error handling
continuous-integration/drone/push Build is failing
2023-10-25 13:37:06 +02:00
moritz 37cb51674f update README
continuous-integration/drone/push Build is failing
2023-10-24 21:14:39 +02:00
moritz 2ea59b4230 breaking change: rename env RESTIC_REPO to RESTIC_REPOSITORY
continuous-integration/drone/push Build is failing
2023-10-24 21:03:44 +02:00
moritz 354f964e7d fix(create): hande non existing secret files
continuous-integration/drone/push Build is failing
2023-10-20 00:17:44 +02:00
moritz 2bb27aadc4 fix: handle not running container
continuous-integration/drone/push Build is failing
2023-10-19 23:15:24 +02:00
moritz 66e1c9617d fix(download): dump volumes and secrets per default into /tmp/backup.tar.gz
continuous-integration/drone/push Build is failing
2023-10-18 14:10:58 +02:00
moritz 79d19e7ac5 chore: formatting
continuous-integration/drone/push Build is failing
2023-10-12 12:50:10 +02:00
moritz 359140781e fix(create): quote handling for bash pipefail wrapping of pre/post hooks
continuous-integration/drone/push Build is failing
2023-10-12 10:57:52 +02:00
moritz 8750ec1813 fix(snapshots): warn if no snapshots could be found
continuous-integration/drone/push Build is failing
2023-10-12 10:35:37 +02:00
moritz 8e76ad591e remove copy pasta line
continuous-integration/drone/push Build is failing
2023-10-11 18:16:58 +02:00
moritz a3faa5d51f fix(ls): catch error if there is no snapshot
continuous-integration/drone/push Build is failing
2023-10-11 18:13:27 +02:00
moritz a3f27fa6ba log before container command
continuous-integration/drone/push Build is failing
2023-10-11 17:46:27 +02:00
moritz fe5d846c5f Revert "Revert "Revert "feat: add backupbot label"""
continuous-integration/drone/push Build is failing
This reverts commit 79b7a01dda.
2023-10-11 14:39:18 +02:00
moritz 79b7a01dda Revert "Revert "feat: add backupbot label""
continuous-integration/drone/push Build is failing
This reverts commit f8a8547b70.
2023-10-10 19:39:57 +02:00
decentral1se f8a8547b70 Revert "feat: add backupbot label"
continuous-integration/drone/push Build is failing
This reverts commit 4c2304a962.
2023-10-10 08:20:08 +02:00
decentral1se 192b1f1d9c Merge pull request 'feat: add backupbot label' (#33) from enable-label into main
continuous-integration/drone/push Build is failing
Reviewed-on: #33
2023-10-10 05:56:16 +00:00
decentral1se 4c2304a962 feat: add backupbot label
continuous-integration/drone/pr Build is failing
2023-10-10 07:53:26 +02:00
moritz 69e7f07978 Merge pull request 'Backupbot Revolution' (#23) from backupbot_revolution into main
continuous-integration/drone/push Build is failing
Reviewed-on: #23
2023-10-09 10:54:22 +00:00
moritz d25688f312 add backupbot label
continuous-integration/drone/pr Build is failing
2023-10-09 12:53:28 +02:00
moritz b3cbb8bb46 rm unused compose.https.yml as its replaced with compose.secret.yml
continuous-integration/drone/pr Build is failing
2023-10-04 19:11:42 +02:00
moritz bb1237f9ad fix secret name
continuous-integration/drone/pr Build is failing
2023-10-04 19:08:47 +02:00
moritz 972a2c2314 extend download to download the secrets or all app volumes at once 2023-10-04 19:08:47 +02:00
moritz 4240318d20 remove package versions, to avoid conflicts 2023-10-04 19:08:47 +02:00
moritz c3f3d1a6fe restic_repo as secret option #31 2023-10-04 19:08:39 +02:00
moritz ab6c06d423 Prompt before restore 2023-10-04 19:07:57 +02:00
moritz 9398e0d83d release note for migration 2023-10-04 19:07:57 +02:00
moritz 6fc62b5516 fix typo 2023-10-04 19:07:57 +02:00
moritz 1f06af95eb fix error messages 2023-10-04 19:07:57 +02:00
moritz 15a552ef8b formatting 2023-10-04 19:07:57 +02:00
moritz 5d4def6143 feat: Backup Secrets (copy secrets) #28 2023-10-04 19:07:57 +02:00
moritz ebc0ea5d84 small fixes 2023-10-04 19:07:57 +02:00
moritz 488c59f667 Revert "feat: Backup Secrets #28"
This reverts commit 2838a36d43.
2023-10-04 19:07:57 +02:00
moritz 825565451a feat: Backup Secrets #28 2023-10-04 19:07:57 +02:00
moritz 6fa9440c76 fix restic version, timeout and cron default timer 2023-10-04 19:07:57 +02:00
moritz 33ce3c58aa fix entrypoint 2023-10-04 19:07:57 +02:00
moritz 06ad03c1d5 specify program versions to prevent future breakage 2023-10-04 19:07:57 +02:00
moritz bd8398e7dd add healthcheck 2023-10-04 19:07:57 +02:00
moritz 75a93c5456 add sftp storage 2023-10-04 19:07:57 +02:00
moritz d32337cf3a update README 2023-10-04 19:07:57 +02:00
moritz 61ffb67686 update README 2023-10-04 19:07:57 +02:00
moritz a86ac15363 README 2023-10-04 19:07:57 +02:00
moritz 5fa8f821c1 choos specific restore target 2023-10-04 19:07:57 +02:00
moritz 203719c224 change repo per option 2023-10-04 19:07:57 +02:00
moritz 3009159c82 use latest snapshot as default 2023-10-04 19:07:57 +02:00
moritz 28334a4241 mount volumes read/write to restore backups 2023-10-04 19:07:57 +02:00
moritz 447a808849 initial rewrite 2023-10-04 19:07:57 +02:00
philippr 42ae6a6b9b remove unused traefik labels 2023-10-04 19:07:57 +02:00
philippr 3261d67dca mount volume ro 2023-10-04 19:07:38 +02:00
philippr 6355f3572f Backup volumes from host instead of copying paths
* Backupbot will now copy all volumes from a service with
  backupbot.enabled = 'true' label from the /var/lib/docker/volumes/
  path directly. This reduces the resource overhead of copying
  stuff from one volume to another.
  Recipes need to be adjustet that db-dumps are saved into a volume
  now!
* Remove the Dockerfile and move stuff into a entrypoint. This
  simplifies the whole versioning thing and makes this "just"
  a recipe

Co-authored-by: Moritz < moritz.m@local-it.org>
2023-10-04 19:07:16 +02:00
16 changed files with 640 additions and 248 deletions
+5 -16
View File
@@ -2,27 +2,16 @@
kind: pipeline
name: linters
steps:
- name: run shellcheck
image: koalaman/shellcheck-alpine
commands:
- shellcheck backup.sh
- name: publish image
image: plugins/docker
settings:
auto_tag: true
username: thecoopcloud
username: 3wordchant
password:
from_secret: thecoopcloud_password
repo: thecoopcloud/backup-bot-two
tags: latest
depends_on:
- run shellcheck
from_secret: git_coopcloud_tech_token_3wc
repo: git.coopcloud.tech/coop-cloud/backup-bot-two
tags: 2.0.0
registry: git.coopcloud.tech
when:
event:
exclude:
- pull_request
trigger:
branch:
- main
+13 -8
View File
@@ -4,11 +4,14 @@ SECRET_RESTIC_PASSWORD_VERSION=v1
COMPOSE_FILE=compose.yml
SERVER_NAME=example.com
RESTIC_HOST=minio.example.com
RESTIC_REPOSITORY=/backups/restic
CRON_SCHEDULE='*/5 * * * *'
REMOVE_BACKUP_VOLUME_AFTER_UPLOAD=1
CRON_SCHEDULE='30 3 * * *'
# Push Notifiactions
#PUSH_URL_START=https://status.example.com/api/push/xxxxxxxxxx?status=up&msg=start
#PUSH_URL_SUCCESS=https://status.example.com/api/push/xxxxxxxxxx?status=up&msg=OK
#PUSH_URL_FAIL=https://status.example.com/api/push/xxxxxxxxxx?status=down&msg=fail
# swarm-cronjob, instead of built-in cron
#COMPOSE_FILE="$COMPOSE_FILE:compose.swarm-cronjob.yml"
@@ -23,7 +26,9 @@ REMOVE_BACKUP_VOLUME_AFTER_UPLOAD=1
#AWS_ACCESS_KEY_ID=something-secret
#COMPOSE_FILE="$COMPOSE_FILE:compose.s3.yml"
# HTTPS storage
#SECRET_HTTPS_PASSWORD_VERSION=v1
#COMPOSE_FILE="$COMPOSE_FILE:compose.https.yml"
#RESTIC_USER=<somebody>
# Secret restic repository
# use a secret to store the RESTIC_REPOSITORY if the repository location contains a secret value
# i.E rest:https://user:SECRET_PASSWORD@host:8000/
# it overwrites the RESTIC_REPOSITORY variable
#SECRET_RESTIC_REPO_VERSION=v1
#COMPOSE_FILE="$COMPOSE_FILE:compose.secret.yml"
+6
View File
@@ -0,0 +1,6 @@
# Change log
## 2.0.0 (unreleased)
- Rewrite from Bash to Python
- Add support for push notifications (#24)
+8 -10
View File
@@ -1,13 +1,11 @@
FROM docker:24.0.6-dind
FROM docker:24.0.7-dind
RUN apk add --upgrade --no-cache \
bash \
curl \
jq \
restic
RUN apk add --upgrade --no-cache restic bash python3 py3-pip py3-click py3-docker-py py3-json-logger curl
COPY backup.sh /usr/bin/backup.sh
COPY setup-cron.sh /usr/bin/setup-cron.sh
RUN chmod +x /usr/bin/backup.sh /usr/bin/setup-cron.sh
# Todo use requirements file with specific versions
RUN pip install --break-system-packages resticpy==1.0.2
ENTRYPOINT [ "/usr/bin/setup-cron.sh" ]
COPY backupbot.py /usr/bin/backup
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT /entrypoint.sh
+158 -32
View File
@@ -4,7 +4,21 @@
_This Time, It's Easily Configurable_
Automatically take backups from running Docker Swarm services into a volume.
Automatically take backups from all volumes of running Docker Swarm services and runs pre- and post commands.
<!-- metadata -->
* **Category**: Utilities
* **Status**: 0, work-in-progress
* **Image**: [`thecoopcloud/backup-bot-two`](https://hub.docker.com/r/thecoopcloud/backup-bot-two), 4, upstream
* **Healthcheck**: No
* **Backups**: N/A
* **Email**: N/A
* **Tests**: No
* **SSO**: N/A
<!-- endmetadata -->
## Background
@@ -20,28 +34,149 @@ Backupbot II tries to help, by
### With Co-op Cloud
1. Set up Docker Swarm and [`abra`][abra]
2. `abra app new backup-bot-two`
3. `abra app config <your-app-name>`, and set storage options. Either configure `CRON_SCHEDULE`, or set up `swarm-cronjob`
4. `abra app secret generate <your-app-name> restic-password v1`, optionally with `--pass` before `<your-app-name>` to save the generated secret in `pass`.
5. `abra app secret insert <your-app-name> ssh-key v1 ...` or similar, to load required secrets.
4. `abra app deploy <your-app-name>`
<!-- metadata -->
* **Category**: Utilities
* **Status**: 0, work-in-progress
* **Image**: [`thecoopcloud/backup-bot-two`](https://hub.docker.com/r/thecoopcloud/backup-bot-two), 4, upstream
* **Healthcheck**: No
* **Backups**: N/A
* **Email**: N/A
* **Tests**: No
* **SSO**: N/A
<!-- endmetadata -->
* `abra app new backup-bot-two`
* `abra app config <app-name>`
- set storage options. Either configure `CRON_SCHEDULE`, or set up `swarm-cronjob`
* `abra app secret generate -a <backupbot_name>`
* `abra app deploy <app-name>`
## Configuration
Per default Backupbot stores the backups locally in the repository `/backups/restic`, which is accessible as volume at `/var/lib/docker/volumes/<backupbot_name>_backups/_data/restic/`
The backup location can be changed using the `RESTIC_REPOSITORY` env variable.
### S3 Storage
To use S3 storage as backup location set the following envs:
```
RESTIC_REPOSITORY=s3:<S3-SERVICE-URL>/<BUCKET-NAME>
SECRET_AWS_SECRET_ACCESS_KEY_VERSION=v1
AWS_ACCESS_KEY_ID=<MY_ACCESS_KEY>
COMPOSE_FILE="$COMPOSE_FILE:compose.s3.yml"
```
and add your `<SECRET_ACCESS_KEY>` as docker secret:
`abra app secret insert <backupbot_name> aws_secret_access_key v1 <SECRET_ACCESS_KEY>`
See [restic s3 docs](https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html#amazon-s3) for more information.
### SFTP Storage
> With sftp it is not possible to prevent the backupbot from deleting backups in case of a compromised machine. Therefore we recommend to use S3, REST or rclone server without delete permissions.
To use SFTP storage as backup location set the following envs:
```
RESTIC_REPOSITORY=sftp:user@host:/restic-repo-path
SECRET_SSH_KEY_VERSION=v1
SSH_HOST_KEY="hostname ssh-rsa AAAAB3...
COMPOSE_FILE="$COMPOSE_FILE:compose.ssh.yml"
```
To get the `SSH_HOST_KEY` run the following command `ssh-keyscan <hostname>`
Generate an ssh keypair: `ssh-keygen -t ed25519 -f backupkey -P ''`
Add the key to your `authorized_keys`:
`ssh-copy-id -i backupkey <user>@<hostname>`
Add your `SSH_KEY` as docker secret:
```
abra app secret insert <backupbot_name> ssh_key v1 """$(cat backupkey)
"""
```
> Attention: This command needs to be executed exactly as stated above, because it places a trailing newline at the end, if this is missing you will get the following error: `Load key "/run/secrets/ssh_key": error in libcrypto`
### Restic REST server Storage
You can simply set the `RESTIC_REPOSITORY` variable to your REST server URL `rest:http://host:8000/`.
If you access the REST server with a password `rest:https://user:pass@host:8000/` you should hide the whole URL containing the password inside a secret.
Uncomment these lines:
```
SECRET_RESTIC_REPO_VERSION=v1
COMPOSE_FILE="$COMPOSE_FILE:compose.secret.yml"
```
Add your REST server url as secret:
```
`abra app secret insert <backupbot_name> restic_repo v1 "rest:https://user:pass@host:8000/"`
```
The secret will overwrite the `RESTIC_REPOSITORY` variable.
See [restic REST docs](https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html#rest-server) for more information.
## Push notifications
The following env variables can be used to setup push notifications for backups. `PUSH_URL_START` is requested just before the backups starts, `PUSH_URL_SUCCESS` is only requested if the backup was successful and if the backup fails `PUSH_URL_FAIL` will be requested.
Each variable is optional and independent of the other.
```
PUSH_URL_START=https://status.example.com/api/push/xxxxxxxxxx?status=up&msg=start
PUSH_URL_SUCCESS=https://status.example.com/api/push/xxxxxxxxxx?status=up&msg=OK
PUSH_URL_FAIL=https://status.example.com/api/push/xxxxxxxxxx?status=down&msg=fail
```
## Usage
Run the cronjob that creates a backup, including the push notifications and docker logging:
`abra app cmd <backupbot_name> app run_cron`
Create a backup of all apps:
`abra app run <backupbot_name> app -- backup create`
> The apps to backup up need to be deployed
Create an individual backup:
`abra app run <backupbot_name> app -- backup --host <target_app_name> create`
Create a backup to a local repository:
`abra app run <backupbot_name> app -- backup create -r /backups/restic`
> It is recommended to shutdown/undeploy an app before restoring the data
Restore the latest snapshot of all including apps:
`abra app run <backupbot_name> app -- backup restore`
Restore a specific snapshot of an individual app:
`abra app run <backupbot_name> app -- backup --host <target_app_name> restore --snapshot <snapshot_id>`
Show all snapshots:
`abra app run <backupbot_name> app -- backup snapshots`
Show all snapshots containing a specific app:
`abra app run <backupbot_name> app -- backup --host <target_app_name> snapshots`
Show all files inside the latest snapshot (can be very verbose):
`abra app run <backupbot_name> app -- backup ls`
Show specific files inside a selected snapshot:
`abra app run <backupbot_name> app -- backup ls --snapshot <snapshot_id> --path /var/lib/docker/volumes/`
Download files from a snapshot:
```
filename=$(abra app run <backupbot_name> app -- backup download --snapshot <snapshot_id> --path <absolute_path>)
abra app cp <backupbot_name> app:$filename .
```
## Run restic
```
abra app run <backupbot_name> app bash
export AWS_SECRET_ACCESS_KEY=$(cat $AWS_SECRET_ACCESS_KEY_FILE)
export RESTIC_PASSWORD=$(cat $RESTIC_PASSWORD_FILE)
restic snapshots
```
## Recipe Configuration
Like Traefik, or `swarm-cronjob`, Backupbot II uses access to the Docker socket to read labels from running Docker Swarm services:
```
@@ -49,24 +184,15 @@ services:
db:
deploy:
labels:
backupbot.backup: "true"
backupbot.backup.pre-hook: 'mysqldump -u root -p"$(cat /run/secrets/db_root_password)" -f /tmp/dump/dump.db'
backupbot.backup.post-hook: "rm -rf /tmp/dump/dump.db"
backupbot.backup.path: "/tmp/dump/,/etc/foo/"
backupbot.backup: ${BACKUP:-"true"}
backupbot.backup.pre-hook: 'mysqldump -u root -p"$(cat /run/secrets/db_root_password)" -f /volume_path/dump.db'
backupbot.backup.post-hook: "rm -rf /volume_path/dump.db"
```
- `backupbot.backup` -- set to `true` to back up this service (REQUIRED)
- `backupbot.backup.path` -- comma separated list of file paths within the service to copy (REQUIRED)
- `backupbot.backup.pre-hook` -- command to run before copying files (optional)
- `backupbot.backup.pre-hook` -- command to run before copying files (optional), save all dumps into the volumes
- `backupbot.backup.post-hook` -- command to run after copying files (optional)
As in the above example, you can reference Docker Secrets, e.g. for looking up database passwords, by reading the files in `/run/secrets` directly.
## Development
1. Install `direnv`
2. `cp .envrc.sample .envrc`
3. Edit `.envrc` as appropriate, including setting `DOCKER_CONTEXT` to a remote Docker context, if you're not running a swarm server locally.
4. Run `./backup.sh` -- you can add the `--skip-backup` or `--skip-upload` options if you just want to test one other step
[abra]: https://git.autonomic.zone/autonomic-cooperative/abra
+10
View File
@@ -0,0 +1,10 @@
export BACKUPBOT_VERSION=v1
export SSH_CONFIG_VERSION=v1
run_cron () {
schedule="$(crontab -l | tr -s " " | cut -d ' ' -f-5)"
rm -f /tmp/backup.log
echo "* * * * * $(crontab -l | tr -s " " | cut -d ' ' -f6-)" | crontab -
while [ ! -f /tmp/backup.log ]; do sleep 1; done
echo "$schedule $(crontab -l | tr -s " " | cut -d ' ' -f6-)" | crontab -
}
-139
View File
@@ -1,139 +0,0 @@
#!/bin/bash
server_name="${SERVER_NAME:?SERVER_NAME not set}"
restic_password_file="${RESTIC_PASSWORD_FILE:?RESTIC_PASSWORD_FILE not set}"
restic_host="${RESTIC_HOST:?RESTIC_HOST not set}"
backup_path="${BACKUP_DEST:?BACKUP_DEST not set}"
# shellcheck disable=SC2153
ssh_key_file="${SSH_KEY_FILE}"
s3_key_file="${AWS_SECRET_ACCESS_KEY_FILE}"
# shellcheck disable=SC2153
https_password_file="${HTTPS_PASSWORD_FILE}"
restic_repo=
restic_extra_options=
if [ -n "$ssh_key_file" ] && [ -f "$ssh_key_file" ]; then
restic_repo="sftp:$restic_host:/$server_name"
# Only check server against provided SSH_HOST_KEY, if set
if [ -n "$SSH_HOST_KEY" ]; then
tmpfile=$(mktemp)
echo "$SSH_HOST_KEY" >>"$tmpfile"
echo "using host key $SSH_HOST_KEY"
ssh_options="-o 'UserKnownHostsFile $tmpfile'"
elif [ "$SSH_HOST_KEY_DISABLE" = "1" ]; then
echo "disabling SSH host key checking"
ssh_options="-o 'StrictHostKeyChecking=No'"
else
echo "neither SSH_HOST_KEY nor SSH_HOST_KEY_DISABLE set"
fi
restic_extra_options="sftp.command=ssh $ssh_options -i $ssh_key_file $restic_host -s sftp"
fi
if [ -n "$s3_key_file" ] && [ -f "$s3_key_file" ] && [ -n "$AWS_ACCESS_KEY_ID" ]; then
AWS_SECRET_ACCESS_KEY="$(cat "${s3_key_file}")"
export AWS_SECRET_ACCESS_KEY
restic_repo="s3:$restic_host:/$server_name"
fi
if [ -n "$https_password_file" ] && [ -f "$https_password_file" ]; then
HTTPS_PASSWORD="$(cat "${https_password_file}")"
export HTTPS_PASSWORD
restic_user="${RESTIC_USER:?RESTIC_USER not set}"
restic_repo="rest:https://$restic_user:$HTTPS_PASSWORD@$restic_host"
fi
if [ -z "$restic_repo" ]; then
echo "you must configure either SFTP, S3, or HTTPS storage, see README"
exit 1
fi
echo "restic_repo: $restic_repo"
# Pre-bake-in some default restic options
_restic() {
if [ -z "$restic_extra_options" ]; then
# shellcheck disable=SC2068
restic -p "$restic_password_file" \
--quiet -r "$restic_repo" \
$@
else
# shellcheck disable=SC2068
restic -p "$restic_password_file" \
--quiet -r "$restic_repo" \
-o "$restic_extra_options" \
$@
fi
}
if [ -n "$SERVICES_OVERRIDE" ]; then
# this is fine because docker service names should never include spaces or
# glob characters
# shellcheck disable=SC2206
services=($SERVICES_OVERRIDE)
else
mapfile -t services < <(docker service ls --format '{{ .Name }}')
fi
if [[ \ $*\ != *\ --skip-backup\ * ]]; then
rm -rf "${backup_path}"
for service in "${services[@]}"; do
echo "service: $service"
details=$(docker service inspect "$service" --format "{{ json .Spec.Labels }}")
if echo "$details" | jq -r '.["backupbot.backup"]' | grep -q 'true'; then
pre=$(echo "$details" | jq -r '.["backupbot.backup.pre-hook"]')
post=$(echo "$details" | jq -r '.["backupbot.backup.post-hook"]')
path=$(echo "$details" | jq -r '.["backupbot.backup.path"]')
if [ "$path" = "null" ]; then
echo "ERROR: missing 'path' for $service"
continue # or maybe exit?
fi
container=$(docker container ls -f "name=$service" --format '{{ .ID }}')
echo "backing up $service"
if [ "$pre" != "null" ]; then
# run the precommand
# shellcheck disable=SC2086
docker exec "$container" sh -c "$pre"
fi
# run the backup
for p in ${path//,/ }; do
# creates the parent folder, so `docker cp` has reliable behaviour no matter if $p ends with `/` or `/.`
dir=$backup_path/$service/$(dirname "$p")
test -d "$dir" || mkdir -p "$dir"
docker cp -a "$container:$p" "$dir/$(basename "$p")"
done
if [ "$post" != "null" ]; then
# run the postcommand
# shellcheck disable=SC2086
docker exec "$container" sh -c "$post"
fi
fi
done
# check if restic repo exists, initialise if not
if [ -z "$(_restic cat config)" ] 2>/dev/null; then
echo "initializing restic repo"
_restic init
fi
fi
if [[ \ $*\ != *\ --skip-upload\ * ]]; then
_restic backup --host "$server_name" --tag coop-cloud "$backup_path"
if [ "$REMOVE_BACKUP_VOLUME_AFTER_UPLOAD" -eq 1 ]; then
echo "Cleaning up ${backup_path}"
rm -rf "${backup_path:?}"/*
fi
fi
Executable
+365
View File
@@ -0,0 +1,365 @@
#!/usr/bin/python3
import os
import sys
import click
import json
import subprocess
import logging
import docker
import restic
import tarfile
import io
from pythonjsonlogger import jsonlogger
from datetime import datetime, timezone
from restic.errors import ResticFailedError
from pathlib import Path
from shutil import copyfile, rmtree
VOLUME_PATH = "/var/lib/docker/volumes/"
SECRET_PATH = '/secrets/'
SERVICE = None
logger = logging.getLogger("backupbot")
logging.addLevelName(55, 'SUMMARY')
setattr(logging, 'SUMMARY', 55)
setattr(logger, 'summary', lambda message, *args, **
kwargs: logger.log(55, message, *args, **kwargs))
def handle_exception(exc_type, exc_value, exc_traceback):
if issubclass(exc_type, KeyboardInterrupt):
sys.__excepthook__(exc_type, exc_value, exc_traceback)
return
logger.critical("Uncaught exception", exc_info=(
exc_type, exc_value, exc_traceback))
sys.excepthook = handle_exception
@click.group()
@click.option('-l', '--log', 'loglevel')
@click.option('-m', '--machine-logs', 'machine_logs', is_flag=True)
@click.option('service', '--host', '-h', envvar='SERVICE')
@click.option('repository', '--repo', '-r', envvar='RESTIC_REPOSITORY', required=True)
def cli(loglevel, service, repository, machine_logs):
global SERVICE
if service:
SERVICE = service.replace('.', '_')
if repository:
os.environ['RESTIC_REPOSITORY'] = repository
if loglevel:
numeric_level = getattr(logging, loglevel.upper(), None)
if not isinstance(numeric_level, int):
raise ValueError('Invalid log level: %s' % loglevel)
logger.setLevel(numeric_level)
if machine_logs:
logHandler = logging.StreamHandler()
formatter = jsonlogger.JsonFormatter(
"%(levelname)s %(filename)s %(lineno)s %(process)d %(message)s", rename_fields={"levelname": "message_type"})
logHandler.setFormatter(formatter)
logger.addHandler(logHandler)
export_secrets()
init_repo()
def init_repo():
repo = os.environ['RESTIC_REPOSITORY']
logger.debug(f"set restic repository location: {repo}")
restic.repository = repo
restic.password_file = '/var/run/secrets/restic_password'
try:
restic.cat.config()
except ResticFailedError as error:
if 'unable to open config file' in str(error):
result = restic.init()
logger.info(f"Initialized restic repo: {result}")
else:
raise error
def export_secrets():
for env in os.environ:
if env.endswith('FILE') and not "COMPOSE_FILE" in env:
logger.debug(f"exported secret: {env}")
with open(os.environ[env]) as file:
secret = file.read()
os.environ[env.removesuffix('_FILE')] = secret
# logger.debug(f"Read secret value: {secret}")
@cli.command()
@click.option('retries', '--retries', '-r', envvar='RETRIES', default=1)
def create(retries):
pre_commands, post_commands, backup_paths, apps = get_backup_cmds()
copy_secrets(apps)
backup_paths.append(SECRET_PATH)
run_commands(pre_commands)
backup_volumes(backup_paths, apps, int(retries))
run_commands(post_commands)
def get_backup_cmds():
client = docker.from_env()
container_by_service = {
c.labels['com.docker.swarm.service.name']: c for c in client.containers.list()}
backup_paths = set()
backup_apps = set()
pre_commands = {}
post_commands = {}
services = client.services.list()
for s in services:
labels = s.attrs['Spec']['Labels']
if (backup := labels.get('backupbot.backup')) and bool(backup):
# volumes: s.attrs['Spec']['TaskTemplate']['ContainerSpec']['Mounts'][0]['Source']
stack_name = labels['com.docker.stack.namespace']
# Remove this lines to backup only a specific service
# This will unfortenately decrease restice performance
# if SERVICE and SERVICE != stack_name:
# continue
backup_apps.add(stack_name)
backup_paths = backup_paths.union(
Path(VOLUME_PATH).glob(f"{stack_name}_*"))
if not (container := container_by_service.get(s.name)):
logger.error(
f"Container {s.name} is not running, hooks can not be executed")
continue
if prehook := labels.get('backupbot.backup.pre-hook'):
pre_commands[container] = prehook
if posthook := labels.get('backupbot.backup.post-hook'):
post_commands[container] = posthook
return pre_commands, post_commands, list(backup_paths), list(backup_apps)
def copy_secrets(apps):
# TODO: check if it is deployed
rmtree(SECRET_PATH, ignore_errors=True)
os.mkdir(SECRET_PATH)
client = docker.from_env()
container_by_service = {
c.labels['com.docker.swarm.service.name']: c for c in client.containers.list()}
services = client.services.list()
for s in services:
app_name = s.attrs['Spec']['Labels']['com.docker.stack.namespace']
if (app_name in apps and
(app_secs := s.attrs['Spec']['TaskTemplate']['ContainerSpec'].get('Secrets'))):
if not container_by_service.get(s.name):
logger.error(
f"Container {s.name} is not running, secrets can not be copied.")
continue
container_id = container_by_service[s.name].id
for sec in app_secs:
src = f'/var/lib/docker/containers/{container_id}/mounts/secrets/{sec["SecretID"]}'
if not Path(src).exists():
logger.error(
f"For the secret {sec['SecretName']} the file {src} does not exist for {s.name}")
continue
dst = SECRET_PATH + sec['SecretName']
copyfile(src, dst)
def run_commands(commands):
for container, command in commands.items():
if not command:
continue
# Remove bash/sh wrapping
command = command.removeprefix('bash -c').removeprefix('sh -c').removeprefix(' ')
# Remove quotes surrounding the command
if (len(command) >= 2 and command[0] == command[-1] and (command[0] == "'" or command[0] == '"')):
command = command[1:-1]
# Use bash's pipefail to return exit codes inside a pipe to prevent silent failure
command = f"bash -c 'set -o pipefail;{command}'"
logger.info(f"run command in {container.name}:")
logger.info(command)
result = container.exec_run(command)
if result.exit_code:
logger.error(
f"Failed to run command {command} in {container.name}: {result.output.decode()}")
else:
logger.info(result.output.decode())
def backup_volumes(backup_paths, apps, retries, dry_run=False):
while True:
try:
result = restic.backup(backup_paths, dry_run=dry_run, tags=apps)
logger.summary("backup finished", extra=result)
return
except ResticFailedError as error:
logger.error(
f"Backup failed for {apps}. Could not Backup these paths: {backup_paths}")
logger.error(error, exc_info=True)
if retries > 0:
retries -= 1
else:
exit(1)
@cli.command()
@click.option('snapshot', '--snapshot', '-s', envvar='SNAPSHOT', default='latest')
@click.option('target', '--target', '-t', envvar='TARGET', default='/')
@click.option('noninteractive', '--noninteractive', envvar='NONINTERACTIVE', is_flag=True)
def restore(snapshot, target, noninteractive):
# Todo: recommend to shutdown the container
service_paths = VOLUME_PATH
if SERVICE:
service_paths = service_paths + f'{SERVICE}_*'
snapshots = restic.snapshots(snapshot_id=snapshot)
if not snapshot:
logger.error("No Snapshots with ID {snapshots}")
exit(1)
if not noninteractive:
snapshot_date = datetime.fromisoformat(snapshots[0]['time'])
delta = datetime.now(tz=timezone.utc) - snapshot_date
print(
f"You are going to restore Snapshot {snapshot} of {service_paths} at {target}")
print(f"This snapshot is {delta} old")
print(
f"THIS COMMAND WILL IRREVERSIBLY OVERWRITES {target}{service_paths.removeprefix('/')}")
prompt = input("Type YES (uppercase) to continue: ")
if prompt != 'YES':
logger.error("Restore aborted")
exit(1)
print(f"Restoring Snapshot {snapshot} of {service_paths} at {target}")
# TODO: use tags if no snapshot is selected, to use a snapshot including SERVICE
result = restic.restore(snapshot_id=snapshot,
include=service_paths, target_dir=target)
logger.debug(result)
@cli.command()
def snapshots():
snapshots = restic.snapshots()
no_snapshots = True
for snap in snapshots:
if not SERVICE or (tags := snap.get('tags')) and SERVICE in tags:
print(snap['time'], snap['id'])
no_snapshots = False
if no_snapshots:
err_msg = "No Snapshots found"
if SERVICE:
service_name = SERVICE.replace('_', '.')
err_msg += f' for app {service_name}'
logger.warning(err_msg)
@cli.command()
@click.option('snapshot', '--snapshot', '-s', envvar='SNAPSHOT', default='latest')
@click.option('path', '--path', '-p', envvar='INCLUDE_PATH')
def ls(snapshot, path):
results = list_files(snapshot, path)
for r in results:
if r.get('path'):
print(f"{r['ctime']}\t{r['path']}")
def list_files(snapshot, path):
cmd = restic.cat.base_command() + ['ls']
if SERVICE:
cmd = cmd + ['--tag', SERVICE]
cmd.append(snapshot)
if path:
cmd.append(path)
try:
output = restic.internal.command_executor.execute(cmd)
except ResticFailedError as error:
if 'no snapshot found' in str(error):
err_msg = f'There is no snapshot "{snapshot}"'
if SERVICE:
err_msg += f' for the app "{SERVICE}"'
logger.error(err_msg)
exit(1)
else:
raise error
output = output.replace('}\n{', '}|{')
results = list(map(json.loads, output.split('|')))
return results
@cli.command()
@click.option('snapshot', '--snapshot', '-s', envvar='SNAPSHOT', default='latest')
@click.option('path', '--path', '-p', envvar='INCLUDE_PATH')
@click.option('volumes', '--volumes', '-v', envvar='VOLUMES')
@click.option('secrets', '--secrets', '-c', is_flag=True, envvar='SECRETS')
def download(snapshot, path, volumes, secrets):
file_dumps = []
if not any([path, volumes, secrets]):
volumes = secrets = True
if path:
path = path.removesuffix('/')
binary_output = dump(snapshot, path)
files = list_files(snapshot, path)
filetype = [f.get('type') for f in files if f.get('path') == path][0]
filename = Path(path).name
if filetype == 'dir':
filename = filename + ".tar"
tarinfo = tarfile.TarInfo(name=filename)
tarinfo.size = len(binary_output)
file_dumps.append((binary_output, tarinfo))
if volumes:
if not SERVICE:
logger.error("Please specify '--host' when using '--volumes'")
exit(1)
files = list_files(snapshot, VOLUME_PATH)
for f in files[1:]:
path = f['path']
if Path(path).name.startswith(SERVICE) and f['type'] == 'dir':
binary_output = dump(snapshot, path)
filename = f"{Path(path).name}.tar"
tarinfo = tarfile.TarInfo(name=filename)
tarinfo.size = len(binary_output)
file_dumps.append((binary_output, tarinfo))
if secrets:
if not SERVICE:
logger.error("Please specify '--host' when using '--secrets'")
exit(1)
filename = f"{SERVICE}.json"
files = list_files(snapshot, SECRET_PATH)
secrets = {}
for f in files[1:]:
path = f['path']
if Path(path).name.startswith(SERVICE) and f['type'] == 'file':
secret = dump(snapshot, path).decode()
secret_name = path.removeprefix(f'{SECRET_PATH}{SERVICE}_')
secrets[secret_name] = secret
binary_output = json.dumps(secrets).encode()
tarinfo = tarfile.TarInfo(name=filename)
tarinfo.size = len(binary_output)
file_dumps.append((binary_output, tarinfo))
with tarfile.open('/tmp/backup.tar.gz', "w:gz") as tar:
print(f"Writing files to /tmp/backup.tar.gz...")
for binary_output, tarinfo in file_dumps:
tar.addfile(tarinfo, fileobj=io.BytesIO(binary_output))
size = get_formatted_size('/tmp/backup.tar.gz')
print(
f"Backup has been written to /tmp/backup.tar.gz with a size of {size}")
def get_formatted_size(file_path):
file_size = os.path.getsize(file_path)
units = ['Bytes', 'KB', 'MB', 'GB', 'TB']
for unit in units:
if file_size < 1024:
return f"{round(file_size, 3)} {unit}"
file_size /= 1024
return f"{round(file_size, 3)} {units[-1]}"
def dump(snapshot, path):
cmd = restic.cat.base_command() + ['dump']
if SERVICE:
cmd = cmd + ['--tag', SERVICE]
cmd = cmd + [snapshot, path]
print(f"Dumping {path} from snapshot '{snapshot}'")
output = subprocess.run(cmd, capture_output=True)
if output.returncode:
logger.error(
f"error while dumping {path} from snapshot '{snapshot}': {output.stderr}")
exit(1)
return output.stdout
if __name__ == '__main__':
cli()
-15
View File
@@ -1,15 +0,0 @@
---
version: "3.8"
services:
app:
environment:
- HTTPS_PASSWORD_FILE=/run/secrets/https_password
- RESTIC_USER
secrets:
- source: https_password
mode: 0400
secrets:
https_password:
external: true
name: ${STACK_NAME}_https_password_${SECRET_HTTPS_PASSWORD_VERSION}
+13
View File
@@ -0,0 +1,13 @@
---
version: "3.8"
services:
app:
environment:
- RESTIC_REPOSITORY_FILE=/run/secrets/restic_repo
secrets:
- restic_repo
secrets:
restic_repo:
external: true
name: ${STACK_NAME}_restic_repo_${SECRET_RESTIC_REPO_VERSION}
+8 -1
View File
@@ -5,12 +5,19 @@ services:
environment:
- SSH_KEY_FILE=/run/secrets/ssh_key
- SSH_HOST_KEY
- SSH_HOST_KEY_DISABLE
secrets:
- source: ssh_key
mode: 0400
configs:
- source: ssh_config
target: /root/.ssh/config
secrets:
ssh_key:
external: true
name: ${STACK_NAME}_ssh_key_${SECRET_SSH_KEY_VERSION}
configs:
ssh_config:
name: ${STACK_NAME}_ssh_config_${SSH_CONFIG_VERSION}
file: ssh_config
+17 -16
View File
@@ -2,34 +2,35 @@
version: "3.8"
services:
app:
image: thecoopcloud/backup-bot-two:latest
# build: .
image: git.coopcloud.tech/coop-cloud/backup-bot-two:2.0.0
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
- "backups:/backups"
- "/var/lib/docker/volumes/:/var/lib/docker/volumes/"
- "/var/lib/docker/containers/:/var/lib/docker/containers/:ro"
- backups:/backups
environment:
- CRON_SCHEDULE
- RESTIC_REPO
- RESTIC_REPOSITORY
- RESTIC_PASSWORD_FILE=/run/secrets/restic_password
- BACKUP_DEST=/backups
- RESTIC_HOST
- SERVER_NAME
- REMOVE_BACKUP_VOLUME_AFTER_UPLOAD=1
secrets:
- restic_password
deploy:
labels:
- "traefik.enable=true"
- "traefik.http.services.${STACK_NAME}.loadbalancer.server.port=8008"
- "traefik.http.routers.${STACK_NAME}.rule="
- "traefik.http.routers.${STACK_NAME}.entrypoints=web-secure"
- "traefik.http.routers.${STACK_NAME}.tls.certresolver=${LETS_ENCRYPT_ENV}"
- coop-cloud.${STACK_NAME}.version=0.1.0+latest
volumes:
backups:
- coop-cloud.${STACK_NAME}.timeout=${TIMEOUT:-300}
- coop-cloud.backupbot.enabled=true
#entrypoint: ['tail', '-f','/dev/null']
healthcheck:
test: "pgrep crond"
interval: 30s
timeout: 10s
retries: 10
start_period: 5m
secrets:
restic_password:
external: true
name: ${STACK_NAME}_restic_password_${SECRET_RESTIC_PASSWORD_VERSION}
volumes:
backups:
Executable
+30
View File
@@ -0,0 +1,30 @@
#!/bin/sh
set -e
if [ -n "$SSH_HOST_KEY" ]
then
echo "$SSH_HOST_KEY" > /root/.ssh/known_hosts
fi
cron_schedule="${CRON_SCHEDULE:?CRON_SCHEDULE not set}"
if [ -n "$PUSH_URL_START" ]
then
push_start_notification="curl -s '$PUSH_URL_START' &&"
fi
if [ -n "$PUSH_URL_FAIL" ]
then
push_fail_notification="|| curl -s '$PUSH_URL_FAIL'"
fi
if [ -n "$PUSH_URL_SUCCESS" ]
then
push_notification=" && (grep -q 'backup finished' /tmp/backup.log && curl -s '$PUSH_URL_SUCCESS' $push_fail_notification)"
fi
echo "$cron_schedule $push_start_notification backup --machine-logs create 2>&1 | tee /tmp/backup.log $push_notification" | crontab -
crontab -l
crond -f -d8 -L /dev/stdout
+3
View File
@@ -0,0 +1,3 @@
Breaking Change: the variables `SERVER_NAME` and `RESTIC_HOST` are merged into `RESTIC_REPOSITORY`. The format can be looked up here: https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html
ssh/sftp: `sftp:user@host:/repo-path`
S3: `s3:https://s3.example.com/bucket_name`
-11
View File
@@ -1,11 +0,0 @@
#!/bin/bash
set -e
set -o pipefail
cron_schedule="${CRON_SCHEDULE:?CRON_SCHEDULE not set}"
echo "$cron_schedule /usr/bin/backup.sh" | crontab -
crontab -l
crond -f -d8 -L /dev/stdout
+4
View File
@@ -0,0 +1,4 @@
Host *
IdentityFile /run/secrets/ssh_key
ServerAliveInterval 60
ServerAliveCountMax 240