4 Commits

Author SHA1 Message Date
e360c3d8f8 remove unused traefik labels
All checks were successful
continuous-integration/drone/pr Build is passing
2023-06-13 14:34:53 +02:00
50b317c12d fix shellcheck prevent globbing
All checks were successful
continuous-integration/drone/pr Build is passing
2023-06-13 11:57:42 +02:00
9cb3a469f3 mount volume ro
Some checks failed
continuous-integration/drone/pr Build is failing
2023-06-13 11:48:25 +02:00
24d2c0e85b Backup volumes from host instead of copying paths
Some checks failed
continuous-integration/drone/pr Build is failing
* Backupbot will now copy all volumes from a service with
  backupbot.enabled = 'true' label from the /var/lib/docker/volumes/
  path directly. This reduces the resource overhead of copying
  stuff from one volume to another.
  Recipes need to be adjustet that db-dumps are saved into a volume
  now!
* Remove the Dockerfile and move stuff into a entrypoint. This
  simplifies the whole versioning thing and makes this "just"
  a recipe

Co-authored-by: Moritz < moritz.m@local-it.org>
2023-06-05 11:15:54 +02:00
16 changed files with 185 additions and 777 deletions

View File

@ -2,16 +2,11 @@
kind: pipeline kind: pipeline
name: linters name: linters
steps: steps:
- name: publish image - name: run shellcheck
image: plugins/docker image: koalaman/shellcheck-alpine
settings: commands:
username: 3wordchant - shellcheck backup.sh
password:
from_secret: git_coopcloud_tech_token_3wc trigger:
repo: git.coopcloud.tech/coop-cloud/backup-bot-two branch:
tags: ${DRONE_SEMVER_BUILD} - main
registry: git.coopcloud.tech
when:
event:
include:
- tag

View File

@ -4,14 +4,11 @@ SECRET_RESTIC_PASSWORD_VERSION=v1
COMPOSE_FILE=compose.yml COMPOSE_FILE=compose.yml
RESTIC_REPOSITORY=/backups/restic SERVER_NAME=example.com
RESTIC_HOST=minio.example.com
CRON_SCHEDULE='30 3 * * *' CRON_SCHEDULE='*/5 * * * *'
REMOVE_BACKUP_VOLUME_AFTER_UPLOAD=1
# Push Notifiactions
#PUSH_URL_START=https://status.example.com/api/push/xxxxxxxxxx?status=up&msg=start
#PUSH_URL_SUCCESS=https://status.example.com/api/push/xxxxxxxxxx?status=up&msg=OK
#PUSH_URL_FAIL=https://status.example.com/api/push/xxxxxxxxxx?status=down&msg=fail
# swarm-cronjob, instead of built-in cron # swarm-cronjob, instead of built-in cron
#COMPOSE_FILE="$COMPOSE_FILE:compose.swarm-cronjob.yml" #COMPOSE_FILE="$COMPOSE_FILE:compose.swarm-cronjob.yml"
@ -25,10 +22,3 @@ CRON_SCHEDULE='30 3 * * *'
#SECRET_AWS_SECRET_ACCESS_KEY_VERSION=v1 #SECRET_AWS_SECRET_ACCESS_KEY_VERSION=v1
#AWS_ACCESS_KEY_ID=something-secret #AWS_ACCESS_KEY_ID=something-secret
#COMPOSE_FILE="$COMPOSE_FILE:compose.s3.yml" #COMPOSE_FILE="$COMPOSE_FILE:compose.s3.yml"
# Secret restic repository
# use a secret to store the RESTIC_REPOSITORY if the repository location contains a secret value
# i.E rest:https://user:SECRET_PASSWORD@host:8000/
# it overwrites the RESTIC_REPOSITORY variable
#SECRET_RESTIC_REPO_VERSION=v1
#COMPOSE_FILE="$COMPOSE_FILE:compose.secret.yml"

View File

@ -10,8 +10,6 @@ export DOCKER_CONTEXT=$SERVER_NAME
# or this: # or this:
#export AWS_SECRET_ACCESS_KEY_FILE=s3 #export AWS_SECRET_ACCESS_KEY_FILE=s3
#export AWS_ACCESS_KEY_ID=easter-october-emphatic-tug-urgent-customer #export AWS_ACCESS_KEY_ID=easter-october-emphatic-tug-urgent-customer
# or this:
#export HTTPS_PASSWORD_FILE=/run/secrets/https_password
# optionally limit subset of services for testing # optionally limit subset of services for testing
#export SERVICES_OVERRIDE="ghost_domain_tld_app ghost_domain_tld_db" #export SERVICES_OVERRIDE="ghost_domain_tld_app ghost_domain_tld_db"

View File

@ -1,6 +0,0 @@
# Change log
## 2.0.0 (unreleased)
- Rewrite from Bash to Python
- Add support for push notifications (#24)

View File

@ -1,11 +0,0 @@
FROM docker:24.0.7-dind
RUN apk add --upgrade --no-cache restic bash python3 py3-pip py3-click py3-docker-py py3-json-logger curl
# Todo use requirements file with specific versions
RUN pip install --break-system-packages resticpy==1.0.2
COPY backupbot.py /usr/bin/backup
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT /entrypoint.sh

178
README.md
View File

@ -6,20 +6,6 @@ _This Time, It's Easily Configurable_
Automatically take backups from all volumes of running Docker Swarm services and runs pre- and post commands. Automatically take backups from all volumes of running Docker Swarm services and runs pre- and post commands.
<!-- metadata -->
* **Category**: Utilities
* **Status**: 0, work-in-progress
* **Image**: [`git.coopcloud.tech/coop-cloud/backup-bot-two`](https://git.coopcloud.tech/coop-cloud/-/packages/container/backup-bot-two), 4, upstream
* **Healthcheck**: No
* **Backups**: N/A
* **Email**: N/A
* **Tests**: No
* **SSO**: N/A
<!-- endmetadata -->
## Background ## Background
There are lots of Docker volume backup systems; all of them have one or both of these limitations: There are lots of Docker volume backup systems; all of them have one or both of these limitations:
@ -34,149 +20,28 @@ Backupbot II tries to help, by
### With Co-op Cloud ### With Co-op Cloud
1. Set up Docker Swarm and [`abra`][abra]
2. `abra app new backup-bot-two`
3. `abra app config <your-app-name>`, and set storage options. Either configure `CRON_SCHEDULE`, or set up `swarm-cronjob`
4. `abra app secret generate <your-app-name> restic-password v1`, optionally with `--pass` before `<your-app-name>` to save the generated secret in `pass`.
5. `abra app secret insert <your-app-name> ssh-key v1 ...` or similar, to load required secrets.
4. `abra app deploy <your-app-name>`
* `abra app new backup-bot-two` <!-- metadata -->
* `abra app config <app-name>`
- set storage options. Either configure `CRON_SCHEDULE`, or set up `swarm-cronjob` * **Category**: Utilities
* `abra app secret generate -a <backupbot_name>` * **Status**: 0, work-in-progress
* `abra app deploy <app-name>` * **Image**: [`thecoopcloud/backup-bot-two`](https://hub.docker.com/r/thecoopcloud/backup-bot-two), 4, upstream
* **Healthcheck**: No
* **Backups**: N/A
* **Email**: N/A
* **Tests**: No
* **SSO**: N/A
<!-- endmetadata -->
## Configuration ## Configuration
Per default Backupbot stores the backups locally in the repository `/backups/restic`, which is accessible as volume at `/var/lib/docker/volumes/<backupbot_name>_backups/_data/restic/`
The backup location can be changed using the `RESTIC_REPOSITORY` env variable.
### S3 Storage
To use S3 storage as backup location set the following envs:
```
RESTIC_REPOSITORY=s3:<S3-SERVICE-URL>/<BUCKET-NAME>
SECRET_AWS_SECRET_ACCESS_KEY_VERSION=v1
AWS_ACCESS_KEY_ID=<MY_ACCESS_KEY>
COMPOSE_FILE="$COMPOSE_FILE:compose.s3.yml"
```
and add your `<SECRET_ACCESS_KEY>` as docker secret:
`abra app secret insert <backupbot_name> aws_secret_access_key v1 <SECRET_ACCESS_KEY>`
See [restic s3 docs](https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html#amazon-s3) for more information.
### SFTP Storage
> With sftp it is not possible to prevent the backupbot from deleting backups in case of a compromised machine. Therefore we recommend to use S3, REST or rclone server without delete permissions.
To use SFTP storage as backup location set the following envs:
```
RESTIC_REPOSITORY=sftp:user@host:/restic-repo-path
SECRET_SSH_KEY_VERSION=v1
SSH_HOST_KEY="hostname ssh-rsa AAAAB3...
COMPOSE_FILE="$COMPOSE_FILE:compose.ssh.yml"
```
To get the `SSH_HOST_KEY` run the following command `ssh-keyscan <hostname>`
Generate an ssh keypair: `ssh-keygen -t ed25519 -f backupkey -P ''`
Add the key to your `authorized_keys`:
`ssh-copy-id -i backupkey <user>@<hostname>`
Add your `SSH_KEY` as docker secret:
```
abra app secret insert <backupbot_name> ssh_key v1 """$(cat backupkey)
"""
```
> Attention: This command needs to be executed exactly as stated above, because it places a trailing newline at the end, if this is missing you will get the following error: `Load key "/run/secrets/ssh_key": error in libcrypto`
### Restic REST server Storage
You can simply set the `RESTIC_REPOSITORY` variable to your REST server URL `rest:http://host:8000/`.
If you access the REST server with a password `rest:https://user:pass@host:8000/` you should hide the whole URL containing the password inside a secret.
Uncomment these lines:
```
SECRET_RESTIC_REPO_VERSION=v1
COMPOSE_FILE="$COMPOSE_FILE:compose.secret.yml"
```
Add your REST server url as secret:
```
abra app secret insert <backupbot_name> restic_repo v1 "rest:https://user:pass@host:8000/"
```
The secret will overwrite the `RESTIC_REPOSITORY` variable.
See [restic REST docs](https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html#rest-server) for more information.
## Push notifications
The following env variables can be used to setup push notifications for backups. `PUSH_URL_START` is requested just before the backups starts, `PUSH_URL_SUCCESS` is only requested if the backup was successful and if the backup fails `PUSH_URL_FAIL` will be requested.
Each variable is optional and independent of the other.
```
PUSH_URL_START=https://status.example.com/api/push/xxxxxxxxxx?status=up&msg=start
PUSH_URL_SUCCESS=https://status.example.com/api/push/xxxxxxxxxx?status=up&msg=OK
PUSH_URL_FAIL=https://status.example.com/api/push/xxxxxxxxxx?status=down&msg=fail
```
## Usage
Run the cronjob that creates a backup, including the push notifications and docker logging:
`abra app cmd <backupbot_name> app run_cron`
Create a backup of all apps:
`abra app run <backupbot_name> app -- backup create`
> The apps to backup up need to be deployed
Create an individual backup:
`abra app run <backupbot_name> app -- backup --host <target_app_name> create`
Create a backup to a local repository:
`abra app run <backupbot_name> app -- backup create -r /backups/restic`
> It is recommended to shutdown/undeploy an app before restoring the data
Restore the latest snapshot of all including apps:
`abra app run <backupbot_name> app -- backup restore`
Restore a specific snapshot of an individual app:
`abra app run <backupbot_name> app -- backup --host <target_app_name> restore --snapshot <snapshot_id>`
Show all snapshots:
`abra app run <backupbot_name> app -- backup snapshots`
Show all snapshots containing a specific app:
`abra app run <backupbot_name> app -- backup --host <target_app_name> snapshots`
Show all files inside the latest snapshot (can be very verbose):
`abra app run <backupbot_name> app -- backup ls`
Show specific files inside a selected snapshot:
`abra app run <backupbot_name> app -- backup ls --snapshot <snapshot_id> --path /var/lib/docker/volumes/`
Download files from a snapshot:
```
filename=$(abra app run <backupbot_name> app -- backup download --snapshot <snapshot_id> --path <absolute_path>)
abra app cp <backupbot_name> app:$filename .
```
## Run restic
```
abra app run <backupbot_name> app bash
export AWS_SECRET_ACCESS_KEY=$(cat $AWS_SECRET_ACCESS_KEY_FILE)
export RESTIC_PASSWORD=$(cat $RESTIC_PASSWORD_FILE)
restic snapshots
```
## Recipe Configuration
Like Traefik, or `swarm-cronjob`, Backupbot II uses access to the Docker socket to read labels from running Docker Swarm services: Like Traefik, or `swarm-cronjob`, Backupbot II uses access to the Docker socket to read labels from running Docker Swarm services:
``` ```
@ -195,4 +60,11 @@ services:
As in the above example, you can reference Docker Secrets, e.g. for looking up database passwords, by reading the files in `/run/secrets` directly. As in the above example, you can reference Docker Secrets, e.g. for looking up database passwords, by reading the files in `/run/secrets` directly.
## Development
1. Install `direnv`
2. `cp .envrc.sample .envrc`
3. Edit `.envrc` as appropriate, including setting `DOCKER_CONTEXT` to a remote Docker context, if you're not running a swarm server locally.
4. Run `./backup.sh` -- you can add the `--skip-backup` or `--skip-upload` options if you just want to test one other step
[abra]: https://git.autonomic.zone/autonomic-cooperative/abra [abra]: https://git.autonomic.zone/autonomic-cooperative/abra

12
abra.sh
View File

@ -1,10 +1,2 @@
export BACKUPBOT_VERSION=v1 export ENTRYPOINT_VERSION=v1
export SSH_CONFIG_VERSION=v1 export BACKUP_VERSION=v1
run_cron () {
schedule="$(crontab -l | tr -s " " | cut -d ' ' -f-5)"
rm -f /tmp/backup.log
echo "* * * * * $(crontab -l | tr -s " " | cut -d ' ' -f6-)" | crontab -
while [ ! -f /tmp/backup.log ]; do sleep 1; done
echo "$schedule $(crontab -l | tr -s " " | cut -d ' ' -f6-)" | crontab -
}

119
backup.sh Executable file
View File

@ -0,0 +1,119 @@
#!/bin/bash
set -e
server_name="${SERVER_NAME:?SERVER_NAME not set}"
restic_password_file="${RESTIC_PASSWORD_FILE:?RESTIC_PASSWORD_FILE not set}"
restic_host="${RESTIC_HOST:?RESTIC_HOST not set}"
backup_paths=()
# shellcheck disable=SC2153
ssh_key_file="${SSH_KEY_FILE}"
s3_key_file="${AWS_SECRET_ACCESS_KEY_FILE}"
restic_repo=
restic_extra_options=
if [ -n "$ssh_key_file" ] && [ -f "$ssh_key_file" ]; then
restic_repo="sftp:$restic_host:/$server_name"
# Only check server against provided SSH_HOST_KEY, if set
if [ -n "$SSH_HOST_KEY" ]; then
tmpfile=$(mktemp)
echo "$SSH_HOST_KEY" >>"$tmpfile"
echo "using host key $SSH_HOST_KEY"
ssh_options="-o 'UserKnownHostsFile $tmpfile'"
elif [ "$SSH_HOST_KEY_DISABLE" = "1" ]; then
echo "disabling SSH host key checking"
ssh_options="-o 'StrictHostKeyChecking=No'"
else
echo "neither SSH_HOST_KEY nor SSH_HOST_KEY_DISABLE set"
fi
restic_extra_options="sftp.command=ssh $ssh_options -i $ssh_key_file $restic_host -s sftp"
fi
if [ -n "$s3_key_file" ] && [ -f "$s3_key_file" ] && [ -n "$AWS_ACCESS_KEY_ID" ]; then
AWS_SECRET_ACCESS_KEY="$(cat "${s3_key_file}")"
export AWS_SECRET_ACCESS_KEY
restic_repo="s3:$restic_host:/$server_name"
fi
if [ -z "$restic_repo" ]; then
echo "you must configure either SFTP or S3 storage, see README"
exit 1
fi
echo "restic_repo: $restic_repo"
# Pre-bake-in some default restic options
_restic() {
if [ -z "$restic_extra_options" ]; then
# shellcheck disable=SC2068
restic -p "$restic_password_file" \
--quiet -r "$restic_repo" \
$@
else
# shellcheck disable=SC2068
restic -p "$restic_password_file" \
--quiet -r "$restic_repo" \
-o "$restic_extra_options" \
$@
fi
}
if [ -n "$SERVICES_OVERRIDE" ]; then
# this is fine because docker service names should never include spaces or
# glob characters
# shellcheck disable=SC2206
services=($SERVICES_OVERRIDE)
else
mapfile -t services < <(docker service ls --format '{{ .Name }}')
fi
post_commands=()
if [[ \ $*\ != *\ --skip-backup\ * ]]; then
for service in "${services[@]}"; do
echo "service: $service"
details=$(docker service inspect "$service" --format "{{ json .Spec.Labels }}")
if echo "$details" | jq -r '.["backupbot.backup"]' | grep -q 'true'; then
pre=$(echo "$details" | jq -r '.["backupbot.backup.pre-hook"]')
post=$(echo "$details" | jq -r '.["backupbot.backup.post-hook"]')
container=$(docker container ls -f "name=$service" --format '{{ .ID }}')
stack_name=$(echo "$details" | jq -r '.["com.docker.stack.namespace"]')
if [ "$pre" != "null" ]; then
# run the precommand
echo "executing precommand $pre in container $container"
docker exec "$container" sh -c "$pre"
fi
if [ "$post" != "null" ]; then
# append post command
post_commands+=("docker exec $container sh -c \"$post\"")
fi
# add volume paths to backup path
backup_paths+=(/var/lib/docker/volumes/"${stack_name}"_*)
fi
done
# check if restic repo exists, initialise if not
if [ -z "$(_restic cat config)" ] 2>/dev/null; then
echo "initializing restic repo"
_restic init
fi
fi
if [[ \ $*\ != *\ --skip-upload\ * ]]; then
echo "${backup_paths[@]}"
_restic backup --host "$server_name" --tag coop-cloud "${backup_paths[@]}"
fi
# run post commands
for post in "${post_commands[@]}"; do
echo "executing postcommand $post"
eval "$post"
done

View File

@ -1,501 +0,0 @@
#!/usr/bin/python3
import os
import sys
import click
import json
import subprocess
import logging
import docker
import restic
import tarfile
import io
from pythonjsonlogger import jsonlogger
from datetime import datetime, timezone
from restic.errors import ResticFailedError
from pathlib import Path
from shutil import copyfile, rmtree
VOLUME_PATH = "/var/lib/docker/volumes/"
SECRET_PATH = '/secrets/'
SERVICE = 'ALL'
logger = logging.getLogger("backupbot")
logging.addLevelName(55, 'SUMMARY')
setattr(logging, 'SUMMARY', 55)
setattr(logger, 'summary', lambda message, *args, **
kwargs: logger.log(55, message, *args, **kwargs))
def handle_exception(exc_type, exc_value, exc_traceback):
if issubclass(exc_type, KeyboardInterrupt):
sys.__excepthook__(exc_type, exc_value, exc_traceback)
return
logger.critical("Uncaught exception", exc_info=(
exc_type, exc_value, exc_traceback))
sys.excepthook = handle_exception
@click.group()
@click.option('-l', '--log', 'loglevel')
@click.option('-m', '--machine-logs', 'machine_logs', is_flag=True)
@click.option('service', '--host', '-h', envvar='SERVICE')
@click.option('repository', '--repo', '-r', envvar='RESTIC_REPOSITORY')
def cli(loglevel, service, repository, machine_logs):
global SERVICE
if service:
SERVICE = service.replace('.', '_')
if repository:
os.environ['RESTIC_REPOSITORY'] = repository
if loglevel:
numeric_level = getattr(logging, loglevel.upper(), None)
if not isinstance(numeric_level, int):
raise ValueError('Invalid log level: %s' % loglevel)
logger.setLevel(numeric_level)
logHandler = logging.StreamHandler()
if machine_logs:
formatter = jsonlogger.JsonFormatter(
"%(levelname)s %(filename)s %(lineno)s %(process)d %(message)s", rename_fields={"levelname": "message_type"})
logHandler.setFormatter(formatter)
logger.addHandler(logHandler)
export_secrets()
init_repo()
def init_repo():
if repo:= os.environ.get('RESTIC_REPOSITORY_FILE'):
# RESTIC_REPOSITORY_FILE and RESTIC_REPOSITORY are mutually exclusive
del os.environ['RESTIC_REPOSITORY']
else:
repo = os.environ['RESTIC_REPOSITORY']
restic.repository = repo
logger.debug(f"set restic repository location: {repo}")
restic.password_file = '/var/run/secrets/restic_password'
try:
restic.cat.config()
except ResticFailedError as error:
if 'unable to open config file' in str(error):
result = restic.init()
logger.info(f"Initialized restic repo: {result}")
else:
raise error
def export_secrets():
for env in os.environ:
if env.endswith('FILE') and not "COMPOSE_FILE" in env:
logger.debug(f"exported secret: {env}")
with open(os.environ[env]) as file:
secret = file.read()
os.environ[env.removesuffix('_FILE')] = secret
# logger.debug(f"Read secret value: {secret}")
@cli.command()
@click.option('retries', '--retries', '-r', envvar='RETRIES', default=1)
def create(retries):
app_settings = parse_backup_labels()
pre_commands, post_commands, backup_paths, apps = get_backup_details(app_settings)
copy_secrets(apps)
backup_paths.append(Path(SECRET_PATH))
run_commands(pre_commands)
backup_volumes(backup_paths, apps, int(retries))
run_commands(post_commands)
@cli.command()
@click.option('snapshot', '--snapshot', '-s', envvar='SNAPSHOT', default='latest')
@click.option('target', '--target', '-t', envvar='TARGET', default='/')
@click.option('noninteractive', '--noninteractive', envvar='NONINTERACTIVE', is_flag=True)
@click.option('volumes', '--volumes', '-v', envvar='VOLUMES', multiple=True)
@click.option('container', '--container', '-c', envvar='CONTAINER', multiple=True)
@click.option('no_commands', '--no-commands', envvar='NO_COMMANDS', is_flag=True)
def restore(snapshot, target, noninteractive, volumes, container, no_commands):
app_settings = parse_backup_labels('restore', container)
if SERVICE != 'ALL':
app_settings = {SERVICE: app_settings[SERVICE]}
pre_commands, post_commands, backup_paths, apps = get_backup_details(app_settings, volumes)
snapshots = get_snapshots(snapshot_id=snapshot)
if not snapshot:
logger.error("No Snapshots with ID {snapshots} for {apps} found.")
exit(1)
if not noninteractive:
snapshot_date = datetime.fromisoformat(snapshots[0]['time'])
delta = datetime.now(tz=timezone.utc) - snapshot_date
print(f"You are going to restore Snapshot {snapshot} of {apps} at {target}")
print("The following volume paths will be restored:")
for p in backup_paths:
print(f'\t{p}')
if not no_commands:
print("The following commands will be executed:")
for container, cmd in list(pre_commands.items()) + list(post_commands.items()):
print(f"\t{container.labels['com.docker.swarm.service.name']}:\t{cmd}")
print(f"This snapshot is {delta} old")
print("\nTHIS COMMAND WILL IRREVERSIBLY OVERWRITES FILES")
prompt = input("Type YES (uppercase) to continue: ")
if prompt != 'YES':
logger.error("Restore aborted")
exit(1)
print(f"Restoring Snapshot {snapshot} at {target}")
if not no_commands and pre_commands:
print(f"Run pre commands.")
run_commands(pre_commands)
result = restic_restore(snapshot_id=snapshot, include=backup_paths, target_dir=target)
if not no_commands and post_commands:
print(f"Run post commands.")
run_commands(post_commands)
logger.debug(result)
def restic_restore(snapshot_id='latest', include=[], target_dir=None):
cmd = restic.cat.base_command() + ['restore', snapshot_id]
for path in include:
cmd.extend(['--include', path])
if target_dir:
cmd.extend(['--target', target_dir])
return restic.internal.command_executor.execute(cmd)
def get_snapshots(snapshot_id=None):
if snapshot_id and snapshot_id != 'latest':
snapshots = restic.snapshots(snapshot_id=snapshot_id)
if SERVICE not in snapshots[0]['tags']:
logger.error(f'Snapshot with ID {snapshot_id} does not contain {SERVICE}')
exit(1)
else:
snapshots = restic.snapshots()
snapshots = list(filter(lambda x: x.get('tags') and SERVICE in x.get('tags'), snapshots))
if snapshot_id == 'latest':
return snapshots[-1:]
else:
return snapshots
def parse_backup_labels(hook_type='backup', selected_container=[]):
client = docker.from_env()
container_by_service = {
c.labels.get('com.docker.swarm.service.name'): c for c in client.containers.list()}
services = client.services.list()
app_settings = {}
for s in services:
specs = s.attrs['Spec']
labels = specs['Labels']
stack_name = labels['com.docker.stack.namespace']
container_name = s.name.removeprefix(f"{stack_name}_")
settings = app_settings[stack_name] = app_settings.get(stack_name) or {}
if (backup := labels.get('backupbot.backup')) and bool(backup):
settings['enabled'] = True
if selected_container and container_name not in selected_container:
logger.debug(f"Skipping {s.name} because it's not a selected container")
continue
if mounts:= specs['TaskTemplate']['ContainerSpec'].get('Mounts'):
volumes = parse_volumes(stack_name, mounts)
volumes.update(settings.get('volumes') or {})
settings['volumes'] = volumes
excluded_volumes, included_volume_paths = parse_excludes_includes(labels)
settings['excluded_volumes'] = excluded_volumes.union(settings.get('excluded_volumes') or set())
settings['included_volume_paths'] = included_volume_paths.union(settings.get('included_volume_paths') or set())
if container := container_by_service.get(s.name):
if command := labels.get(f'backupbot.{hook_type}.pre-hook'):
if not (pre_hooks:= settings.get('pre_hooks')):
pre_hooks = settings['pre_hooks'] = {}
pre_hooks[container] = command
if command := labels.get(f'backupbot.{hook_type}.post-hook'):
if not (post_hooks:= settings.get('post_hooks')):
post_hooks = settings['post_hooks'] = {}
post_hooks[container] = command
else:
logger.debug(f"Container {s.name} is not running.")
if labels.get(f'backupbot.{hook_type}.pre-hook') or labels.get(f'backupbot.{hook_type}.post-hook'):
logger.error(f"Container {s.name} contain hooks but it's not running")
return app_settings
def get_backup_details(app_settings, volumes=[]):
backup_paths = set()
backup_apps = []
pre_hooks= {}
post_hooks = {}
for app, settings in app_settings.items():
if settings.get('enabled'):
if SERVICE != 'ALL' and SERVICE != app:
continue
backup_apps.append(app)
add_backup_paths(backup_paths, settings, app, volumes)
if hooks:= settings.get('pre_hooks'):
pre_hooks.update(hooks)
if hooks:= settings.get('post_hooks'):
post_hooks.update(hooks)
return pre_hooks, post_hooks, list(backup_paths), backup_apps
def add_backup_paths(backup_paths, settings, app, selected_volumes):
if (volumes := settings.get('volumes')):
if includes:= settings.get('included_volume_paths'):
included_volumes = list(zip(*includes))[0]
for volume, rel_paths in includes:
if not (volume_path:= volumes.get(volume)):
logger.error(f'Can not find volume with the name {volume}')
continue
if selected_volumes and volume not in selected_volumes:
logger.debug(f'Skipping {volume}:{rel_paths} because the volume is not selected')
continue
for p in rel_paths:
absolute_path = Path(f"{volume_path}/{p}")
backup_paths.add(absolute_path)
else:
included_volumes = []
excluded_volumes = settings.get('excluded_volumes') or []
for name, path in volumes.items():
if selected_volumes and name not in selected_volumes:
logger.debug(f'Skipping volume: {name} because the volume is not selected')
continue
if name in excluded_volumes:
logger.debug(f'Skipping volume: {name} because the volume is excluded')
continue
if name in included_volumes:
logger.debug(f'Skipping volume: {name} because a path is selected')
continue
backup_paths.add(path)
else:
logger.warning(f"{app} does not contain any volumes")
def parse_volumes(stack_name, mounts):
volumes = {}
for m in mounts:
if m['Type'] != 'volume':
continue
relative_path = m['Source']
name = relative_path.removeprefix(stack_name + '_')
absolute_path = Path(f"{VOLUME_PATH}{relative_path}/_data/")
volumes[name] = absolute_path
return volumes
def parse_excludes_includes(labels):
excluded_volumes = set()
included_volume_paths = set()
for label, value in labels.items():
if label.startswith('backupbot.backup.volumes.'):
volume_name = label.removeprefix('backupbot.backup.volumes.').removesuffix('.path')
if label.endswith('path'):
relative_paths = tuple(value.split(','))
included_volume_paths.add((volume_name, relative_paths))
elif bool(value):
excluded_volumes.add(volume_name)
return excluded_volumes, included_volume_paths
def copy_secrets(apps):
# TODO: check if it is deployed
rmtree(SECRET_PATH, ignore_errors=True)
os.mkdir(SECRET_PATH)
client = docker.from_env()
container_by_service = {
c.labels.get('com.docker.swarm.service.name'): c for c in client.containers.list()}
services = client.services.list()
for s in services:
app_name = s.attrs['Spec']['Labels']['com.docker.stack.namespace']
if (app_name in apps and
(app_secs := s.attrs['Spec']['TaskTemplate']['ContainerSpec'].get('Secrets'))):
if not container_by_service.get(s.name):
logger.warning(
f"Container {s.name} is not running, secrets can not be copied.")
continue
container_id = container_by_service[s.name].id
for sec in app_secs:
src = f'/var/lib/docker/containers/{container_id}/mounts/secrets/{sec["SecretID"]}'
if not Path(src).exists():
logger.error(
f"For the secret {sec['SecretName']} the file {src} does not exist for {s.name}")
continue
dst = SECRET_PATH + sec['SecretName']
logger.debug(f"Copy Secret {sec['SecretName']}")
copyfile(src, dst)
def run_commands(commands):
for container, command in commands.items():
if not command:
continue
# Remove bash/sh wrapping
command = command.removeprefix('bash -c').removeprefix('sh -c').removeprefix(' ')
# Remove quotes surrounding the command
if (len(command) >= 2 and command[0] == command[-1] and (command[0] == "'" or command[0] == '"')):
command = command[1:-1]
# Use bash's pipefail to return exit codes inside a pipe to prevent silent failure
command = f"bash -c 'set -o pipefail;{command}'"
logger.info(f"run command in {container.name}:")
logger.info(command)
result = container.exec_run(command)
if result.exit_code:
logger.error(
f"Failed to run command {command} in {container.name}: {result.output.decode()}")
else:
logger.info(result.output.decode())
def backup_volumes(backup_paths, apps, retries, dry_run=False):
while True:
try:
logger.info("Backup these paths:")
logger.debug("\n".join(map(str, backup_paths)))
backup_paths = list(filter(path_exists, backup_paths))
cmd = restic.cat.base_command()
parent = get_snapshots('latest')
if parent:
# https://restic.readthedocs.io/en/stable/040_backup.html#file-change-detection
cmd.extend(['--parent', parent[0]['short_id']])
tags = set(apps + [SERVICE])
logger.info("Start volume backup")
result = restic.internal.backup.run(cmd, backup_paths, dry_run=dry_run, tags=tags)
logger.summary("backup finished", extra=result)
return
except ResticFailedError as error:
logger.error(f"Backup failed for {SERVICE}.")
logger.error(error, exc_info=True)
if retries > 0:
retries -= 1
else:
exit(1)
def path_exists(path):
if not path.exists():
logger.error(f'{path} does not exist')
return path.exists()
@cli.command()
def snapshots():
snapshots = get_snapshots()
for snap in snapshots:
print(snap['time'], snap['id'])
if not snapshots:
err_msg = "No Snapshots found"
if SERVICE != 'ALL':
service_name = SERVICE.replace('_', '.')
err_msg += f' for app {service_name}'
logger.warning(err_msg)
@cli.command()
@click.option('snapshot', '--snapshot', '-s', envvar='SNAPSHOT', default='latest')
@click.option('path', '--path', '-p', envvar='INCLUDE_PATH')
def ls(snapshot, path):
results = list_files(snapshot, path)
for r in results:
if r.get('path'):
print(f"{r['ctime']}\t{r['path']}")
def list_files(snapshot, path):
cmd = restic.cat.base_command() + ['ls']
cmd = cmd + ['--tag', SERVICE]
cmd.append(snapshot)
if path:
cmd.append(path)
try:
output = restic.internal.command_executor.execute(cmd)
except ResticFailedError as error:
if 'no snapshot found' in str(error):
err_msg = f'There is no snapshot "{snapshot}"'
if SERVICE != 'ALL':
err_msg += f' for the app "{SERVICE}"'
logger.error(err_msg)
exit(1)
else:
raise error
output = output.replace('}\n{', '}|{')
results = list(map(json.loads, output.split('|')))
return results
@cli.command()
@click.option('snapshot', '--snapshot', '-s', envvar='SNAPSHOT', default='latest')
@click.option('path', '--path', '-p', envvar='INCLUDE_PATH')
@click.option('volumes', '--volumes', '-v', envvar='VOLUMES')
@click.option('secrets', '--secrets', '-c', is_flag=True, envvar='SECRETS')
def download(snapshot, path, volumes, secrets):
file_dumps = []
if not any([path, volumes, secrets]):
volumes = secrets = True
if path:
path = path.removesuffix('/')
binary_output = dump(snapshot, path)
files = list_files(snapshot, path)
filetype = [f.get('type') for f in files if f.get('path') == path][0]
filename = Path(path).name
if filetype == 'dir':
filename = filename + ".tar"
tarinfo = tarfile.TarInfo(name=filename)
tarinfo.size = len(binary_output)
file_dumps.append((binary_output, tarinfo))
if volumes:
if SERVICE == 'ALL':
logger.error("Please specify '--host' when using '--volumes'")
exit(1)
files = list_files(snapshot, VOLUME_PATH)
for f in files[1:]:
path = f['path']
if Path(path).name.startswith(SERVICE) and f['type'] == 'dir':
binary_output = dump(snapshot, path)
filename = f"{Path(path).name}.tar"
tarinfo = tarfile.TarInfo(name=filename)
tarinfo.size = len(binary_output)
file_dumps.append((binary_output, tarinfo))
if secrets:
if SERVICE == 'ALL':
logger.error("Please specify '--host' when using '--secrets'")
exit(1)
filename = f"{SERVICE}.json"
files = list_files(snapshot, SECRET_PATH)
secrets = {}
for f in files[1:]:
path = f['path']
if Path(path).name.startswith(SERVICE) and f['type'] == 'file':
secret = dump(snapshot, path).decode()
secret_name = path.removeprefix(f'{SECRET_PATH}{SERVICE}_')
secrets[secret_name] = secret
binary_output = json.dumps(secrets).encode()
tarinfo = tarfile.TarInfo(name=filename)
tarinfo.size = len(binary_output)
file_dumps.append((binary_output, tarinfo))
with tarfile.open('/tmp/backup.tar.gz', "w:gz") as tar:
print(f"Writing files to /tmp/backup.tar.gz...")
for binary_output, tarinfo in file_dumps:
tar.addfile(tarinfo, fileobj=io.BytesIO(binary_output))
size = get_formatted_size('/tmp/backup.tar.gz')
print(
f"Backup has been written to /tmp/backup.tar.gz with a size of {size}")
def get_formatted_size(file_path):
file_size = os.path.getsize(file_path)
units = ['Bytes', 'KB', 'MB', 'GB', 'TB']
for unit in units:
if file_size < 1024:
return f"{round(file_size, 3)} {unit}"
file_size /= 1024
return f"{round(file_size, 3)} {units[-1]}"
def dump(snapshot, path):
cmd = restic.cat.base_command() + ['dump']
cmd = cmd + ['--tag', SERVICE]
cmd = cmd + [snapshot, path]
print(f"Dumping {path} from snapshot '{snapshot}'")
output = subprocess.run(cmd, capture_output=True)
if output.returncode:
logger.error(
f"error while dumping {path} from snapshot '{snapshot}': {output.stderr}")
exit(1)
return output.stdout
if __name__ == '__main__':
cli()

View File

@ -1,13 +0,0 @@
---
version: "3.8"
services:
app:
environment:
- RESTIC_REPOSITORY_FILE=/run/secrets/restic_repo
secrets:
- restic_repo
secrets:
restic_repo:
external: true
name: ${STACK_NAME}_restic_repo_${SECRET_RESTIC_REPO_VERSION}

View File

@ -5,19 +5,12 @@ services:
environment: environment:
- SSH_KEY_FILE=/run/secrets/ssh_key - SSH_KEY_FILE=/run/secrets/ssh_key
- SSH_HOST_KEY - SSH_HOST_KEY
- SSH_HOST_KEY_DISABLE
secrets: secrets:
- source: ssh_key - source: ssh_key
mode: 0400 mode: 0400
configs:
- source: ssh_config
target: /root/.ssh/config
secrets: secrets:
ssh_key: ssh_key:
external: true external: true
name: ${STACK_NAME}_ssh_key_${SECRET_SSH_KEY_VERSION} name: ${STACK_NAME}_ssh_key_${SECRET_SSH_KEY_VERSION}
configs:
ssh_config:
name: ${STACK_NAME}_ssh_config_${SSH_CONFIG_VERSION}
file: ssh_config

View File

@ -2,35 +2,41 @@
version: "3.8" version: "3.8"
services: services:
app: app:
image: git.coopcloud.tech/coop-cloud/backup-bot-two:2.1.1-beta image: docker:24.0.2-dind
volumes: volumes:
- "/var/run/docker.sock:/var/run/docker.sock" - "/var/run/docker.sock:/var/run/docker.sock"
- "/var/lib/docker/volumes/:/var/lib/docker/volumes/" - "/var/lib/docker/volumes/:/var/lib/docker/volumes/:ro"
- "/var/lib/docker/containers/:/var/lib/docker/containers/:ro"
- backups:/backups
environment: environment:
- CRON_SCHEDULE - CRON_SCHEDULE
- RESTIC_REPOSITORY - RESTIC_REPO
- RESTIC_PASSWORD_FILE=/run/secrets/restic_password - RESTIC_PASSWORD_FILE=/run/secrets/restic_password
- BACKUP_DEST=/backups
- RESTIC_HOST
- SERVER_NAME
- REMOVE_BACKUP_VOLUME_AFTER_UPLOAD=1
secrets: secrets:
- restic_password - restic_password
deploy: deploy:
labels: labels:
- coop-cloud.${STACK_NAME}.version=2.0.1+2.1.1-beta - coop-cloud.${STACK_NAME}.version=0.1.0+latest
- coop-cloud.${STACK_NAME}.timeout=${TIMEOUT:-300} configs:
- coop-cloud.backupbot.enabled=true - source: entrypoint
#entrypoint: ['tail', '-f','/dev/null'] target: /entrypoint.sh
healthcheck: mode: 0555
test: "pgrep crond" - source: backup
interval: 30s target: /backup.sh
timeout: 10s mode: 0555
retries: 10 entrypoint: ['/entrypoint.sh']
start_period: 5m
secrets: secrets:
restic_password: restic_password:
external: true external: true
name: ${STACK_NAME}_restic_password_${SECRET_RESTIC_PASSWORD_VERSION} name: ${STACK_NAME}_restic_password_${SECRET_RESTIC_PASSWORD_VERSION}
volumes: configs:
backups: entrypoint:
name: ${STACK_NAME}_entrypoint_${ENTRYPOINT_VERSION}
file: entrypoint.sh
backup:
name: ${STACK_NAME}_backup_${BACKUP_VERSION}
file: backup.sh

22
entrypoint.sh Executable file → Normal file
View File

@ -2,29 +2,11 @@
set -e set -e
if [ -n "$SSH_HOST_KEY" ] apk add --upgrade --no-cache bash curl jq restic
then
echo "$SSH_HOST_KEY" > /root/.ssh/known_hosts
fi
cron_schedule="${CRON_SCHEDULE:?CRON_SCHEDULE not set}" cron_schedule="${CRON_SCHEDULE:?CRON_SCHEDULE not set}"
if [ -n "$PUSH_URL_START" ] echo "$cron_schedule /backup.sh" | crontab -
then
push_start_notification="curl -s '$PUSH_URL_START' &&"
fi
if [ -n "$PUSH_URL_FAIL" ]
then
push_fail_notification="|| curl -s '$PUSH_URL_FAIL'"
fi
if [ -n "$PUSH_URL_SUCCESS" ]
then
push_notification=" && (grep -q 'backup finished' /tmp/backup.log && curl -s '$PUSH_URL_SUCCESS' $push_fail_notification)"
fi
echo "$cron_schedule $push_start_notification backup --machine-logs create 2>&1 | tee /tmp/backup.log $push_notification" | crontab -
crontab -l crontab -l
crond -f -d8 -L /dev/stdout crond -f -d8 -L /dev/stdout

View File

@ -1 +0,0 @@
This is the first beta release of the new backup-bot-two rewrite in python. Be aware when updating, it can break. Please read the readme and update your config according to it.

View File

@ -1,3 +0,0 @@
Breaking Change: the variables `SERVER_NAME` and `RESTIC_HOST` are merged into `RESTIC_REPOSITORY`. The format can be looked up here: https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html
ssh/sftp: `sftp:user@host:/repo-path`
S3: `s3:https://s3.example.com/bucket_name`

View File

@ -1,4 +0,0 @@
Host *
IdentityFile /run/secrets/ssh_key
ServerAliveInterval 60
ServerAliveCountMax 240