WIP: Backup volumes from /var/lib/docker/volumes instead of copying individual paths #16

Closed
yksflip wants to merge 4 commits from backup_volumes into main
Owner

So we realized that copying several hundred gigabytes of data just for backup purpose
is a lot for our infrastructure. This is why we wanted to get rid of the docker cp stuff.
We investigated the current use of backupbot in all recipes and there are actually more or
less two use-cases:

  1. Backup stuff that is already in a volume
  2. Make a db-dump

So we thought we could reduce complexity by backing up the /var/lib/docker/volumes/
path instead. Backupbot will still look for a backupbot.backup = 'true' label,
and select all volumes with prefixed with the stack_name (involves bash globbing magic).

We considered running docker run -v ...:... for every service too, but decided that just
copying from the host filesystem is much simpler and comes with no other costs (we could see now).

Secondly we removed the Dockerfile as it does only very few things. For now it's just
simpler to have bash, jq etc. installed in the entrypoint. We don't have to build a image,
no more frankenstein-repo, versioning just like any other recipe ...

Before releasing a new version of this, we'd have to adjust all Recipes including backupbot, to:

  • Save db-dumps in volumes (e.g. /var/lib/postgresql/data/)
  • remove .paths labels
So we realized that copying several hundred gigabytes of data just for backup purpose is a lot for our infrastructure. This is why we wanted to get rid of the `docker cp` stuff. We investigated the current use of backupbot in all recipes and there are actually more or less two use-cases: 1. Backup stuff that is already in a volume 2. Make a db-dump So we thought we could reduce complexity by backing up the /var/lib/docker/volumes/ path instead. Backupbot will still look for a `backupbot.backup = 'true'` label, and select all volumes with prefixed with the *stack_name* (involves bash globbing magic). We considered running `docker run -v ...:...` for every service too, but decided that just copying from the host filesystem is much simpler and comes with no other costs (we could see now). Secondly we removed the Dockerfile as it does only very few things. For now it's just simpler to have bash, jq etc. installed in the entrypoint. We don't have to build a image, no more frankenstein-repo, versioning just like any other recipe ... Before releasing a new version of this, we'd have to adjust all Recipes including backupbot, to: * Save db-dumps in volumes (e.g. `/var/lib/postgresql/data/`) * remove `.paths` labels
yksflip added 1 commit 2023-06-05 08:51:09 +00:00
continuous-integration/drone/pr Build is failing Details
eda232819c
Backup volumes from host instead of copying paths
* Backupbot will now copy all volumes from a service with
  backupbot.enabled = 'true' label from the /var/lib/docker/volumes/
  path directly. This reduces the resource overhead of copying
  stuff from one volume to another.
  Recipes need to be adjustet that db-dumps are saved into a volume
  now!
* Remove the Dockerfile and move stuff into a entrypoint. This
  simplifies the whole versioning thing and makes this "just"
  a recipe

Co-authored-by: Moritz < moritz.m@local-it.org>
yksflip changed title from Backup volumes from /var/lib/docker/volumes instead of copying individual paths to WIP: Backup volumes from /var/lib/docker/volumes instead of copying individual paths 2023-06-05 09:00:54 +00:00
yksflip force-pushed backup_volumes from eda232819c to 24d2c0e85b 2023-06-05 09:16:03 +00:00 Compare
yksflip changed title from WIP: Backup volumes from /var/lib/docker/volumes instead of copying individual paths to Backup volumes from /var/lib/docker/volumes instead of copying individual paths 2023-06-05 09:49:10 +00:00
moritz requested changes 2023-06-05 15:46:52 +00:00
@ -7,3 +6,3 @@
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
- "backups:/backups"
- "/var/lib/docker/volumes/:/var/lib/docker/volumes/"
Member

Mounting the volumes read-only could prevent any kind of unexpected modifications:
"/var/lib/docker/volumes/:/var/lib/docker/volumes/:ro"

Mounting the volumes read-only could prevent any kind of unexpected modifications: `"/var/lib/docker/volumes/:/var/lib/docker/volumes/:ro"`
yksflip marked this conversation as resolved
Owner

Sorry folks, I'm not deep enough into using backup-bot-two to grok much of this. I think @3wordchant and autonomicz will have more context to help review this. Cool to see the approach adapting to larger scale.

Sorry folks, I'm not deep enough into using `backup-bot-two` to grok much of this. I think @3wordchant and autonomicz will have more context to help review this. Cool to see the approach adapting to larger scale.
yksflip added 1 commit 2023-06-13 09:48:32 +00:00
continuous-integration/drone/pr Build is failing Details
9cb3a469f3
mount volume ro
yksflip added 1 commit 2023-06-13 09:57:49 +00:00
continuous-integration/drone/pr Build is passing Details
50b317c12d
fix shellcheck prevent globbing
yksflip added 1 commit 2023-06-13 12:35:00 +00:00
continuous-integration/drone/pr Build is passing Details
e360c3d8f8
remove unused traefik labels
Author
Owner

I just realised, we'll still need the .paths labels for abra app backup functionality or have to alter abra so, to just backup volumes too ...

I just realised, we'll still need the `.paths labels` for `abra app backup` functionality or have to alter abra so, to just backup volumes too ...
Owner

I just realised, we'll still need the .paths labels for abra app backup functionality or have to alter abra so, to just backup volumes too ...

Ah yes. That explains my weird hesitation about this otherwise-great change 🤔

TBH the abra backup functionality as it currently stands is bad for a lot of cases, once there's more than a trivial amount of data, copying using docker cp is very painful because of lack of compression (see #324).

The Heroku CLI -- which was the inspiration for a lot of the initial abra functionality -- does backups server-side. Running non-Docker stuff on the remote server would be a reasonably significant design change, but maybe now's the time? Perhaps we can discuss in "Co-op Cloud Tech", or on a call soon?

> I just realised, we'll still need the .paths labels for abra app backup functionality or have to alter abra so, to just backup volumes too ... Ah yes. That explains my weird hesitation about this otherwise-great change 🤔 TBH the `abra backup` functionality as it currently stands is bad for a lot of cases, once there's more than a trivial amount of data, copying using `docker cp` is _very_ painful because of lack of compression (see #324). The Heroku CLI -- which was the inspiration for a lot of the initial `abra` functionality -- does backups server-side. Running non-Docker stuff on the remote server would be a reasonably significant design change, but maybe now's the time? Perhaps we can discuss in "Co-op Cloud Tech", or on a call soon?
Author
Owner

The Heroku CLI -- which was the inspiration for a lot of the initial abra functionality -- does backups server-side. Running non-Docker stuff on the remote server would be a reasonably significant design change, but maybe now's the time? Perhaps we can discuss in "Co-op Cloud Tech", or on a call soon?

I'd still count this approach as docker-stuff, but it's vague :D
Yes let's have a chat/call!

I just realised, we'll still need the .paths labels for abra app backup functionality or have to alter abra so, to just backup volumes too ...

regarding this, I thought to just keep the .pats labels in the recipes, but point everything to a volume folder ... So we can have both for the moment, until we find a better solution.

> The Heroku CLI -- which was the inspiration for a lot of the initial abra functionality -- does backups server-side. Running non-Docker stuff on the remote server would be a reasonably significant design change, but maybe now's the time? Perhaps we can discuss in "Co-op Cloud Tech", or on a call soon? I'd still count this approach as docker-stuff, but it's vague :D Yes let's have a chat/call! > I just realised, we'll still need the .paths labels for abra app backup functionality or have to alter abra so, to just backup volumes too ... regarding this, I thought to just keep the .pats labels in the recipes, but point everything to a volume folder ... So we can have both for the moment, until we find a better solution.
yksflip added this to the (deleted) project 2023-08-26 09:15:19 +00:00
yksflip modified the project from (deleted) to backupbot revolution 2023-08-26 09:23:23 +00:00
moritz changed title from Backup volumes from /var/lib/docker/volumes instead of copying individual paths to WIP: Backup volumes from /var/lib/docker/volumes instead of copying individual paths 2023-09-07 13:35:52 +00:00
Member
Moved to https://git.coopcloud.tech/coop-cloud/backup-bot-two/src/branch/backupbot_revolution
moritz closed this pull request 2023-09-07 13:36:53 +00:00
All checks were successful
continuous-integration/drone/pr Build is passing

Pull request closed

Sign in to join this conversation.
No reviewers
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: coop-cloud/backup-bot-two#16
No description provided.