does backupbot.backup.path need to be backwards compatible? i.e. should we make a new label? #41

Open
opened 2023-11-12 01:05:13 +00:00 by 3wordchant · 10 comments
Owner

In working on #26, I am recommending preferring database dump backups over backups of database data volumes.

Because we are looking at paths within volumes, instead of within containers, path specifications need to change.

For instance, for a mysql / mariadb container with a pre-backup hook to dump to dump.sql.gz in the data volume, the path label would need to change from:

backupbot.backup.path=/var/lib/mysql/dump.sql.gz

to

backupbot.backup.path=dump.sql.gz

This will prevent Cool Original backup-bot-two from being able to back up, as /dump.sql.gz doesn't exist in the container.

Is it OK to break backwards compatibility in this way, or should we choose a new label name for these relative paths? If we should choose a new label, what should we call it?

CC @moritz

In working on #26, I am recommending preferring database dump backups over backups of database data volumes. Because we are looking at paths within volumes, instead of within containers, path specifications need to change. For instance, for a mysql / mariadb container with a `pre-backup` hook to dump to `dump.sql.gz` in the data volume, the `path` label would need to change from: `backupbot.backup.path=/var/lib/mysql/dump.sql.gz` to `backupbot.backup.path=dump.sql.gz` This will prevent Cool Original backup-bot-two from being able to back up, as `/dump.sql.gz` doesn't exist in the container. Is it OK to break backwards compatibility in this way, or should we choose a new label name for these relative paths? If we should choose a new label, what should we call it? CC @moritz
3wordchant added this to the backupbot revolution project 2023-11-12 01:05:18 +00:00
Member

How would backupbot.backup.path=dump.sql.gz when there are multiple volumes for the container? I think it needs to be explicitly stated which volume the path corresponds to. The syntax could be either something like

backupbot.backup.path=myvolume:dump.sql.gz

or something like

backupbot.backup.volumes.myvolume=dump.sql.gz

Another question is what happens, when a container has two volumes attached, but only one has a backup path specified? Does the other volume get backed up? Personally I would prefer it, when a path is given only that gets backed up and nothing else.

One more thought: Instead of specifying the path relativ to a volume, we could keep the old absolute path and lookup up the volume which the path corresponds to.

How would `backupbot.backup.path=dump.sql.gz` when there are multiple volumes for the container? I think it needs to be explicitly stated which volume the path corresponds to. The syntax could be either something like ``` backupbot.backup.path=myvolume:dump.sql.gz ``` or something like ``` backupbot.backup.volumes.myvolume=dump.sql.gz ``` Another question is what happens, when a container has two volumes attached, but only one has a backup path specified? Does the other volume get backed up? Personally I would prefer it, when a path is given only that gets backed up and nothing else. One more thought: Instead of specifying the path relativ to a volume, we could keep the old absolute path and lookup up the volume which the path corresponds to.
Author
Owner

Ah great thoughts @p4u1, yeah that all seems legit.

backupbot.backup.volumes.myvolume=dump.sql.gz

Prefer this one, because it seems more obvious how to add a second volume (backupbot.backup.volumes.myothervolume=otherpath).

Another question is what happens, when a container has two volumes attached, but only one has a backup path specified? Does the other volume get backed up? Personally I would prefer it, when a path is given only that gets backed up and nothing else.

This seems reasonable, but I do wonder how to retain the nice functionality of the current approach where backing up "everything" is easy, just adding backupbot.backup=true. So maybe I prefer that if a service has backups enabled, all volumes are backed-up, unless e.g. backupbot.backup.volumes.myvolume.disabled=true is set? Don't feel super strongly about this though.

One more thought: Instead of specifying the path relativ to a volume, we could keep the old absolute path and lookup up the volume which the path corresponds to.

Maybe? Could be nice to avoid maintainers needing to worry about which volumes are mounted where.

More questions:

  1. What if a volume is mounted to multiple services? I guess with your suggestion of "don't back up volumes by default" it's fine, but if we do backup all volumes on enabled services, we need to disable it on additional services?
  2. What about the corner case where a volume is mounted to multiple paths in the same container?
Ah great thoughts @p4u1, yeah that all seems legit. > `backupbot.backup.volumes.myvolume=dump.sql.gz` Prefer this one, because it seems more obvious how to add a second volume (`backupbot.backup.volumes.myothervolume=otherpath`). > Another question is what happens, when a container has two volumes attached, but only one has a backup path specified? Does the other volume get backed up? Personally I would prefer it, when a path is given only that gets backed up and nothing else. This seems reasonable, but I do wonder how to retain the nice functionality of the current approach where backing up "everything" is easy, just adding `backupbot.backup=true`. So maybe I prefer that if a service has backups enabled, all volumes are backed-up, unless e.g. `backupbot.backup.volumes.myvolume.disabled=true` is set? Don't feel super strongly about this though. > One more thought: Instead of specifying the path relativ to a volume, we could keep the old absolute path and lookup up the volume which the path corresponds to. Maybe? Could be nice to avoid maintainers needing to worry about which volumes are mounted where. More questions: 1. What if a volume is mounted to multiple services? I guess with your suggestion of "don't back up volumes by default" it's fine, but if we do backup all volumes on enabled services, we need to `disable` it on additional services? 2. What about the corner case where a volume is mounted to multiple paths in the same container?
Member

Is it OK to break backwards compatibility in this way, or should we choose a new label name for these relative paths? If we should choose a new label, what should we call it?

I think we should avoid breaking backwards compatibility, so an app update won't break someones backups.
I would prefer to leave backupbot.backup.path as it is and add another label.

The default should be to backup everything and to exclude what we don't want to backup explicitly because it's always better to backup too much, maybe waste some storage and traffic, than forgetting to backup something and loosing data. Therefore one label of backupbot.backup=true at any container will trigger that every volume for every container gets backed up.
The backup behavior should be specified as explicit as possible like this:

  • Volumes could be excluded with a label backupbot.backup.exclude_volumes=myvolume1 myvolume2
  • Specific paths from the excluded volumes could be included again with backupbot.backup.include_paths=myvolume1:dump.sql.gz myvolume2:path/iwant/tokeep

These two labels should be global, only specified once at any container. This would avoid problems of shared volumes or multiple volumes mounted at one container. Only the pre-hook, post-hook and restore commands should be bound to the belonging container.

> Is it OK to break backwards compatibility in this way, or should we choose a new label name for these relative paths? If we should choose a new label, what should we call it? I think we should avoid breaking backwards compatibility, so an app update won't break someones backups. I would prefer to leave `backupbot.backup.path` as it is and add another label. The default should be to backup everything and to exclude what we don't want to backup explicitly because it's always better to backup too much, maybe waste some storage and traffic, than forgetting to backup something and loosing data. Therefore one label of `backupbot.backup=true` at any container will trigger that every volume for every container gets backed up. The backup behavior should be specified as explicit as possible like this: - Volumes could be excluded with a label `backupbot.backup.exclude_volumes=myvolume1 myvolume2` - Specific paths from the excluded volumes could be included again with `backupbot.backup.include_paths=myvolume1:dump.sql.gz myvolume2:path/iwant/tokeep` These two labels should be global, only specified once at any container. This would avoid problems of shared volumes or multiple volumes mounted at one container. Only the `pre-hook`, `post-hook` and `restore` commands should be bound to the belonging container.
Author
Owner

The default should be to backup everything and to exclude what we don't want to backup explicitly because it's always better to backup too much, maybe waste some storage and traffic, than forgetting to backup something and loosing data.

Love it

Therefore one label of backupbot.backup=true at any container will trigger that every volume for every container gets backed up.
The backup behavior should be specified as explicit as possible like this:

  • Volumes could be excluded with a label backupbot.backup.exclude_volumes=myvolume1 myvolume2
  • Specific paths from the excluded volumes could be included again with backupbot.backup.include_paths=myvolume1:dump.sql.gz myvolume2:path/iwant/tokeep

Love this also

These two labels should be global, only specified once at any container.

This seems a little unintuitive to me, but I can't think of a better option, fine with it for now. Maybe we standardise that these labels have to be set on the app service?

> The default should be to backup everything and to exclude what we don't want to backup explicitly because it's always better to backup too much, maybe waste some storage and traffic, than forgetting to backup something and loosing data. Love it > Therefore one label of backupbot.backup=true at any container will trigger that every volume for every container gets backed up. > The backup behavior should be specified as explicit as possible like this: > > - Volumes could be excluded with a label `backupbot.backup.exclude_volumes=myvolume1 myvolume2` > - Specific paths from the excluded volumes could be included again with `backupbot.backup.include_paths=myvolume1:dump.sql.gz myvolume2:path/iwant/tokeep` Love this also > These two labels should be global, only specified once at any container. This seems a little unintuitive to me, but I can't think of a better option, fine with it for now. Maybe we standardise that these labels have to be set on the `app` service?
Member

Therefore one label of backupbot.backup=true at any container will trigger that every volume for every container gets backed up.

I understand that the default should be to backup everything, but I find it very intuitive that when you add the backupbot.backup=true to one container, it also backups all volumes from other containers in that stack.

Maybe we can add add a flag/config to the backupbot itself to backup all volumes?

What I can agree with is that when only the backupbot.backup=true all volumes attached to that container get backed up.

The backup behavior should be specified as explicit as possible like this:

  • Volumes could be excluded with a label backupbot.backup.exclude_volumes=myvolume1 myvolume2
  • Specific paths from the excluded volumes could be included again with backupbot.backup.include_paths=myvolume1:dump.sql.gz myvolume2:path/iwant/tokeep

Not a very strong opinion, but I would prefer not using include/exlcude in the labels but instead a more declarative style. Maybe something like this:

backupbot.backup.volumes.myvolume1.path=path/iwant/tokeep
backupbot.backup.volumes.myvolume2.backup=false

These two labels should be global, only specified once at any container.

This seems a little unintuitive to me, but I can't think of a better option, fine with it for now. Maybe we standardise that > these labels have to be set on the app service?

This I would find very unintuitive :) I find it much nicer, if I can define backup label on the service, that gets affected (like it is done currently)

> Therefore one label of backupbot.backup=true at any container will trigger that every volume for every container gets backed up. I understand that the default should be to backup everything, but I find it very intuitive that when you add the `backupbot.backup=true` to one container, it also backups all volumes from other containers in that stack. Maybe we can add add a flag/config to the backupbot itself to backup all volumes? What I can agree with is that when only the `backupbot.backup=true` all volumes attached to that container get backed up. >The backup behavior should be specified as explicit as possible like this: > > * Volumes could be excluded with a label backupbot.backup.exclude_volumes=myvolume1 myvolume2 > * Specific paths from the excluded volumes could be included again with backupbot.backup.include_paths=myvolume1:dump.sql.gz myvolume2:path/iwant/tokeep Not a very strong opinion, but I would prefer not using include/exlcude in the labels but instead a more declarative style. Maybe something like this: ``` backupbot.backup.volumes.myvolume1.path=path/iwant/tokeep backupbot.backup.volumes.myvolume2.backup=false ``` > > These two labels should be global, only specified once at any container. > > This seems a little unintuitive to me, but I can't think of a better option, fine with it for now. Maybe we standardise that > these labels have to be set on the app service? This I would find very unintuitive :) I find it much nicer, if I can define backup label on the service, that gets affected (like it is done currently)
Member

I understand that the default should be to backup everything, but I find it very intuitive that when you add the backupbot.backup=true to one container, it also backups all volumes from other containers in that stack.

I would like to avoid cluttering the compose file with too many labels because this introduces more mistakes that could be done. I worry that people miss some labels in the recipes and data won't get saved. Therefore I favor the approach of only adding the label backupbot.backup=true once and having a recipe with a working backup strategy.
This is the way it is currently done and many recipes rely on it.

I prefer blacklisting instead of whitelisting. If maintainers have to explicitly remove everything they don't want to backup, they can't forget to add important container or volumes. Over the time different people will work on the same recipe and maybe adding new container or volumes without being aware of the backup labels. Volumes inside of an overwrite file can easily be overlooked.

Maybe we can add add a flag/config to the backupbot itself to backup all volumes?

Not sure how do you mean it? This flag would be global and affect all apps on the same machine.

The approach of having global labels at the app container would also be consistent with the other coop-cloud labels and give a direct overview of the whole backup configuration.
I also don't have a very strong opinion on this, basically I can go with putting the include/exclude label at the affected container. But I'm not sure how to get a intuitive consistency with shared volumes.

backupbot.backup.volumes.myvolume1.path=path/iwant/tokeep
backupbot.backup.volumes.myvolume2.backup=false

This way you could not specify multiple paths inside of one volume, as each label can only specified once.
It could be a list like backupbot.backup.volumes.myvolume1.paths=path1,path2/iwant/tokeep

>I understand that the default should be to backup everything, but I find it very intuitive that when you add the backupbot.backup=true to one container, it also backups all volumes from other containers in that stack. I would like to avoid cluttering the compose file with too many labels because this introduces more mistakes that could be done. I worry that people miss some labels in the recipes and data won't get saved. Therefore I favor the approach of only adding the label `backupbot.backup=true` once and having a recipe with a working backup strategy. This is the way it is currently done and many recipes rely on it. I prefer blacklisting instead of whitelisting. If maintainers have to explicitly remove everything they don't want to backup, they can't forget to add important container or volumes. Over the time different people will work on the same recipe and maybe adding new container or volumes without being aware of the backup labels. Volumes inside of an overwrite file can easily be overlooked. > Maybe we can add add a flag/config to the backupbot itself to backup all volumes? Not sure how do you mean it? This flag would be global and affect all apps on the same machine. The approach of having global labels at the `app` container would also be consistent with the other coop-cloud labels and give a direct overview of the whole backup configuration. I also don't have a very strong opinion on this, basically I can go with putting the include/exclude label at the affected container. But I'm not sure how to get a intuitive consistency with shared volumes. > ``` > backupbot.backup.volumes.myvolume1.path=path/iwant/tokeep > backupbot.backup.volumes.myvolume2.backup=false > ``` This way you could not specify multiple paths inside of one volume, as each label can only specified once. It could be a list like `backupbot.backup.volumes.myvolume1.paths=path1,path2/iwant/tokeep`
Author
Owner

I would like to avoid cluttering the compose file with too many labels because this introduces more mistakes that could be done. I worry that people miss some labels in the recipes and data won't get saved. Therefore I favor the approach of only adding the label backupbot.backup=true once and having a recipe with a working backup strategy.
This is the way it is currently done and many recipes rely on it.

I prefer blacklisting instead of whitelisting. If maintainers have to explicitly remove everything they don't want to backup, they can't forget to add important container or volumes. Over the time different people will work on the same recipe and maybe adding new container or volumes without being aware of the backup labels. Volumes inside of an overwrite file can easily be overlooked.

Yes, I'm inclined to agree with this, and while I see where @p4u1 is coming from that it might be suprising that setting backupbot.backup=true on the appservice, and having volumes from the db service backed up, I agree that it is a safer default, and that it's OK to ask recipe maintainers to add more labels to exclude things.

It could be a list like backupbot.backup.volumes.myvolume1.paths=path1,path2/iwant/tokeep

Seems good to me!


So, to attempt a summary:

Back up everything

services:
  app:
    deploy:
      labels:
        - "backupbot.backup=true"

Back up all app volumes, exclude cache volume

services:
  app:
    deploy:
      labels:
        - "backupbot.backup=true"
  cache:
    deploy:
      labels:
        - "backupbot.backup.volumes.redis=false"

Back up all app volumes, only back up a single file in db volume

services:
  app:
    deploy:
      labels:
        - "backupbot.backup=true"
  db:
    deploy:
      labels:
        - "backupbot.backup.volumes.mariadb.paths=/var/lib/mariadb/dump.sql.gz"
> I would like to avoid cluttering the compose file with too many labels because this introduces more mistakes that could be done. I worry that people miss some labels in the recipes and data won't get saved. Therefore I favor the approach of only adding the label backupbot.backup=true once and having a recipe with a working backup strategy. This is the way it is currently done and many recipes rely on it. > > I prefer blacklisting instead of whitelisting. If maintainers have to explicitly remove everything they don't want to backup, they can't forget to add important container or volumes. Over the time different people will work on the same recipe and maybe adding new container or volumes without being aware of the backup labels. Volumes inside of an overwrite file can easily be overlooked. Yes, I'm inclined to agree with this, and while I see where @p4u1 is coming from that it might be suprising that setting `backupbot.backup=true` on the `app`service, and having volumes from the `db` service backed up, I agree that it is a safer default, and that it's OK to ask recipe maintainers to add more labels to exclude things. > It could be a list like `backupbot.backup.volumes.myvolume1.paths=path1,path2/iwant/tokeep` Seems good to me! --- So, to attempt a summary: ### Back up everything ``` services: app: deploy: labels: - "backupbot.backup=true" ``` ### Back up all `app` volumes, exclude `cache` volume ``` services: app: deploy: labels: - "backupbot.backup=true" cache: deploy: labels: - "backupbot.backup.volumes.redis=false" ``` ### Back up all `app` volumes, only back up a single file in `db` volume ``` services: app: deploy: labels: - "backupbot.backup=true" db: deploy: labels: - "backupbot.backup.volumes.mariadb.paths=/var/lib/mariadb/dump.sql.gz" ```
Member

@3wordchant The summary looks good to me. This summary could also be used as a starting point for the documentation.

@3wordchant The summary looks good to me. This summary could also be used as a starting point for the documentation.
Member

I like this design. I would like to implement it soon but I'm not sure if I've the time to do this before end of this year. But feel free to implement is :)

I like this design. I would like to implement it soon but I'm not sure if I've the time to do this before end of this year. But feel free to implement is :)
decentral1se added the
help wanted
label 2024-01-11 13:32:34 +00:00
Owner

We're in a meeting now and looks like #41 (comment) is consensus on what to do now? This looks ready to document and implement. We're discussing how to pick this up soon-ish 🤘 Notes in https://pad.riseup.net/p/UEC2JUPGb6tmRCZ7RX9X-keep

We're in a meeting now and looks like https://git.coopcloud.tech/coop-cloud/backup-bot-two/issues/41#issuecomment-19094 is consensus on what to do now? This looks ready to document and implement. We're discussing how to pick this up soon-ish 🤘 Notes in https://pad.riseup.net/p/UEC2JUPGb6tmRCZ7RX9X-keep
Sign in to join this conversation.
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: coop-cloud/backup-bot-two#41
No description provided.