post backup restore command #26
Labels
No Label
bug
duplicate
enhancement
help wanted
invalid
question
wontfix
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: coop-cloud/backup-bot-two#26
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Do we need a post backup restore command? Something like
psql -d dbname < backup.sql
for restoring a database.I am not quite sure, because the easiest way to restore a database is to simply shut down the container and restore the whole data volume. The database dumps are more like an alternative way, especially for migrating the database to another system. I played a little bit with restoring database dumps and its not really straightforward. On the one hand the container needs to run, on the other hand all database sessions need to be closed.
For postgres I hand to run this whole bunch of commands to simply restore a database dump on a running container:
What is your opinion about a
backupbot.restore.post-hook
label?I think
borgmatic
has a very good approach here https://torsion.org/borgmatic/docs/how-to/backup-your-databases/So, we need to support dumps? I typically err on the side of "do both", both system level snapshot and then dump and include that. But yeh, unsure how this works out in practice... it's always a bit difficult.
I also think we should support both. Maybe restoring the filesystem per default (as it's less error prone and can be done without deploying the app) and restoring the database as extra command.
And I like the idea to bake the postrgres/mysql dump and restore commands directly into the backupbot. This way the commands can be arbitrarily complex, without cluttering the recipe.
I am not sure if than a
backupbot.restore.post-hook
is still necessary?yes
I'd prefer the other way around (reasoning follows) but as long as both options are available then either is fine.
My experience is the opposite:
DB dump backups are guaranteed consistent (as per @decentral1se's comment), vs copying the database files on disk, which - depending on which DB, and how it's configured - often leads to corrupted data. When I did a survey of docker/docker swarm backup systems before writing backup-bot-two, most folks were stopping docker services before running backups, which seems like an anti-pattern i would like to avoid
DB dump backups are also more portable, e.g. postgres is very fussy about loading files (or even its own native backup format files) from different major versions
This is fair; mysql backups usually restore fine over default data (e.g. wordpress installs some default tables and data, but these get overwritten when loading a backup) but postgres seems to want a completely blank DB to load a dump, which in practice means overriding
entrypoint
to prevent DB initialisation and doing a--chaos
deploy, not ideal. I'd be sort of ok with keeping "restoring the filesystem" as default for this reason.can we do both? the original backupbot implementation had helper scripts which made the
post-restore
labels less verbose, but still allowed defining custom hooks where needed. but custom hooks could potentially come later if needed.I also prefer restoring with the data base commands, for the reasons @3wordchant wrote.
When providing
--clean
to `pg_restore it overrides the existing database. So it should not be a problem.I think we still need to manually terminate all connections to execute this command?
We could put the restore commands into an
abra.sh
function and trigger this function after a restore. This would allow more flexibility without verbose labels in the compose file. Maybe a function called_restore()
?Interesting point, I haven't tested this myself.
That sounds nice, centralising them into backup-bot could also be nice, but either seems fine to start with.
Yes you are right, terminating the existing connections is probably safer.
Using a
_restore()
inabra.sh
might be more convenient, but thenbackup-bot-two
will be tightly coupled with abra. It might be nice, ifbackup-bot-two
could work without using abra. So maybe we can wait a bit more to add this coupling?I don't have a good idea how to decouple this in a smart way. One way would be to have multiple modules inside the backupbot repo. Each module contains the restore command for the respecting database i.e. mysql, postgresql, mariadb, mongodb...
But this way we probably also have to maintain modules for different database versions. And in rare cases the restore command is completely unique for a specific recipe, or may change in between different recipe versions.
Therefore I think the restore command should be inside the recipe. But having a command like above #26 (comment) as label would completely clutter the compose file 🤔
Also see #42.
(Should we merge?)(Separate issues).For me, "standard bash commands in
abra.sh
" is reusable enough outside abra; if someone is bringing their own plumbing to use a Co-op Cloud recipe, and backup-bot-two, withoutabra
, then asking them to manually handle copying that script into a container and run it after a restore seems fine.It does seem a bit non-ideal that we'll have a proliferation of nearly-identical
restore_postgres
scripts between a bunch of recipes, but I see where you're coming from about how centralisation would make maintaining backupbot annoying, so maybe it's fine to live with more copypasta for now.moritz referenced this issue from coop-cloud/docs.coopcloud.tech2024-04-16 11:51:05 +00:00