Operator sync resurrected #809

Open
opened 2026-03-26 16:50:23 +00:00 by iexos · 13 comments
Member

Last night I wrote down some ideas I have for operator sync. After reading through existing conversations (toolshed/organising#467, #457 and more) I am happy to see it aligns and overlaps well with existing ideas as well as addressing some of the concerns.

Differences to previous proposals

The initial implementation would be automatic but also fully opt-in. To enable operator sync, you will need your env files within a git repo (its root can be anywhere) and have a remote called operator-sync. Opting in/out is as simple as renaming the git remote and can be done individually. to add sync: true to abra.yml.

The other main difference to existing ideas would be to commit&push only and all changes to server side state (even if no files have changed, creating empty commits) which would make the git log work as an operator log.

Initial implementation

Operator sync is activated if the server dir is part of a git repo and it has a remote called operator-sync.

Operator sync would function automatically on abra app commands in two different ways:

Commands modifying server state

Commands which modify the server state should

  • git pull the operator-sync remote with git pull --rebase behaviour
    • fatal out if not successful, suggest using --no-operator-sync
  • let the cmd do its thing
    • in some cases of failure, where server state has already changed, we should still continue with the steps below
  • create a commit
    • the message containing the command and its arguments and possibly output/failure state
    • only add the .env file of the app in question
    • if there are other new/modified files in the repository, list them in a warning
    • this can also be an empty commit with no files changed
  • git push to operator-sync remote

Commands with this behaviour would be the following abra app commands:

  • deploy
  • move
    • should operate on both server dirs with different messages
  • new --secrets
  • remove
  • restore
  • rollback
  • secret generate/insert/remove
    • inserted secret content should not end up in log
  • undeploy
  • upgrade
  • volume remove

This might lead to seemingly weird side effects like that the commit with secret insert will add the .env of a new app. I think that is correct though as this is exactly the point where the something about this app is created on the server.

To only commit the <domain>.env in question will prevent unintended side effects. A warning about other changed files should help you keep the repo clean. If you added other files related to this app you might want to commit them manually. I am not quite sure how to deal with staged changes yet.

git pull --rebase behaviour should reduce the amount of manual intervention needed to a minimum. In these cases intervention is probably needed anyway.

Some others are also (potentially) modifying server state (command, restart, ...), though I consider not important enough to keep a log of. I might have missed smth though!

Other abra app commands

Every other abra app command should execute the pull-part only - they all depend on the list of services. In this case a pull failure could be a non-fatal error. Though you might miss that in the wall of text that the command probably produces.

I do not envision creating new abra commands initially and would like it to work as automatic as possible after setup. I think we can also rely on operators' ability to use git if anything breaks.

Other considerations

Since every app command is now pulling the remote, this might slow down operations. To mitigate abra could somehow take note of the last successful pull and only pull again after a few minutes or so.

To override behaviour, I envision these flags:

  • --no-operator-sync do not do pulls and pushes
    • also, --offline should be respected
  • --no-operator-log do not create git commit (implies --no-operator-sync)
  • --force-sync always pull on commands without cooldown

I tend to think a bit far sometimes and I recognize that this might be a bit much for an initial implementation. Implementing only part of this could be helpful already.

Future possibilities

This may be a bit far off right now, I just want to get it out of my head. If initial implementation turns out to be useful, further steps could include:

  • creating dedicated abra commands related to operator sync, depending on needs (e.g. syncing, handling remote)
  • creating more abra.yml options to change sync/log behaviour
  • enabling commits without remote for creating the log
  • ask to create a repo on abra server add
  • additionally ask to create a remote on the server itself, e.g. in <server>:~/.abra/operator-sync.git
    • clone if exists, otherwise creating the bare repo and then clone
    • potential issues: operators might have different users on server, dir might not be writable, git might not be installed...

Please leave comments on what you think of this!

Last night I wrote down some ideas I have for operator sync. After reading through existing conversations (https://git.coopcloud.tech/toolshed/organising/issues/467, https://git.coopcloud.tech/toolshed/abra/pulls/457 and more) I am happy to see it aligns and overlaps well with existing ideas as well as addressing some of the concerns. ## Differences to previous proposals The initial implementation would be automatic but also fully opt-in. To enable operator sync, you will need ~~your env files within a git repo (its root can be anywhere) and have a remote called `operator-sync`. Opting in/out is as simple as renaming the git remote and can be done individually.~~ to add `sync: true` to `abra.yml`. The other main difference to existing ideas would be to commit&push only and all changes to server side state (even if no files have changed, creating empty commits) which would make the git log work as an operator log. ## Initial implementation Operator sync is activated if the server dir is part of a git repo and it has a remote called `operator-sync`. Operator sync would function automatically on `abra app` commands in two different ways: ### Commands modifying server state Commands which modify the server state should - git pull ~~the `operator-sync` remote with `git pull --rebase` behaviour~~ - fatal out if not successful, suggest using `--no-operator-sync` - let the cmd do its thing - in some cases of failure, where server state has already changed, we should still continue with the steps below - create a commit - the message containing the command and its arguments and possibly output/failure state - only add the `.env` file of the app in question - if there are other new/modified files in the repository, list them in a warning - this can also be an empty commit with no files changed - git push ~~to `operator-sync` remote~~ Commands with this behaviour would be the following `abra app` commands: - `deploy` - `move` - should operate on both server dirs with different messages - `new --secrets` - `remove` - `restore` - `rollback` - `secret generate/insert/remove` - inserted secret content should not end up in log - `undeploy` - `upgrade` - `volume remove` This might lead to seemingly weird side effects like that the commit with `secret insert` will add the `.env` of a new app. I think that is correct though as this is exactly the point where the something about this app is created on the server. To only commit the `<domain>.env` in question will prevent unintended side effects. A warning about other changed files should help you keep the repo clean. If you added other files related to this app you might want to commit them manually. I am not quite sure how to deal with staged changes yet. ~~`git pull --rebase` behaviour should reduce the amount of manual intervention needed to a minimum. In these cases intervention is probably needed anyway.~~ Some others are also (potentially) modifying server state (`command`, `restart`, ...), though I consider not important enough to keep a log of. I might have missed smth though! ### Other `abra app` commands Every other `abra app` command should execute the pull-part only - they all depend on the list of services. In this case a pull failure could be a non-fatal error. Though you might miss that in the wall of text that the command probably produces. I do not envision creating new abra commands initially and would like it to work as automatic as possible after setup. I think we can also rely on operators' ability to use `git` if anything breaks. ### Other considerations Since every app command is now pulling the remote, this might slow down operations. To mitigate `abra` could somehow take note of the last successful pull and only pull again after a few minutes or so. To override behaviour, I envision these flags: - `--no-operator-sync` do not do pulls and pushes - also, `--offline` should be respected - `--no-operator-log` do not create git commit (implies `--no-operator-sync`) - `--force-sync` always pull on commands without cooldown I tend to think a bit far sometimes and I recognize that this might be a bit much for an initial implementation. Implementing only part of this could be helpful already. ## Future possibilities This may be a bit far off right now, I just want to get it out of my head. If initial implementation turns out to be useful, further steps could include: - creating dedicated abra commands related to operator sync, depending on needs (e.g. syncing, handling remote) - creating more `abra.yml` options to change sync/log behaviour - enabling commits without remote for creating the log - ask to create a repo on `abra server add` - additionally ask to create a remote on the server itself, e.g. in `<server>:~/.abra/operator-sync.git` - clone if exists, otherwise creating the bare repo and then clone - potential issues: operators might have different users on server, dir might not be writable, git might not be installed... ------ Please leave comments on what you think of this!
iexos added the
enhancement
design
labels 2026-03-26 16:50:24 +00:00
Owner

I'm all for this approach. Thanks for drawing it up! If it's something people can slowly opt-in to and we can fix as we go without some big breaking migration then I think it's doable. It's more a question for me of how to break this down into some logical order of bit-by-bit implementation that we can roll out and have people test.

I think we will have to invest in making sure we can somehow cover all the failure scenarios of Swarm deployments and that they are reflected in the env version. I feel like we are still building on shaky foundations. #808 shows the current limitations and I'm not sure what to do about it.

I'm all for this approach. Thanks for drawing it up! If it's something people can slowly opt-in to and we can fix as we go without some big breaking migration then I think it's doable. It's more a question for me of how to break this down into some logical order of bit-by-bit implementation that we can roll out and have people test. I think we will have to invest in making sure we can somehow cover all the failure scenarios of Swarm deployments and that they are reflected in the env version. I feel like we are still building on shaky foundations. https://git.coopcloud.tech/toolshed/abra/issues/808 shows the current limitations and I'm not sure what to do about it.
Owner

@iexos How would this work if we already work in a git repo? Could you describe that use case? Like how would this interact with normal git usage?

@iexos How would this work if we already work in a git repo? Could you describe that use case? Like how would this interact with normal git usage?
Author
Member

@p4u1
Assuming you have repo with a single remote origin, you will have to rename that remote to operator-sync. You can continue to use the repo as before.

Now when using any abra app cmd this repo will be pulled.
Additionally, any time you execute one of the server-modifying cmds above like abra app deploy <domain> a commit will be created in which only changes to <domain>.env are added. Then the repo is pushed to remote.

So what you potentially need to watch out for is that all commits made before will also be pushed and that the .env file will be committed.
Also, your git log will likely grow much faster.

@p4u1 Assuming you have repo with a single remote `origin`, you will have to rename that remote to `operator-sync`. You can continue to use the repo as before. Now when using any `abra app` cmd this repo will be pulled. Additionally, any time you execute one of the server-modifying cmds above like `abra app deploy <domain>` a commit will be created in which only changes to `<domain>.env` are added. Then the repo is pushed to remote. So what you potentially need to watch out for is that all commits made before will also be pushed and that the `.env` file will be committed. Also, your git log will likely grow much faster.
Owner

Hmmm good question. Being required to git remote rename origin operator-sync is kind of breaking an existing workflow? I had actually imagined it being a 2nd remote that you add alongside origin (which also seems weird now that I type it out). Another possibility is triggering via a config value so that you can keep the same git remote name?

# abra.yml
sync: true
Hmmm good question. Being required to `git remote rename origin operator-sync` is kind of breaking an existing workflow? I had actually imagined it being a *2nd* remote that you add alongside `origin` (which also seems weird now that I type it out). Another possibility is triggering via a config value so that you can keep the same git remote name? ```yml # abra.yml sync: true ```
Author
Member

Both can be possible, either renaming or adding the remote. I thought if only one remote is present, cmds like pull, push will continue to work as normal but maybe that is wrong?

Where I come from is that the double remotes of recipes (origin & origin-ssh) confuse my shell prompt as it thinks origin is not up-to-date because abra pushes the other one, showing me unpushed commits. But if that is no issue for you then just use both.

Both can be possible, either renaming or adding the remote. I thought if only one remote is present, cmds like pull, push will continue to work as normal but maybe that is wrong? Where I come from is that the double remotes of recipes (`origin` & `origin-ssh`) confuse my shell prompt as it thinks origin is not up-to-date because abra pushes the other one, showing me unpushed commits. But if that is no issue for you then just use both.
Owner

I would really prefer enabling via got config instead of an implicit got remote name. I was also confused if I would have two remotes then.

Another scenario:
What happens if I have local changes? Then git pull will error. Would Abra then run git stash and then git pull and then got stash pop? This would increase possibilities for errors.

I would really prefer enabling via got config instead of an implicit got remote name. I was also confused if I would have two remotes then. Another scenario: What happens if I have local changes? Then git pull will error. Would Abra then run git stash and then git pull and then got stash pop? This would increase possibilities for errors.
Author
Member

Then lets enable sync via config file then, you are right it might be more clear.

I also checked the git pull behaviour. I seem to have remembered it wrong, without --rebase it will sync even if there are changes in the repo if the same file is not modified both remote and local and with --rebase it won't sync. So I would prefer without --rebase then.

Then lets enable sync via config file then, you are right it might be more clear. I also checked the `git pull` behaviour. I seem to have remembered it wrong, without `--rebase` it will sync even if there are changes in the repo if the same file is not modified both remote and local and with `--rebase` it won't sync. So I would prefer without `--rebase` then.
Author
Member

As to breaking down the implementation into smaller steps, I would go with (order not that important):

  1. implementing sync: true option
  2. perform git pull before running any abra app cmd
  3. perform commit and push after running specific cmds, maybe starting with deploy & upgrade
  4. implement the no-sync/log flags
  5. evaluate if pulling needs rate-limiting
  6. expand to other modifying commands
  7. investigate failure state behaviour
As to breaking down the implementation into smaller steps, I would go with (order not that important): 1. implementing `sync: true` option 2. perform `git pull` before running any `abra app` cmd 3. perform commit and push after running specific cmds, maybe starting with `deploy` & `upgrade` 4. implement the no-sync/log flags 5. evaluate if pulling needs rate-limiting 6. expand to other modifying commands 7. investigate failure state behaviour
Owner

Nice!

I'm fine with just bailing out explicitly on git pull failures, showing the diff/failure logs and asking to resolve stuff manually. It will be easier to see then the ways operators trip each other up and how we can give some guidelines for the social side of this new change.

This is gonna create a tonne of mind bending scenarios when it doesn't do what you expected. We'll just have to form a tight alpha testing group and see it through with all the bugs and twists and turns 🫡

Nice! I'm fine with just bailing out explicitly on `git pull` failures, showing the diff/failure logs and asking to resolve stuff manually. It will be easier to see then the ways operators trip each other up and how we can give some guidelines for the social side of this new change. This is gonna create a tonne of mind bending scenarios when it doesn't do what you expected. We'll just have to form a tight alpha testing group and see it through with all the bugs and twists and turns 🫡

Great plan! Some thoughts on the faster-growing log: It would be helpful if multiple config changes that relate to each other could be included in a single commit, with a custom summary message. For example

Deployed apps for organisation <name>
- abra app new something
- abra app deploy
- abra app new something2
- abra app deploy

I think this could be achieved by amending commits. Manually by adding something like --amend-to-last-commit or automatically with a config timewindow-commit-amend: 120 minutes.
Any command that would commit something, then checks if the last commit was made by the current user and if it is within the configured time window for amending. If so, use git commit --amend -m "Updated message" & git push --force-with-lease instead of (and when failing, fallback to) pushing a new commit. In these cases, abra could also ask to add or update the summary message for these commits; or run a separate abra command or manual git amend when you're done to amend a custom summary message.

Or instead of putting this in the abra config, there could be abra operatorsync starttransaction -m "Summary of what I'm going to do" -t 120. That creates a commit with the summary message, stores the specified time window somewhere and uses the logic of the previous paragraph to amend all config changes to this commit.

Great plan! Some thoughts on the faster-growing log: It would be helpful if multiple config changes that relate to each other could be included in a single commit, with a custom summary message. For example ``` Deployed apps for organisation <name> - abra app new something - abra app deploy - abra app new something2 - abra app deploy ``` I think this could be achieved by amending commits. Manually by adding something like `--amend-to-last-commit` or automatically with a config `timewindow-commit-amend: 120 minutes`. Any command that would commit something, then checks if the last commit was made by the current user and if it is within the configured time window for amending. If so, use `git commit --amend -m "Updated message" & git push --force-with-lease` instead of (and when failing, fallback to) pushing a new commit. In these cases, abra could also ask to add or update the summary message for these commits; or run a separate abra command or manual git amend when you're done to amend a custom summary message. Or instead of putting this in the abra config, there could be `abra operatorsync starttransaction -m "Summary of what I'm going to do" -t 120`. That creates a commit with the summary message, stores the specified time window somewhere and uses the logic of the previous paragraph to amend all config changes to this commit.
Author
Member

@dannygroenewegen If I understand your proposal correctly, this will make sure that no other pushed changes are overridden. But that would mean if you are pulling from another machine in the meantime you will have to deal with a conflict when upstream history changed? While abra could theoretically deal with that as well, it would make things quite complex.

@dannygroenewegen If I understand your proposal correctly, this will make sure that no other pushed changes are overridden. But that would mean if you are pulling from another machine in the meantime you will have to deal with a conflict when upstream history changed? While abra could theoretically deal with that as well, it would make things quite complex.

There's indeed a changed upstream history, but I don't think that creates a new problem:

  • No local (un)committed changes on the second machine -> git pull has no problem updating to the changed upstream history.
  • Local uncommitted changes (e.g. editing app.env outside of abra) on the second machine, but no merge conflict with upstream changes -> git pull will update the changed history and the uncommitted changes are kept.
  • Local uncommitted changes with a merge conflict -> Uncommitted changes are made outside of abra? So git pull at the start of an abra command will fail due to conflicting uncommitted changes. This means two people are working on the same thing, which I think is a valid reason to require manual resolving.
  • Local committed changes with no merge conflict -> No problem updating to the changed upstream history, but I think this creates an extra merge commit that could be prevented by using git pull --rebase
  • Local committed changes with a merge conflict -> If abra creates and pushes the commit at the end of running the command, it's unclear how to resolve the conflict. Abra already made the changes on the server, so just exitting with a message to manually resolve this could leave the repo and server state unsynced if the users doesn't do this. Maybe forcing your own changes for the conflicts (since those happened last) might have the best chance of correctly representing the state of the server. And put something in the commit message that the conflict has been resolved in this way.

I would argue that these merge conflicts are not caused by amended commits, but by two people working on the same files at the same time. And I think these conflicts could also happen without amended commits, so abra would have to deal with this anyway by solving or providing useful errors.

There's indeed a changed upstream history, but I don't think that creates a new problem: - No local (un)committed changes on the second machine -> `git pull` has no problem updating to the changed upstream history. - Local uncommitted changes (e.g. editing app.env outside of abra) on the second machine, but no merge conflict with upstream changes -> `git pull` will update the changed history and the uncommitted changes are kept. - Local uncommitted changes with a merge conflict -> Uncommitted changes are made outside of abra? So `git pull` at the start of an abra command will fail due to conflicting uncommitted changes. This means two people are working on the same thing, which I think is a valid reason to require manual resolving. - Local committed changes with no merge conflict -> No problem updating to the changed upstream history, but I think this creates an extra merge commit that could be prevented by using `git pull --rebase` - Local committed changes with a merge conflict -> If abra creates and pushes the commit at the end of running the command, it's unclear how to resolve the conflict. Abra already made the changes on the server, so just exitting with a message to manually resolve this could leave the repo and server state unsynced if the users doesn't do this. Maybe forcing your own changes for the conflicts (since those happened last) might have the best chance of correctly representing the state of the server. And put something in the commit message that the conflict has been resolved in this way. I would argue that these merge conflicts are not caused by amended commits, but by two people working on the same files at the same time. And I think these conflicts could also happen without amended commits, so abra would have to deal with this anyway by solving or providing useful errors.
Author
Member

If it really works without additional trouble, it could be nice to explore. I personally don't really see a need for this (yet) and would not do it for the initial implementation but we can keep it in mind if the need comes up. Thank you for your input!

If it really works without additional trouble, it could be nice to explore. I personally don't really see a need for this (yet) and would not do it for the initial implementation but we can keep it in mind if the need comes up. Thank you for your input!
Sign in to join this conversation.
4 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: toolshed/abra#809
No description provided.