Add os hook for interrupt signal while waiting for service to converge. #333

Merged
3wordchant merged 7 commits from rix/abra:add-waiting-interrupt-handling into main 2023-08-04 19:22:51 +00:00
Member

My attempt at fixing coop-cloud/organising#413 where Ctrl-C while waiting to converge would cause a success message to be printed. Now it prints a message similar to the timeout one where it just points out it was interrupted. I've also correctly a small grammar problem in the timeout message.

My attempt at fixing coop-cloud/organising#413 where Ctrl-C while waiting to converge would cause a success message to be printed. Now it prints a message similar to the timeout one where it just points out it was interrupted. I've also correctly a small grammar problem in the timeout message.
rix added 1 commit 2023-07-30 11:36:32 +00:00
decentral1se reviewed 2023-07-30 15:34:50 +00:00
decentral1se left a comment
Owner

Amazing @rix, tysm! This is looking good. I won't be able to test rn but if you are feeling good about it, please merge away? One doubt is that ctrl-c means "i really want to cancel now" vs. "timed out but i still care" and perhaps the error message could also just be "bailing out, cancelled by user". No strong feelings on this tho.

Amazing @rix, tysm! This is looking good. I won't be able to test rn but if you are feeling good about it, please merge away? One doubt is that ctrl-c means "i really want to cancel now" vs. "timed out but i still care" and perhaps the error message could also just be "bailing out, cancelled by user". No strong feelings on this tho.
Author
Member

Hi @decentral1se ,

I'm fine to change the message and correct me if I'm wrong but I don't think whether or not the app gets deployed depends on when the interrupt occurs.

So for a test I did the following:

run abra a deploy traefik.default
pause for a second until it's 'waiting to converge'
press ctrl+ c
run abra app ps traefik.default and get back output
run abra app undeploy traefik.default and successfully remove the app
run abra app ps traefik.default and be told the app is no longer installed

Full trace of that is below but basically it installed the app because it was already in progress in another thread and the only thing I interrupted was CLI process. If you want to make it so it actually stops/undeploys the app then that will require a bit more work to get doing. Let me know what you think.

<Setup logs removed for brevity>...
+--------+---------+-------------+-----------------+---------------+
| SERVER | RECIPE  |   CONFIG    |     DOMAIN      |    VERSION    |
+--------+---------+-------------+-----------------+---------------+
| local  | traefik | compose.yml | traefik.default | 2.4.2+v2.10.4 |
+--------+---------+-------------+-----------------+---------------+
? continue with deployment? Yes
DEBU[0001] get label 'coop-cloud.traefik_default.timeout'  caller="/home/rix/src/abra/pkg/config/app.go:553 GetLabel"
DEBU[0001] timeout label: 300                            caller="/home/rix/src/abra/pkg/config/app.go:568 GetTimeoutFromLabel"
DEBU[0001] set waiting timeout to 300 s                  caller="/home/rix/src/abra/cli/internal/deploy.go:168 DeployAction"
INFO[0001] Creating config traefik_default_file_provider_yml_v8  caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:280 createConfigs"
INFO[0001] Creating config traefik_default_entrypoint_v2  caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:280 createConfigs"
INFO[0001] Creating config traefik_default_traefik_yml_v17  caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:280 createConfigs"
INFO[0001] Creating service traefik_default_app          caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:391 deployServices"
INFO[0003] waiting for services to converge: traefik_default_app  caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:419 deployServices"
DEBU[0003] waiting on traefik_default_app to converge    caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:422 deployServices"
^CFATA[0006]
The wait for traefik.default to converge was interrupted.

This does not necessarily mean your deployment has stopped, but we aren't
monitoring it anymore

You can track latest deployment status with:

    abra app ps --watch traefik.default

And inspect the logs with:

    abra app logs traefik.default

If a service is failing to even start, try smoke out the error with:

    abra app errors --watch traefik.default
  caller="/home/rix/src/abra/cli/internal/deploy.go:171 DeployAction" stack="/home/rix/src/abra/cli/internal/deploy.go:171                     DeployAction\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/app.go:524     HandleAction\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/command.go:173 Command.Run\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/app.go:405     (*App).RunAsSubcommand\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/command.go:378 Command.startApp\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/command.go:102 Command.Run\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/app.go:277     (*App).Run\n/home/rix/src/abra/cli/cli.go:200                                 RunApp\n/home/rix/src/abra/cmd/abra/main.go:22                            main\n/usr/local/go/src/runtime/internal/atomic/types.go:194            (*Uint32).Load\n/usr/local/go/src/runtime/asm_amd64.s:1598                        goexit"
➜  abra git:(add-waiting-interrupt-handling) abra app ps traefik.default
+--------------+-----------------+----------------+---------------------------------+---------+--------+
| SERVICE NAME |      IMAGE      |    CREATED     |             STATUS              |  STATE  | PORTS  |
+--------------+-----------------+----------------+---------------------------------+---------+--------+
| app          | traefik:v2.10.4 | 10 seconds ago | Up 9 seconds (health: starting) | running | 80/tcp |
+--------------+-----------------+----------------+---------------------------------+---------+--------+
➜  abra git:(add-waiting-interrupt-handling) abra a undeploy traefik.default
+--------+---------+-------------+-----------------+---------------+
| SERVER | RECIPE  |   CONFIG    |     DOMAIN      |    VERSION    |
+--------+---------+-------------+-----------------+---------------+
| local  | traefik | compose.yml | traefik.default | 2.4.2+v2.10.4 |
+--------+---------+-------------+-----------------+---------------+
? continue with undeploy? Yes
INFO[0002] removing service traefik_default_app
INFO[0002] removing config traefik_default_entrypoint_v2
INFO[0002] removing config traefik_default_traefik_yml_v17
INFO[0002] removing config traefik_default_file_provider_yml_v8
➜  abra git:(add-waiting-interrupt-handling) abra app ps traefik.default
FATA[0000] traefik.default is not deployed?```
Hi @decentral1se , I'm fine to change the message and correct me if I'm wrong but I don't think whether or not the app gets deployed depends on when the interrupt occurs. So for a test I did the following: run abra a deploy traefik.default pause for a second until it's 'waiting to converge' press ctrl+ c run abra app ps traefik.default and get back output run abra app undeploy traefik.default and successfully remove the app run abra app ps traefik.default and be told the app is no longer installed Full trace of that is below but basically it installed the app because it was already in progress in another thread and the only thing I interrupted was CLI process. If you want to make it so it actually stops/undeploys the app then that will require a bit more work to get doing. Let me know what you think. ```abra git:(add-waiting-interrupt-handling) abra a deploy traefik.default -d <Setup logs removed for brevity>... +--------+---------+-------------+-----------------+---------------+ | SERVER | RECIPE | CONFIG | DOMAIN | VERSION | +--------+---------+-------------+-----------------+---------------+ | local | traefik | compose.yml | traefik.default | 2.4.2+v2.10.4 | +--------+---------+-------------+-----------------+---------------+ ? continue with deployment? Yes DEBU[0001] get label 'coop-cloud.traefik_default.timeout' caller="/home/rix/src/abra/pkg/config/app.go:553 GetLabel" DEBU[0001] timeout label: 300 caller="/home/rix/src/abra/pkg/config/app.go:568 GetTimeoutFromLabel" DEBU[0001] set waiting timeout to 300 s caller="/home/rix/src/abra/cli/internal/deploy.go:168 DeployAction" INFO[0001] Creating config traefik_default_file_provider_yml_v8 caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:280 createConfigs" INFO[0001] Creating config traefik_default_entrypoint_v2 caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:280 createConfigs" INFO[0001] Creating config traefik_default_traefik_yml_v17 caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:280 createConfigs" INFO[0001] Creating service traefik_default_app caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:391 deployServices" INFO[0003] waiting for services to converge: traefik_default_app caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:419 deployServices" DEBU[0003] waiting on traefik_default_app to converge caller="/home/rix/src/abra/pkg/upstream/stack/stack.go:422 deployServices" ^CFATA[0006] The wait for traefik.default to converge was interrupted. This does not necessarily mean your deployment has stopped, but we aren't monitoring it anymore You can track latest deployment status with: abra app ps --watch traefik.default And inspect the logs with: abra app logs traefik.default If a service is failing to even start, try smoke out the error with: abra app errors --watch traefik.default caller="/home/rix/src/abra/cli/internal/deploy.go:171 DeployAction" stack="/home/rix/src/abra/cli/internal/deploy.go:171 DeployAction\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/app.go:524 HandleAction\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/command.go:173 Command.Run\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/app.go:405 (*App).RunAsSubcommand\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/command.go:378 Command.startApp\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/command.go:102 Command.Run\n/home/rix/go/pkg/mod/github.com/urfave/cli@v1.22.9/app.go:277 (*App).Run\n/home/rix/src/abra/cli/cli.go:200 RunApp\n/home/rix/src/abra/cmd/abra/main.go:22 main\n/usr/local/go/src/runtime/internal/atomic/types.go:194 (*Uint32).Load\n/usr/local/go/src/runtime/asm_amd64.s:1598 goexit" ➜ abra git:(add-waiting-interrupt-handling) abra app ps traefik.default +--------------+-----------------+----------------+---------------------------------+---------+--------+ | SERVICE NAME | IMAGE | CREATED | STATUS | STATE | PORTS | +--------------+-----------------+----------------+---------------------------------+---------+--------+ | app | traefik:v2.10.4 | 10 seconds ago | Up 9 seconds (health: starting) | running | 80/tcp | +--------------+-----------------+----------------+---------------------------------+---------+--------+ ➜ abra git:(add-waiting-interrupt-handling) abra a undeploy traefik.default +--------+---------+-------------+-----------------+---------------+ | SERVER | RECIPE | CONFIG | DOMAIN | VERSION | +--------+---------+-------------+-----------------+---------------+ | local | traefik | compose.yml | traefik.default | 2.4.2+v2.10.4 | +--------+---------+-------------+-----------------+---------------+ ? continue with undeploy? Yes INFO[0002] removing service traefik_default_app INFO[0002] removing config traefik_default_entrypoint_v2 INFO[0002] removing config traefik_default_traefik_yml_v17 INFO[0002] removing config traefik_default_file_provider_yml_v8 ➜ abra git:(add-waiting-interrupt-handling) abra app ps traefik.default FATA[0000] traefik.default is not deployed?```
Owner

@rix

Righhhhhht, true! Yes, so we maybe need to do some UI/UX experimenting here. It is indeed the case that a ctrl-c only stops the polling / checking logic not the actual deployment... and I was just talking to @knoflook how the whole text output of this post-deploy thing is a bit weird 😆

Like what the heck does "services converged" even mean? We take some effort to gloss over Docker terminology in places, so I don't know why we suddenly expose it here on a thing that you have to read loads of times a day 🙃

Anyway, take or leave as much of this as you like but I think maybe we could:

  • Add a "starting to poll deployment status" message before any of the output to make it clear what is exactly happening... abra is just a "front-end" to the docker daemon reports in this moment, so we're not controlling if the stack succeeeds or not directly

  • Change "services converged" to "successfully deploy x $domain" or something clear / simple

  • When ctrl-c'ing output something like "cancelling polling, deployment continues..." (potentially with a "maybe you want app undeploy if you wanna take it down?")

Up to you!

@rix Righhhhhht, true! Yes, so we maybe need to do some UI/UX experimenting here. It is indeed the case that a ctrl-c only stops the polling / checking logic not the actual deployment... and I was just talking to @knoflook how the whole text output of this post-deploy thing is a bit weird 😆 Like what the heck does "services converged" even mean? We take some effort to gloss over Docker terminology in places, so I don't know why we suddenly expose it here on a thing that you have to read loads of times a day 🙃 Anyway, take or leave as much of this as you like but I think maybe we could: - Add a "starting to poll deployment status" message before any of the output to make it clear what is exactly happening... `abra` is just a "front-end" to the docker daemon reports in this moment, so we're not controlling if the stack succeeeds or not directly - Change "services converged" to "successfully deploy x $domain" or something clear / simple - When ctrl-c'ing output something like "cancelling polling, deployment continues..." (potentially with a "maybe you want app undeploy if you wanna take it down?") Up to you!
Author
Member

Ok that sounds good, it might be a day or two until I have enough time to make those mods but happy to do that as too literally have no idea what "services converging" means : )

Ok that sounds good, it might be a day or two until I have enough time to make those mods but happy to do that as too literally have no idea what "services converging" means : )
rix force-pushed add-waiting-interrupt-handling from 1208438cba to 65fdaf43cc 2023-08-01 11:51:11 +00:00 Compare
rix added 3 commits 2023-08-04 18:06:07 +00:00
Author
Member

Ok I've finally managed to get my deployments again and made those changes, here are the outputs now:

abra a deploy traefik.default

INFO[0002] Creating config traefik_default_entrypoint_v2
INFO[0002] Creating config traefik_default_traefik_yml_v17
INFO[0002] Creating config traefik_default_file_provider_yml_v8
INFO[0002] Creating service traefik_default_app
INFO[0003] Starting to poll for deployment status for: traefik.default
INFO[0041] Successfully deployed traefik.default

abra a deploy traefik.default (with interrupt)

INFO[0001] Creating config traefik_default_traefik_yml_v17
INFO[0001] Creating config traefik_default_file_provider_yml_v8
INFO[0001] Creating config traefik_default_entrypoint_v2
INFO[0002] Creating service traefik_default_app
INFO[0003] Starting to poll for deployment status for: traefik.default
^CFATA[0005]
Cancelling polling for traefik.default, deployment is still continuing.

If you want to stop the deployment try:
abra app undeploy traefik.default

I think that's it from my side but if you want me to make further changes then let me know and I should be able to get around to it this weekend : )

Ok I've finally managed to get my deployments again and made those changes, here are the outputs now: ### abra a deploy traefik.default INFO[0002] Creating config traefik_default_entrypoint_v2 INFO[0002] Creating config traefik_default_traefik_yml_v17 INFO[0002] Creating config traefik_default_file_provider_yml_v8 INFO[0002] Creating service traefik_default_app INFO[0003] Starting to poll for deployment status for: traefik.default INFO[0041] Successfully deployed traefik.default ### abra a deploy traefik.default (with interrupt) INFO[0001] Creating config traefik_default_traefik_yml_v17 INFO[0001] Creating config traefik_default_file_provider_yml_v8 INFO[0001] Creating config traefik_default_entrypoint_v2 INFO[0002] Creating service traefik_default_app INFO[0003] Starting to poll for deployment status for: traefik.default ^CFATA[0005] Cancelling polling for traefik.default, deployment is still continuing. If you want to stop the deployment try: abra app undeploy traefik.default I think that's it from my side but if you want me to make further changes then let me know and I should be able to get around to it this weekend : )
rix changed title from WIP: Add os hook for interrupt signal while waiting for service to converge. to Add os hook for interrupt signal while waiting for service to converge. 2023-08-04 18:07:44 +00:00
rix added 1 commit 2023-08-04 18:08:40 +00:00
Owner
➜ abra app deploy owncast_demo_coopcloud_tech -n
+---------------------+---------+-------------+-----------------------------+-------------+
|       SERVER        | RECIPE  |   CONFIG    |           DOMAIN            |   VERSION   |
+---------------------+---------+-------------+-----------------------------+-------------+
| demo.coopcloud.tech | owncast | compose.yml | owncast.demo.coopcloud.tech | 0.2.1+0.1.1 |
+---------------------+---------+-------------+-----------------------------+-------------+
INFO[0002] Creating service owncast_demo_coopcloud_tech_app 
INFO[0005] Starting to poll for deployment status for: owncast_demo_coopcloud_tech 
^CFATA[0007] 
Cancelling polling for owncast_demo_coopcloud_tech, deployment is still continuing.  

If you want to stop the deployment try:
	abra app undeploy owncast_demo_coopcloud_tech

And indeed the deployment continued!

The owncast instance seems exploded, but thinking / hoping that's unrelated 😬

``` ➜ abra app deploy owncast_demo_coopcloud_tech -n +---------------------+---------+-------------+-----------------------------+-------------+ | SERVER | RECIPE | CONFIG | DOMAIN | VERSION | +---------------------+---------+-------------+-----------------------------+-------------+ | demo.coopcloud.tech | owncast | compose.yml | owncast.demo.coopcloud.tech | 0.2.1+0.1.1 | +---------------------+---------+-------------+-----------------------------+-------------+ INFO[0002] Creating service owncast_demo_coopcloud_tech_app INFO[0005] Starting to poll for deployment status for: owncast_demo_coopcloud_tech ^CFATA[0007] Cancelling polling for owncast_demo_coopcloud_tech, deployment is still continuing. If you want to stop the deployment try: abra app undeploy owncast_demo_coopcloud_tech ``` And indeed the deployment continued! The owncast instance seems exploded, but thinking / hoping that's unrelated 😬
3wordchant merged commit 2db172ea5a into main 2023-08-04 19:22:51 +00:00
Sign in to join this conversation.
No description provided.