Failing to deploy app; no errors in logs #421

Closed
opened 2023-02-21 19:00:50 +00:00 by codegod100 · 6 comments
Member
❯ abra app deploy --chaos prime.agor.ai
WARN[0003] chaos mode engaged
+---------+-------------+-------------+---------------+----------+
| SERVER  |   RECIPE    |   CONFIG    |    DOMAIN     | VERSION  |
+---------+-------------+-------------+---------------+----------+
| agor.ai | agora-prime | compose.yml | prime.agor.ai | ed1046c0 |
+---------+-------------+-------------+---------------+----------+
? continue with deployment? Yes
INFO[0016] Creating network prime_agor_ai_default
INFO[0016] Creating service prime_agor_ai_app
INFO[0018] Creating service prime_agor_ai_bridge
INFO[0020] waiting for services to converge: prime_agor_ai_app, prime_agor_ai_bridge
FATA[0070]
prime.agor.ai has not converged (50s second timeout reached).
❯ abra app logs --debug prime.agor.ai
DEBU[0000] collecting metadata from 3 servers: agor.ai, anagora.org, cloud.vera.pink  caller="/drone/src/pkg/config/app.go:204 LoadAppFiles"
DEBU[0000] read map[AGORA_DB_PATH:/home/root/agora/agora.db COMPOSE_FILE:compose.yml DOMAIN:prime.agor.ai LETS_ENCRYPT_ENV:production TYPE:agora-prime] from /home/vera/.abra/servers/agor.ai/prime.agor.ai.env  caller="/drone/src/pkg/config/env.go:51 ReadEnv"
DEBU[0000] read env map[AGORA_DB_PATH:/home/root/agora/agora.db COMPOSE_FILE:compose.yml DOMAIN:prime.agor.ai LETS_ENCRYPT_ENV:production TYPE:agora-prime] from /home/vera/.abra/servers/agor.ai/prime.agor.ai.env  caller="/drone/src/pkg/config/app.go:158 readAppEnvFile"
DEBU[0000] retrieved {prime.agor.ai agora-prime prime.agor.ai map[AGORA_DB_PATH:/home/root/agora/agora.db COMPOSE_FILE:compose.yml DOMAIN:prime.agor.ai LETS_ENCRYPT_ENV:production TYPE:agora-prime] agor.ai /home/vera/.abra/servers/agor.ai/prime.agor.ai.env} for prime.agor.ai  caller="/drone/src/pkg/app/app.go:22 Get"
DEBU[0000] validated prime.agor.ai as app argument       caller="/drone/src/cli/internal/validate.go:143 ValidateApp"
DEBU[0000] created client for agor.ai                    caller="/drone/src/pkg/client/client.go:68 New"
DEBU[0000] commandconn: starting ssh with [-o ConnectTimeout=5 -l vera -p 444 -- agor.ai docker system dial-stdio]  caller="/drone/src/pkg/upstream/commandconn/commandconn.go:49 New"
DEBU[0002] tailing logs for all agora-prime services     caller="/drone/src/cli/app/logs.go:91 glob..func8"
DEBU[0002] COMPOSE_FILE detected (compose.yml), loading compose.yml  caller="/drone/src/pkg/config/app.go:443 GetAppComposeFiles"
DEBU[0002] retrieved /home/vera/.abra/recipes/agora-prime/compose.yml configs for agora-prime  caller="/drone/src/pkg/config/app.go:449 GetAppComposeFiles"
DEBU[0002] retrieved /home/vera/.abra/recipes/agora-prime/compose.yml for agora-prime  caller="/drone/src/pkg/config/app.go:463 GetAppComposeConfig"
DEBU[0002] commandconn: starting ssh with [-o ConnectTimeout=5 -l vera -p 444 -- agor.ai docker system dial-stdio]  caller="/drone/src/pkg/upstream/commandconn/commandconn.go:49 New"
vera@hypatia:~$ docker service ls
ID             NAME                   MODE         REPLICAS   IMAGE                         PORTS
s8lhbxc18vfn   prime_agor_ai_app      replicated   0/1        moonlion/agora-server:prime   *:5017->5017/tcp
xgusxqb508x9   prime_agor_ai_bridge   replicated   0/1        moonlion/agora-bridge:prime
9c03dyy8i9sk   traefik_agor_ai_app    replicated   1/1        traefik:v2.9.6                *:80->80/tcp, *:443->443/tcp
``` ❯ abra app deploy --chaos prime.agor.ai WARN[0003] chaos mode engaged +---------+-------------+-------------+---------------+----------+ | SERVER | RECIPE | CONFIG | DOMAIN | VERSION | +---------+-------------+-------------+---------------+----------+ | agor.ai | agora-prime | compose.yml | prime.agor.ai | ed1046c0 | +---------+-------------+-------------+---------------+----------+ ? continue with deployment? Yes INFO[0016] Creating network prime_agor_ai_default INFO[0016] Creating service prime_agor_ai_app INFO[0018] Creating service prime_agor_ai_bridge INFO[0020] waiting for services to converge: prime_agor_ai_app, prime_agor_ai_bridge FATA[0070] prime.agor.ai has not converged (50s second timeout reached). ``` ``` ❯ abra app logs --debug prime.agor.ai DEBU[0000] collecting metadata from 3 servers: agor.ai, anagora.org, cloud.vera.pink caller="/drone/src/pkg/config/app.go:204 LoadAppFiles" DEBU[0000] read map[AGORA_DB_PATH:/home/root/agora/agora.db COMPOSE_FILE:compose.yml DOMAIN:prime.agor.ai LETS_ENCRYPT_ENV:production TYPE:agora-prime] from /home/vera/.abra/servers/agor.ai/prime.agor.ai.env caller="/drone/src/pkg/config/env.go:51 ReadEnv" DEBU[0000] read env map[AGORA_DB_PATH:/home/root/agora/agora.db COMPOSE_FILE:compose.yml DOMAIN:prime.agor.ai LETS_ENCRYPT_ENV:production TYPE:agora-prime] from /home/vera/.abra/servers/agor.ai/prime.agor.ai.env caller="/drone/src/pkg/config/app.go:158 readAppEnvFile" DEBU[0000] retrieved {prime.agor.ai agora-prime prime.agor.ai map[AGORA_DB_PATH:/home/root/agora/agora.db COMPOSE_FILE:compose.yml DOMAIN:prime.agor.ai LETS_ENCRYPT_ENV:production TYPE:agora-prime] agor.ai /home/vera/.abra/servers/agor.ai/prime.agor.ai.env} for prime.agor.ai caller="/drone/src/pkg/app/app.go:22 Get" DEBU[0000] validated prime.agor.ai as app argument caller="/drone/src/cli/internal/validate.go:143 ValidateApp" DEBU[0000] created client for agor.ai caller="/drone/src/pkg/client/client.go:68 New" DEBU[0000] commandconn: starting ssh with [-o ConnectTimeout=5 -l vera -p 444 -- agor.ai docker system dial-stdio] caller="/drone/src/pkg/upstream/commandconn/commandconn.go:49 New" DEBU[0002] tailing logs for all agora-prime services caller="/drone/src/cli/app/logs.go:91 glob..func8" DEBU[0002] COMPOSE_FILE detected (compose.yml), loading compose.yml caller="/drone/src/pkg/config/app.go:443 GetAppComposeFiles" DEBU[0002] retrieved /home/vera/.abra/recipes/agora-prime/compose.yml configs for agora-prime caller="/drone/src/pkg/config/app.go:449 GetAppComposeFiles" DEBU[0002] retrieved /home/vera/.abra/recipes/agora-prime/compose.yml for agora-prime caller="/drone/src/pkg/config/app.go:463 GetAppComposeConfig" DEBU[0002] commandconn: starting ssh with [-o ConnectTimeout=5 -l vera -p 444 -- agor.ai docker system dial-stdio] caller="/drone/src/pkg/upstream/commandconn/commandconn.go:49 New" ``` ``` vera@hypatia:~$ docker service ls ID NAME MODE REPLICAS IMAGE PORTS s8lhbxc18vfn prime_agor_ai_app replicated 0/1 moonlion/agora-server:prime *:5017->5017/tcp xgusxqb508x9 prime_agor_ai_bridge replicated 0/1 moonlion/agora-bridge:prime 9c03dyy8i9sk traefik_agor_ai_app replicated 1/1 traefik:v2.9.6 *:80->80/tcp, *:443->443/tcp ```
codegod100 added the
bug
label 2023-02-21 19:00:51 +00:00
Author
Member

My images might be the problem:

❯ docker run -it moonlion/agora-server:prime
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "entrypoint.sh": executable file not found in $PATH: unknown.

This should be in logs?

My images might be the problem: ``` ❯ docker run -it moonlion/agora-server:prime docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "entrypoint.sh": executable file not found in $PATH: unknown. ``` This should be in logs?
Owner

Great logs @codegod100, thank you! abra app errors is supposed to help with this but I believe it still has bugs. I think the container never comes up, so this kind of helpful logging never comes into the logs. But it is available elsewhere in the JSON payload that the runtime returns. We definitely need to get these logs out because it's just a blackbox otherwise.

Great logs @codegod100, thank you! `abra app errors` is supposed to help with this but I believe it still has bugs. I think the container never comes up, so this kind of helpful logging never comes into the logs. But it is available elsewhere in the JSON payload that the runtime returns. We definitely need to get these logs out because it's just a blackbox otherwise.
Owner

Related: #339

Related: https://git.coopcloud.tech/coop-cloud/organising/issues/339
Author
Member

Why have separate logs and errors commands? Intuitively I expect to find errors in the logs

Why have separate logs and errors commands? Intuitively I expect to find errors in the logs
Owner

Yeh, that's a good point! I guess I was just following the same model of the docker CLI, that it wasn't giving out these errors via the logs command and was to be found elsewhere. But yeh, we could pile this logic into the logs command? Just check if the container is flapping, if there are healthcheck errors, etc. before polling for logs? Smaller CLI surface is nicer. The logic needs to be improved, the errors command doesn't really work atm.

Yeh, that's a good point! I guess I was just following the same model of the docker CLI, that it wasn't giving out these errors via the logs command and was to be found elsewhere. But yeh, we could pile this logic into the `logs` command? Just check if the container is flapping, if there are healthcheck errors, etc. before polling for logs? Smaller CLI surface is nicer. The logic needs to be improved, the `errors` command doesn't really work atm.
decentral1se added the
abra
label 2023-06-08 09:19:23 +00:00
decentral1se added this to the Medium/large enhancements project 2023-06-08 09:28:36 +00:00
decentral1se modified the project from Medium/large enhancements to Critical fixes 2023-06-08 09:28:54 +00:00
Owner

Converging on #501.

Converging on https://git.coopcloud.tech/coop-cloud/organising/issues/501.
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: coop-cloud/organising#421
No description provided.