WIP: monitoring #15

Draft
Brooke wants to merge 12 commits from monitoring into main
5 changed files with 68 additions and 1 deletions

View File

@ -6,6 +6,11 @@ LETS_ENCRYPT_ENV=production
COMPOSE_FILE="compose.yml"
SECRET_RPC_SECRET_VERSION=v1 # length=64 charset=hex
SECRET_ADMIN_TOKEN_SECRET_VERSION=v1 # length=64 charset=hex
SECRET_METRICS_TOKEN_SECRET_VERSION=v1 # length=64 charset=hex
#COMPOSE_FILE="$COMPOSE_FILE:compose.monitoring.yml"
#MONITORING_ENABLED=true
# Changing the replication factor after initial deployment is not
# supported and requires deleting the existing cluster layout metadata.

View File

@ -1,4 +1,4 @@
# garage
# Garage
> An open-source distributed object storage service tailored for selfhosting at a small-to-medium scale.
@ -52,7 +52,35 @@ You can optionally add this alias to your `.bashrc` (or similar) file to avoid h
### Garage Quick Start Guide
Once `garage status` works, you can follow the guide here: https://garagehq.deuxfleurs.fr/documentation/quick-start/#checking-that-garage-runs-correctly
## Monitoring
### Enabling
By default monitoring is disabled and must be enabled in your config.
To enable, set `MONITORING` to `true` and uncomment the line `#COMPOSE_FILE="$COMPOSE_FILE:compose.monitoring.yml"`.
> If you've deployed garage before ver `0.0.2+v2.3.0` then you will need to add the following lines to your config:
> ```
> MONITORING_DOMAIN=monitoring.garage.example.com
> SECRET_ADMIN_TOKEN_SECRET_VERSION=v1 # length=64 charset=hex
> SECRET_METRICS_TOKEN_SECRET_VERSION=v1 # length=64 charset=hex
Brooke marked this conversation as resolved Outdated
Outdated
Review

also add instructions on which address and port to add to prometheus/alloy

also add instructions on which address and port to add to prometheus/alloy

Good point, instructions are definitely not finished, let me add the wip flag to the pr

Good point, instructions are definitely not finished, let me add the wip flag to the pr

The pr is a wip so I'll wait on review until the instructions are finalized.

The pr is a wip so I'll wait on review until the instructions are finalized.
>
> #COMPOSE_FILE="$COMPOSE_FILE:compose.monitoring.yml"
> MONITORING="true"
> ```
### Deploying
Now, undeploy the service, generate the new secrets, and finally re-deploy:
```
abra app undeploy <app-domain>
abra app secret generate --all <app-domain>
abra app deploy <app-domain>
```
### Utilizing metrics
Within your chosen monitoring software (ie. Telegraf, Prometheus, etc.), you'll need to make sure it interprets the correct scheme (https), and point it at <app-domain>/metrics as the monitoring endpoint. The secret you copied earlier called metrics_token will be used to authenticate the request.
## Backups

25
compose.monitoring.yml Normal file
View File

@ -0,0 +1,25 @@
---
version: "3.8"
services:
app:
secrets:
- source: metrics_token
Brooke marked this conversation as resolved Outdated
Outdated
Review

i recently discovered the env var is not needed here because the template is rendered locally

i recently discovered the env var is not needed here because the template is rendered locally

Where should this be placed to trigger the conditional in the config? I wasn't sure if it should go in the .env or in the compose file and chose the compose with the idea that it'd be simpler to manage in the config.

Where should this be placed to trigger the conditional in the config? I wasn't sure if it should go in the .env or in the compose file and chose the compose with the idea that it'd be simpler to manage in the config.
Outdated
Review

as i understand, if it's user-customizable it has to go on .env and only defined on compose files if it's needed during container runtime. if it's for templates it's ok for it to only be on .env.

as i understand, if it's user-customizable it has to go on `.env` and only defined on compose files if it's needed during container runtime. if it's for templates it's ok for it to only be on `.env`.
Outdated
Review

everything else looks ok, thanks!

everything else looks ok, thanks!
mode: 0600
- source: admin_token
mode: 0600
deploy:
labels:
Brooke marked this conversation as resolved Outdated
Outdated
Review

this means the port is proxied but not protected by https, right? wouldn't it be better to proxy it through the domain name? i'm not sure if you can do this on coopcloud, maybe the paths need to be listed

this means the port is proxied but not protected by https, right? wouldn't it be better to proxy it through the domain name? i'm not sure if you can do this on coopcloud, maybe the paths need to be listed

True, I didn't think of this. Going through a domain name would be easiest, because If I understand the monitoring endpoint correctly it's just served as a plaintext result.

True, I didn't think of this. Going through a domain name would be easiest, because If I understand the monitoring endpoint correctly it's just served as a plaintext result.
- "traefik.http.routers.${STACK_NAME}-metrics.rule=Host(`${DOMAIN}`) && Path(`/metrics`)"
- "traefik.http.routers.${STACK_NAME}-metrics.entrypoints=web-secure"
- "traefik.http.routers.${STACK_NAME}-metrics.tls.certresolver=${LETS_ENCRYPT_ENV}"
- "traefik.http.routers.${STACK_NAME}-metrics.service=${STACK_NAME}-metrics"
- "traefik.http.services.${STACK_NAME}-metrics.loadbalancer.server.port=3903"
secrets:
admin_token:
name: ${STACK_NAME}_admin_token_${SECRET_ADMIN_TOKEN_SECRET_VERSION}
external: true
metrics_token:
name: ${STACK_NAME}_metrics_token_${SECRET_METRICS_TOKEN_SECRET_VERSION}
external: true

View File

@ -23,6 +23,7 @@ services:
- "traefik.http.routers.${STACK_NAME}.rule=Host(`${DOMAIN}`)"
- "traefik.http.routers.${STACK_NAME}.entrypoints=web-secure"
- "traefik.http.routers.${STACK_NAME}.tls.certresolver=${LETS_ENCRYPT_ENV}"
- "traefik.http.routers.${STACK_NAME}.service=${STACK_NAME}"
- "traefik.tcp.routers.${STACK_NAME}-rpc.rule=HostSNI(`*`)"
- "traefik.tcp.routers.${STACK_NAME}-rpc.entrypoints=garage-rpc"
- "traefik.tcp.services.${STACK_NAME}-rpc.loadbalancer.server.port=3901"

View File

@ -27,3 +27,11 @@ bootstrap_peers = [
s3_region = "garage"
api_bind_addr = "[::]:3900"
root_domain = ".s3.garage"
{{ if eq (env "MONITORING_ENABLED") "true" }}
[admin]
api_bind_addr = "[::]:3903"
admin_token_file = "/run/secrets/admin_token"
metrics_require_token = true
metrics_token_file = "/run/secrets/metrics_token"
{{ end }}