yet another try on the monitoring stack
Go to file
Philipp Rothmann d5a34436f9 wip 2023-02-09 10:07:29 +01:00
.env.sample wip 2023-02-09 10:07:29 +01:00
.gitignore init 2022-03-31 14:52:21 +02:00
README.md init 2022-03-31 14:52:21 +02:00
abra.sh wip 2023-02-09 10:07:29 +01:00
alertmanager.yml.tmpl init 2022-03-31 14:52:21 +02:00
compose.grafana.yml wip 2023-02-09 10:07:29 +01:00
compose.loki.yml wip 2023-02-09 10:07:29 +01:00
compose.metrics.yml wip 2023-02-09 10:07:29 +01:00
compose.prometheus.yml wip 2023-02-09 10:07:29 +01:00
compose.promtail.yml wip 2023-02-09 10:07:29 +01:00
compose.yml wip 2023-02-09 10:07:29 +01:00
grafana_custom.ini init 2022-03-31 14:52:21 +02:00
loki.yml.tmpl init 2022-03-31 14:52:21 +02:00
node-exporter-entrypoint.sh wip 2023-02-09 10:07:29 +01:00
prometheus.yml.tmpl init 2022-03-31 14:52:21 +02:00
prometheus_web.yml.tmpl init 2022-03-31 14:52:21 +02:00
promtail.yml.tmpl wip 2023-02-09 10:07:29 +01:00

README.md

monitoring-lite

A centralised grafana/prometheus/loki stack. This an alternative approach to coop-cloud/monitoring which does include any of the services which actually gather metrics and/or logs. Instead, this is a useful recipe for folks who need to centralise their monitoring stack into a single grafana/prometheus/loki & several instances of node_exporter/cadvisor/promtail.

  • Category: Apps
  • Status: 2, beta
  • Image: grafana/grafana, 4, upstream
  • Healthcheck: 3
  • Backups: 1
  • Email: 3
  • Tests: No
  • SSO: 1

Setup

This stack requires 3 domains, one for grafana, prometheus & loki. This is due to the need for the gathering tools, such as node_exporter, to have a publicy accessible URL for making connections. We make use of the internal prometheus HTTP basic auth & wire up an Nginx proxy with HTTP basic auth for loki. Grafana uses Keycloak OpenId Connect sign in. The alertmanager setup remains internal and is only connected with grafana. It also assume that you are deploying the coop-cloud/gathering recipe on the machines that you want to gather metrics & logs from. Each instance of the gathering recipe will report back and/or be scraped by your central install of monitoring-lite.

Post-setup guide

  • configure prometheus/loki/alertmanager as data sources in grafana under Configuration > Data sources

    • for loki, you need to set a "Custom HTTP Header": X-Scope-OrgID: fake
  • configure the SMTP mailer under Alerting > Contact points

    • edit the default contact point, choose "Alertmanager" as type & http://alertmanager:9093 as URL
    • use the "Test" button to send a test mail. It should fire a request at the alertmanager & that should send a mail
  • abra app cp your scrap_configs: ... into /prometheus/scrape_configs & log into your prometheus web UI to ensure they're working

  • load your dashboards in manually under Create > Dashboard

  • from your dashboard panels, choose Edit > Alert to create alerts based on those panels