1 Commits

Author SHA1 Message Date
6ae2d2cf51 fix(db): make the postgres major-version migration safe and correct
The in-place pg_upgrade in the db entrypoint could crash-loop or fail on real
clusters. This reworks it:

- Idempotent, crash-safe: replace the fragile migration_in_progress marker with
  a state-driven guard on the old_data/new_data scratch dirs. An empty leftover
  means a run was interrupted before any data moved (data still intact at
  $PGDATA) so it is discarded and retried; a non-empty one means data may live
  only there, so it stops for manual recovery. Removes both the
  "mkdir: File exists" crash-loop and the silent fresh-initdb-over-live-data
  window.

- Correct install user: pg_upgrade must run as the old cluster's bootstrap
  superuser (oid 10), and the new cluster must be initialised with that same
  user. It is not necessarily $POSTGRES_USER (clusters created with the default
  "postgres" superuser plus a separate app role are common). Detect it from the
  old cluster (briefly start it and read pg_roles where oid = 10) and use it for
  both the new cluster's initdb and the pg_upgrade -U argument.

- Bump DB_ENTRYPOINT_VERSION to v3 so swarm reloads the (immutable) config.

Verified on cctest: clean 13->17, interrupted-then-retried, and prod-like
clusters whose install user is "postgres" with a separate "discourse" app role.
2026-06-16 18:27:45 +00:00
9 changed files with 117 additions and 170 deletions

View File

@ -5,19 +5,17 @@ DOMAIN=discourse.example.com
#EXTRA_DOMAINS=', `www.discourse.example.com`'
LETS_ENCRYPT_ENV=production
# Admin / developer accounts (comma-separated); these become admins on signup
DISCOURSE_DEVELOPER_EMAILS=admin@example.com
# Outgoing email
#DISCOURSE_SMTP_HOST=
#DISCOURSE_SMTP_PORT=
#DISCOURSE_SMTP_USER=
#DISCOURSE_SMTP_PROTOCOL=
#DISCOURSE_SMTP_AUTH=
# Set this if you send e-mails from a different domain than noreply@$DOMAIN
#DISCOURSE_NOTIFICATION_EMAIL=$SMTP_USER
# Outgoing email (official discourse/discourse env names)
#DISCOURSE_SMTP_ADDRESS=
#DISCOURSE_SMTP_PORT=587
#DISCOURSE_SMTP_USER_NAME=
#DISCOURSE_SMTP_AUTHENTICATION=login
#DISCOURSE_SMTP_ENABLE_START_TLS=true
# Set this if you send e-mail from a different address than noreply@$DOMAIN
#DISCOURSE_NOTIFICATION_EMAIL=
# SMTP password as a secret
# SMTP authentication
#COMPOSE_FILE="compose.yml:compose.smtpauth.yml"
#SECRET_SMTP_PASSWORD_VERSION=v1
SECRET_DB_PASSWORD_VERSION=v1

View File

@ -6,54 +6,67 @@ A platform for community discussion
<!-- metadata -->
* **Category**: Apps
* **Status**: 3, experimental
* **Image**: [`discourse/discourse`](https://hub.docker.com/r/discourse/discourse), 4, upstream
* **Status**:
* **Image**: [`bitnami/discourse`](https://hub.docker.com/r/bitnami/discourse)
* **Healthcheck**: yes
* **Backups**: yes
* **Backups**: no
* **Email**: yes
* **Tests**: yes
* **Tests**: no
* **SSO**: no
<!-- endmetadata -->
> **Note**: this recipe runs the official, **experimental** `discourse/discourse`
> image. Upstream does not yet recommend it for production — see
> <https://meta.discourse.org/t/380646>. Use with care.
## Basic usage
1. Set up Docker Swarm and [`abra`]
2. Deploy [`coop-cloud/traefik`]
3. `abra app new discourse --secrets`
4. `abra app config YOURAPPDOMAIN` — set `DOMAIN` and `DISCOURSE_DEVELOPER_EMAILS`
3. `abra app new discourse --secrets` (optionally with `--pass` if you'd like
to save secrets in `pass`)
4. `abra app config YOURAPPDOMAIN` - be sure to change `$DOMAIN` to something that resolves to
your Docker swarm box
5. `abra app deploy YOURAPPDOMAIN`
6. Open the configured domain in your browser to finish set-up. The first account
that registers with an address listed in `DISCOURSE_DEVELOPER_EMAILS` becomes
an admin.
6. Open the configured domain in your browser to finish set-up
[`abra`]: https://docs.coopcloud.tech/abra/
[`coop-cloud/traefik`]: https://git.coopcloud.tech/coop-cloud/traefik
[`abra`]: https://git.autonomic.zone/autonomic-cooperative/abra
[`coop-cloud/traefik`]: https://git.autonomic.zone/coop-cloud/traefik
The app serves plain HTTP on port 80; Traefik terminates TLS in front of it. The
image's built-in nginx/Let's Encrypt is disabled by the recipe (`install-ssl`
override) so it works behind the reverse proxy.
## To add a new admin user
## Add an admin user
1. Login to the instance `abra app run APPNAME app sh`
2. `cd /opt/bitnami/discourse`
3. `RAILS_ENV=production bundle exec rake admin:create` and follow prompts.
```
abra app run YOURAPPDOMAIN app discourse admin create
```
## Install plugins
## Migrating from the previous (bitnami) recipe
1. Login to instance `abra app run APPNAME app sh`
2. `cd /bitnami/discourse/plugins/`
3. `git clone plugin.git` for example `https://github.com/discourse/discourse-openid-connect.git`
4. `abra app restart YOURAPPDOMAIN app`
The official image stores uploads under `/shared` rather than bitnami's
`/bitnami/discourse`. On first boot the recipe copies uploads + backups from the
old bitnami volume (mounted read-only at `/legacy`) into `/shared`, once,
idempotently. The Postgres database is reused as-is. After a successful migration
a later recipe version will drop the transitional `/legacy` mount.
### Events / calendar plugin
If you are upgrading from the bitnami recipe, also remove the now-unused sidekiq
service that swarm leaves behind (sidekiq runs inside the app container now):
We've had some luck running [discourse-events](https://github.com/paviliondev/discourse-events).
```
docker service rm YOURSTACK_sidekiq
```
## Setup Notes
Until issue #1 is fixed, the default user is `user` and the default password is `bitnami123`
## Postgres major version upgrades
Welcome to hell.
1. `abra app run YOURAPPDOMAIN db pg_dumpall -U discourse | gzip > YOURAPPDOMAIN_db_DATE.sql.gz`
2. `abra app volume ls YOURAPPDOMAIN`, find the name of the Postgres data volume
3. `scp` the backup to your VPS
4. `abra app undeploy YOURAPPDOMAIN`
5. `abra app volume rm YOURAPPDOMAIN`, choose the Postgres data volume
6. `abra app deploy YOURAPPDOMAIN`, then `abra app undeploy YOURAPPDOMAIN`
7. `ssh` to the VPS, run (replacing `13-alpine` with the new Postgres version)
`docker run -v YOURDATAVOLUME:/var/lib/postgresql/data -e POSTGRES_HOST_AUTH_METHOD=trust -it postgres:13-alpine`
8. In another SSH session on the server, run `docker ps` to find the ID of the
new Postgres container, then `docker exec -it CONTAINERID bash`
9. In the shell you just launched, run `dropdb -U discourse discourse`, then
`createdb -U discourse discourse`, then Ctrl+D or run `exit`
10. In the second SSH session, run `zcat YOURAPPDOMAIN_db_DATE.sql.gz | docker exec -it CONTAINERID psql -U discourse`
11. Exit the second SSH session
12. Back in the first SSH session, Ctrl+C to shut down the database
13. `abra app deploy YOURAPPDOMAIN`

View File

@ -1,5 +1,2 @@
export DB_ENTRYPOINT_VERSION=v3
export PG_BACKUP_VERSION=v2
export APP_ENTRYPOINT_VERSION=v2
export APP_INSTALL_SSL_VERSION=v1
export APP_MIGRATE_UPLOADS_VERSION=v1

View File

@ -1,11 +0,0 @@
#!/bin/bash
# Overrides the official image's /etc/runit/1.d/install-ssl.
#
# The stock install-ssl always runs configure-ssl (and configure-letsencrypt),
# which empties the default `listen 80` nginx outlet and switches to `listen 443
# ssl` against a cert that does not exist here — nginx then crash-loops, or the
# image tries to obtain its own Let's Encrypt cert. Under Co-op Cloud, Traefik
# terminates TLS and proxies plain HTTP to port 80, so we skip the image's SSL
# setup entirely and let nginx keep its default HTTP-on-80 config.
echo "install-ssl overridden by recipe: serving plain HTTP on :80 behind Traefik"
exit 0

View File

@ -1,15 +0,0 @@
#!/bin/bash
# Co-op Cloud wrapper around the official image's /sbin/boot.
# discourse/discourse reads passwords from the process env (pups/Ruby; it has no
# *_FILE support), so inject them from the docker secrets before booting.
set -e
if [ -f /run/secrets/db_password ]; then
export DISCOURSE_DB_PASSWORD="$(cat /run/secrets/db_password)"
fi
if [ -f /run/secrets/smtp_password ]; then
export DISCOURSE_SMTP_PASSWORD="$(cat /run/secrets/smtp_password)"
fi
exec /sbin/boot

View File

@ -3,8 +3,14 @@ version: "3.8"
services:
app:
# the wrapper entrypoint reads /run/secrets/smtp_password and exports
# DISCOURSE_SMTP_PASSWORD (the official image has no *_FILE support)
environment:
- DISCOURSE_SMTP_PASSWORD_FILE=/var/run/secrets/smtp_password
secrets:
- smtp_password
sidekiq:
environment:
- DISCOURSE_SMTP_PASSWORD_FILE=/var/run/secrets/smtp_password
secrets:
- smtp_password

View File

@ -3,64 +3,53 @@ version: "3.8"
services:
app:
image: discourse/discourse:3.5.3
image: bitnamilegacy/discourse:3.5.0
networks:
- proxy
- internal
# official image CMD is /sbin/boot; wrapper injects the DB password secret first
entrypoint: /usr/local/bin/cc-app-entrypoint.sh
# entrypoint: ['tail', '-f', '/dev/null']
environment:
- DISCOURSE_HOSTNAME=${DOMAIN}
- DISCOURSE_DEVELOPER_EMAILS=${DISCOURSE_DEVELOPER_EMAILS}
- DISCOURSE_DB_HOST=${STACK_NAME}_db
- DISCOURSE_DB_PORT=5432
- DISCOURSE_DB_NAME=discourse
- DISCOURSE_DB_USERNAME=discourse
- DISCOURSE_REDIS_HOST=${STACK_NAME}_redis
- DISCOURSE_REDIS_PORT=6379
- DISCOURSE_SMTP_ADDRESS
- DISCOURSE_SMTP_PORT
- DISCOURSE_SMTP_USER_NAME
- DISCOURSE_SMTP_PASSWORD
- DISCOURSE_SMTP_AUTHENTICATION
- DISCOURSE_SMTP_ENABLE_START_TLS
- ALLOW_EMPTY_PASSWORD=yes
- DISCOURSE_DATABASE_HOST=${STACK_NAME}_db
- DISCOURSE_DATABASE_NAME=discourse
- DISCOURSE_DATABASE_PASSWORD_FILE=/run/secrets/db_password
- DISCOURSE_DATABASE_USER=discourse
- DISCOURSE_HOST=${DOMAIN}
- DISCOURSE_NOTIFICATION_EMAIL
- DISCOURSE_SMTP_AUTH
- DISCOURSE_SMTP_HOST
- DISCOURSE_SMTP_PORT
- DISCOURSE_SMTP_PROTOCOL
- DISCOURSE_SMTP_USER
- PASSENGER_COMPILE_NATIVE_SUPPORT_BINARY=0
volumes:
- 'discourse_shared:/shared'
# transition only: legacy bitnami volume, read-only, for one-time upload migration
- 'discourse_data:/legacy:ro'
- 'discourse_data:/bitnami/discourse'
secrets:
- db_password
configs:
- source: app_entrypoint
target: /usr/local/bin/cc-app-entrypoint.sh
mode: 0555
- source: app_install_ssl
target: /etc/runit/1.d/install-ssl
mode: 0555
- source: app_migrate_uploads
target: /etc/runit/1.d/02-migrate-bitnami-uploads
mode: 0555
depends_on:
- db
- redis
deploy:
update_config:
failure_action: rollback
order: stop-first
order: start-first
labels:
- "traefik.enable=true"
- "traefik.http.services.${STACK_NAME}.loadbalancer.server.port=80"
- "traefik.http.services.${STACK_NAME}.loadbalancer.server.port=3000"
- "traefik.http.routers.${STACK_NAME}.rule=Host(`${DOMAIN}`${EXTRA_DOMAINS})"
- "traefik.http.routers.${STACK_NAME}.entrypoints=web-secure"
- "traefik.http.routers.${STACK_NAME}.tls.certresolver=${LETS_ENCRYPT_ENV}"
- "coop-cloud.${STACK_NAME}.version=1.0.0+3.5.3"
## Redirect from EXTRA_DOMAINS to DOMAIN
#- "traefik.http.routers.${STACK_NAME}.middlewares=${STACK_NAME}-redirect"
#- "traefik.http.middlewares.${STACK_NAME}-redirect.headers.SSLForceHost=true"
#- "traefik.http.middlewares.${STACK_NAME}-redirect.headers.SSLHost=${DOMAIN}"
- "coop-cloud.${STACK_NAME}.version=0.8.0+3.5.0"
healthcheck:
test: "curl -fsS http://localhost/srv/status || exit 1"
test: "ruby -e \"require 'uri'; require 'net/http'; uri = URI('http://localhost:3000/srv/status'); res = Net::HTTP.get_response(uri); if res.is_a?(Net::HTTPSuccess) then exit (0) else exit (1) end\""
interval: 30s
timeout: 10s
retries: 6
start_period: 25m
start_period: 20m
db:
image: pgvector/pgvector:pg17
@ -83,15 +72,6 @@ services:
- POSTGRES_USER=discourse
- POSTGRES_DB=discourse
- POSTGRES_PASSWORD_FILE=/run/secrets/db_password
healthcheck:
test: "pg_isready -U discourse -d discourse"
interval: 30s
timeout: 10s
retries: 5
# generous: a postgres major-version upgrade (apt install + pg_upgrade) runs
# in the entrypoint before the server accepts connections — don't let the
# healthcheck kill an in-progress migration
start_period: 10m
deploy:
labels:
backupbot.backup: "true"
@ -105,12 +85,35 @@ services:
- internal
volumes:
- 'redis_data:/data'
healthcheck:
test: "redis-cli ping | grep -q PONG"
interval: 30s
timeout: 5s
retries: 5
start_period: 30s
sidekiq:
image: bitnamilegacy/discourse:3.5.0
networks:
- proxy
- internal
depends_on:
- discourse
volumes:
- 'discourse_data:/bitnami/discourse'
command: /opt/bitnami/scripts/discourse-sidekiq/run.sh
secrets:
- db_password
environment:
- ALLOW_EMPTY_PASSWORD=yes
- DISCOURSE_DATABASE_HOST=db
- DISCOURSE_DATABASE_NAME=discourse
- DISCOURSE_DATABASE_PASSWORD_FILE=/run/secrets/db_password
- DISCOURSE_DATABASE_PORT_NUMBER=5432
- DISCOURSE_DATABASE_USER=discourse
- DISCOURSE_HOST=${DOMAIN}
- DISCOURSE_REDIS_HOST=redis
- DISCOURSE_REDIS_PORT_NUMBER=6379
- DISCOURSE_SMTP_HOST
- DISCOURSE_SMTP_PORT
- DISCOURSE_SMTP_PROTOCOL
- DISCOURSE_SMTP_USER
- PASSENGER_COMPILE_NATIVE_SUPPORT_BINARY=0
- DISCOURSE_SMTP_AUTH
secrets:
db_password:
@ -120,7 +123,6 @@ secrets:
volumes:
postgresql_data:
redis_data:
discourse_shared:
discourse_data:
networks:
@ -129,15 +131,6 @@ networks:
internal:
configs:
app_entrypoint:
name: ${STACK_NAME}_app_entrypoint_${APP_ENTRYPOINT_VERSION}
file: cc-app-entrypoint.sh
app_install_ssl:
name: ${STACK_NAME}_app_install_ssl_${APP_INSTALL_SSL_VERSION}
file: app-install-ssl.sh
app_migrate_uploads:
name: ${STACK_NAME}_app_migrate_uploads_${APP_MIGRATE_UPLOADS_VERSION}
file: migrate-uploads.sh
db_entrypoint:
name: ${STACK_NAME}_db_entrypoint_${DB_ENTRYPOINT_VERSION}
file: entrypoint.postgres.sh.tmpl

View File

@ -1,24 +0,0 @@
#!/bin/bash
# One-time, idempotent, NON-destructive migration of uploads + backups from a
# legacy bitnami discourse volume into the official image's /shared.
#
# Runs on every boot as a runit 1.d hook but no-ops after the first success
# (sentinel) and when there is no legacy volume mounted (fresh installs). It only
# ever COPIES from the read-only /legacy mount, so an interruption just re-copies
# on the next boot — there is no move/delete to leave the data half-migrated.
set -e
SENTINEL=/shared/.bitnami-uploads-migrated
[ -e "$SENTINEL" ] && exit 0
if [ -d /legacy/public/uploads ]; then
echo "[migrate-uploads] copying bitnami uploads/backups -> /shared"
mkdir -p /shared/uploads /shared/backups
cp -a /legacy/public/uploads/. /shared/uploads/ 2>/dev/null || true
cp -a /legacy/public/backups/. /shared/backups/ 2>/dev/null || true
# discourse runs as uid 1000; the official boot also chowns /shared, but be explicit
chown -R discourse:discourse /shared/uploads /shared/backups 2>/dev/null || true
echo "[migrate-uploads] done"
fi
touch "$SENTINEL"

View File

@ -1,10 +0,0 @@
This release switches from the bitnami image to the official discourse/discourse
image. Some env vars need to be renamed for this migration; everything else
should happen automatically.
Rename these in your app's .env (the values carry over):
DISCOURSE_SMTP_HOST --> DISCOURSE_SMTP_ADDRESS
DISCOURSE_SMTP_USER --> DISCOURSE_SMTP_USER_NAME
DISCOURSE_SMTP_AUTH --> DISCOURSE_SMTP_AUTHENTICATION
DISCOURSE_SMTP_PROTOCOL --> DISCOURSE_SMTP_ENABLE_START_TLS (takes a boolean true/false, not the old tls/ssl value, so translate it rather than copying it straight across)