discourse

coop-cloud/discourse

Fork 2

Author	SHA1	Message	Date
notplants	6ae2d2cf51	fix(db): make the postgres major-version migration safe and correct The in-place pg_upgrade in the db entrypoint could crash-loop or fail on real clusters. This reworks it: - Idempotent, crash-safe: replace the fragile migration_in_progress marker with a state-driven guard on the old_data/new_data scratch dirs. An empty leftover means a run was interrupted before any data moved (data still intact at $PGDATA) so it is discarded and retried; a non-empty one means data may live only there, so it stops for manual recovery. Removes both the "mkdir: File exists" crash-loop and the silent fresh-initdb-over-live-data window. - Correct install user: pg_upgrade must run as the old cluster's bootstrap superuser (oid 10), and the new cluster must be initialised with that same user. It is not necessarily $POSTGRES_USER (clusters created with the default "postgres" superuser plus a separate app role are common). Detect it from the old cluster (briefly start it and read pg_roles where oid = 10) and use it for both the new cluster's initdb and the pg_upgrade -U argument. - Bump DB_ENTRYPOINT_VERSION to v3 so swarm reloads the (immutable) config. Verified on cctest: clean 13->17, interrupted-then-retried, and prod-like clusters whose install user is "postgres" with a separate "discourse" app role.	2026-06-16 18:27:45 +00:00

Author

SHA1

Message

Date

notplants

6ae2d2cf51

fix(db): make the postgres major-version migration safe and correct

The in-place pg_upgrade in the db entrypoint could crash-loop or fail on real
clusters. This reworks it:

- Idempotent, crash-safe: replace the fragile migration_in_progress marker with
  a state-driven guard on the old_data/new_data scratch dirs. An empty leftover
  means a run was interrupted before any data moved (data still intact at
  $PGDATA) so it is discarded and retried; a non-empty one means data may live
  only there, so it stops for manual recovery. Removes both the
  "mkdir: File exists" crash-loop and the silent fresh-initdb-over-live-data
  window.

- Correct install user: pg_upgrade must run as the old cluster's bootstrap
  superuser (oid 10), and the new cluster must be initialised with that same
  user. It is not necessarily $POSTGRES_USER (clusters created with the default
  "postgres" superuser plus a separate app role are common). Detect it from the
  old cluster (briefly start it and read pg_roles where oid = 10) and use it for
  both the new cluster's initdb and the pg_upgrade -U argument.

- Bump DB_ENTRYPOINT_VERSION to v3 so swarm reloads the (immutable) config.

Verified on cctest: clean 13->17, interrupted-then-retried, and prod-like
clusters whose install user is "postgres" with a separate "discourse" app role.

2026-06-16 18:27:45 +00:00

Compare commits

1 Commits

idempotent4 .. idempotent2

Diff Content Not Available

Compare commits

1 Commits idempotent4 .. idempotent2

Diff Content Not Available

1 Commits

idempotent4 .. idempotent2