pg_upgrade must run as the old cluster's bootstrap superuser (oid 10), and the
new cluster must be initialised with that same user, otherwise it fails the
"database user is the install user" consistency check. The install user is not
necessarily $POSTGRES_USER: clusters created with the default "postgres"
superuser plus a separate app role (e.g. discourse) are common.
Detect it from the old cluster by briefly starting it and reading pg_roles
(oid = 10) as the known app role, then use it for both the new cluster's initdb
and the pg_upgrade -U argument.
The postgres major-version migration in the db entrypoint was not safe to
re-run. If the container was killed mid-migration it could crash-loop forever
("mkdir: cannot create directory .../old_data: File exists") or silently initdb
a fresh empty cluster over the live data once PG_VERSION had been moved out of
$PGDATA but before the in-progress marker was written.
Replace the marker file with a state-driven guard keyed on the scratch dirs:
empty old_data/new_data means the run was interrupted before any data moved, so
discard and retry (idempotent); non-empty means data may only live there, so
stop for manual recovery. Bump DB_ENTRYPOINT_VERSION v1->v2 so swarm picks up
the new (immutable) config.