fix(db): wait for/retry DB DNS resolution before drizzle-kit migrate (GRO-2163) #161

Merged
Flea Flicker merged 4 commits from fix/gro-2163-migrate-pre-dns-wait into dev 2026-06-08 13:37:30 +00:00

4 Commits

Author SHA1 Message Date
Flea Flicker 840675e89e Merge remote-tracking branch 'origin/dev' into fix/gro-2163-migrate-pre-dns-wait
CI / Test (pull_request) Successful in 27s
CI / Lint & Typecheck (pull_request) Successful in 31s
CI / Build & Push Docker Images (pull_request) Successful in 1m18s
2026-06-08 13:34:53 +00:00
Flea Flicker 680cfa2bf5 fix(db): run wait-for-db inline in migrate/seed/reset (pnpm skips pre-* hooks)
CI / Test (pull_request) Failing after 10m16s
CI / Lint & Typecheck (pull_request) Failing after 10m23s
CI / Build & Push Docker Images (pull_request) Has been skipped
pnpm 9 does not auto-run npm pre-* lifecycle scripts (enable-pre-post-scripts
defaults to false), so the pre-migrate/pre-seed/pre-reset hooks added in the
prior commit never executed under the Dockerfile entrypoint
`pnpm --filter @groombook/db migrate`. Chain wait-for-db.mjs directly into the
migrate/seed/reset scripts so the DNS pre-resolve actually runs on the real
invocation path. Verified locally that `pnpm --filter @groombook/db migrate`
now runs wait-for-db before drizzle-kit. (GRO-2163)

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-08 05:40:26 +00:00
Flea Flicker c5c62b5162 Merge remote-tracking branch 'origin/dev' into fix/gro-2163-migrate-pre-dns-wait 2026-06-08 05:40:26 +00:00
Flea Flicker 323f6d6bcb fix(db): wait for/retry DB DNS resolution before drizzle-kit migrate (GRO-2163)
CI / Test (pull_request) Successful in 10s
CI / Lint & Typecheck (pull_request) Successful in 16s
CI / Build & Push Docker Images (pull_request) Failing after 30m28s
A fresh migrate-schema pod occasionally hits a transient CoreDNS miss
(EAI_AGAIN) on groombook-postgres-rw.<ns>.svc on its first attempt.
With backoffLimit: 2 the retry pod usually wins, but three unlucky
attempts in a row trips BackoffLimitExceeded and the Job is recreated
on every Flux reconcile (3+ Completed events observed in 8 min in uat).

Add packages/db/scripts/wait-for-db.mjs: a tiny no-deps Node 22 script
that parses DATABASE_URL, resolves the hostname via node:dns.promises
with exponential backoff (12 attempts, ~30s total) and only exits 0
once a real IP is returned. EAI_AGAIN / ENOTFOUND / EAI_NODATA are
retried; any other DNS error is surfaced so drizzle-kit gets a clear
message instead of being starved by retries.

Wire it as a pnpm `pre-migrate` (and `pre-seed` / `pre-reset`) hook
in @groombook/db so pnpm auto-runs it before any of the data-plane
commands. Mirrors the belt-and-braces pattern used in GRO-1985
(disable Corepack download fallback): do not try to outsmart CoreDNS,
just do not ask drizzle-kit to perform the very first DNS lookup of a
freshly-scheduled pod.

Defaults are env-tunable (WAIT_FOR_DB_MAX_ATTEMPTS, _BASE_DELAY_MS,
_MAX_DELAY_MS, _SKIP) so a future uat-debug pod can sidestep the
wait if needed.

Refs: GRO-2163, GRO-1985.
2026-06-08 00:31:38 +00:00