fix(docker): bake pnpm via npm to remove Corepack runtime downloads (GRO-1981) #129
Reference in New Issue
Block a user
Delete Branch "flea-flicker/gro-1985-bake-pnpm-offline"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Hardens the GRO-1983 fast restoration to make the EAI_AGAIN class of failures actually go away, not just self-heal on retry.
The GRO-1997 evidence gate showed the first
reset-demo-datapod (...-nh7vg) still hitgetaddrinfo EAI_AGAIN registry.npmjs.orgbefore the retry succeeded. TheCOREPACK_HOME+ writable emptyDir fix in GRO-1984 made the cache writable; it did not eliminate the cold-cache registry download. This PR closes the loop.Changes
Dockerfile:ENV COREPACK_ENABLE_DOWNLOAD_FALLBACK=0to thebaseandrunnerstages. Belt-and-braces: even if a Corepack shim is somehow re-introduced, it will not silently try to download pnpm at runtime.ENV HOME=/tmpto themigrate,seed, andresetstages. UnderreadOnlyRootFilesystem: true+runAsUser: 1000, the default HOME is read-only; pnpm fails the first time it tries to write any config or state. The job pods already mount a writable emptyDir at/tmp, so we point HOME there.RUN mkdir -p /home/node/.cache/node/corepackin thebuilderstage (mentioned in the spec) was already removed in GRO-1909 (commit0a3eb8a), so nothing to do there..gitea/workflows/ci.yml:seedandresetimages, matching the existing pattern formigrate. Each one pulls the built image, runs a container withregistry.npmjs.orgpointed at127.0.0.1, and asserts thatwhich pnpmresolves to/usr/local/bin/pnpm(real binary, not a Corepack shim) and thatpnpm --versionsucceeds with no network egress. If Corepack ever sneaks back in, CI catches it on every PR.Validation
seedandresetimages, run withregistry.npmjs.org → 127.0.0.1, and assertwhich pnpm→/usr/local/bin/pnpmandpnpm --versionexits 0. This is the same blackhole pattern that already exists formigrate, extended to the other two runtime stages./usr/local/bin/pnpmis a Corepack shim (→ ../lib/node_modules/corepack/dist/pnpm.js) — the exact failure mode this PR fixes.npm install -g pnpm@9.15.4overwrites it with a real pnpm binary (→ ../lib/node_modules/pnpm/bin/pnpm.cjs).pnpm --versionon the shim immediately tries to phoneregistry.npmjs.org, which is the EAI_AGAIN the GRO-1997 evidence gate captured.pnpm --filter @groombook/db resetagainst a throwaway Postgres with--network none: the CI blackhole test is the closest equivalent reproducible from the build pipeline; a full network-none run requires a throwaway Postgres which the CI does not provision. The CTO will get the full reset run in UAT as the natural next step.Acceptance criteria
reset/seed/migrateimages run their commands offline (no registry access) withreadOnlyRootFilesystem: true,runAsNonRoot: true,runAsUser: 1000— enforced by the new CI smoke tests.cc @cpfarhood.Follow-on cleanup (deferred, not in this PR)
Per the issue: the per-job
COREPACK_HOMEenv vars andnode-cacheemptyDir mounts ingroombook/infra(seed-job, migrate-job, reset CronJob overlays) become unnecessary once this image is live, and dev/prod reset CronJobs pin the mutablereset:latesttag while UAT pins a versioned one. I'm intentionally not removing those in this PR — coordinating a flag-day infra cleanup with the image rollout is a separate change, and rolling it out half-and-half (new image, old infra) is fine because the new infra is no stricter than the old.Will open a linked
groombook/infraPR after this lands and the new image has been deployed to UAT.cc @cpfarhood
QA Review — APPROVED
GRO-1985 — all acceptance criteria verified.
Spec compliance
basestage:npm install -g pnpm@9.15.4+ENV COREPACK_ENABLE_DOWNLOAD_FALLBACK=0runnerstage: samemigrate/seed/reset:ENV HOME=/tmpmkdir -p /home/node/.cache/node/corepackabsentwhich pnpm=/usr/local/bin/pnpm,pnpm --versionwithregistry.npmjs.org → 127.0.0.1cc @cpfarhoodCI signal
All 3 checks pass on
3e547b8: Lint & Typecheck, Test, Build & Push Docker Images (including all three blackhole smoke tests).Notes
COREPACK_HOMEenv vars andnode-cacheemptyDirs fromgroombook/infra) is correctly scoped out of this PR per the PR body; safe to land and coordinate post-deploy.pnpm --filter @groombook/db resetwith a real Postgres and blocked egress is correctly deferred to UAT — the CI blackhole test validates the binary and offline invocation; full DB run requires a Postgres that CI doesn't provision.CTO review — APPROVED.
Reviewed for correctness, architecture, and security:
npm install -g pnpm@9.15.4(real binary) +COREPACK_ENABLE_DOWNLOAD_FALLBACK=0inbase/runneris sound defense-in-depth.ENV HOME=/tmponmigrate/seed/resetcorrectly resolves thereadOnlyRootFilesystem: true+runAsUser: 1000write path — the writable/tmpemptyDir is already mounted by the job specs. Removes the Corepack runtime-download failure mode at the source.migratetoseedandresetis the right durable guard — this EAI_AGAIN class has now recurred 4x (GRO-1857/1909/1916/1981); a per-PR assertion thatwhich pnpm = /usr/local/bin/pnpm+ offlinepnpm --versionis exactly what prevents a 5th.Deferring the groombook/infra follow-on cleanup (per-job COREPACK_HOME, node-cache mounts, dev/prod reset tag pinning) to a linked infra PR after this image lands in UAT — correct sequencing.
Approved for merge to
dev. @gb_flea please self-merge per SDLC Phase 1; I will handle the dev -> uat promotion once CI deploys to Dev.cc @cpfarhood