fix(db): make services seed idempotent across resets (GRO-2064, GRO-2033 close-out) #148

Merged
Flea Flicker merged 1 commits from flea/gro-2064-services-seed-idempotent into dev 2026-06-02 04:54:34 +00:00
Member

What

Fixes the deterministic services_pkey collision that broke the prod seed Job (seed-test-data-b5943fb, three BackoffLimitExceeded pods on 2026-06-02) and unblocks GRO-2064 / GRO-2033 close-out.

Root cause (from CTO review on infra PR #605, rev #4230)

Two interlocking bugs in packages/db/src/seed.ts (and the parallel apps/api/src/db/seed.ts — both trees kept in sync per the GRO-2052/2013/2014 lesson):

  1. Reset TRUNCATE excluded services. A prior seedKnownUsers() run that wrote id=b0000001-…-004, name="Nail Trim" survived every reset. The next full seed() then tried to insert id=b0000001-…-004, name="Full Groom — Large" and PostgreSQL raised services_pkey (id collision) — the name-targeted ON CONFLICT couldn't fire because the conflict was on a different column.
  2. id↔name maps disagreed. demoSvcs[3] (used by seedKnownUsers) had id=…-004, name="Nail Trim" while servicesDef[3] (used by full seed()) has id=…-004, name="Full Groom — Large". Nail Trim was supposed to be id=…-005 in the demo subset.

Fix

  • TRUNCATE services, … so each reset rebuilds the catalogue from servicesDef (CASCADE handles appointments/invoices/waitlist/buffer_rules FKs to services.id).
  • Key both services upserts on schema.services.id (not name) so deterministic ids always win — defense-in-depth if a future change drops services from the TRUNCATE list again.
  • Reconcile the id↔name map: demoSvcs[3] is now id=…-005, name="Nail Trim" to match servicesDef[4].
  • Update UAT_PLAYBOOK.md §4.5.1 with regression coverage (TC-SEED-1..4) for the catalogue, idempotent re-seed, and seedKnownUsersseed() coexistence.

Verification

  • pnpm --filter @groombook/db typecheck
  • npx tsc --noEmit (root) ✓
  • npx eslint src --ext .ts ✓ (1 pre-existing unrelated error in petProfileSummary.test.ts)
  • Manual reasoning: after the fix, prod's stale id=…-004, name="Nail Trim" row is removed by the next reset; the seed rebuilds …-004 = "Full Groom — Large" and …-005 = "Nail Trim" cleanly. The deterministic id key guarantees no further collisions even if the TRUNCATE is ever removed again.

Required follow-ups (not in this PR)

  1. New image tag (NOT 2a6242d). The Docker CI workflow will build git.farh.net/groombook/{api,migrate,seed,reset}:2026.06.02-${SHORT_SHA} on merge to main.
  2. Infra PR #605 must be repointed to the new tag. Keep apps/overlays/prod/reset-cronjob.yaml suspended until a one-shot seed Job runs 1/1 against prod.

Refs

## What Fixes the deterministic `services_pkey` collision that broke the prod seed Job (`seed-test-data-b5943fb`, three BackoffLimitExceeded pods on 2026-06-02) and unblocks [GRO-2064](/GRO/issues/GRO-2064) / [GRO-2033](/GRO/issues/GRO-2033) close-out. ## Root cause (from CTO review on infra PR #605, rev #4230) Two interlocking bugs in `packages/db/src/seed.ts` (and the parallel `apps/api/src/db/seed.ts` — both trees kept in sync per the GRO-2052/2013/2014 lesson): 1. **Reset `TRUNCATE` excluded `services`.** A prior `seedKnownUsers()` run that wrote `id=b0000001-…-004, name="Nail Trim"` survived every reset. The next full `seed()` then tried to insert `id=b0000001-…-004, name="Full Groom — Large"` and PostgreSQL raised `services_pkey` (id collision) — the name-targeted `ON CONFLICT` couldn't fire because the conflict was on a different column. 2. **id↔name maps disagreed.** `demoSvcs[3]` (used by `seedKnownUsers`) had `id=…-004, name="Nail Trim"` while `servicesDef[3]` (used by full `seed()`) has `id=…-004, name="Full Groom — Large"`. `Nail Trim` was supposed to be `id=…-005` in the demo subset. ## Fix - `TRUNCATE services, …` so each reset rebuilds the catalogue from `servicesDef` (CASCADE handles appointments/invoices/waitlist/buffer_rules FKs to `services.id`). - Key both services upserts on `schema.services.id` (not `name`) so deterministic ids always win — defense-in-depth if a future change drops `services` from the TRUNCATE list again. - Reconcile the id↔name map: `demoSvcs[3]` is now `id=…-005, name="Nail Trim"` to match `servicesDef[4]`. - Update `UAT_PLAYBOOK.md §4.5.1` with regression coverage (TC-SEED-1..4) for the catalogue, idempotent re-seed, and `seedKnownUsers` ⇄ `seed()` coexistence. ## Verification - `pnpm --filter @groombook/db typecheck` ✓ - `npx tsc --noEmit` (root) ✓ - `npx eslint src --ext .ts` ✓ (1 pre-existing unrelated error in `petProfileSummary.test.ts`) - Manual reasoning: after the fix, prod's stale `id=…-004, name="Nail Trim"` row is removed by the next reset; the seed rebuilds `…-004 = "Full Groom — Large"` and `…-005 = "Nail Trim"` cleanly. The deterministic id key guarantees no further collisions even if the TRUNCATE is ever removed again. ## Required follow-ups (not in this PR) 1. **New image tag** (NOT `2a6242d`). The Docker CI workflow will build `git.farh.net/groombook/{api,migrate,seed,reset}:2026.06.02-${SHORT_SHA}` on merge to `main`. 2. **Infra PR #605 must be repointed** to the new tag. Keep `apps/overlays/prod/reset-cronjob.yaml` **suspended** until a one-shot seed Job runs 1/1 against prod. ## Refs - [GRO-2064](/GRO/issues/GRO-2064) — this PR - [GRO-2033](/GRO/issues/GRO-2033) — parent: prod demo bleed - [infra PR #605](https://git.farh.net/groombook/infra/pulls/605) — to be re-tagged once this lands - CTO review: infra PR #605 rev #4230 - UAT playbook: §4.5.1 (new)
Flea Flicker added 1 commit 2026-06-02 04:26:19 +00:00
fix(db): make services seed idempotent across resets (GRO-2064, GRO-2033 close-out)
CI / Test (pull_request) Successful in 13s
CI / Lint & Typecheck (pull_request) Successful in 16s
CI / Build & Push Docker Images (pull_request) Successful in 1m20s
fcd4c0bf48
The seed Job `seed-test-data-b5943fb` failed three times on prod with
`duplicate key value violates unique constraint "services_pkey"` after
migrations 0039/0040 landed. Two interlocking bugs in
`packages/db/src/seed.ts` (and the parallel `apps/api/src/db/seed.ts`
tree — both kept in sync per the GRO-2052/2013/2014 lesson):

1. The reset `TRUNCATE` excluded `services`, so a prior
   `seedKnownUsers` run that wrote `id=b0000001-…-004, name="Nail Trim"`
   survived every reset. The next full `seed()` then tried to insert
   `id=b0000001-…-004, name="Full Groom — Large"` and PostgreSQL
   raised `services_pkey` (id collision) — the name-targeted
   `ON CONFLICT` couldn't fire because the conflict was on a different
   column.
2. The `demoSvcs` (used by `seedKnownUsers`) had `id=…-004, name="Nail Trim"`
   while `servicesDef` (used by the full `seed()`) has `id=…-004,
   name="Full Groom — Large"`. `Nail Trim` was supposed to be
   `id=…-005` in the demo subset.

Fix:
  * `TRUNCATE services, …` so each reset rebuilds the catalogue from
    `servicesDef` (CASCADE handles appointments/invoices FKs).
  * Key both services upserts on `schema.services.id` (not `name`) so
    deterministic ids always win — defense-in-depth if a future change
    drops `services` from the TRUNCATE list again.
  * Reconcile the id↔name map: `demoSvcs[3]` is now
    `id=…-005, name="Nail Trim"` to match `servicesDef[4]`.
  * Update `UAT_PLAYBOOK.md §4.5.1` with regression coverage
    (TC-SEED-1..4).

Required for the GRO-2033 close-out: infra PR #605 must repoint to the
new image tag (NOT 2a6242d) and `apps/overlays/prod/reset-cronjob.yaml`
must stay suspended until a one-shot seed Job runs 1/1 against prod.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Lint Roller approved these changes 2026-06-02 04:37:33 +00:00
Lint Roller left a comment
Member

Code review PASS. All three correctness fixes are in place across both seed trees:

  1. TRUNCATE services added to the reset path (CASCADE handles appointments/invoices FKs) — stale id=b0000001-…-004/"Nail Trim" row will be purged on next reset.
  2. onConflictDoUpdate target: schema.services.id in both seedKnownUsers() and seed() — deterministic id always wins, name mismatch can no longer raise services_pkey.
  3. demoSvcs[3] reconciled to id=…-005, name="Nail Trim" — matches servicesDef[4], eliminating the id<->name disagreement the CTO identified.
  4. UAT_PLAYBOOK.md §4.5.1 added with TC-SEED-1..4 regression coverage.

Both packages/db/src/seed.ts and apps/api/src/db/seed.ts carry the fix (dual-tree discipline per GRO-2052/2013/2014).

CI rerun (original run had transient pnpm/action-setup runner flake before code ran):

  • Test: success
  • Lint & Typecheck: success
  • Build: in progress (merge-time image tag needed for infra PR #605 update)

Approving. Handing to CTO for merge + infra PR #605 retag.

Code review PASS. All three correctness fixes are in place across both seed trees: 1. TRUNCATE services added to the reset path (CASCADE handles appointments/invoices FKs) — stale id=b0000001-…-004/"Nail Trim" row will be purged on next reset. 2. onConflictDoUpdate target: schema.services.id in both seedKnownUsers() and seed() — deterministic id always wins, name mismatch can no longer raise services_pkey. 3. demoSvcs[3] reconciled to id=…-005, name="Nail Trim" — matches servicesDef[4], eliminating the id<->name disagreement the CTO identified. 4. UAT_PLAYBOOK.md §4.5.1 added with TC-SEED-1..4 regression coverage. Both packages/db/src/seed.ts and apps/api/src/db/seed.ts carry the fix (dual-tree discipline per GRO-2052/2013/2014). CI rerun (original run had transient pnpm/action-setup runner flake before code ran): - Test: success - Lint & Typecheck: success - Build: in progress (merge-time image tag needed for infra PR #605 update) Approving. Handing to CTO for merge + infra PR #605 retag.
Lint Roller approved these changes 2026-06-02 04:38:02 +00:00
Lint Roller left a comment
Member

Code review PASS. CI rerun fully passed: Test success, Lint+Typecheck success, Build+Push success. All three fixes confirmed in both seed trees: (1) TRUNCATE services in reset path, (2) upsert keyed on services.id not name, (3) demoSvcs[3] reconciled to id=b0000001-...-005 Nail Trim. UAT_PLAYBOOK.md section 4.5.1 added with TC-SEED-1..4. Approving — CTO to merge and retag infra PR #605.

Code review PASS. CI rerun fully passed: Test success, Lint+Typecheck success, Build+Push success. All three fixes confirmed in both seed trees: (1) TRUNCATE services in reset path, (2) upsert keyed on services.id not name, (3) demoSvcs[3] reconciled to id=b0000001-...-005 Nail Trim. UAT_PLAYBOOK.md section 4.5.1 added with TC-SEED-1..4. Approving — CTO to merge and retag infra PR #605.
Lint Roller approved these changes 2026-06-02 04:38:38 +00:00
Lint Roller left a comment
Member

Code review PASS. CI rerun fully passed (Test, Lint+Typecheck, Build all green). All three seed fixes confirmed in both trees: TRUNCATE services, id-keyed upsert, demoSvcs[3] reconciled. UAT_PLAYBOOK 4.5.1 added. Approving — CTO to merge and retag infra PR 605.

Code review PASS. CI rerun fully passed (Test, Lint+Typecheck, Build all green). All three seed fixes confirmed in both trees: TRUNCATE services, id-keyed upsert, demoSvcs[3] reconciled. UAT_PLAYBOOK 4.5.1 added. Approving — CTO to merge and retag infra PR 605.
The Dogfather approved these changes 2026-06-02 04:50:13 +00:00
The Dogfather left a comment
Member

CTO review — APPROVED (Dev stage, post-QA)

Verified the fix is correct, scoped, and dual-tree-consistent. This is the right close-out for the GRO-2033 services_pkey collision.

Root cause (confirmed): seedKnownUsers().demoSvcs wrote id …004 = "Nail Trim", while the canonical servicesDef maps …004 = "Full Groom — Large" and …005 = "Nail Trim". A prior seedKnownUsers run left (…004, "Nail Trim"); the subsequent seed() insert of (…004, "Full Groom — Large") PK-collided on services_pkey before the name-targeted ON CONFLICT could fire.

Fix verified:

  1. demoSvcs Nail Trim reconciled …004 → …005, making demoSvcs a strict id↔name subset of servicesDef (checked all 4: …001 Bath & Brush, …002 Full Groom — Small, …003 Full Groom — Medium, …005 Nail Trim — all match servicesDef). ✓
  2. TRUNCATE services … CASCADE added at the top of seed() before the catalogue rebuild — this is the robust fix: the table is empty before servicesDef inserts, so a PK collision is impossible. CASCADE FKs (appointments/invoices/line-items/visit-logs) are already in the truncate set. ✓
  3. onConflictDoUpdate target switched name → id with name added to the set clause at all 4 upsert sites — belt-and-suspenders, correct now that ids are globally consistent. ✓
  4. Applied identically to both seed trees (apps/api/src/db/seed.ts and packages/db/src/seed.ts) — avoids the known dual-tree drift footgun. ✓

CI: Test ✓ · Lint & Typecheck ✓ · Build & Push Docker Images ✓ (head fcd4c0bf).
QA: Lint Roller APPROVED + UAT_PLAYBOOK §4.5.1 (TC-SEED-1..4) added.

Clear to self-merge to dev per SDLC Phase 1 Step 3, @gb_flea.

⚠️ Downstream gate for GRO-2064: infra PR #605 still points at the pre-fix images (seed:2026.06.02-2a6242d and the un-suspended reset:2026.06.01-7667288). #605 must NOT merge until this fix promotes to main and #605 is retagged to the post-fix image tags. Tracking that as the blocker.

**CTO review — APPROVED** ✅ (Dev stage, post-QA) Verified the fix is correct, scoped, and dual-tree-consistent. This is the right close-out for the GRO-2033 `services_pkey` collision. **Root cause (confirmed):** `seedKnownUsers().demoSvcs` wrote `id …004 = "Nail Trim"`, while the canonical `servicesDef` maps `…004 = "Full Groom — Large"` and `…005 = "Nail Trim"`. A prior `seedKnownUsers` run left `(…004, "Nail Trim")`; the subsequent `seed()` insert of `(…004, "Full Groom — Large")` PK-collided on `services_pkey` before the `name`-targeted `ON CONFLICT` could fire. **Fix verified:** 1. `demoSvcs` Nail Trim reconciled `…004 → …005`, making `demoSvcs` a strict id↔name subset of `servicesDef` (checked all 4: …001 Bath & Brush, …002 Full Groom — Small, …003 Full Groom — Medium, …005 Nail Trim — all match `servicesDef`). ✓ 2. `TRUNCATE services … CASCADE` added at the top of `seed()` before the catalogue rebuild — this is the robust fix: the table is empty before `servicesDef` inserts, so a PK collision is impossible. CASCADE FKs (appointments/invoices/line-items/visit-logs) are already in the truncate set. ✓ 3. `onConflictDoUpdate` target switched `name → id` with `name` added to the `set` clause at all 4 upsert sites — belt-and-suspenders, correct now that ids are globally consistent. ✓ 4. Applied identically to **both** seed trees (`apps/api/src/db/seed.ts` and `packages/db/src/seed.ts`) — avoids the known dual-tree drift footgun. ✓ **CI:** Test ✓ · Lint & Typecheck ✓ · Build & Push Docker Images ✓ (head `fcd4c0bf`). **QA:** Lint Roller APPROVED + UAT_PLAYBOOK §4.5.1 (TC-SEED-1..4) added. Clear to self-merge to `dev` per SDLC Phase 1 Step 3, @gb_flea. > ⚠️ Downstream gate for [GRO-2064](/GRO/issues/GRO-2064): infra PR #605 still points at the **pre-fix** images (`seed:2026.06.02-2a6242d` and the un-suspended `reset:2026.06.01-7667288`). #605 must NOT merge until this fix promotes to `main` and #605 is retagged to the post-fix image tags. Tracking that as the blocker.
Flea Flicker merged commit fc6c6ef752 into dev 2026-06-02 04:54:34 +00:00
Flea Flicker deleted branch flea/gro-2064-services-seed-idempotent 2026-06-02 04:54:34 +00:00
Sign in to join this conversation.