fix(GRO-2123): serialize seed.ts with Postgres advisory lock #155
Reference in New Issue
Block a user
Delete Branch "flea-flicker/gro-2123-seed-advisory-lock"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Fixes the non-deterministic
FK 23503oninvoice_tip_splitsin the UATreset-demo-dataCronJob (GRO-2123).Root cause
packages/db/src/seed.tsdoesTRUNCATE … invoices, invoice_tip_splits …then thousands of inserts with no enclosing transaction and no concurrency lock. The in-loop flush order is correct, so a single isolated run cannot produce this FK violation. But all threereset-demo-dataCronJob overlays usedconcurrencyPolicy: Replaceon an hourly schedule: a new pod can start while the previous one is still mid-seed, and the new pod'sTRUNCATE invoicesdeletes rows the old pod is still inserting against. Hence the in-flightinvoice_tip_splitsinsert references an invoice that was just truncated away →code: '23503', non-deterministic across nodes (wakko/mindy).Fix (this PR —
api)Acquire a session-level
pg_advisory_lockfor the full duration ofseed()so overlapping invocations block then proceed in order.Critical implementation note (architectural correctness): with postgres-js connection pooling, a session-level
pg_advisory_lockacquired on one pooled connection and released on a different one is a no-op (the lock is bound to the session / pg-backend that took it). We therefore:client.reserve().pg_advisory_lock(KEY)on that reserved connection (before theTRUNCATE).db(no per-insert connection changes).try/finallyso a thrown seed error cannot leak the lock or the connection.client.end()) after the lock releases — releasing a reserved connection back to a closed pool is a no-op that confuses the cleanup.SEED_ADVISORY_LOCK_KEY = 0x47524f4f— ASCII"GROO", stable, documented in the file. The single-argumentpg_advisory_lock(int)form is used (postgres-js serializesintas a plainnumber; nobiginttype plumbing required).Defense in depth (companion PR)
groombook/infraPR #TBD:concurrencyPolicy: Replace → Forbidon all threereset-demo-dataCronJob overlays (dev/uat/prod).Forbidskips a new scheduled run while one is still active, removing the cross-schedule overlap vector entirely.Acceptance criteria
api— this PR) opens againstdev.infra) merges first so the Forbid policy is in place when the rebuilt reset image deploys.groombook-uat):kubectl create job --from=cronjob/reset-demo-data verify-gro2123 -n groombook-uat— pod reachesCompleted, logs show✓ Acquired seed advisory lockand✓ Released seed advisory lockfromseed.ts, nocode: '23503'/invoice_tip_splitsFK error.✓ Created Nlines).UAT Playbook
UAT_PLAYBOOK.md§3.29 documents the regression check.Files changed
packages/db/src/seed.ts— addSEED_ADVISORY_LOCK_KEY,withSeedAdvisoryLockhelper, extractrunSeedBody; wrapseed()in the helper; remove the in-bodyclient.end()so the lock connection can release cleanly first.UAT_PLAYBOOK.md— new §3.29 (TC-API-3.29).cc @cpfarhood
🤖 Generated with Claude Code
The reset-demo-data CronJob in groombook-uat intermittently failed with FK 23503 on invoice_tip_splits because two pods could run the seed concurrently: the new pod's TRUNCATE deleted rows the old pod was still inserting. Acquire a session-level advisory lock for the full duration of the seed. CRITICAL: with postgres-js connection pooling, a pg_advisory_lock acquired on one pooled connection and released on a different one is a no-op (the lock is bound to the pg-backend that took it). We therefore reserve a dedicated connection for the lock, take pg_advisory_lock(KEY) on it, run the seed on the pooled connections, and release the lock + reserved connection in a try/finally so a thrown seed error cannot leak the lock or the connection. Defence-in-depth with the infra PR that switches concurrencyPolicy: Replace → Forbid on the reset-demo-data CronJob. - Adds withSeedAdvisoryLock helper and runSeedBody extracted function - Wraps seed() body in the helper; client.end() runs after the lock releases so a reserved connection is not returned to a closed pool - SEED_ADVISORY_LOCK_KEY = 0x47524f4f ("GROO" in ASCII) — arbitrary stable 32-bit key, referenced in runbooks - UAT_PLAYBOOK.md §3.29 documents the regression check cc @cpfarhood Co-Authored-By: Paperclip <noreply@paperclip.ing>