Compare commits

...

5 Commits

Author SHA1 Message Date
Flea Flicker 1a4c02476d fix(GRO-2139): serialize the entire reset→migrate→seed chain under the seed advisory lock
CI / Test (pull_request) Successful in 10s
CI / Lint & Typecheck (pull_request) Successful in 15s
CI / Build & Push Docker Images (pull_request) Successful in 1m21s
The dev reset-demo-data CronJob intermittently produced one Error pod per
run with `invoices_pkey` duplicate-key violations. The CTO analysis
(traced in GRO-2136) concluded the race is between the reset image's
three-step chain and a concurrent same-PRNG seeder (the dev
seed-test-data Job being recreated at the top of the hour by Flux).

GRO-2123 added `pg_advisory_lock(0x47524f4f)` around `runSeedBody`,
but `reset.ts` (DROP TABLE … CASCADE) and `drizzle-kit migrate`
ran as separate processes outside that lock — so a concurrent locked
seed could still interleave with the reset's drop+recreate, leaving
two same-seed writers emitting identical invoice ids (the
Mulberry32(seed=42) stream is fully deterministic per process).

This commit makes the whole chain a single locked unit:

- `reset.ts` now takes the same advisory lock and runs DROP → migrate
  → runSeedBody under a single Postgres session (max: 1). The lock
  spans the entire chain, so any concurrent `seed.ts` invocation
  (via the seed-test-data Job or CI) blocks until the reset finishes.
- `packages/db/package.json` `reset` script is now a single
  `tsx src/reset.ts` invocation — `drizzle-kit migrate` no longer
  runs as a separate un-locked process.
- `withSeedAdvisoryLock`, `runSeedBody`, `getProfile`, `profiles`,
  `SEED_ADVISORY_LOCK_KEY`, and the `SeedProfile`/`ProfileConfig`
  types are now exported from `seed.ts` so `reset.ts` can use them
  while preserving the deterministic seed contract.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 05:11:13 +00:00
Flea Flicker 93be4d8f72 chore: delete stale apps/api/src/db/seed.ts duplicate (GRO-2129) (#158)
CI / Test (push) Successful in 15s
CI / Lint & Typecheck (push) Successful in 18s
CI / Build & Push Docker Images (push) Successful in 38s
chore: delete stale apps/api/src/db/seed.ts duplicate (GRO-2129) (#158)
2026-06-04 12:44:46 +00:00
Flea Flicker f67b96ddfe Merge pull request 'fix(GRO-2123): serialize seed.ts with Postgres advisory lock' (#155) from flea-flicker/gro-2123-seed-advisory-lock into dev
CI / Test (push) Successful in 11s
CI / Lint & Typecheck (push) Successful in 16s
CI / Build & Push Docker Images (push) Successful in 25s
CI / Test (pull_request) Successful in 10s
CI / Lint & Typecheck (pull_request) Successful in 16s
CI / Build & Push Docker Images (pull_request) Successful in 28s
2026-06-04 11:23:41 +00:00
Flea Flicker d1a68d93de fix(GRO-2123): serialize seed.ts with Postgres advisory lock
CI / Test (pull_request) Successful in 13s
CI / Lint & Typecheck (pull_request) Successful in 15s
CI / Build & Push Docker Images (pull_request) Successful in 58s
The reset-demo-data CronJob in groombook-uat intermittently failed with
FK 23503 on invoice_tip_splits because two pods could run the seed
concurrently: the new pod's TRUNCATE deleted rows the old pod was still
inserting.

Acquire a session-level advisory lock for the full duration of the seed.
CRITICAL: with postgres-js connection pooling, a pg_advisory_lock
acquired on one pooled connection and released on a different one is a
no-op (the lock is bound to the pg-backend that took it). We therefore
reserve a dedicated connection for the lock, take pg_advisory_lock(KEY)
on it, run the seed on the pooled connections, and release the lock +
reserved connection in a try/finally so a thrown seed error cannot leak
the lock or the connection.

Defence-in-depth with the infra PR that switches
concurrencyPolicy: Replace → Forbid on the reset-demo-data CronJob.

- Adds withSeedAdvisoryLock helper and runSeedBody extracted function
- Wraps seed() body in the helper; client.end() runs after the lock
  releases so a reserved connection is not returned to a closed pool
- SEED_ADVISORY_LOCK_KEY = 0x47524f4f ("GROO" in ASCII) — arbitrary
  stable 32-bit key, referenced in runbooks
- UAT_PLAYBOOK.md §3.29 documents the regression check

cc @cpfarhood

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-04 11:12:17 +00:00
Flea Flicker e9f94a2bd7 fix(seed): GRO-2100 run uat-groomer linkage AFTER services seed (regression in #151) (#153)
CI / Test (push) Successful in 12s
CI / Test (pull_request) Successful in 12s
CI / Lint & Typecheck (pull_request) Successful in 15s
CI / Build & Push Docker Images (pull_request) Successful in 29s
CI / Lint & Typecheck (push) Failing after 12m57s
CI / Build & Push Docker Images (push) Has been skipped
fix(seed): GRO-2100 run uat-groomer linkage after services seed (#153)

Co-authored-by: Flea Flicker <flea@groombook.dev>
Co-committed-by: Flea Flicker <flea@groombook.dev>
2026-06-02 20:11:45 +00:00
6 changed files with 228 additions and 1400 deletions
+1
View File
@@ -166,6 +166,7 @@ Expected: one row, `role = 'groomer'`. If zero rows return, the request hit the
| TC-API-3.26 | Verify 25-35% medicalAlerts distribution | GET /api/pets (first 30 pets), count how many have non-empty medicalAlerts | Ratio is 25-35% (seed uses rand() < 0.3 for ~30% distribution) |
| TC-API-3.27 | Verify coat_type enum has all seed values | After UAT seed completes, inspect the coat_type enum on the UAT DB — it must contain: short, medium, long, double, wire, silky, curly, hairless | UAT seed jobs (`reset-demo-data`, `seed-test-data`) complete 1/1 with no `enum_in` error; coat_type includes all 8 values used by seed.ts `coatTypePool` |
| TC-API-3.28 | Verify pet_size_category enum has all seed values | After UAT seed completes, inspect the pet_size_category enum on the UAT DB — it must contain: small, medium, large, extra_large | UAT seed jobs (`reset-demo-data`, `seed-test-data`) complete 1/1 with no `enum_in` error; pet_size_category includes all 4 values used by seed.ts `petSizeCategoryPool` (regression for GRO-1999, mirrors TC-API-3.27) |
| TC-API-3.29 | Verify `reset-demo-data` CronJob does not fail with FK 23503 on `invoice_tip_splits` (GRO-2123) | Trigger the CronJob manually: `kubectl create job --from=cronjob/reset-demo-data verify-gro2123 -n groombook-uat`. Wait for pod to terminate. Inspect logs: `kubectl logs -n groombook-uat -l job-name=verify-gro2123` | Pod reaches `Completed` state; logs show `✓ Acquired seed advisory lock` and `✓ Released seed advisory lock` from `seed.ts`; no `PostgresError: … violates foreign key constraint "invoice_tip_splits_invoice_id_invoices_id_fk"` (code 23503); final counts unchanged (500 clients, ~4000 invoices) |
### 4.4 Appointment Scheduling
+2 -2
View File
@@ -12,8 +12,8 @@
"test": "vitest run",
"db:generate": "drizzle-kit generate",
"db:migrate": "drizzle-kit migrate",
"db:seed": "tsx src/db/seed.ts",
"db:reset": "tsx src/db/reset.ts && drizzle-kit migrate && tsx src/db/seed.ts",
"db:seed": "pnpm --filter @groombook/db seed",
"db:reset": "pnpm --filter @groombook/db reset",
"db:studio": "drizzle-kit studio"
},
"dependencies": {
File diff suppressed because it is too large Load Diff
+1 -1
View File
@@ -20,7 +20,7 @@
"generate": "drizzle-kit generate",
"migrate": "drizzle-kit migrate",
"seed": "tsx src/seed.ts",
"reset": "tsx src/reset.ts && drizzle-kit migrate && tsx src/seed.ts",
"reset": "tsx src/reset.ts",
"studio": "drizzle-kit studio",
"typecheck": "tsc --noEmit"
},
+103 -38
View File
@@ -1,13 +1,51 @@
/**
* reset.ts — Drop all application tables and re-run migrations + seed.
* reset.ts — Drop all application tables, re-run migrations, and re-seed.
*
* Intended for local development only. Never run against production.
*
* Usage:
* DATABASE_URL=postgres://... npx tsx packages/db/src/reset.ts
*
* GRO-2139: the entire drop→migrate→seed chain runs inside a single
* Postgres advisory lock (SEED_ADVISORY_LOCK_KEY) so a concurrent
* `seed.ts` (e.g. the dev `seed-test-data-*` Job being recreated at
* the top of the hour) cannot interleave between `reset.ts` (DROP)
* and `seed.ts` (TRUNCATE+insert) and collide on `invoices_pkey`.
*
* Why this matters: `seed.ts` derives every primary key from a single
* shared Mulberry32 PRNG seeded with 42 (see `createPrng(42)` and
* `uuid()` in seed.ts). Two concurrent same-profile seeders therefore
* emit *identical* ids for the same logical row, and any moment
* between a concurrent `seed.ts` TRUNCATE and INSERT is exactly the
* window in which the second seeder's INSERT can hit a pkey already
* taken by the first. Pre-GRO-2123 this raced unconditionally;
* GRO-2123 added the advisory lock around `runSeedBody` but left
* `reset.ts` and `drizzle-kit migrate` outside the lock. This script
* now wraps the *whole* chain in the same lock, using `max: 1` so the
* single Postgres session that holds the lock is the one that runs
* the DROP, the migrations, and the seed body.
*
* See: groombook/infra `apps/base/reset-cronjob.yaml` (CronJob) and
* `apps/base/seed-job.yaml` (one-shot Job) — both invoke the same
* `seed.ts` code path on the same database in `groombook-dev`.
*/
import postgres from "postgres";
import { drizzle } from "drizzle-orm/postgres-js";
import { migrate } from "drizzle-orm/postgres-js/migrator";
import { fileURLToPath } from "node:url";
import { dirname, resolve } from "node:path";
import * as schema from "./schema.js";
import {
SEED_ADVISORY_LOCK_KEY,
withSeedAdvisoryLock,
getProfile,
runSeedBody,
profiles,
} from "./seed.js";
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);
const MIGRATIONS_FOLDER = resolve(__dirname, "../migrations");
async function reset() {
const url = process.env.DATABASE_URL;
@@ -16,52 +54,79 @@ async function reset() {
process.exit(1);
}
if (process.env.NODE_ENV === "production" && process.env.ALLOW_RESET !== "true") {
console.error("[FATAL] db:reset must not be run in production without ALLOW_RESET=true.");
if (
process.env.NODE_ENV === "production" &&
process.env.ALLOW_RESET !== "true"
) {
console.error(
"[FATAL] db:reset must not be run in production without ALLOW_RESET=true.",
);
process.exit(1);
}
// max: 1 so the advisory lock and every DROP / migrate / seed query
// share a single Postgres session. With max > 1 postgres-js could
// route a query to a different pooled connection that does NOT hold
// the lock, defeating the point of "lock spans the whole chain".
const client = postgres(url, { max: 1 });
const db = drizzle(client, { schema });
console.log("Dropping all application tables...\n");
try {
await withSeedAdvisoryLock(client, async () => {
console.log("Dropping all application tables...\n");
// Drop in dependency order (children before parents)
await client`
DO $$ DECLARE
r RECORD;
BEGIN
FOR r IN (
SELECT tablename FROM pg_tables
WHERE schemaname = 'public'
) LOOP
EXECUTE 'DROP TABLE IF EXISTS public.' || quote_ident(r.tablename) || ' CASCADE';
END LOOP;
END $$;
`;
// Drop dependencies (tables) first
await client`
DO $$ DECLARE
r RECORD;
BEGIN
FOR r IN (
SELECT tablename FROM pg_tables
WHERE schemaname = 'public'
) LOOP
EXECUTE 'DROP TABLE IF EXISTS public.' || quote_ident(r.tablename) || ' CASCADE';
END LOOP;
END $$;
`;
// Drop custom enums
await client`
DO $$ DECLARE
r RECORD;
BEGIN
FOR r IN (
SELECT typname FROM pg_type
WHERE typtype = 'e' AND typnamespace = (
SELECT oid FROM pg_namespace WHERE nspname = 'public'
)
) LOOP
EXECUTE 'DROP TYPE IF EXISTS ' || quote_ident(r.typname) || ' CASCADE';
END LOOP;
END $$;
`;
// Drop custom enums
await client`
DO $$ DECLARE
r RECORD;
BEGIN
FOR r IN (
SELECT typname FROM pg_type
WHERE typtype = 'e' AND typnamespace = (
SELECT oid FROM pg_namespace WHERE nspname = 'public'
)
) LOOP
EXECUTE 'DROP TYPE IF EXISTS ' || quote_ident(r.typname) || ' CASCADE';
END LOOP;
END $$;
`;
// Drop the drizzle migrations tracking table
await client`DROP TABLE IF EXISTS drizzle.__drizzle_migrations CASCADE`;
await client`DROP SCHEMA IF EXISTS drizzle CASCADE`;
// Drop the drizzle migrations tracking table
await client`DROP TABLE IF EXISTS drizzle.__drizzle_migrations CASCADE`;
await client`DROP SCHEMA IF EXISTS drizzle CASCADE`;
console.log("✓ All tables and enums dropped\n");
console.log("✓ All tables and enums dropped\n");
await client.end();
console.log("Running migrations...");
await migrate(db, { migrationsFolder: MIGRATIONS_FOLDER });
console.log("✓ Migrations applied\n");
console.log("Seeding database...");
const profile = getProfile();
const cfg = profiles[profile];
await runSeedBody(client, db, profile, cfg);
});
console.log(
`\n✓ Reset complete (advisory lock key=0x${SEED_ADVISORY_LOCK_KEY.toString(16)})`,
);
} finally {
await client.end();
}
}
reset().catch((err) => {
+121 -10
View File
@@ -24,9 +24,9 @@ import type { MedicalAlert } from "@groombook/types";
// ── Seed profile configuration ─────────────────────────────────────────────
type SeedProfile = "dev" | "uat" | "demo";
export type SeedProfile = "dev" | "uat" | "demo";
interface ProfileConfig {
export interface ProfileConfig {
staffCount: { manager: number; receptionist: number; groomer: number; bather: number };
clientCount: number;
appointmentsBackDays: number;
@@ -35,7 +35,7 @@ interface ProfileConfig {
includeUatClients: boolean;
}
const profiles: Record<SeedProfile, ProfileConfig> = {
export const profiles: Record<SeedProfile, ProfileConfig> = {
dev: {
staffCount: { manager: 1, receptionist: 1, groomer: 2, bather: 0 },
clientCount: 100,
@@ -70,6 +70,8 @@ function getProfile(): SeedProfile {
return "uat";
}
export { getProfile };
// ── Deterministic PRNG (Mulberry32) ──────────────────────────────────────────
/**
@@ -401,7 +403,9 @@ const servicesDef = [
*
* In seedKnownUsers() this replaces the inline UAT-staff block.
*/
async function seedUatStaffAccounts(db: ReturnType<typeof drizzle>) {
async function seedUatStaffAccounts(
db: ReturnType<typeof drizzle>,
): Promise<string | null> {
// ── Staff: UAT Super User (oidcSub from SEED_UAT_SUPER_OIDC_SUB env var) ──
const uatSuperOidcSub = process.env.SEED_UAT_SUPER_OIDC_SUB;
if (uatSuperOidcSub) {
@@ -677,7 +681,12 @@ async function seedUatStaffAccounts(db: ReturnType<typeof drizzle>) {
// We deterministically link the UAT groomer to the UAT customer's first pet
// ("UAT Pup Alpha") and leave the second pet ("UAT Pup Beta") UNLINKED so
// TC-UAT-2 (200) and TC-UAT-3 (403) can both hardcode the stable petIds.
await seedUatGroomerLinkage(db, uatCustomerClientId);
//
// The linkage call itself is performed by the caller AFTER the `services`
// catalogue has been seeded (this helper runs before services exist,
// which previously caused the linkage to be silently skipped on every
// reset). GRO-2100 follow-up.
return uatCustomerClientId;
}
/**
@@ -692,12 +701,18 @@ async function seedUatStaffAccounts(db: ReturnType<typeof drizzle>) {
*/
async function seedUatGroomerLinkage(
db: ReturnType<typeof drizzle>,
customerClientId: string,
customerClientId: string | null,
): Promise<void> {
const uatGroomerEmail = "uat-groomer@groombook.dev";
const LINKED_PET_ID = "c0000001-0000-0000-0000-000000000002"; // UAT Pup Alpha
const APPT_ID = "a0000001-0000-0000-0000-000000000001";
// Skip silently if the UAT Customer client wasn't created (non-UAT seed
// profile, e.g. seedKnownUsers() in an env without the UAT personas).
if (!customerClientId) {
return;
}
// Only run if the UAT groomer staff record actually exists — dev/test seeds
// that don't set SEED_UAT_STAFF_OIDC_SUB should not crash.
const [uatGroomerStaff] = await db
@@ -720,6 +735,19 @@ async function seedUatGroomerLinkage(
return;
}
// Skip if the linked pet hasn't been seeded yet (defensive: caller should
// ensure pets exist; if the helper is re-ordered later we don't want to
// crash here).
const [linkedPet] = await db
.select({ id: schema.pets.id })
.from(schema.pets)
.where(eq(schema.pets.id, LINKED_PET_ID))
.limit(1);
if (!linkedPet) {
console.warn(`⚠ GRO-2100: UAT Pup Alpha (${LINKED_PET_ID}) not found — skipping uat-groomer linkage`);
return;
}
// The "Bath & Brush" service id is stable across the reset; falls back to
// any active service if it has not been seeded yet (e.g. seedKnownUsers
// runs in isolation).
@@ -847,7 +875,7 @@ async function seedKnownUsers() {
// ── UAT staff accounts + Better Auth credentials (shared impl) ──────────────
// Extracted into seedUatStaffAccounts() so it runs in both seedKnownUsers()
// and the full seed() UAT branch.
await seedUatStaffAccounts(db);
const uatCustomerClientId = await seedUatStaffAccounts(db);
// ── Services: idempotent upsert keyed on `id` ─────────────────────────────
// GRO-2064: previously keyed on `services.name` while writing a
@@ -875,6 +903,12 @@ async function seedKnownUsers() {
}
console.log(`✓ Seeded ${demoSvcs.length} services`);
// GRO-2100: deterministic uat-groomer ↔ UAT Pup Alpha linkage. Must run
// AFTER services are seeded (this helper looks up an active service id
// to attach to the appointment; on a fresh reset there are none yet at
// the time seedUatStaffAccounts() returns).
await seedUatGroomerLinkage(db, uatCustomerClientId);
// ── Client: Demo Client ──
const [existingClient] = await db
.select()
@@ -944,6 +978,63 @@ async function seedKnownUsers() {
// ── Main seed ────────────────────────────────────────────────────────────────
// ── GRO-2123: serialize reset+seed with a Postgres advisory lock ────────
// The reset-demo-data CronJob runs on an hourly schedule. With
// concurrencyPolicy=Replace, a new pod can start while the previous one
// is still mid-seed; the new pod's TRUNCATE then deletes rows the old pod
// is still inserting, producing FK 23503 errors non-deterministically
// (see GRO-2123: invoice_tip_splits → invoices).
//
// We hold a session-level advisory lock for the full duration of the
// seed so that overlapping invocations block then proceed in order —
// not skip. The key is a stable 32-bit constant so it can be referenced
// from runbooks without ambiguity and binds to the single-argument
// `pg_advisory_lock(int)` form, which postgres-js serializes as a plain
// number (no bigint type plumbing required).
export const SEED_ADVISORY_LOCK_KEY = 0x47524f4f; // "GROO" in ASCII — arbitrary, stable
/**
* Reserve a dedicated connection from `pool`, take the seed advisory lock
* on it, run `fn`, and release the lock + connection in a try/finally.
*
* CRITICAL: with postgres-js connection pooling, a session-level
* `pg_advisory_lock(KEY)` acquired on one pooled connection and released
* on a *different* one is a no-op (the lock is bound to the session /
* pg-backend that took it). We therefore reserve a dedicated connection
* for the lock and release it from the same reserved connection. The
* seed work itself still runs on the pooled connections.
*/
export async function withSeedAdvisoryLock<T>(
pool: ReturnType<typeof postgres>,
fn: () => Promise<T>,
): Promise<T> {
const lockConnection = await pool.reserve();
let lockHeld = false;
try {
await lockConnection`SELECT pg_advisory_lock(${SEED_ADVISORY_LOCK_KEY})`;
lockHeld = true;
console.log(`✓ Acquired seed advisory lock (key=${SEED_ADVISORY_LOCK_KEY})`);
const result = await fn();
await lockConnection`SELECT pg_advisory_unlock(${SEED_ADVISORY_LOCK_KEY})`;
lockHeld = false;
console.log(`✓ Released seed advisory lock`);
return result;
} finally {
if (lockHeld) {
try {
await lockConnection`SELECT pg_advisory_unlock(${SEED_ADVISORY_LOCK_KEY})`;
} catch (err) {
console.error("Failed to release seed advisory lock during cleanup:", err);
}
}
try {
lockConnection.release();
} catch (err) {
console.error("Failed to release reserved lock connection:", err);
}
}
}
async function seed() {
const url = process.env.DATABASE_URL;
if (!url) {
@@ -961,6 +1052,22 @@ async function seed() {
const client = postgres(url, { max: 5 });
const db = drizzle(client, { schema });
// GRO-2123: hold the seed advisory lock for the full body of runSeedBody.
// See the withSeedAdvisoryLock comment for why a reserved connection is
// required (postgres-js pooling would silently drop the lock otherwise).
await withSeedAdvisoryLock(client, async () => {
return await runSeedBody(client, db, profile, cfg);
});
await client.end();
}
export async function runSeedBody(
client: ReturnType<typeof postgres>,
db: ReturnType<typeof drizzle>,
profile: SeedProfile,
cfg: ProfileConfig,
): Promise<void> {
console.log(`Seeding Groom Book database (profile: ${profile})...\n`);
// ── Staff ──
@@ -1031,7 +1138,7 @@ async function seed() {
// ── UAT staff accounts + Better Auth credentials (shared impl) ──────────────
// Seeds deterministic UAT staff with numeric OIDC subs and Better Auth credentials.
// Must run AFTER random staff are created so upserts land correctly.
await seedUatStaffAccounts(db);
const uatCustomerClientId = await seedUatStaffAccounts(db);
// ── Services ──
// GRO-2064: key the upsert on `services.id` (not `name`) so deterministic
@@ -1058,6 +1165,12 @@ async function seed() {
}
console.log(`✓ Created ${servicesDef.length} services`);
// GRO-2100: deterministic uat-groomer ↔ UAT Pup Alpha linkage. Must run
// AFTER services are seeded (this helper looks up an active service id
// to attach to the appointment; on a fresh reset there are none yet at
// the time seedUatStaffAccounts() returns).
await seedUatGroomerLinkage(db, uatCustomerClientId);
// ── Clients & Pets ──
const now = new Date();
const appointmentsBackDate = new Date(now);
@@ -1576,8 +1689,6 @@ async function seed() {
}
console.log(`✓ Created ${visitLogCount} grooming visit logs`);
console.log("\nSeed complete!");
await client.end();
}
seed().catch((err) => {