Compare commits

..

46 Commits

Author SHA1 Message Date
Coupon Carl c27f6a1e3c Merge pull request 'Promote to Production: CAR-1276 Phase 1 — auth /health 503 error-log fix' (#286) from uat into main
CI / test (push) Successful in 10s
CI / lint (push) Successful in 14s
CI / audit (push) Successful in 13s
CI / e2e (push) Successful in 40s
CI / lighthouse (push) Failing after 1m20s
CI / build-and-push-api (push) Successful in 1m4s
CI / build-and-push-receiptwitness (push) Successful in 1m52s
CI / build-and-push-auth (push) Successful in 1m14s
CI / build-and-push (push) Successful in 1m14s
CI / deploy-uat (push) Failing after 7s
CI / deploy-dev (push) Failing after 7s
Promote to Production: CAR-1276 Phase 1 — auth /health 503 error-log fix

UAT PASS (Deal Dottie) + Security PASS (Stockboy Steve) on CAR-1282.
Merged by CEO (Coupon Carl) as production gate.

cc @cpfarhood
2026-06-06 00:25:10 +00:00
Savannah Savings f283d5aa02 promote: auth /health 503 error-log fix (CAR-1276 Phase 1) dev→uat (#285)
CI / lint (push) Successful in 14s
CI / e2e (push) Successful in 48s
CI / test (push) Successful in 14s
CI / audit (push) Successful in 15s
CI / lighthouse (push) Failing after 1m19s
CI / build-and-push-api (push) Successful in 2m31s
CI / build-and-push-receiptwitness (push) Successful in 3m14s
CI / build-and-push-auth (push) Successful in 2m2s
CI / build-and-push (push) Failing after 2m13s
CI / deploy-dev (push) Has been skipped
CI / deploy-uat (push) Failing after 7s
CI / audit (pull_request) Successful in 10s
CI / lint (pull_request) Successful in 11s
CI / test (pull_request) Successful in 12s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / e2e (pull_request) Successful in 40s
CI / lighthouse (pull_request) Failing after 1m22s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
2026-06-06 00:02:56 +00:00
Savannah Savings 39804135a4 fix(auth): log /health 503 error and surface message in body (#283, CAR-1276)
CI / audit (push) Successful in 13s
CI / test (push) Successful in 13s
CI / lint (pull_request) Successful in 14s
CI / test (pull_request) Successful in 13s
CI / lighthouse (push) Failing after 1m22s
CI / e2e (pull_request) Successful in 49s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / e2e (push) Successful in 44s
CI / audit (pull_request) Successful in 13s
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / lighthouse (pull_request) Failing after 1m23s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
CI / lint (push) Successful in 16m17s
CI / build-and-push-auth (push) Successful in 38s
CI / build-and-push-api (push) Successful in 1m34s
CI / build-and-push (push) Successful in 2m44s
CI / build-and-push-receiptwitness (push) Successful in 3m52s
CI / deploy-uat (push) Has been skipped
CI / deploy-dev (push) Failing after 6s
2026-06-06 00:02:17 +00:00
Barcode Betty b2c4692400 fix(auth): log /health 503 error and surface message in body (CAR-1276)
CI / deploy-uat (pull_request) Has been skipped
CI / test (pull_request) Successful in 12s
CI / lint (pull_request) Successful in 13s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / audit (pull_request) Successful in 40s
CI / e2e (pull_request) Successful in 1m11s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / lighthouse (pull_request) Failing after 1m15s
The /health handler's catch block was empty, so when the DB probe
failed we had no log line to diagnose from. UAT auth was crashlooping
on /health 503s for that exact reason — pod logs only showed
'CartSnitch auth service listening on port 3001' and nothing else.

Add console.error with the error name/message and include the message
in the 503 response body so the next time this fails we can read the
actual error from `kubectl logs` without re-deploying.

This is the dev-side observability half of CAR-1276. The underlying
DB failure still needs investigation (likely better-auth schema
missing from the cartsnitch DB; see CAR-1276 for the analysis).

Tests updated to assert the new error field is present and a string.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-05 07:05:46 +00:00
Coupon Carl a0088acb1a Merge pull request 'Promote to Production: CAR-1215 react-router audit-gate fix' (#282) from uat into main
CI / lint (push) Successful in 11s
CI / audit (push) Successful in 11s
CI / test (push) Successful in 13s
CI / e2e (push) Successful in 42s
CI / build-and-push-receiptwitness (push) Failing after 55s
CI / build-and-push-auth (push) Failing after 21s
CI / lighthouse (push) Failing after 1m14s
CI / build-and-push (push) Successful in 30s
CI / build-and-push-api (push) Successful in 1m19s
CI / deploy-dev (push) Failing after 12s
CI / deploy-uat (push) Failing after 13s
Promote to Production: CAR-1215 react-router audit-gate fix

UAT PASS: Deal Dottie — all 5 regression steps green
Security PASS: Stockboy Steve — lockfile-only, 3 high advisories cleared

ref: CAR-1215, CAR-1217
2026-06-04 01:53:08 +00:00
Savannah Savings eff1098289 Promote to UAT: CAR-1215 react-router audit-gate fix (#280)
CI / audit (push) Successful in 10s
CI / lint (push) Successful in 11s
CI / test (push) Successful in 14s
CI / e2e (push) Successful in 58s
CI / lighthouse (push) Failing after 1m25s
CI / build-and-push-api (push) Successful in 1m26s
CI / build-and-push-auth (push) Successful in 43s
CI / build-and-push-receiptwitness (push) Successful in 1m59s
CI / build-and-push (push) Successful in 1m6s
CI / deploy-dev (push) Has been skipped
CI / deploy-uat (push) Failing after 7s
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / build-and-push (pull_request) Has been skipped
CI / test (pull_request) Successful in 12s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / e2e (pull_request) Successful in 45s
CI / audit (pull_request) Successful in 10s
CI / lint (pull_request) Successful in 14s
CI / deploy-uat (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / lighthouse (pull_request) Failing after 1m17s
Promotes CAR-1215 to uat. audit gate green; lighthouse pre-existing red (tracked separately).
2026-06-03 22:14:58 +00:00
Savannah Savings 8eeaa92ad8 CAR-1215: bump react-router to 7.16.0 (clear audit gate) (#278)
CI / audit (push) Successful in 10s
CI / build-and-push-api (push) Successful in 1m23s
CI / build-and-push-auth (push) Successful in 43s
CI / lint (pull_request) Successful in 11s
CI / audit (pull_request) Successful in 13s
CI / lint (push) Successful in 14s
CI / test (push) Successful in 14s
CI / test (pull_request) Successful in 12s
CI / e2e (push) Successful in 57s
CI / lighthouse (push) Failing after 1m29s
CI / e2e (pull_request) Successful in 46s
CI / build-and-push-receiptwitness (push) Successful in 2m33s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / build-and-push (push) Successful in 54s
CI / lighthouse (pull_request) Failing after 1m17s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-dev (push) Failing after 17s
CI / deploy-uat (push) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
Lockfile-only bump react-router/react-router-dom 7.14.0->7.16.0 clearing GHSA-49rj-9fvp-4h2h, GHSA-2j2x-hqr9-3h42, GHSA-8x6r-g9mw-2r78. QA PASS (cs_charlie), security PASS (cs_steve). audit gate now green; lighthouse pre-existing red (out of scope, tracked separately).
2026-06-03 22:14:12 +00:00
Barcode Betty fc3a0b4d92 chore(deps): bump react-router + react-router-dom to 7.16.0 (CAR-1215)
CI / lint (pull_request) Successful in 12s
CI / test (pull_request) Successful in 12s
CI / audit (pull_request) Successful in 11s
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / e2e (pull_request) Successful in 43s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
CI / lighthouse (pull_request) Failing after 1m16s
Lockfile-only bump from 7.14.0 -> 7.16.0. The ^7.0.0 range in
package.json already permits 7.16.0, so no source changes.

Clears three high-severity advisories that block the audit CI gate:
- GHSA-49rj-9fvp-4h2h (turbo-stream arbitrary constructor invocation)
- GHSA-2j2x-hqr9-3h42 (protocol-relative URL open redirect)
- GHSA-8x6r-g9mw-2r78 (DoS via unbounded path expansion)

No runtime behavior change; react-router stays on 7.x. npm audit
--audit-level=high exits clean (0 high/critical) locally.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 21:56:05 +00:00
Savannah Savings 009aa92777 Merge pull request 'Promote to UAT: deploy-dev/deploy-uat approval-gate success (CAR-1212)' (#277) from dev into uat
CI / lint (push) Successful in 13s
CI / test (push) Successful in 13s
CI / audit (push) Failing after 11s
CI / e2e (push) Successful in 50s
CI / lighthouse (push) Failing after 1m19s
CI / build-and-push-auth (push) Successful in 31s
CI / build-and-push-api (push) Successful in 1m3s
CI / build-and-push-receiptwitness (push) Successful in 2m29s
CI / build-and-push (push) Successful in 1m40s
CI / deploy-dev (push) Has been skipped
CI / deploy-uat (push) Failing after 6s
2026-06-03 21:49:34 +00:00
Savannah Savings 284b361f9b Merge pull request 'ci: deploy-dev/deploy-uat: report success on infra-main approval gate (CAR-1212)' (#276) from betty/car-1212-approval-gate-exit0 into dev
CI / lint (push) Successful in 14s
CI / audit (push) Failing after 13s
CI / test (push) Successful in 14s
CI / lint (pull_request) Successful in 12s
CI / e2e (push) Successful in 42s
CI / build-and-push-api (push) Successful in 1m2s
CI / build-and-push-auth (push) Successful in 32s
CI / lighthouse (push) Failing after 1m21s
CI / audit (pull_request) Failing after 13s
CI / test (pull_request) Successful in 16s
CI / build-and-push-receiptwitness (push) Successful in 1m52s
CI / e2e (pull_request) Successful in 47s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / build-and-push (push) Successful in 50s
CI / lighthouse (pull_request) Failing after 1m16s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-uat (push) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
CI / deploy-dev (push) Failing after 8s
2026-06-03 21:49:04 +00:00
Barcode Betty 3dcf0ce021 ci: treat infra PR approvals gate as success in deploy jobs (CAR-1212)
CI / lint (pull_request) Successful in 12s
CI / test (pull_request) Successful in 12s
CI / audit (pull_request) Failing after 12s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / e2e (pull_request) Successful in 43s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
CI / lighthouse (pull_request) Failing after 1m17s
Per the spec for CAR-1212 (CAR-1195 follow-up):

- deploy-dev and deploy-uat now request cs_savannah as a reviewer on the
  cartsnitch/infra PR (best-effort, log on non-2xx, never fail the job).
- After the merge attempt, classify the response:
  * .merged == true                      -> success notice
  * 'Does not have enough approvals'     -> ::notice:: + exit 0
                                           (GitOps approval gate, not a
                                           failure; the PR is correctly
                                           opened and surfaces in the CTO
                                           queue)
  * anything else                        -> keep the existing ::error::
                                           and exit 1 (genuine unexpected
                                           failure)

This unblocks the deploy jobs that were hard-failing on the branch-protection
approvals requirement, which a CI bot cannot self-satisfy. The CTO (cs_savannah)
already backstop-approves+merges these infra PRs by hand (e.g. #321, #322).

- 'No image changes to deploy' early-exit preserved.
- Still uses secrets.CI_GITEA_TOKEN for the PR/reviewer/merge API calls.
- No git push origin main: only the API path is used.

Refs CAR-1195, CAR-1194.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-03 21:34:18 +00:00
Savannah Savings b3a452be50 Merge pull request 'promote(dev→uat): CI deploy PR-based image bump (CAR-1195, CAR-1194)' (#275) from dev into uat
CI / lint (push) Successful in 11s
CI / audit (push) Successful in 11s
CI / test (push) Successful in 12s
CI / e2e (push) Successful in 45s
CI / build-and-push-api (push) Successful in 1m7s
CI / build-and-push-auth (push) Successful in 36s
CI / lighthouse (push) Failing after 1m20s
CI / build-and-push (push) Successful in 33s
CI / build-and-push-receiptwitness (push) Successful in 2m10s
CI / deploy-dev (push) Has been skipped
CI / deploy-uat (push) Failing after 7s
2026-06-03 21:13:44 +00:00
Savannah Savings 440d7ac7e7 Merge pull request 'fix(ci): deploy jobs land image bump via PR (CAR-1195, CAR-1194)' (#274) from betty/car-1195-pr-based-deploy into dev
CI / e2e (push) Successful in 43s
CI / audit (push) Successful in 10s
CI / lint (push) Successful in 12s
CI / test (push) Successful in 13s
CI / build-and-push-api (push) Successful in 1m6s
CI / build-and-push-receiptwitness (push) Successful in 2m6s
CI / build-and-push (push) Successful in 47s
CI / lighthouse (push) Failing after 1m52s
CI / deploy-uat (push) Has been skipped
CI / deploy-dev (push) Failing after 7s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / lint (pull_request) Successful in 13s
CI / test (pull_request) Successful in 13s
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / build-and-push (pull_request) Has been skipped
CI / audit (pull_request) Successful in 12s
CI / e2e (pull_request) Successful in 40s
CI / lighthouse (pull_request) Failing after 1m14s
CI / build-and-push-auth (push) Successful in 30s
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
2026-06-03 21:06:44 +00:00
Barcode Betty 83b553b58e ci: delete overlay deploy branches after merge
CI / lint (pull_request) Successful in 13s
CI / test (pull_request) Successful in 12s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / audit (pull_request) Successful in 10s
CI / e2e (pull_request) Successful in 43s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
CI / lighthouse (pull_request) Failing after 1m16s
Set delete_branch_after_merge:true on the auto-merge POST in both
deploy-dev and deploy-uat so the per-deploy branches in
cartsnitch/infra (ci/deploy-{dev,uat}-${GITHUB_SHA}) are removed
once their overlay image-tag bump lands on main. Without this flag
every successful deploy would leave a branch behind, accumulating
in cartsnitch/infra and making future re-runs of the same SHA
un-actionable from the existing branch name.

Refs CAR-1195 (CTO fix #2).
2026-06-03 20:53:54 +00:00
Barcode Betty 3a69ec29b5 fix(ci): bind deploy PR API to secrets.CI_GITEA_TOKEN (CAR-1195)
CI / test (pull_request) Successful in 12s
CI / audit (pull_request) Successful in 11s
CI / lint (pull_request) Successful in 13s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / e2e (pull_request) Successful in 43s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
CI / lighthouse (pull_request) Failing after 1m16s
deploy-dev and deploy-uat had CI_GITEA_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
which is the package-scoped container-registry token. PR creation and
auto-merge against cartsnitch/infra would 403 on the first real push.
Bind to secrets.CI_GITEA_TOKEN (the token the infra checkout already
uses for branch push) so the Gitea API calls have repo-write scope.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-03 20:39:21 +00:00
Barcode Betty 2573de86d5 Update .gitea/workflows/ci.yml
CI / lint (pull_request) Successful in 11s
CI / test (pull_request) Successful in 11s
CI / audit (pull_request) Successful in 12s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / e2e (pull_request) Successful in 45s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
CI / lighthouse (pull_request) Failing after 1m12s
2026-06-03 20:09:56 +00:00
Barcode Betty 06162f9f15 fix(ci): unblock dev build/deploy (CAR-1195)
CI / audit (push) Failing after 3s
CI / lint (push) Successful in 13s
CI / test (push) Successful in 13s
CI / e2e (push) Successful in 41s
CI / build-and-push-auth (push) Successful in 41s
CI / lighthouse (push) Failing after 1m15s
CI / build-and-push (push) Successful in 58s
CI / build-and-push-api (push) Successful in 2m48s
CI / build-and-push-receiptwitness (push) Failing after 3m35s
CI / deploy-uat (push) Has been skipped
CI / deploy-dev (push) Failing after 5s
2026-06-03 19:43:54 +00:00
Savannah Savings fb70b816f2 Merge pull request 'fix(receiptwitness): pool DB engine and Redis client to prevent connection exhaustion' (#273) from barcode-betty/car-1078-email-worker-dragonfly-reset into dev
CI / audit (push) Successful in 11s
CI / test (push) Successful in 11s
CI / lint (push) Successful in 14s
CI / e2e (push) Successful in 45s
CI / lighthouse (push) Failing after 1m19s
CI / build-and-push-api (push) Failing after 3m12s
CI / build-and-push-auth (push) Failing after 2m44s
CI / build-and-push (push) Failing after 2m14s
CI / build-and-push-receiptwitness (push) Failing after 3m45s
CI / deploy-uat (push) Has been skipped
CI / deploy-dev (push) Failing after 34s
2026-06-03 19:20:31 +00:00
Coupon Carl d92bcf433b fix(ci): remove actions/setup-node from lint job to bypass corrupted runner cache
CI / test (pull_request) Successful in 12s
CI / lint (pull_request) Successful in 13s
CI / audit (pull_request) Successful in 12s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / e2e (pull_request) Successful in 45s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / lighthouse (pull_request) Failing after 1m16s
Runner pod gitea-act-runner-cartsnitch-85b5984bb-527xw has a corrupt
/root/.cache/act clone of actions/setup-node (missing dist/setup/index.js).
SHA-pinning changed the cache hash but the fresh clone on that pod still
ends up missing the dist directory.

catthehacker/ubuntu:act-latest ships Node pre-installed; the lint job only
needs ESLint + tsc, both of which are devDependencies installed by npm ci.
Removing actions/setup-node from lint bypasses the corrupt pod cache entirely
without affecting other jobs.

Refs CAR-1162

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-03 19:07:14 +00:00
Barcode Betty 01ed6dac00 fix(deps): pin safe versions of audit-flagged transitive deps (CAR-1162 audit)
CI / lint (pull_request) Failing after 6s
CI / test (pull_request) Successful in 12s
CI / audit (pull_request) Successful in 12s
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / e2e (pull_request) Successful in 40s
CI / build-and-push (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
CI / lighthouse (pull_request) Failing after 1m11s
The CI's npm audit (10.8.2) flagged three transitive vulnerabilities
that local newer-npm runs (11.x) miss due to advisory-DB divergence:

- @babel/plugin-transform-modules-systemjs: 7.29.0 -> ^7.29.4
  (CVE-2026-44728: arbitrary code generation, fixed in 7.29.4)
- fast-uri: 3.1.0 -> ^3.1.2
  (path traversal / host confusion via percent-encoded segments)
- brace-expansion: 5.0.5 -> >=5.0.6
  (DoS via large numeric range defeating max protection)

These are non-breaking transitive updates within the same major
version. The previous override for brace-expansion (>=1.1.13) was
too loose to exclude 5.0.2-5.0.5; tightening it to >=5.0.6.

Ref CAR-1162, CAR-1122, CAR-1078

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-03 15:53:46 +00:00
Barcode Betty a7a55bbf79 fix(ci): unblock dev PR #271 CI
CI / audit (pull_request) Failing after 14s
CI / lighthouse (pull_request) Has been skipped
CI / build-and-push (pull_request) Has been skipped
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
CI / test (pull_request) Successful in 16s
CI / lint (pull_request) Successful in 18s
CI / e2e (pull_request) Successful in 52s
- Remove .mcp.json (scope creep, unrelated to CAR-1078)
- Bump vitest to ^4.1.8 (fixes GHSA-5xrq-8626-4rwp critical)
- Run npm audit fix for non-breaking vulns
- Pin actions/checkout and actions/setup-node to commit SHAs
  in .gitea/workflows/ci.yml to force a clean cache fetch on
  the act runner (workaround for corrupted /root/.cache/act cache)

Refs CAR-1162, CAR-1122, CAR-1078
2026-06-03 11:41:19 +00:00
Flea Flicker fb0bb0102c fix(receiptwitness): pool DB engine and Redis client to prevent connection exhaustion
CI / test (pull_request) Failing after 3s
CI / lighthouse (pull_request) Has been skipped
CI / e2e (pull_request) Failing after 4s
CI / audit (pull_request) Failing after 14s
CI / lint (pull_request) Successful in 16s
CI / build-and-push (pull_request) Has been skipped
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
email_worker calls get_async_session_factory() inside every resolve_user()
call, which creates a brand-new async engine (and thus a brand-new
connection pool) on every message.  In a tight consumer loop processing
5 messages per batch, this rapidly exhausts DragonflyDB/Postgres
connection limits and manifests as ConnectionResetError.

Fix: cache the async engine in a module-level dict keyed by URL in
cartsnitch_common.database:get_async_engine(), matching the pattern
already used in receiptwitness:events.py for the Redis connection pool.
Also add pool_size=10, max_overflow=20, pool_pre_ping=True for
健壮连接管理.

Similarly, receiptwitness/queue/email.py:get_redis() was creating a new
Redis connection on every call with no pooling.  Share a
ConnectionPool (max_connections=30) across all get_redis() callers.

Fixes CAR-1078
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-28 18:53:05 +00:00
Coupon Carl 80786b9f1f fix(ci): use CI_GITEA_TOKEN for cross-repo checkout
CI / audit (push) Failing after 16s
CI / e2e (push) Successful in 52s
CI / lint (push) Successful in 1m14s
CI / test (push) Successful in 1m16s
CI / build-and-push (push) Failing after 14s
CI / build-and-push-api (push) Failing after 17s
CI / build-and-push-auth (push) Failing after 12s
CI / lighthouse (push) Failing after 1m5s
CI / build-and-push-receiptwitness (push) Failing after 3m23s
CI / deploy-dev (push) Has been skipped
CI / deploy-uat (push) Failing after 10s
Update deploy-dev and deploy-uat jobs to use CI_GITEA_TOKEN for
checking out the cartsnitch/infra repository instead of REGISTRY_TOKEN.

CI_GITEA_TOKEN is the org-level Actions secret configured for cross-repo
access, while REGISTRY_TOKEN continues to be used for Docker registry login.

This resolves CAR-986 by enabling CI to commit image tag updates to
the private infra repository.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-25 22:47:40 +00:00
Chris Farhood d90b00d7ac Add .mcp.json
CI / lint (push) Successful in 14s
CI / test (push) Successful in 11s
CI / audit (push) Failing after 10s
CI / e2e (push) Successful in 43s
CI / lighthouse (push) Failing after 44s
CI / build-and-push-receiptwitness (push) Failing after 11s
CI / build-and-push-api (push) Failing after 16s
CI / build-and-push-auth (push) Failing after 10s
CI / build-and-push (push) Failing after 12s
CI / deploy-uat (push) Failing after 27s
CI / deploy-dev (push) Failing after 31s
2026-05-25 21:47:10 +00:00
Savannah Savings 8983fe5d8f Merge pull request 'Promote to Production: CAR-894 Gitea workflows migration' (#270) from uat into main
CI / lint (push) Successful in 12s
CI / test (push) Successful in 12s
CI / audit (push) Failing after 21s
CI / e2e (push) Successful in 40s
CI / build-and-push-receiptwitness (push) Failing after 31s
CI / build-and-push-api (push) Failing after 15s
CI / lighthouse (push) Failing after 46s
CI / build-and-push-auth (push) Failing after 19s
CI / build-and-push (push) Failing after 12s
CI / deploy-dev (push) Failing after 33s
CI / deploy-uat (push) Failing after 32s
2026-05-24 18:51:41 +00:00
Savannah Savings a26082d099 Merge pull request 'Promote dev → uat: Fix API crash (dispose_engine import)' (#268) from dev into uat
CI / build-and-push-auth (push) Failing after 10s
CI / test (push) Successful in 14s
CI / build-and-push-receiptwitness (push) Failing after 12s
CI / build-and-push-api (push) Failing after 13s
CI / deploy-dev (push) Has been skipped
CI / audit (push) Failing after 10s
CI / lint (push) Successful in 15s
CI / e2e (push) Successful in 39s
CI / build-and-push (push) Failing after 10s
CI / lighthouse (push) Failing after 41s
CI / deploy-uat (push) Failing after 48s
CI / lint (pull_request) Successful in 40s
CI / e2e (pull_request) Successful in 40s
CI / audit (pull_request) Failing after 1m12s
CI / test (pull_request) Successful in 1m18s
CI / build-and-push (pull_request) Has been skipped
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / lighthouse (pull_request) Failing after 46s
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
Merge PR #268: Promote dev → uat — Fix API crash (dispose_engine import)

Promotes fix for ImportError/CrashLoopBackOff to UAT environment.

Approved-by: Savannah Savings (CTO)
2026-05-23 15:52:56 +00:00
Savannah Savings f8b8f4feef Merge pull request 'Fix API crash: remove dead dispose_engine import' (#266) from fix/dispose-engine-import into dev
CI / build-and-push (push) Has been skipped
CI / build-and-push-api (pull_request) Has been skipped
CI / build-and-push (pull_request) Has been skipped
CI / deploy-uat (push) Has been skipped
CI / build-and-push-receiptwitness (pull_request) Has been skipped
CI / build-and-push-auth (pull_request) Has been skipped
CI / deploy-dev (pull_request) Has been skipped
CI / deploy-uat (pull_request) Has been skipped
CI / deploy-dev (push) Failing after 1m46s
CI / test (push) Failing after 1m38s
CI / test (pull_request) Failing after 1m27s
CI / lighthouse (pull_request) Has been skipped
CI / audit (pull_request) Failing after 1m31s
CI / e2e (pull_request) Failing after 1m34s
CI / build-and-push-api (push) Has been skipped
CI / lint (push) Failing after 1m29s
CI / audit (push) Failing after 1m31s
CI / lighthouse (push) Has been skipped
CI / e2e (push) Failing after 1m36s
CI / build-and-push-receiptwitness (push) Has been skipped
CI / build-and-push-auth (push) Has been skipped
CI / lint (pull_request) Failing after 1m30s
Merge PR #266: Fix API crash — remove dead dispose_engine import

Removes non-existent dispose_engine import from main.py that caused ImportError and CrashLoopBackOff on API pods.

Reviewed-by: Checkout Charlie (QA PASS)
Approved-by: Savannah Savings (CTO)
2026-05-23 15:52:33 +00:00
Savannah Savings c39b26050b Merge pull request 'Promote dev → uat: CI registry migration [CAR-933]' (#265) from dev into uat
CI / test (push) Failing after 1m28s
CI / lighthouse (push) Has been skipped
CI / build-and-push-api (push) Has been skipped
CI / build-and-push-auth (push) Has been skipped
CI / lint (push) Failing after 1m34s
CI / build-and-push-receiptwitness (push) Has been skipped
CI / build-and-push (push) Has been skipped
CI / audit (push) Failing after 1m32s
CI / e2e (push) Failing after 1m37s
CI / deploy-dev (push) Has been skipped
CI / deploy-uat (push) Failing after 1m30s
Promote dev → uat: CI registry migration [CAR-933] (#265)
2026-05-23 14:39:41 +00:00
Savannah Savings 6b6a50b9ec Merge pull request 'Promote dev → uat: .gitea/workflows migration [CAR-934]' (#261) from dev into uat
CI / lint (push) Successful in 14s
CI / test (push) Successful in 12s
CI / audit (push) Failing after 10s
CI / e2e (push) Successful in 39s
CI / build-and-push-receiptwitness (push) Failing after 8s
CI / build-and-push-api (push) Failing after 7s
CI / build-and-push-auth (push) Failing after 7s
CI / build-and-push (push) Failing after 8s
CI / deploy-dev (push) Has been skipped
CI / deploy-uat (push) Failing after 42s
CI / lighthouse (push) Failing after 17m22s
Promote dev → uat: .gitea/workflows migration [CAR-934]

cc @cpfarhood
2026-05-21 19:19:40 +00:00
Savannah Savings 7c021c4eb5 Merge pull request 'chore: promote dev to uat - Gitea Actions workflow conversion' (#254) from dev into uat
CI / test (push) Successful in 13s
CI / lint (push) Successful in 14s
CI / audit (push) Failing after 10s
CI / build-and-push-receiptwitness (push) Failing after 7s
CI / build-and-push-api (push) Failing after 7s
CI / build-and-push-auth (push) Failing after 8s
CI / e2e (push) Successful in 42s
CI / build-and-push (push) Failing after 8s
CI / deploy-dev (push) Has been skipped
CI / deploy-uat (push) Failing after 23s
CI / lighthouse (push) Failing after 1m16s
2026-05-21 04:23:11 +00:00
savannah-savings-cto[bot] a5404dc824 promote: dev → uat (fix auth tsc build) (#252)
promote: dev → uat (fix auth tsc build)
2026-05-05 11:19:44 +00:00
savannah-savings-cto[bot] 618da593a6 Merge pull request #250 from cartsnitch/dev
ci: promote dev → uat (auth CI pipeline)
2026-05-05 10:56:35 +00:00
coupon-carl-ceo[bot] e3ed19f98c release: promote uat → main (seed tooling CAR-812 + auth health)
CI / test (pull_request) Has been cancelled
CI / audit (pull_request) Has been cancelled
CI / e2e (pull_request) Has been cancelled
CI / build-and-push-receiptwitness (pull_request) Has been cancelled
CI / deploy-uat (pull_request) Has been cancelled
CI / lint (pull_request) Has been cancelled
CI / lighthouse (pull_request) Has been cancelled
CI / build-and-push (pull_request) Has been cancelled
CI / build-and-push-api (pull_request) Has been cancelled
CI / deploy-dev (pull_request) Has been cancelled
UAT PASS (Deal Dottie, 2026-05-04) + Security PASS (Stockboy Steve, 2026-05-04)

Merged with admin privileges due to 1-commit divergence (README/UI-only release commit from PR #245 with no file overlap with uat changes). No functional conflict.

Refs: CAR-842, CAR-812
2026-05-04 21:55:13 +00:00
savannah-savings-cto[bot] e54736d900 chore: promote dev → uat (seed tooling, CAR-812) (#247)
chore: promote dev → uat (seed tooling, CAR-812)
2026-05-04 21:44:34 +00:00
savannah-savings-cto[bot] 40abf64888 chore: promote dev → uat (auth health routing fix) (#246)
chore: promote dev → uat (auth health routing fix)
2026-05-04 21:17:31 +00:00
savannah-savings-cto[bot] 3615a78f0e release: remove mock auth bypass + README expansion (CAR-813/CAR-829)
release: remove mock auth bypass + README expansion (CAR-813/CAR-829)
2026-05-04 19:42:36 +00:00
savannah-savings-cto[bot] d785606bd1 Merge main into uat to bring up to date for production release 2026-05-04 19:41:47 +00:00
savannah-savings-cto[bot] 48eaf45121 Merge pull request #244 from cartsnitch/dev
promote: dev → uat (README expansion)
2026-05-04 19:00:18 +00:00
savannah-savings-cto[bot] 4bf5cd3826 Merge pull request #242 from cartsnitch/dev
Promote dev → uat: remove VITE_MOCK_AUTH bypass (#181)
2026-05-04 16:23:33 +00:00
coupon-carl-ceo[bot] a3fca65ea1 Merge pull request #239 from cartsnitch/uat
release: lifespan DB/Redis connection pooling (CAR-550)
2026-05-04 15:41:53 +00:00
savannah-savings-cto[bot] 25c27d08fe Merge pull request #241 from cartsnitch/dev
promote: dev → uat (color contrast accessibility fix)
2026-05-04 15:31:13 +00:00
Chris Farhood aaf645fbe9 ci: retrigger e2e after runner network outage [CAR-799]
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-04 15:30:28 +00:00
savannah-savings-cto[bot] 80aa58b37a Merge pull request #240 from cartsnitch/dev
Promote dev → uat: PR #178 (fix N+1 UPC scan with Postgres JSON containment)
2026-05-04 15:20:28 +00:00
savannah-savings-cto[bot] 062f6be8ea Merge pull request #238 from cartsnitch/dev
Promote dev to UAT: lifespan DB/Redis connection pooling
2026-05-04 15:07:59 +00:00
savannah-savings-cto[bot] 60beb2d89e Merge pull request #237 from cartsnitch/uat
release: remove auth image build from monorepo CI (CAR-749)
2026-04-20 18:53:47 +00:00
savannah-savings-cto[bot] 9120c834e4 Merge pull request #236 from cartsnitch/dev
Promote dev to UAT: remove auth image build from CI
2026-04-20 18:01:29 +00:00
7 changed files with 528 additions and 450 deletions
+145 -41
View File
@@ -26,11 +26,7 @@ jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
cache: npm
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
- run: npm ci
- name: ESLint
run: npx eslint .
@@ -40,8 +36,8 @@ jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020
with:
node-version: "20"
cache: npm
@@ -52,8 +48,8 @@ jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020
with:
node-version: "20"
cache: npm
@@ -64,8 +60,8 @@ jobs:
e2e:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020
with:
node-version: "20"
cache: npm
@@ -77,8 +73,8 @@ jobs:
runs-on: ubuntu-latest
needs: [test]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020
with:
node-version: "20"
cache: npm
@@ -106,7 +102,7 @@ jobs:
calver_tag: ${{ steps.calver.outputs.version }}
sha_tag: sha-${{ github.sha }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
with:
fetch-depth: 0
@@ -160,8 +156,8 @@ jobs:
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
target: prod
cache-from: type=gha
cache-to: type=gha,mode=max
cache-from: type=inline
cache-to: type=inline,mode=max
- name: Scan frontend image for vulnerabilities
uses: anchore/scan-action@v5
@@ -186,7 +182,7 @@ jobs:
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
target: prod
cache-from: type=gha
cache-from: type=inline
- name: Create git tag
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
@@ -202,7 +198,7 @@ jobs:
calver_tag: ${{ steps.calver.outputs.version }}
sha_tag: sha-${{ github.sha }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
with:
fetch-depth: 0
@@ -252,8 +248,8 @@ jobs:
labels: ${{ steps.meta.outputs.labels }}
build-args: |
APT_CACHE_BUST=${{ github.run_id }}
cache-from: type=gha
cache-to: type=gha,mode=max
cache-from: type=inline
cache-to: type=inline,mode=max
- name: Scan receiptwitness image for vulnerabilities
uses: anchore/scan-action@v5
@@ -280,7 +276,7 @@ jobs:
labels: ${{ steps.meta.outputs.labels }}
build-args: |
APT_CACHE_BUST=${{ github.run_id }}
cache-from: type=gha
cache-from: type=inline
build-and-push-api:
runs-on: ubuntu-latest
@@ -290,7 +286,7 @@ jobs:
calver_tag: ${{ steps.calver.outputs.version }}
sha_tag: sha-${{ github.sha }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
with:
fetch-depth: 0
@@ -340,8 +336,8 @@ jobs:
labels: ${{ steps.meta.outputs.labels }}
build-args: |
APT_CACHE_BUST=${{ github.run_id }}
cache-from: type=gha
cache-to: type=gha,mode=max
cache-from: type=inline
cache-to: type=inline,mode=max
- name: Scan api image for vulnerabilities
uses: anchore/scan-action@v5
@@ -368,7 +364,7 @@ jobs:
labels: ${{ steps.meta.outputs.labels }}
build-args: |
APT_CACHE_BUST=${{ github.run_id }}
cache-from: type=gha
cache-from: type=inline
build-and-push-auth:
runs-on: ubuntu-latest
@@ -378,7 +374,7 @@ jobs:
calver_tag: ${{ steps.calver.outputs.version }}
sha_tag: sha-${{ github.sha }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
with:
fetch-depth: 0
@@ -428,8 +424,8 @@ jobs:
labels: ${{ steps.meta.outputs.labels }}
build-args: |
APT_CACHE_BUST=${{ github.run_id }}
cache-from: type=gha
cache-to: type=gha,mode=max
cache-from: type=inline
cache-to: type=inline,mode=max
- name: Scan auth image for vulnerabilities
uses: anchore/scan-action@v5
@@ -456,7 +452,7 @@ jobs:
labels: ${{ steps.meta.outputs.labels }}
build-args: |
APT_CACHE_BUST=${{ github.run_id }}
cache-from: type=gha
cache-from: type=inline
deploy-dev:
runs-on: ubuntu-latest
@@ -464,10 +460,10 @@ jobs:
if: always() && !cancelled() && github.event_name == 'push' && (github.ref == 'refs/heads/dev' || github.ref == 'refs/heads/main')
steps:
- name: Checkout infra repo
uses: actions/checkout@v4
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
with:
repository: cartsnitch/infra
token: ${{ secrets.REGISTRY_TOKEN }}
token: ${{ secrets.CI_GITEA_TOKEN }}
ref: main
path: infra
@@ -475,7 +471,16 @@ jobs:
uses: azure/setup-kubectl@v4
- name: Install kustomize
uses: imranismail/setup-kustomize@v2
# imranismail/setup-kustomize@v2 calls the Gitea API to record
# telemetry under the "kubernetes-sigs" user, which doesn't exist
# on this Gitea instance. Install the binary directly instead.
run: |
set -euo pipefail
version="5.4.3"
url="https://github.com/kubernetes-sigs/kustomize/releases/download/kustomize%2Fv${version}/kustomize_v${version}_linux_amd64.tar.gz"
curl -fsSL --retry 3 "$url" | tar -xz -C /tmp kustomize
sudo install -m 0755 /tmp/kustomize /usr/local/bin/kustomize
kustomize version
- name: Determine image tag for frontend
id: frontend_tag
@@ -537,16 +542,61 @@ jobs:
cd infra/apps/overlays/dev
kustomize edit set image ghcr.io/cartsnitch/auth=git.farh.net/cartsnitch/auth:${{ steps.auth_tag.outputs.tag }}
- name: Commit and push to infra
- name: Commit and push to infra (via PR)
env:
CI_GITEA_TOKEN: ${{ secrets.CI_GITEA_TOKEN }}
run: |
cd infra
git config user.name "cartsnitch-ci[bot]"
git config user.email "cartsnitch-ci[bot]@users.noreply.git.farh.net"
git add apps/overlays/dev/kustomization.yaml
git diff --cached --quiet && echo "No image changes to deploy" && exit 0
BRANCH="ci/deploy-dev-${GITHUB_SHA}"
git checkout -b "$BRANCH"
git commit -m "ci(dev): update cartsnitch, receiptwitness, api, and auth images"
git pull --rebase origin main
git push origin main
git push origin "$BRANCH"
PR_BODY=$(printf 'Auto-opened by deploy-dev (CAR-1195).\n\nBuild SHA: %s' "${GITHUB_SHA}")
PR_JSON=$(curl -sS -X POST \
-H "Authorization: token ${CI_GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d "$(jq -n --arg head "cartsnitch:${BRANCH}" --arg base main --arg title "ci(dev): update overlay image tags (${GITHUB_SHA::12})" --arg body "$PR_BODY" '{head:$head,base:$base,title:$title,body:$body}')" \
"https://git.farh.net/api/v1/repos/cartsnitch/infra/pulls")
PR_NUM=$(echo "$PR_JSON" | jq -r '.number // empty')
if [ -z "$PR_NUM" ]; then
echo "::error::Failed to open PR against cartsnitch/infra: $PR_JSON"
exit 1
fi
echo "Opened cartsnitch/infra PR #${PR_NUM} (head=${BRANCH})"
# Request CTO (cs_savannah) review as the GitOps hand-off. Best-effort:
# log on non-2xx but never fail the job for this.
REVIEW_HTTP=$(curl -sS -o /dev/null -w '%{http_code}' -X POST \
-H "Authorization: token ${CI_GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"reviewers":["cs_savannah"]}' \
"https://git.farh.net/api/v1/repos/cartsnitch/infra/pulls/${PR_NUM}/requested_reviewers")
if [ "${REVIEW_HTTP}" -lt 200 ] || [ "${REVIEW_HTTP}" -ge 300 ]; then
echo "::notice::Failed to request reviewers for cartsnitch/infra PR #${PR_NUM} (HTTP ${REVIEW_HTTP}); continuing"
fi
MERGE_RESP=$(curl -sS -X POST \
-H "Authorization: token ${CI_GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"Do":"merge","delete_branch_after_merge":true}' \
"https://git.farh.net/api/v1/repos/cartsnitch/infra/pulls/${PR_NUM}/merge")
MERGED=$(echo "$MERGE_RESP" | jq -r '.merged // false')
if [ "$MERGED" = "true" ]; then
echo "PR #${PR_NUM} merged into cartsnitch/infra main"
elif echo "$MERGE_RESP" | grep -qi 'does not have enough approvals'; then
# GitOps approval gate: the PR is correctly opened and surfaces in
# the CTO queue via the reviewers request above. Treat as success
# (exit 0) so the deploy job does not hard-fail on the approvals
# requirement that only a human maintainer can satisfy.
echo "::notice::infra PR #${PR_NUM} opened and awaiting CTO (cs_savannah) approve+merge — GitOps approval gate, not a failure"
exit 0
else
echo "::error::Auto-merge of cartsnitch/infra PR #${PR_NUM} failed: $MERGE_RESP"
echo "::error::Reassign to cs_savannah (authorized merger for cartsnitch/infra main) for backstop merge."
exit 1
fi
deploy-uat:
runs-on: ubuntu-latest
@@ -554,10 +604,10 @@ jobs:
if: always() && !cancelled() && github.event_name == 'push' && (github.ref == 'refs/heads/uat' || github.ref == 'refs/heads/main')
steps:
- name: Checkout infra repo
uses: actions/checkout@v4
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
with:
repository: cartsnitch/infra
token: ${{ secrets.REGISTRY_TOKEN }}
token: ${{ secrets.CI_GITEA_TOKEN }}
ref: main
path: infra
@@ -565,7 +615,16 @@ jobs:
uses: azure/setup-kubectl@v4
- name: Install kustomize
uses: imranismail/setup-kustomize@v2
# imranismail/setup-kustomize@v2 calls the Gitea API to record
# telemetry under the "kubernetes-sigs" user, which doesn't exist
# on this Gitea instance. Install the binary directly instead.
run: |
set -euo pipefail
version="5.4.3"
url="https://github.com/kubernetes-sigs/kustomize/releases/download/kustomize%2Fv${version}/kustomize_v${version}_linux_amd64.tar.gz"
curl -fsSL --retry 3 "$url" | tar -xz -C /tmp kustomize
sudo install -m 0755 /tmp/kustomize /usr/local/bin/kustomize
kustomize version
- name: Determine image tag for frontend
id: frontend_tag
@@ -627,13 +686,58 @@ jobs:
cd infra/apps/overlays/uat
kustomize edit set image ghcr.io/cartsnitch/auth=git.farh.net/cartsnitch/auth:${{ steps.auth_tag.outputs.tag }}
- name: Commit and push to infra
- name: Commit and push to infra (via PR)
env:
CI_GITEA_TOKEN: ${{ secrets.CI_GITEA_TOKEN }}
run: |
cd infra
git config user.name "cartsnitch-ci[bot]"
git config user.email "cartsnitch-ci[bot]@users.noreply.git.farh.net"
git add apps/overlays/uat/kustomization.yaml
git diff --cached --quiet && echo "No image changes to deploy" && exit 0
BRANCH="ci/deploy-uat-${GITHUB_SHA}"
git checkout -b "$BRANCH"
git commit -m "ci(uat): update cartsnitch, receiptwitness, api, and auth images"
git pull --rebase origin main
git push origin main
git push origin "$BRANCH"
PR_BODY=$(printf 'Auto-opened by deploy-uat (CAR-1195).\n\nBuild SHA: %s' "${GITHUB_SHA}")
PR_JSON=$(curl -sS -X POST \
-H "Authorization: token ${CI_GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d "$(jq -n --arg head "cartsnitch:${BRANCH}" --arg base main --arg title "ci(uat): update overlay image tags (${GITHUB_SHA::12})" --arg body "$PR_BODY" '{head:$head,base:$base,title:$title,body:$body}')" \
"https://git.farh.net/api/v1/repos/cartsnitch/infra/pulls")
PR_NUM=$(echo "$PR_JSON" | jq -r '.number // empty')
if [ -z "$PR_NUM" ]; then
echo "::error::Failed to open PR against cartsnitch/infra: $PR_JSON"
exit 1
fi
echo "Opened cartsnitch/infra PR #${PR_NUM} (head=${BRANCH})"
# Request CTO (cs_savannah) review as the GitOps hand-off. Best-effort:
# log on non-2xx but never fail the job for this.
REVIEW_HTTP=$(curl -sS -o /dev/null -w '%{http_code}' -X POST \
-H "Authorization: token ${CI_GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"reviewers":["cs_savannah"]}' \
"https://git.farh.net/api/v1/repos/cartsnitch/infra/pulls/${PR_NUM}/requested_reviewers")
if [ "${REVIEW_HTTP}" -lt 200 ] || [ "${REVIEW_HTTP}" -ge 300 ]; then
echo "::notice::Failed to request reviewers for cartsnitch/infra PR #${PR_NUM} (HTTP ${REVIEW_HTTP}); continuing"
fi
MERGE_RESP=$(curl -sS -X POST \
-H "Authorization: token ${CI_GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"Do":"merge","delete_branch_after_merge":true}' \
"https://git.farh.net/api/v1/repos/cartsnitch/infra/pulls/${PR_NUM}/merge")
MERGED=$(echo "$MERGE_RESP" | jq -r '.merged // false')
if [ "$MERGED" = "true" ]; then
echo "PR #${PR_NUM} merged into cartsnitch/infra main"
elif echo "$MERGE_RESP" | grep -qi 'does not have enough approvals'; then
# GitOps approval gate: the PR is correctly opened and surfaces in
# the CTO queue via the reviewers request above. Treat as success
# (exit 0) so the deploy job does not hard-fail on the approvals
# requirement that only a human maintainer can satisfy.
echo "::notice::infra PR #${PR_NUM} opened and awaiting CTO (cs_savannah) approve+merge — GitOps approval gate, not a failure"
exit 0
else
echo "::error::Auto-merge of cartsnitch/infra PR #${PR_NUM} failed: $MERGE_RESP"
echo "::error::Reassign to cs_savannah (authorized merger for cartsnitch/infra main) for backstop merge."
exit 1
fi
+23 -4
View File
@@ -19,9 +19,18 @@ describe('Auth health endpoint', () => {
}
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ status: 'ok', db: 'reachable' }));
} catch {
} catch (err) {
// Mirror src/index.ts: log the error and include the message in the
// response body so /health 503s are diagnosable from pod logs.
console.error(
'[auth /health] DB probe failed:',
err instanceof Error ? `${err.name}: ${err.message}` : err,
);
const detail = err instanceof Error ? err.message : 'unknown error';
res.writeHead(503, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ status: 'error', db: 'unreachable' }));
res.end(
JSON.stringify({ status: 'error', db: 'unreachable', error: detail }),
);
}
return;
}
@@ -76,7 +85,10 @@ describe('Auth health endpoint', () => {
close();
equal(status, 503);
equal(body, '{"status":"error","db":"unreachable"}');
const parsed = JSON.parse(body);
equal(parsed.status, 'error');
equal(parsed.db, 'unreachable');
equal(parsed.error, 'connection refused');
});
it('returns 503 with db=unreachable when query times out', async () => {
@@ -95,7 +107,14 @@ describe('Auth health endpoint', () => {
close();
equal(status, 503);
equal(body, '{"status":"error","db":"unreachable"}');
const parsed = JSON.parse(body);
equal(parsed.status, 'error');
equal(parsed.db, 'unreachable');
// The query promise rejects with a synthetic 'timeout' error; the
// Promise.race wrapper also rejects with 'DB timeout'. The body should
// surface whichever error was thrown — accept either to stay robust.
equal(typeof parsed.error, 'string');
equal(parsed.error.length > 0, true);
});
it('returns a terminal response for unknown paths (no hang)', async () => {
+12 -2
View File
@@ -21,9 +21,19 @@ const server = createServer(async (req, res) => {
}
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ status: "ok", db: "reachable" }));
} catch {
} catch (err) {
// Log the actual error so /health 503s are diagnosable from pod logs
// (CAR-1276: UAT auth was crashlooping with no log output beyond the
// initial "listening on port 3001" line because this catch was empty).
console.error(
"[auth /health] DB probe failed:",
err instanceof Error ? `${err.name}: ${err.message}` : err,
);
const detail = err instanceof Error ? err.message : "unknown error";
res.writeHead(503, { "Content-Type": "application/json" });
res.end(JSON.stringify({ status: "error", db: "unreachable" }));
res.end(
JSON.stringify({ status: "error", db: "unreachable", error: detail }),
);
}
return;
}
+23 -4
View File
@@ -1,17 +1,36 @@
"""Database engine and session factories for sync and async usage."""
from collections.abc import AsyncGenerator, Generator
from typing import TYPE_CHECKING
from sqlalchemy import create_engine
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
from sqlalchemy.ext.asyncio import AsyncEngine, AsyncSession, async_sessionmaker, create_async_engine
from sqlalchemy.orm import Session, sessionmaker
from cartsnitch_common.config import settings
if TYPE_CHECKING:
from sqlalchemy.engine import Engine
def get_async_engine(url: str | None = None):
"""Create an async SQLAlchemy engine."""
return create_async_engine(url or settings.database_url, echo=settings.debug)
# Module-level async engine cache — one engine per unique URL, shared across all callers.
# This prevents pool exhaustion in high-throughput workers (e.g. email-worker hitting
# DragonflyDB/Postgres repeatedly per message). pool_size=10, max_overflow=20 gives
# headroom for bursts while capping max connections at 30 per URL.
_async_engine_cache: dict[str, "AsyncEngine"] = {}
def get_async_engine(url: str | None = None) -> "AsyncEngine":
"""Get or create a cached async engine for the given URL."""
target = url or settings.database_url
if target not in _async_engine_cache:
_async_engine_cache[target] = create_async_engine(
target,
echo=settings.debug,
pool_size=10,
max_overflow=20,
pool_pre_ping=True,
)
return _async_engine_cache[target]
def get_sync_engine(url: str | None = None):
+297 -391
View File
File diff suppressed because it is too large Load Diff
+5 -3
View File
@@ -45,14 +45,16 @@
"typescript-eslint": "^8.56.1",
"vite": "^6.4.2",
"vite-plugin-pwa": "^0.21.2",
"vitest": "^3.2.4"
"vitest": "^4.1.8"
},
"overrides": {
"@rollup/pluginutils": "5.3.0",
"flatted": "^3.4.2",
"serialize-javascript": "7.0.5",
"brace-expansion": ">=1.1.13",
"brace-expansion": ">=5.0.6",
"lodash": ">=4.17.24",
"minimatch": "^10.2.4"
"minimatch": "^10.2.4",
"@babel/plugin-transform-modules-systemjs": "^7.29.4",
"fast-uri": "^3.1.2"
}
}
@@ -16,6 +16,29 @@ logger = logging.getLogger(__name__)
STREAM_KEY = "email:receipts"
CONSUMER_GROUP = "email-workers"
# Module-level Redis/DragonflyDB connection pool — shared across all worker calls.
# Without pooling, each call to get_redis() opens a new TCP connection. In a tight
# consumer loop this causes ConnectionResetError when DragonflyDB's connection limit
# is hit under load. max_connections=30 (10 base + 20 overflow) mirrors the engine pool.
_redis_pool: aioredis.ConnectionPool | None = None
def _get_redis_pool() -> aioredis.ConnectionPool:
"""Get or create the shared DragonflyDB connection pool."""
global _redis_pool
if _redis_pool is None:
_redis_pool = aioredis.ConnectionPool.from_url(
settings.redis_url,
decode_responses=True,
max_connections=30,
)
return _redis_pool
async def get_redis() -> aioredis.Redis:
"""Get async Redis/DragonflyDB client backed by a shared connection pool."""
return aioredis.Redis(connection_pool=_get_redis_pool())
@dataclass
class EmailJob:
@@ -31,11 +54,6 @@ class EmailJob:
message_id: str # from email provider, for dedup
async def get_redis() -> aioredis.Redis:
"""Get async Redis/DragonflyDB client."""
return cast(aioredis.Redis, aioredis.from_url(settings.redis_url, decode_responses=True))
async def ensure_consumer_group(client: aioredis.Redis) -> None:
"""Create consumer group if it does not exist."""
try: