fix(api): add timeouts for OIDC discovery fetch and DB connection #66

Merged
Scrubs McBarkley merged 1 commits from fix/gro-1678-econnreset-robustness into dev 2026-05-24 20:11:44 +00:00
Owner

Summary

  • OIDC discovery fetch in initAuth() now has a 5s AbortSignal.timeout to fail fast when the auth server is unreachable (identified root cause of startup ECONNRESET crashes on UAT where ztunnel drops TCP connections before headers arrive).
  • DB postgres client now sets connect_timeout: 5 so failed connection attempts fail fast rather than hanging startup.
  • Graceful shutdown timeout tightened to 8s (from 10s) to avoid being killed by Kubernetes liveness-probe deadline while draining.

Changes

  • packages/db/src/index.ts: postgres() → postgres(..., { connect_timeout: 5 })
  • src/lib/auth.ts: OIDC discovery fetch gets signal: AbortSignal.timeout(5000)
  • src/index.ts: shutdown timeout 10s → 8s

Test plan

  • All 524 unit tests pass
  • Deploy to dev, verify /api/health still returns 200
  • Verify startup no longer hangs on OIDC discovery failure

Fixes GRO-1678.

cc @cpfarhood

## Summary - OIDC discovery fetch in initAuth() now has a 5s AbortSignal.timeout to fail fast when the auth server is unreachable (identified root cause of startup ECONNRESET crashes on UAT where ztunnel drops TCP connections before headers arrive). - DB postgres client now sets connect_timeout: 5 so failed connection attempts fail fast rather than hanging startup. - Graceful shutdown timeout tightened to 8s (from 10s) to avoid being killed by Kubernetes liveness-probe deadline while draining. ## Changes - packages/db/src/index.ts: postgres() → postgres(..., { connect_timeout: 5 }) - src/lib/auth.ts: OIDC discovery fetch gets signal: AbortSignal.timeout(5000) - src/index.ts: shutdown timeout 10s → 8s ## Test plan - [x] All 524 unit tests pass - [ ] Deploy to dev, verify /api/health still returns 200 - [ ] Verify startup no longer hangs on OIDC discovery failure Fixes GRO-1678. cc @cpfarhood
Scrubs McBarkley added 1 commit 2026-05-24 19:46:36 +00:00
fix(api): add timeouts for OIDC discovery fetch and DB connection
CI / Test (pull_request) Successful in 16s
CI / Lint & Typecheck (pull_request) Successful in 19s
CI / Build & Push Docker Images (pull_request) Successful in 52s
dc3c23055a
- OIDC discovery fetch in initAuth() now has a 5s AbortSignal.timeout
  to fail fast instead of hanging indefinitely when the auth server is unreachable.
  This was identified as a root cause of startup ECONNRESET crashes on UAT
  where ztunnel drops TCP connections before headers arrive.

- DB postgres client now sets connect_timeout: 5 so failed connection attempts
  fail fast rather than hanging the startup sequence.

- Graceful shutdown timeout tightened to 8s (from 10s) to avoid
  getting killed by Kubernetes liveness-probe deadline while draining.

Fixes GRO-1678.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Scrubs McBarkley merged commit b486c44a82 into dev 2026-05-24 20:11:44 +00:00
Sign in to join this conversation.