Files
paperclip/packages/db/src/schema/issues.ts
T
Dotta 57229d0f24 [codex] Add issue monitor liveness controls (#4988)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies where work
must stay observable, governable, and recoverable.
> - The task/heartbeat subsystem owns agent execution continuity, issue
state transitions, and visible recovery behavior.
> - Waiting on an external service is not the same as being blocked when
the assignee still owns a future check.
> - The gap was that agents had no first-class one-shot monitor state
for external-service waits, so recovery could look stalled or require ad
hoc comments.
> - This pull request adds bounded issue monitors that can wake the
owner, clear exhausted waits, and produce explicit recovery behavior.
> - It also surfaces monitor status in the board UI and documents when
to use monitors versus `blocked`.
> - The benefit is clearer liveness semantics for asynchronous waits
without weakening single-assignee task ownership.

## What Changed

- Added issue monitor fields, shared types, validators, constants, and
an idempotent `0075` migration for scheduled monitor state.
- Added server-side monitor scheduling, dispatch, recovery bounds,
activity logging, and external-ref redaction.
- Added board/agent route coverage for monitor permissions and child
monitor scheduling.
- Added issue detail/property UI for monitor state, a monitor activity
card, and Storybook stories for review surfaces.
- Documented monitor semantics and recovery policy behavior in
`doc/execution-semantics.md`.
- Addressed Greptile review feedback by preserving monitor state in
skipped-stage builders and making board monitor saves send `scheduledBy:
"board"`.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/issue-execution-policy-routes.test.ts
server/src/__tests__/issue-execution-policy.test.ts
server/src/__tests__/issue-monitor-scheduler.test.ts
server/src/__tests__/recovery-classifiers.test.ts
ui/src/components/IssueMonitorActivityCard.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/lib/activity-format.test.ts`
- First run passed 5 files and failed to collect 2 server suites because
the worktree was missing the optional `acpx/runtime` dependency.
- After `pnpm install --frozen-lockfile`, reran the 2 failed suites
successfully.
- `pnpm exec vitest run
server/src/__tests__/issue-monitor-scheduler.test.ts
server/src/__tests__/recovery-classifiers.test.ts`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/db typecheck && pnpm --filter @paperclipai/server typecheck
&& pnpm --filter @paperclipai/ui typecheck`
- `pnpm exec vitest run
server/src/__tests__/issue-execution-policy.test.ts
ui/src/components/IssueProperties.test.tsx`
- `pnpm --filter @paperclipai/server typecheck && pnpm --filter
@paperclipai/ui typecheck`
- `pnpm exec vitest run
ui/src/components/IssueMonitorActivityCard.test.tsx
ui/src/components/IssueProperties.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- Storybook screenshot captured from
`http://127.0.0.1:6006/iframe.html?viewMode=story&id=product-issue-monitor-surfaces--monitor-surfaces`
with Playwright.

## Screenshots

![Issue monitor Storybook
surfaces](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2945-when-a-task-is-waiting-for-an-_external-service_-what-state-should-it-be-in-and-what-recovery-method-could-it-h/docs/pr-screenshots/pap-2945/monitor-surfaces.png)

## Risks

- Medium: this changes heartbeat recovery behavior for scheduled
external-service waits, so regressions could affect wake timing or
recovery issue creation.
- Migration risk is reduced by using `IF NOT EXISTS` for the new issue
monitor columns and index.
- External monitor references are treated as secret-adjacent and are
intentionally omitted from visible activity/wake payloads.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent with repository tool use and terminal
execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or Storybook review surfaces
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 08:58:53 -05:00

143 lines
7.3 KiB
TypeScript

import { sql } from "drizzle-orm";
import {
type AnyPgColumn,
pgTable,
uuid,
text,
timestamp,
integer,
jsonb,
index,
uniqueIndex,
} from "drizzle-orm/pg-core";
import { agents } from "./agents.js";
import { projects } from "./projects.js";
import { goals } from "./goals.js";
import { companies } from "./companies.js";
import { heartbeatRuns } from "./heartbeat_runs.js";
import { projectWorkspaces } from "./project_workspaces.js";
import { executionWorkspaces } from "./execution_workspaces.js";
export const issues = pgTable(
"issues",
{
id: uuid("id").primaryKey().defaultRandom(),
companyId: uuid("company_id").notNull().references(() => companies.id),
projectId: uuid("project_id").references(() => projects.id),
projectWorkspaceId: uuid("project_workspace_id").references(() => projectWorkspaces.id, { onDelete: "set null" }),
goalId: uuid("goal_id").references(() => goals.id),
parentId: uuid("parent_id").references((): AnyPgColumn => issues.id),
title: text("title").notNull(),
description: text("description"),
status: text("status").notNull().default("backlog"),
priority: text("priority").notNull().default("medium"),
assigneeAgentId: uuid("assignee_agent_id").references(() => agents.id),
assigneeUserId: text("assignee_user_id"),
checkoutRunId: uuid("checkout_run_id").references(() => heartbeatRuns.id, { onDelete: "set null" }),
executionRunId: uuid("execution_run_id").references(() => heartbeatRuns.id, { onDelete: "set null" }),
executionAgentNameKey: text("execution_agent_name_key"),
executionLockedAt: timestamp("execution_locked_at", { withTimezone: true }),
createdByAgentId: uuid("created_by_agent_id").references(() => agents.id),
createdByUserId: text("created_by_user_id"),
issueNumber: integer("issue_number"),
identifier: text("identifier"),
originKind: text("origin_kind").notNull().default("manual"),
originId: text("origin_id"),
originRunId: text("origin_run_id"),
originFingerprint: text("origin_fingerprint").notNull().default("default"),
requestDepth: integer("request_depth").notNull().default(0),
billingCode: text("billing_code"),
assigneeAdapterOverrides: jsonb("assignee_adapter_overrides").$type<Record<string, unknown>>(),
executionPolicy: jsonb("execution_policy").$type<Record<string, unknown>>(),
executionState: jsonb("execution_state").$type<Record<string, unknown>>(),
monitorNextCheckAt: timestamp("monitor_next_check_at", { withTimezone: true }),
monitorWakeRequestedAt: timestamp("monitor_wake_requested_at", { withTimezone: true }),
monitorLastTriggeredAt: timestamp("monitor_last_triggered_at", { withTimezone: true }),
monitorAttemptCount: integer("monitor_attempt_count").notNull().default(0),
monitorNotes: text("monitor_notes"),
monitorScheduledBy: text("monitor_scheduled_by"),
executionWorkspaceId: uuid("execution_workspace_id")
.references((): AnyPgColumn => executionWorkspaces.id, { onDelete: "set null" }),
executionWorkspacePreference: text("execution_workspace_preference"),
executionWorkspaceSettings: jsonb("execution_workspace_settings").$type<Record<string, unknown>>(),
startedAt: timestamp("started_at", { withTimezone: true }),
completedAt: timestamp("completed_at", { withTimezone: true }),
cancelledAt: timestamp("cancelled_at", { withTimezone: true }),
hiddenAt: timestamp("hidden_at", { withTimezone: true }),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp("updated_at", { withTimezone: true }).notNull().defaultNow(),
},
(table) => ({
companyStatusIdx: index("issues_company_status_idx").on(table.companyId, table.status),
assigneeStatusIdx: index("issues_company_assignee_status_idx").on(
table.companyId,
table.assigneeAgentId,
table.status,
),
assigneeUserStatusIdx: index("issues_company_assignee_user_status_idx").on(
table.companyId,
table.assigneeUserId,
table.status,
),
parentIdx: index("issues_company_parent_idx").on(table.companyId, table.parentId),
projectIdx: index("issues_company_project_idx").on(table.companyId, table.projectId),
originIdx: index("issues_company_origin_idx").on(table.companyId, table.originKind, table.originId),
projectWorkspaceIdx: index("issues_company_project_workspace_idx").on(table.companyId, table.projectWorkspaceId),
executionWorkspaceIdx: index("issues_company_execution_workspace_idx").on(table.companyId, table.executionWorkspaceId),
dueMonitorIdx: index("issues_company_monitor_due_idx").on(table.companyId, table.monitorNextCheckAt),
identifierIdx: uniqueIndex("issues_identifier_idx").on(table.identifier),
titleSearchIdx: index("issues_title_search_idx").using("gin", table.title.op("gin_trgm_ops")),
identifierSearchIdx: index("issues_identifier_search_idx").using("gin", table.identifier.op("gin_trgm_ops")),
descriptionSearchIdx: index("issues_description_search_idx").using("gin", table.description.op("gin_trgm_ops")),
openRoutineExecutionIdx: uniqueIndex("issues_open_routine_execution_uq")
.on(table.companyId, table.originKind, table.originId, table.originFingerprint)
.where(
sql`${table.originKind} = 'routine_execution'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.executionRunId} is not null
and ${table.status} in ('backlog', 'todo', 'in_progress', 'in_review', 'blocked')`,
),
activeLivenessRecoveryIncidentIdx: uniqueIndex("issues_active_liveness_recovery_incident_uq")
.on(table.companyId, table.originKind, table.originId)
.where(
sql`${table.originKind} = 'harness_liveness_escalation'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
activeLivenessRecoveryLeafIdx: uniqueIndex("issues_active_liveness_recovery_leaf_uq")
.on(table.companyId, table.originKind, table.originFingerprint)
.where(
sql`${table.originKind} = 'harness_liveness_escalation'
and ${table.originFingerprint} <> 'default'
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
activeStaleRunEvaluationIdx: uniqueIndex("issues_active_stale_run_evaluation_uq")
.on(table.companyId, table.originKind, table.originId)
.where(
sql`${table.originKind} = 'stale_active_run_evaluation'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
activeProductivityReviewIdx: uniqueIndex("issues_active_productivity_review_uq")
.on(table.companyId, table.originKind, table.originId)
.where(
sql`${table.originKind} = 'issue_productivity_review'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
activeStrandedIssueRecoveryIdx: uniqueIndex("issues_active_stranded_issue_recovery_uq")
.on(table.companyId, table.originKind, table.originId)
.where(
sql`${table.originKind} = 'stranded_issue_recovery'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
}),
);