[codex] Add run liveness continuations (#4083)

## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Heartbeat runs are the control-plane record of each agent execution window. > - Long-running local agents can exhaust context or stop while still holding useful next-step state. > - Operators need that stop reason, next action, and continuation path to be durable and visible. > - This pull request adds run liveness metadata, continuation summaries, and UI surfaces for issue run ledgers. > - The benefit is that interrupted or long-running work can resume with clearer context instead of losing the agent's last useful handoff. ## What Changed - Added heartbeat-run liveness fields, continuation attempt tracking, and an idempotent `0058` migration. - Added server services and tests for run liveness, continuation summaries, stop metadata, and activity backfill. - Wired local and HTTP adapters to surface continuation/liveness context through shared adapter utilities. - Added shared constants, validators, and heartbeat types for liveness continuation state. - Added issue-detail UI surfaces for continuation handoffs and the run ledger, with component tests. - Updated agent runtime docs, heartbeat protocol docs, prompt guidance, onboarding assets, and skills instructions to explain continuation behavior. - Addressed Greptile feedback by scoping document evidence by run, excluding system continuation-summary documents from liveness evidence, importing shared liveness types, surfacing hidden ledger run counts, documenting bounded retry behavior, and moving run-ledger liveness backfill off the request path. ## Verification - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts server/src/__tests__/run-continuations.test.ts server/src/__tests__/run-liveness.test.ts server/src/__tests__/activity-service.test.ts server/src/__tests__/documents-service.test.ts server/src/__tests__/issue-continuation-summary.test.ts server/src/services/heartbeat-stop-metadata.test.ts ui/src/components/IssueRunLedger.test.tsx ui/src/components/IssueContinuationHandoff.test.tsx ui/src/components/IssueDocumentsSection.test.tsx` - `pnpm --filter @paperclipai/db build` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm --filter @paperclipai/ui typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/run-continuations.test.ts ui/src/components/IssueRunLedger.test.tsx` - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts -t "treats a plan document update"` - `pnpm exec vitest run server/src/__tests__/activity-service.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts -t "activity service|treats a plan document update"` - Remote PR checks on head `e53b1a1d`: `verify`, `e2e`, `policy`, and Snyk all passed. - Confirmed `public-gh/master` is an ancestor of this branch after fetching `public-gh master`. - Confirmed `pnpm-lock.yaml` is not included in the branch diff. - Confirmed migration `0058_wealthy_starbolt.sql` is ordered after `0057` and uses `IF NOT EXISTS` guards for repeat application. - Greptile inline review threads are resolved. ## Risks - Medium risk: this touches heartbeat execution, liveness recovery, activity rendering, issue routes, shared contracts, docs, and UI. - Migration risk is mitigated by additive columns/indexes and idempotent guards. - Run-ledger liveness backfill is now asynchronous, so the first ledger response can briefly show historical missing liveness until the background backfill completes. - UI screenshot coverage is not included in this packaging pass; validation is currently through focused component tests. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5.4, local tool-use coding agent with terminal, git, GitHub connector, GitHub CLI, and Paperclip API access. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Screenshot note: no before/after screenshots were captured in this PR packaging pass; the UI changes are covered by focused component tests listed above. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:01:49 -05:00
parent b9a80dcf22
commit 236d11d36f
71 changed files with 18254 additions and 85 deletions
@@ -1,6 +1,17 @@
 import { randomUUID } from "node:crypto";
 import { afterAll, afterEach, beforeAll, describe, expect, it } from "vitest";
-import { agents, companies, createDb, heartbeatRuns } from "@paperclipai/db";
+import {
+  agents,
+  companies,
+  createDb,
+  documentRevisions,
+  documents,
+  heartbeatRuns,
+  issueComments,
+  issueDocuments,
+  issues,
+} from "@paperclipai/db";
+import { ISSUE_CONTINUATION_SUMMARY_DOCUMENT_KEY } from "@paperclipai/shared";
 import {
  getEmbeddedPostgresTestSupport,
  startEmbeddedPostgresTestDatabase,
@@ -9,6 +20,8 @@ import { activityService } from "../services/activity.ts";

 const embeddedPostgresSupport = await getEmbeddedPostgresTestSupport();
 const describeEmbeddedPostgres = embeddedPostgresSupport.supported ? describe : describe.skip;
+type ActivityService = ReturnType<typeof activityService>;
+type IssueRun = Awaited<ReturnType<ActivityService["runsForIssue"]>>[number];

 if (!embeddedPostgresSupport.supported) {
  console.warn(
@@ -16,6 +29,23 @@ if (!embeddedPostgresSupport.supported) {
  );
 }

+async function waitForIssueRun(
+  service: ActivityService,
+  companyId: string,
+  issueId: string,
+  predicate: (run: IssueRun) => boolean,
+) {
+  const deadline = Date.now() + 2_000;
+  let latestRuns: IssueRun[] = [];
+  while (Date.now() < deadline) {
+    latestRuns = await service.runsForIssue(companyId, issueId);
+    const run = latestRuns.find(predicate);
+    if (run) return { run, runs: latestRuns };
+    await new Promise((resolve) => setTimeout(resolve, 25));
+  }
+  throw new Error(`Timed out waiting for issue run. Latest run count: ${latestRuns.length}`);
+}
+
 describeEmbeddedPostgres("activity service", () => {
  let db!: ReturnType<typeof createDb>;
  let tempDb: Awaited<ReturnType<typeof startEmbeddedPostgresTestDatabase>> | null = null;
@@ -26,6 +56,11 @@ describeEmbeddedPostgres("activity service", () => {
  }, 20_000);

  afterEach(async () => {
+    await db.delete(issueComments);
+    await db.delete(issueDocuments);
+    await db.delete(documentRevisions);
+    await db.delete(documents);
+    await db.delete(issues);
    await db.delete(heartbeatRuns);
    await db.delete(agents);
    await db.delete(companies);
@@ -78,9 +113,17 @@ describeEmbeddedPostgres("activity service", () => {
      resultJson: {
        billing_type: "metered",
        total_cost_usd: 0.42,
+        stopReason: "timeout",
+        effectiveTimeoutSec: 30,
+        timeoutFired: true,
        summary: "done",
        nestedHuge: { payload: "y".repeat(256_000) },
      },
+      livenessState: "advanced",
+      livenessReason: "Run produced concrete action evidence: 1 issue comment(s)",
+      continuationAttempt: 2,
+      lastUsefulActionAt: new Date("2026-04-18T19:59:00.000Z"),
+      nextAction: "Review the completed output.",
    });

    const runs = await activityService(db).runsForIssue(companyId, issueId);
@@ -111,6 +154,337 @@ describeEmbeddedPostgres("activity service", () => {
      costUsd: 0.42,
      cost_usd: 0.42,
      total_cost_usd: 0.42,
+      stopReason: "timeout",
+      effectiveTimeoutSec: 30,
+      timeoutFired: true,
+    });
+    expect(runs[0]).toMatchObject({
+      livenessState: "advanced",
+      livenessReason: "Run produced concrete action evidence: 1 issue comment(s)",
+      continuationAttempt: 2,
+      lastUsefulActionAt: new Date("2026-04-18T19:59:00.000Z"),
+      nextAction: "Review the completed output.",
+    });
+  });
+
+  it("backfills missing liveness for completed issue runs before returning the ledger", async () => {
+    const companyId = randomUUID();
+    const agentId = randomUUID();
+    const issueId = randomUUID();
+    const runId = randomUUID();
+    const completedAt = new Date("2026-04-18T20:04:00.000Z");
+
+    await db.insert(companies).values({
+      id: companyId,
+      name: "Paperclip",
+      issuePrefix: `T${companyId.replace(/-/g, "").slice(0, 6).toUpperCase()}`,
+      requireBoardApprovalForNewAgents: false,
+    });
+
+    await db.insert(agents).values({
+      id: agentId,
+      companyId,
+      name: "CodexCoder",
+      role: "engineer",
+      status: "idle",
+      adapterType: "codex_local",
+      adapterConfig: {},
+      runtimeConfig: {},
+      permissions: {},
+    });
+
+    await db.insert(issues).values({
+      id: issueId,
+      companyId,
+      title: "Fix run ledger",
+      description: "Make the run ledger answer whether a run advanced.",
+      status: "done",
+      priority: "medium",
+      assigneeAgentId: agentId,
+      completedAt,
+    });
+
+    await db.insert(heartbeatRuns).values({
+      id: runId,
+      companyId,
+      agentId,
+      invocationSource: "assignment",
+      status: "succeeded",
+      startedAt: new Date("2026-04-18T20:00:00.000Z"),
+      finishedAt: completedAt,
+      contextSnapshot: { issueId },
+      resultJson: {
+        summary: "Finished the implementation.",
+      },
+      livenessState: null,
+      livenessReason: null,
+      lastUsefulActionAt: null,
+      nextAction: null,
+    });
+
+    await db.insert(issueComments).values({
+      companyId,
+      issueId,
+      authorAgentId: agentId,
+      createdByRunId: runId,
+      body: "Done",
+      createdAt: completedAt,
+    });
+
+    const service = activityService(db);
+    const { run, runs } = await waitForIssueRun(
+      service,
+      companyId,
+      issueId,
+      (entry) => entry.runId === runId && entry.livenessState === "completed",
+    );
+
+    expect(runs).toHaveLength(1);
+    expect(run).toMatchObject({
+      runId,
+      livenessState: "completed",
+      livenessReason: "Issue is done",
+      continuationAttempt: 0,
+      lastUsefulActionAt: completedAt,
+    });
+
+    const [persisted] = await db.select().from(heartbeatRuns);
+    expect(persisted).toMatchObject({
+      id: runId,
+      livenessState: "completed",
+      livenessReason: "Issue is done",
+      continuationAttempt: 0,
+      lastUsefulActionAt: completedAt,
+    });
+  });
+
+  it("does not backfill document evidence from a different run", async () => {
+    const companyId = randomUUID();
+    const agentId = randomUUID();
+    const issueId = randomUUID();
+    const runId = randomUUID();
+    const otherRunId = randomUUID();
+    const documentId = randomUUID();
+    const revisionId = randomUUID();
+    const createdAt = new Date("2026-04-18T20:08:00.000Z");
+
+    await db.insert(companies).values({
+      id: companyId,
+      name: "Paperclip",
+      issuePrefix: `T${companyId.replace(/-/g, "").slice(0, 6).toUpperCase()}`,
+      requireBoardApprovalForNewAgents: false,
+    });
+
+    await db.insert(agents).values({
+      id: agentId,
+      companyId,
+      name: "CodexCoder",
+      role: "engineer",
+      status: "idle",
+      adapterType: "codex_local",
+      adapterConfig: {},
+      runtimeConfig: {},
+      permissions: {},
+    });
+
+    await db.insert(issues).values({
+      id: issueId,
+      companyId,
+      title: "Fix run ledger",
+      description: "Make the run ledger answer whether a run advanced.",
+      status: "in_progress",
+      priority: "medium",
+      assigneeAgentId: agentId,
+    });
+
+    await db.insert(heartbeatRuns).values([
+      {
+        id: runId,
+        companyId,
+        agentId,
+        invocationSource: "assignment",
+        status: "succeeded",
+        startedAt: new Date("2026-04-18T20:00:00.000Z"),
+        finishedAt: new Date("2026-04-18T20:02:00.000Z"),
+        contextSnapshot: { issueId },
+        resultJson: {
+          summary: "Next steps:\n- inspect files",
+        },
+        livenessState: null,
+        livenessReason: null,
+      },
+      {
+        id: otherRunId,
+        companyId,
+        agentId,
+        invocationSource: "assignment",
+        status: "succeeded",
+        startedAt: new Date("2026-04-18T20:05:00.000Z"),
+        finishedAt: createdAt,
+        contextSnapshot: { issueId },
+        resultJson: {
+          summary: "Updated the plan document.",
+        },
+        livenessState: "advanced",
+        livenessReason: "Run produced concrete action evidence: 1 document revision(s)",
+      },
+    ]);
+
+    await db.insert(documents).values({
+      id: documentId,
+      companyId,
+      title: "Plan",
+      format: "markdown",
+      latestBody: "# Plan\n\n- Inspect files",
+      latestRevisionId: revisionId,
+      latestRevisionNumber: 1,
+      createdByAgentId: agentId,
+      updatedByAgentId: agentId,
+      createdAt,
+      updatedAt: createdAt,
+    });
+
+    await db.insert(documentRevisions).values({
+      id: revisionId,
+      companyId,
+      documentId,
+      revisionNumber: 1,
+      title: "Plan",
+      format: "markdown",
+      body: "# Plan\n\n- Inspect files",
+      createdByAgentId: agentId,
+      createdByRunId: otherRunId,
+      createdAt,
+    });
+
+    await db.insert(issueDocuments).values({
+      companyId,
+      issueId,
+      documentId,
+      key: "plan",
+      createdAt,
+      updatedAt: createdAt,
+    });
+
+    const service = activityService(db);
+    const { run: backfilledRun } = await waitForIssueRun(
+      service,
+      companyId,
+      issueId,
+      (entry) => entry.runId === runId && entry.livenessState === "plan_only",
+    );
+
+    expect(backfilledRun).toMatchObject({
+      runId,
+      livenessState: "plan_only",
+      livenessReason: "Run described future work without concrete action evidence",
+      lastUsefulActionAt: null,
+    });
+  });
+
+  it("does not treat continuation summary revisions as concrete backfill evidence", async () => {
+    const companyId = randomUUID();
+    const agentId = randomUUID();
+    const issueId = randomUUID();
+    const runId = randomUUID();
+    const documentId = randomUUID();
+    const revisionId = randomUUID();
+    const createdAt = new Date("2026-04-18T20:12:00.000Z");
+
+    await db.insert(companies).values({
+      id: companyId,
+      name: "Paperclip",
+      issuePrefix: `T${companyId.replace(/-/g, "").slice(0, 6).toUpperCase()}`,
+      requireBoardApprovalForNewAgents: false,
+    });
+
+    await db.insert(agents).values({
+      id: agentId,
+      companyId,
+      name: "CodexCoder",
+      role: "engineer",
+      status: "idle",
+      adapterType: "codex_local",
+      adapterConfig: {},
+      runtimeConfig: {},
+      permissions: {},
+    });
+
+    await db.insert(issues).values({
+      id: issueId,
+      companyId,
+      title: "Fix run ledger",
+      description: "Make the run ledger answer whether a run advanced.",
+      status: "in_progress",
+      priority: "medium",
+      assigneeAgentId: agentId,
+    });
+
+    await db.insert(heartbeatRuns).values({
+      id: runId,
+      companyId,
+      agentId,
+      invocationSource: "assignment",
+      status: "succeeded",
+      startedAt: new Date("2026-04-18T20:10:00.000Z"),
+      finishedAt: createdAt,
+      contextSnapshot: { issueId },
+      resultJson: {
+        summary: "Next steps:\n- inspect files",
+      },
+      livenessState: null,
+      livenessReason: null,
+    });
+
+    await db.insert(documents).values({
+      id: documentId,
+      companyId,
+      title: "Continuation Summary",
+      format: "markdown",
+      latestBody: "# Continuation Summary",
+      latestRevisionId: revisionId,
+      latestRevisionNumber: 1,
+      createdByAgentId: agentId,
+      updatedByAgentId: agentId,
+      createdAt,
+      updatedAt: createdAt,
+    });
+
+    await db.insert(documentRevisions).values({
+      id: revisionId,
+      companyId,
+      documentId,
+      revisionNumber: 1,
+      title: "Continuation Summary",
+      format: "markdown",
+      body: "# Continuation Summary",
+      createdByAgentId: agentId,
+      createdByRunId: runId,
+      createdAt,
+    });
+
+    await db.insert(issueDocuments).values({
+      companyId,
+      issueId,
+      documentId,
+      key: ISSUE_CONTINUATION_SUMMARY_DOCUMENT_KEY,
+      createdAt,
+      updatedAt: createdAt,
+    });
+
+    const service = activityService(db);
+    const { run: backfilledRun } = await waitForIssueRun(
+      service,
+      companyId,
+      issueId,
+      (entry) => entry.runId === runId && entry.livenessState === "plan_only",
+    );
+
+    expect(backfilledRun).toMatchObject({
+      runId,
+      livenessState: "plan_only",
+      livenessReason: "Run described future work without concrete action evidence",
+      lastUsefulActionAt: null,
    });
  });
 });