[codex] Add run liveness continuations (#4083)

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Heartbeat runs are the control-plane record of each agent execution
window.
> - Long-running local agents can exhaust context or stop while still
holding useful next-step state.
> - Operators need that stop reason, next action, and continuation path
to be durable and visible.
> - This pull request adds run liveness metadata, continuation
summaries, and UI surfaces for issue run ledgers.
> - The benefit is that interrupted or long-running work can resume with
clearer context instead of losing the agent's last useful handoff.

## What Changed

- Added heartbeat-run liveness fields, continuation attempt tracking,
and an idempotent `0058` migration.
- Added server services and tests for run liveness, continuation
summaries, stop metadata, and activity backfill.
- Wired local and HTTP adapters to surface continuation/liveness context
through shared adapter utilities.
- Added shared constants, validators, and heartbeat types for liveness
continuation state.
- Added issue-detail UI surfaces for continuation handoffs and the run
ledger, with component tests.
- Updated agent runtime docs, heartbeat protocol docs, prompt guidance,
onboarding assets, and skills instructions to explain continuation
behavior.
- Addressed Greptile feedback by scoping document evidence by run,
excluding system continuation-summary documents from liveness evidence,
importing shared liveness types, surfacing hidden ledger run counts,
documenting bounded retry behavior, and moving run-ledger liveness
backfill off the request path.

## Verification

- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/run-continuations.test.ts
server/src/__tests__/run-liveness.test.ts
server/src/__tests__/activity-service.test.ts
server/src/__tests__/documents-service.test.ts
server/src/__tests__/issue-continuation-summary.test.ts
server/src/services/heartbeat-stop-metadata.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/components/IssueContinuationHandoff.test.tsx
ui/src/components/IssueDocumentsSection.test.tsx`
- `pnpm --filter @paperclipai/db build`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
server/src/__tests__/run-continuations.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts -t "treats a
plan document update"`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts -t "activity
service|treats a plan document update"`
- Remote PR checks on head `e53b1a1d`: `verify`, `e2e`, `policy`, and
Snyk all passed.
- Confirmed `public-gh/master` is an ancestor of this branch after
fetching `public-gh master`.
- Confirmed `pnpm-lock.yaml` is not included in the branch diff.
- Confirmed migration `0058_wealthy_starbolt.sql` is ordered after
`0057` and uses `IF NOT EXISTS` guards for repeat application.
- Greptile inline review threads are resolved.

## Risks

- Medium risk: this touches heartbeat execution, liveness recovery,
activity rendering, issue routes, shared contracts, docs, and UI.
- Migration risk is mitigated by additive columns/indexes and idempotent
guards.
- Run-ledger liveness backfill is now asynchronous, so the first ledger
response can briefly show historical missing liveness until the
background backfill completes.
- UI screenshot coverage is not included in this packaging pass;
validation is currently through focused component tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4, local tool-use coding agent with terminal, git,
GitHub connector, GitHub CLI, and Paperclip API access.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Screenshot note: no before/after screenshots were captured in this PR
packaging pass; the UI changes are covered by focused component tests
listed above.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
This commit is contained in:
Dotta
2026-04-20 06:01:49 -05:00
committed by GitHub
parent b9a80dcf22
commit 236d11d36f
71 changed files with 18254 additions and 85 deletions
@@ -10,9 +10,12 @@ import {
companySkills,
companies,
createDb,
documentRevisions,
documents,
heartbeatRunEvents,
heartbeatRuns,
issueComments,
issueDocuments,
issues,
} from "@paperclipai/db";
import {
@@ -22,6 +25,17 @@ import {
import { runningProcesses } from "../adapters/index.ts";
const mockTelemetryClient = vi.hoisted(() => ({ track: vi.fn() }));
const mockTrackAgentFirstHeartbeat = vi.hoisted(() => vi.fn());
const mockAdapterExecute = vi.hoisted(() =>
vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
errorMessage: null,
summary: "Recovered stranded heartbeat work.",
provider: "test",
model: "test-model",
})),
);
vi.mock("../telemetry.ts", () => ({
getTelemetryClient: () => mockTelemetryClient,
@@ -43,14 +57,7 @@ vi.mock("../adapters/index.ts", async () => {
...actual,
getServerAdapter: vi.fn(() => ({
supportsLocalAgentJwt: false,
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
errorMessage: null,
provider: "test",
model: "test-model",
})),
execute: mockAdapterExecute,
})),
};
});
@@ -104,6 +111,20 @@ async function waitForRunToSettle(
return heartbeat.getRun(runId);
}
async function waitForValue<T>(
read: () => Promise<T | null | undefined>,
timeoutMs = 3_000,
) {
const deadline = Date.now() + timeoutMs;
let latest: T | null | undefined = null;
while (Date.now() < deadline) {
latest = await read();
if (latest) return latest;
await new Promise((resolve) => setTimeout(resolve, 50));
}
return latest ?? null;
}
async function spawnOrphanedProcessGroup() {
const leader = spawn(
process.execPath,
@@ -157,6 +178,15 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
afterEach(async () => {
vi.clearAllMocks();
mockAdapterExecute.mockImplementation(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
errorMessage: null,
summary: "Recovered stranded heartbeat work.",
provider: "test",
model: "test-model",
}));
runningProcesses.clear();
for (const child of childProcesses) {
child.kill("SIGKILL");
@@ -170,10 +200,26 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
}
}
cleanupPids.clear();
for (let attempt = 0; attempt < 10; attempt += 1) {
const runs = await db.select({ status: heartbeatRuns.status }).from(heartbeatRuns);
if (runs.every((run) => run.status !== "queued" && run.status !== "running")) {
break;
let idlePolls = 0;
for (let attempt = 0; attempt < 100; attempt += 1) {
const runs = await db
.select({
status: heartbeatRuns.status,
processPid: heartbeatRuns.processPid,
processGroupId: heartbeatRuns.processGroupId,
})
.from(heartbeatRuns);
const managedExecutionStillActive = runs.some(
(run) =>
(run.status === "queued" || run.status === "running") &&
!run.processPid &&
!run.processGroupId,
);
if (!managedExecutionStillActive) {
idlePolls += 1;
if (idlePolls >= 3) break;
} else {
idlePolls = 0;
}
await new Promise((resolve) => setTimeout(resolve, 50));
}
@@ -182,6 +228,9 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
await db.delete(agentRuntimeState);
await db.delete(companySkills);
await db.delete(issueComments);
await db.delete(issueDocuments);
await db.delete(documentRevisions);
await db.delete(documents);
await db.delete(issues);
await db.delete(heartbeatRunEvents);
await db.delete(heartbeatRuns);
@@ -439,6 +488,13 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
const retryRun = runs.find((row) => row.id !== runId);
expect(failedRun?.status).toBe("failed");
expect(failedRun?.errorCode).toBe("process_lost");
expect(failedRun?.livenessState).toBe("failed");
expect(failedRun?.livenessReason).toContain("process_lost");
expect(failedRun?.resultJson).toMatchObject({
stopReason: "process_lost",
timeoutConfigured: false,
timeoutFired: false,
});
expect(retryRun?.status).toBe("queued");
expect(retryRun?.retryOfRunId).toBe(runId);
expect(retryRun?.processLossRetryCount).toBe(1);
@@ -553,6 +609,23 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
);
});
it("records manual cancellation stop metadata", async () => {
const { runId } = await seedRunFixture({
agentStatus: "running",
includeIssue: false,
});
const heartbeat = heartbeatService(db);
const cancelled = await heartbeat.cancelRun(runId);
expect(cancelled?.status).toBe("cancelled");
expect(cancelled?.resultJson).toMatchObject({
stopReason: "cancelled",
effectiveTimeoutSec: 0,
timeoutConfigured: false,
timeoutFired: false,
});
});
it("re-enqueues assigned todo work when the last issue run died and no wake remains", async () => {
const { agentId, issueId, runId } = await seedStrandedIssueFixture({
status: "todo",
@@ -629,6 +702,106 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
}
});
it("classifies actionable plan-only recovery and enqueues one liveness continuation", async () => {
mockAdapterExecute.mockResolvedValueOnce({
exitCode: 0,
signal: null,
timedOut: false,
errorMessage: null,
summary: "I will inspect the repo next and then implement the fix.",
provider: "test",
model: "test-model",
});
const { agentId, issueId, runId } = await seedStrandedIssueFixture({
status: "in_progress",
runStatus: "failed",
});
const heartbeat = heartbeatService(db);
await heartbeat.reconcileStrandedAssignedIssues();
const livenessWake = await waitForValue(async () => {
const rows = await db.select().from(agentWakeupRequests).where(eq(agentWakeupRequests.agentId, agentId));
return rows.find((row) => row.reason === "run_liveness_continuation") ?? null;
});
expect(livenessWake).toBeTruthy();
expect(livenessWake?.payload).toMatchObject({
issueId,
livenessState: "plan_only",
continuationAttempt: 1,
});
const sourceRunId = (livenessWake?.payload as Record<string, unknown> | null)?.sourceRunId;
expect(sourceRunId).toBeTruthy();
const sourceRun = await db
.select()
.from(heartbeatRuns)
.where(eq(heartbeatRuns.id, String(sourceRunId)))
.then((rows) => rows[0] ?? null);
expect(sourceRun?.id).not.toBe(runId);
expect(sourceRun?.livenessState).toBe("plan_only");
});
it("treats a plan document update as progress and does not enqueue liveness continuation", async () => {
const { agentId, companyId, issueId, runId } = await seedStrandedIssueFixture({
status: "in_progress",
runStatus: "failed",
});
mockAdapterExecute.mockImplementationOnce(async (ctx: { runId: string }) => {
const documentId = randomUUID();
const revisionId = randomUUID();
await db.insert(documents).values({
id: documentId,
companyId,
title: "Plan",
format: "markdown",
latestBody: "# Plan\n\n- Inspect files\n- Implement fix",
latestRevisionId: revisionId,
latestRevisionNumber: 1,
createdByAgentId: agentId,
updatedByAgentId: agentId,
});
await db.insert(documentRevisions).values({
id: revisionId,
companyId,
documentId,
revisionNumber: 1,
title: "Plan",
format: "markdown",
body: "# Plan\n\n- Inspect files\n- Implement fix",
createdByAgentId: agentId,
createdByRunId: ctx.runId,
});
await db.insert(issueDocuments).values({
companyId,
issueId,
documentId,
key: "plan",
});
return {
exitCode: 0,
signal: null,
timedOut: false,
errorMessage: null,
summary: "Plan:\n- Inspect files\n- Implement fix",
provider: "test",
model: "test-model",
};
});
const heartbeat = heartbeatService(db);
await heartbeat.reconcileStrandedAssignedIssues();
const retryRun = await waitForValue(async () => {
const rows = await db.select().from(heartbeatRuns).where(eq(heartbeatRuns.agentId, agentId));
return rows.find((row) => row.id !== runId && row.livenessState === "advanced") ?? null;
});
expect(retryRun?.livenessState).toBe("advanced");
const wakes = await db.select().from(agentWakeupRequests).where(eq(agentWakeupRequests.agentId, agentId));
expect(wakes.some((row) => row.reason === "run_liveness_continuation")).toBe(false);
});
it("blocks stranded in-progress work after the continuation retry was already used", async () => {
const { issueId } = await seedStrandedIssueFixture({
status: "in_progress",