[codex] Harden recovery issue handling (#4600)

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The control plane must recover stranded agent work without creating
new operational loops
> - Stranded recovery issues can themselves fail, and exposing raw retry
errors in comments can leak sensitive adapter details
> - New local companies also should not force a hire-approval gate
unless operators enable that policy
> - This pull request hardens recovery issue handling, redacts retry
failure details in issue copy, preserves `maxConcurrentRuns: 1`, and
flips new-hire approval to an opt-in default
> - The benefit is safer automatic recovery and smoother default company
setup without hidden migration conflicts

## What Changed

- Added migration `0071_default_hire_approval_off` and updated company
schema/import/export/docs so hire approvals default off and serialize
only when enabled.
- Added migration `0072_large_sandman` with a partial unique index
preventing duplicate active stranded recovery issues for the same source
issue.
- Blocked failed `stranded_issue_recovery` issues in place instead of
creating nested recovery issues.
- Redacted latest retry failure details from recovery issue comments
while still linking reviewers to run evidence.
- Allowed `maxConcurrentRuns: 1` to be honored by heartbeat concurrency
normalization.
- Added focused regression coverage for recovery recursion, redaction,
migration ordering, and concurrency behavior.

## Verification

- `pnpm --filter @paperclipai/db run check:migrations`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/recovery-classifiers.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-portability.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-permissions-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-process-recovery.test.ts --pool=forks
--poolOptions.forks.isolate=true` exits 0, but this host skipped the
embedded Postgres tests with the existing init guard.
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
--pool=forks --poolOptions.forks.isolate=true` exits 0, but this host
skipped the embedded Postgres tests with the existing init guard.

## Risks

- Migration risk is low but this PR intentionally owns both new
migrations to avoid separate PR migration-journal conflicts.
- Recovery comments now require operators to inspect linked run evidence
for details instead of reading raw errors inline.
- The hire approval default changes behavior for newly created/imported
companies only; existing persisted company settings are not changed
except by the SQL default for future rows.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
This commit is contained in:
Dotta
2026-04-27 15:02:47 -05:00
committed by GitHub
parent 6ccf80bcf2
commit 7a9b3a6037
18 changed files with 16535 additions and 65 deletions
+1 -1
View File
@@ -47,7 +47,7 @@ You do **not** need to tell the CEO to engage specific agents. After you approve
- **Breaks goals into concrete tasks** with clear descriptions, priorities, and acceptance criteria - **Breaks goals into concrete tasks** with clear descriptions, priorities, and acceptance criteria
- **Assigns tasks to the right agent** based on role and capabilities (e.g., engineering tasks go to the CTO or engineers, marketing tasks go to the CMO) - **Assigns tasks to the right agent** based on role and capabilities (e.g., engineering tasks go to the CTO or engineers, marketing tasks go to the CMO)
- **Creates subtasks** when work needs to be decomposed further - **Creates subtasks** when work needs to be decomposed further
- **Hires new agents** when the team lacks capacity for a goal (subject to your approval) - **Hires new agents** when the team lacks capacity for a goal, with hire approvals available when enabled in company settings
- **Monitors progress** on each heartbeat, checking task status and unblocking reports - **Monitors progress** on each heartbeat, checking task status and unblocking reports
- **Escalates to you** when it encounters something it can't resolve — budget issues, blocked approvals, or strategic ambiguity - **Escalates to you** when it encounters something it can't resolve — budget issues, blocked approvals, or strategic ambiguity
+2 -2
View File
@@ -57,9 +57,9 @@ The CEO is the primary delegator. When you set company goals, the CEO:
1. Creates a strategy and submits it for your approval 1. Creates a strategy and submits it for your approval
2. Breaks approved goals into tasks 2. Breaks approved goals into tasks
3. Assigns tasks to agents based on their role and capabilities 3. Assigns tasks to agents based on their role and capabilities
4. Hires new agents when needed (subject to your approval) 4. Hires new agents when needed, with hire approvals available when you enable them
You don't need to manually assign every task — set the goals and let the CEO organize the work. You approve key decisions (strategy, hiring) and monitor progress. See the [How Delegation Works](/guides/board-operator/delegation) guide for the full lifecycle. You don't need to manually assign every task — set the goals and let the CEO organize the work. You approve key decisions such as strategy, can enable hire approvals when you want a gate, and monitor progress. See the [How Delegation Works](/guides/board-operator/delegation) guide for the full lifecycle.
## Heartbeats ## Heartbeats
@@ -0,0 +1 @@
ALTER TABLE "companies" ALTER COLUMN "require_board_approval_for_new_agents" SET DEFAULT false;
@@ -0,0 +1,6 @@
CREATE UNIQUE INDEX IF NOT EXISTS "issues_active_stranded_issue_recovery_uq"
ON "issues" USING btree ("company_id","origin_kind","origin_id")
WHERE "origin_kind" = 'stranded_issue_recovery'
AND "origin_id" IS NOT NULL
AND "hidden_at" IS NULL
AND "status" NOT IN ('done', 'cancelled');
File diff suppressed because it is too large Load Diff
@@ -498,6 +498,20 @@
"when": 1776780004000, "when": 1776780004000,
"tag": "0070_active_run_output_watchdog", "tag": "0070_active_run_output_watchdog",
"breakpoints": true "breakpoints": true
},
{
"idx": 71,
"version": "7",
"when": 1777131234000,
"tag": "0071_default_hire_approval_off",
"breakpoints": true
},
{
"idx": 72,
"version": "7",
"when": 1777305216238,
"tag": "0072_large_sandman",
"breakpoints": true
} }
] ]
} }
+1 -1
View File
@@ -15,7 +15,7 @@ export const companies = pgTable(
spentMonthlyCents: integer("spent_monthly_cents").notNull().default(0), spentMonthlyCents: integer("spent_monthly_cents").notNull().default(0),
requireBoardApprovalForNewAgents: boolean("require_board_approval_for_new_agents") requireBoardApprovalForNewAgents: boolean("require_board_approval_for_new_agents")
.notNull() .notNull()
.default(true), .default(false),
feedbackDataSharingEnabled: boolean("feedback_data_sharing_enabled") feedbackDataSharingEnabled: boolean("feedback_data_sharing_enabled")
.notNull() .notNull()
.default(false), .default(false),
+8
View File
@@ -115,5 +115,13 @@ export const issues = pgTable(
and ${table.hiddenAt} is null and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`, and ${table.status} not in ('done', 'cancelled')`,
), ),
activeStrandedIssueRecoveryIdx: uniqueIndex("issues_active_stranded_issue_recovery_uq")
.on(table.companyId, table.originKind, table.originId)
.where(
sql`${table.originKind} = 'stranded_issue_recovery'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
}), }),
); );
@@ -19,7 +19,6 @@ const baseAgent = {
adapterType: "process", adapterType: "process",
adapterConfig: {}, adapterConfig: {},
runtimeConfig: {}, runtimeConfig: {},
defaultEnvironmentId: null,
budgetMonthlyCents: 0, budgetMonthlyCents: 0,
spentMonthlyCents: 0, spentMonthlyCents: 0,
pauseReason: null, pauseReason: null,
@@ -352,7 +351,6 @@ describe.sequential("agent permission routes", () => {
mockCompanySkillService.listRuntimeSkillEntries.mockResolvedValue([]); mockCompanySkillService.listRuntimeSkillEntries.mockResolvedValue([]);
mockCompanySkillService.resolveRequestedSkillKeys.mockImplementation(async (_companyId, requested) => requested); mockCompanySkillService.resolveRequestedSkillKeys.mockImplementation(async (_companyId, requested) => requested);
mockBudgetService.upsertPolicy.mockResolvedValue(undefined); mockBudgetService.upsertPolicy.mockResolvedValue(undefined);
mockEnvironmentService.getById.mockResolvedValue(null);
mockAgentInstructionsService.materializeManagedBundle.mockImplementation( mockAgentInstructionsService.materializeManagedBundle.mockImplementation(
async (agent: Record<string, unknown>, files: Record<string, string>) => ({ async (agent: Record<string, unknown>, files: Record<string, string>) => ({
bundle: null, bundle: null,
@@ -375,9 +373,6 @@ describe.sequential("agent permission routes", () => {
mockInstanceSettingsService.getGeneral.mockResolvedValue({ mockInstanceSettingsService.getGeneral.mockResolvedValue({
censorUsernameInLogs: false, censorUsernameInLogs: false,
}); });
mockEnsureOpenCodeModelConfiguredAndAvailable.mockResolvedValue([
{ id: "opencode/gpt-5-nano", label: "opencode/gpt-5-nano" },
]);
mockLogActivity.mockResolvedValue(undefined); mockLogActivity.mockResolvedValue(undefined);
}); });
@@ -139,12 +139,12 @@ describe("company portability", () => {
brandColor: "#5c5fff", brandColor: "#5c5fff",
logoAssetId: null, logoAssetId: null,
logoUrl: null, logoUrl: null,
requireBoardApprovalForNewAgents: true, requireBoardApprovalForNewAgents: false,
}); });
companySvc.create.mockResolvedValue({ companySvc.create.mockResolvedValue({
id: "company-imported", id: "company-imported",
name: "Imported Paperclip", name: "Imported Paperclip",
requireBoardApprovalForNewAgents: true, requireBoardApprovalForNewAgents: false,
}); });
agentSvc.list.mockResolvedValue([ agentSvc.list.mockResolvedValue([
{ {
@@ -461,6 +461,32 @@ describe("company portability", () => {
expect(exported.warnings).toContain("Agent claudecoder PATH override was omitted from export because it is system-dependent."); expect(exported.warnings).toContain("Agent claudecoder PATH override was omitted from export because it is system-dependent.");
}); });
it("exports hire approval policy only when approval is required", async () => {
const portability = companyPortabilityService({} as any);
companySvc.getById.mockResolvedValueOnce({
id: "company-1",
name: "Paperclip",
description: null,
issuePrefix: "PAP",
brandColor: "#5c5fff",
logoAssetId: null,
logoUrl: null,
requireBoardApprovalForNewAgents: true,
});
const exported = await portability.exportBundle("company-1", {
include: {
company: true,
agents: false,
projects: false,
issues: false,
},
});
expect(asTextFile(exported.files[".paperclip.yaml"])).toContain("requireBoardApprovalForNewAgents: true");
});
it("exports default sidebar order into the Paperclip extension and manifest", async () => { it("exports default sidebar order into the Paperclip extension and manifest", async () => {
const portability = companyPortabilityService({} as any); const portability = companyPortabilityService({} as any);
@@ -2554,7 +2580,7 @@ describe("company portability", () => {
status: "idle", status: "idle",
})); }));
expect(companySvc.create).toHaveBeenCalledWith(expect.objectContaining({ expect(companySvc.create).toHaveBeenCalledWith(expect.objectContaining({
requireBoardApprovalForNewAgents: true, requireBoardApprovalForNewAgents: false,
})); }));
}); });
@@ -96,7 +96,16 @@ describeEmbeddedPostgres("heartbeat dependency-aware queued run selection", () =
}, 20_000); }, 20_000);
afterEach(async () => { afterEach(async () => {
vi.clearAllMocks(); mockAdapterExecute.mockReset();
mockAdapterExecute.mockImplementation(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
errorMessage: null,
summary: "Dependency-aware heartbeat test run.",
provider: "test",
model: "test-model",
}));
runningProcesses.clear(); runningProcesses.clear();
let idlePolls = 0; let idlePolls = 0;
for (let attempt = 0; attempt < 100; attempt += 1) { for (let attempt = 0; attempt < 100; attempt += 1) {
@@ -347,6 +356,126 @@ describeEmbeddedPostgres("heartbeat dependency-aware queued run selection", () =
expect(blockedWakeRequestCount).toBeGreaterThanOrEqual(2); expect(blockedWakeRequestCount).toBeGreaterThanOrEqual(2);
}); });
it("honors maxConcurrentRuns 1 by leaving a second assignment wake queued", async () => {
const companyId = randomUUID();
const agentId = randomUUID();
const firstIssueId = randomUUID();
const secondIssueId = randomUUID();
let finishFirstRun!: () => void;
const firstRunFinished = new Promise<void>((resolve) => {
finishFirstRun = resolve;
});
mockAdapterExecute.mockImplementationOnce(async () => {
await firstRunFinished;
return {
exitCode: 0,
signal: null,
timedOut: false,
errorMessage: null,
summary: "First assignment run completed.",
provider: "test",
model: "test-model",
};
});
await db.insert(companies).values({
id: companyId,
name: "Paperclip",
issuePrefix: `T${companyId.replace(/-/g, "").slice(0, 6).toUpperCase()}`,
requireBoardApprovalForNewAgents: false,
});
await db.insert(agents).values({
id: agentId,
companyId,
name: "CodexCoder",
role: "engineer",
status: "active",
adapterType: "codex_local",
adapterConfig: {},
runtimeConfig: {
heartbeat: {
wakeOnDemand: true,
maxConcurrentRuns: 1,
},
},
permissions: {},
});
await db.insert(issues).values([
{
id: firstIssueId,
companyId,
title: "First assignment",
status: "todo",
priority: "high",
assigneeAgentId: agentId,
},
{
id: secondIssueId,
companyId,
title: "Second assignment",
status: "todo",
priority: "high",
assigneeAgentId: agentId,
},
]);
try {
const firstWake = await heartbeat.wakeup(agentId, {
source: "assignment",
triggerDetail: "system",
reason: "issue_assigned",
payload: { issueId: firstIssueId },
contextSnapshot: { issueId: firstIssueId, wakeReason: "issue_assigned" },
});
expect(firstWake).not.toBeNull();
const firstRunStarted = await waitForCondition(async () => {
const run = await db
.select({ status: heartbeatRuns.status })
.from(heartbeatRuns)
.where(eq(heartbeatRuns.id, firstWake!.id))
.then((rows) => rows[0] ?? null);
return run?.status === "running";
});
expect(firstRunStarted).toBe(true);
const firstAdapterStarted = await waitForCondition(async () => mockAdapterExecute.mock.calls.length === 1);
expect(firstAdapterStarted).toBe(true);
const secondWake = await heartbeat.wakeup(agentId, {
source: "assignment",
triggerDetail: "system",
reason: "issue_assigned",
payload: { issueId: secondIssueId },
contextSnapshot: { issueId: secondIssueId, wakeReason: "issue_assigned" },
});
expect(secondWake).not.toBeNull();
const secondRunWhileFirstRunning = await db
.select({ status: heartbeatRuns.status })
.from(heartbeatRuns)
.where(eq(heartbeatRuns.id, secondWake!.id))
.then((rows) => rows[0] ?? null);
expect(secondRunWhileFirstRunning?.status).toBe("queued");
expect(mockAdapterExecute).toHaveBeenCalledTimes(1);
finishFirstRun();
const secondRunSucceeded = await waitForCondition(async () => {
const run = await db
.select({ status: heartbeatRuns.status })
.from(heartbeatRuns)
.where(eq(heartbeatRuns.id, secondWake!.id))
.then((rows) => rows[0] ?? null);
return run?.status === "succeeded";
});
expect(secondRunSucceeded).toBe(true);
expect(mockAdapterExecute).toHaveBeenCalledTimes(2);
} finally {
finishFirstRun();
}
});
it("cancels stale queued runs when issue blockers are still unresolved", async () => { it("cancels stale queued runs when issue blockers are still unresolved", async () => {
const companyId = randomUUID(); const companyId = randomUUID();
const agentId = randomUUID(); const agentId = randomUUID();
@@ -468,6 +468,8 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
retryReason?: "assignment_recovery" | "issue_continuation_needed" | null; retryReason?: "assignment_recovery" | "issue_continuation_needed" | null;
assignToUser?: boolean; assignToUser?: boolean;
activePauseHold?: boolean; activePauseHold?: boolean;
runErrorCode?: string | null;
runError?: string | null;
}) { }) {
const companyId = randomUUID(); const companyId = randomUUID();
const agentId = randomUUID(); const agentId = randomUUID();
@@ -509,7 +511,9 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
runId, runId,
claimedAt: now, claimedAt: now,
finishedAt: new Date("2026-03-19T00:05:00.000Z"), finishedAt: new Date("2026-03-19T00:05:00.000Z"),
error: input.runStatus === "succeeded" ? null : "run failed before issue advanced", error: input.runStatus === "succeeded"
? null
: ("runError" in input ? input.runError : "run failed before issue advanced"),
}); });
await db.insert(heartbeatRuns).values({ await db.insert(heartbeatRuns).values({
@@ -531,8 +535,12 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
startedAt: now, startedAt: now,
finishedAt: new Date("2026-03-19T00:05:00.000Z"), finishedAt: new Date("2026-03-19T00:05:00.000Z"),
updatedAt: new Date("2026-03-19T00:05:00.000Z"), updatedAt: new Date("2026-03-19T00:05:00.000Z"),
errorCode: input.runStatus === "succeeded" ? null : "process_lost", errorCode: input.runStatus === "succeeded"
error: input.runStatus === "succeeded" ? null : "run failed before issue advanced", ? null
: ("runErrorCode" in input ? input.runErrorCode : "process_lost"),
error: input.runStatus === "succeeded"
? null
: ("runError" in input ? input.runError : "run failed before issue advanced"),
}); });
await db.insert(issues).values([ await db.insert(issues).values([
@@ -659,6 +667,20 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
return recovery; return recovery;
} }
async function sourceBlockerIssueIds(companyId: string, sourceIssueId: string) {
return db
.select({ blockerIssueId: issueRelations.issueId })
.from(issueRelations)
.where(
and(
eq(issueRelations.companyId, companyId),
eq(issueRelations.relatedIssueId, sourceIssueId),
eq(issueRelations.type, "blocks"),
),
)
.then((rows) => rows.map((row) => row.blockerIssueId));
}
async function seedQueuedIssueRunFixture() { async function seedQueuedIssueRunFixture() {
const companyId = randomUUID(); const companyId = randomUUID();
const agentId = randomUUID(); const agentId = randomUUID();
@@ -930,6 +952,81 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
expect(comments[0]?.body).toContain(`Recovery issue: [${recovery.identifier}]`); expect(comments[0]?.body).toContain(`Recovery issue: [${recovery.identifier}]`);
}); });
it("blocks failed recovery work in place during immediate terminal-run cleanup", async () => {
const sourceIssueId = randomUUID();
const { companyId, agentId, runId, issueId } = await seedRunFixture({
agentStatus: "idle",
processPid: 999_999_999,
processLossRetryCount: 1,
runErrorCode: "process_lost",
runError: "Authorization: Bearer sk-test-recovery-secret",
});
await db
.update(issues)
.set({
title: "Recover stalled issue PAP-1",
originKind: "stranded_issue_recovery",
originId: sourceIssueId,
})
.where(eq(issues.id, issueId));
const issuePrefix = `T${companyId.replace(/-/g, "").slice(0, 6).toUpperCase()}`;
await db.insert(issues).values({
id: sourceIssueId,
companyId,
title: "Original stranded source",
status: "blocked",
priority: "medium",
issueNumber: 2,
identifier: `${issuePrefix}-2`,
});
await db.insert(issueRelations).values({
companyId,
issueId,
relatedIssueId: sourceIssueId,
type: "blocks",
});
const heartbeat = heartbeatService(db);
const result = await heartbeat.reapOrphanedRuns();
expect(result.reaped).toBe(1);
expect(result.runIds).toEqual([runId]);
const runs = await db
.select()
.from(heartbeatRuns)
.where(eq(heartbeatRuns.agentId, agentId));
expect(runs).toHaveLength(1);
expect(runs[0]?.status).toBe("failed");
const recoveryIssue = await waitForValue(async () =>
db.select().from(issues).where(eq(issues.id, issueId)).then((rows) => {
const issue = rows[0] ?? null;
return issue?.status === "blocked" ? issue : null;
})
);
expect(recoveryIssue?.assigneeAgentId).toBe(agentId);
expect(recoveryIssue?.originKind).toBe("stranded_issue_recovery");
expect(recoveryIssue?.originId).toBe(sourceIssueId);
expect(recoveryIssue?.executionRunId).toBeNull();
const nestedRecoveries = await db
.select()
.from(issues)
.where(and(eq(issues.companyId, companyId), eq(issues.originKind, "stranded_issue_recovery"), eq(issues.originId, issueId)));
expect(nestedRecoveries).toHaveLength(0);
const comments = await waitForValue(async () => {
const rows = await db.select().from(issueComments).where(eq(issueComments.issueId, issueId));
return rows.length > 0 ? rows : null;
});
expect(comments).toHaveLength(1);
expect(comments[0]?.body).toContain("stopped automatic stranded-work recovery");
expect(comments[0]?.body).toContain("recovery issues do not create nested `stranded_issue_recovery` issues");
expect(comments[0]?.body).toContain("Latest retry failure details were withheld from the issue thread");
expect(comments[0]?.body).not.toContain("sk-test-recovery-secret");
await expect(sourceBlockerIssueIds(companyId, sourceIssueId)).resolves.toEqual([issueId]);
});
it("does not block paused-tree work when immediate continuation recovery is suppressed by the hold", async () => { it("does not block paused-tree work when immediate continuation recovery is suppressed by the hold", async () => {
const { companyId, agentId, runId, issueId } = await seedRunFixture({ const { companyId, agentId, runId, issueId } = await seedRunFixture({
agentStatus: "idle", agentStatus: "idle",
@@ -1108,6 +1205,8 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
status: "todo", status: "todo",
runStatus: "failed", runStatus: "failed",
retryReason: "assignment_recovery", retryReason: "assignment_recovery",
runErrorCode: "process_lost",
runError: "Authorization: Bearer sk-test-recovery-secret",
}); });
const heartbeat = heartbeatService(db); const heartbeat = heartbeatService(db);
@@ -1127,11 +1226,12 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
previousStatus: "todo", previousStatus: "todo",
retryReason: "assignment_recovery", retryReason: "assignment_recovery",
}); });
expect(recovery.description ?? "").not.toContain("sk-test-recovery-secret");
const comments = await db.select().from(issueComments).where(eq(issueComments.issueId, issueId)); const comments = await db.select().from(issueComments).where(eq(issueComments.issueId, issueId));
expect(comments).toHaveLength(1); expect(comments).toHaveLength(1);
expect(comments[0]?.body).toContain("retried dispatch"); expect(comments[0]?.body).toContain("retried dispatch");
expect(comments[0]?.body).toContain("Latest retry failure: `process_lost` - run failed before issue advanced."); expect(comments[0]?.body).toContain("Latest retry failure details were withheld from the issue thread");
expect(comments[0]?.body).toContain(`Recovery issue: [${recovery.identifier}]`); expect(comments[0]?.body).toContain(`Recovery issue: [${recovery.identifier}]`);
}); });
@@ -1446,10 +1546,217 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
const comments = await db.select().from(issueComments).where(eq(issueComments.issueId, issueId)); const comments = await db.select().from(issueComments).where(eq(issueComments.issueId, issueId));
expect(comments).toHaveLength(1); expect(comments).toHaveLength(1);
expect(comments[0]?.body).toContain("retried continuation"); expect(comments[0]?.body).toContain("retried continuation");
expect(comments[0]?.body).toContain("Latest retry failure: `process_lost` - run failed before issue advanced."); expect(comments[0]?.body).toContain("Latest retry failure details were withheld from the issue thread");
expect(comments[0]?.body).toContain(`Recovery issue: [${recovery.identifier}]`); expect(comments[0]?.body).toContain(`Recovery issue: [${recovery.identifier}]`);
}); });
it("redacts error-code-only stranded recovery failures in issue copy", async () => {
const { companyId, agentId, issueId, runId } = await seedStrandedIssueFixture({
status: "in_progress",
runStatus: "failed",
retryReason: "issue_continuation_needed",
runErrorCode: "adapter_exit_code",
runError: null,
});
const heartbeat = heartbeatService(db);
const result = await heartbeat.reconcileStrandedAssignedIssues();
expect(result.escalated).toBe(1);
const recovery = await expectStrandedRecoveryArtifacts({
companyId,
agentId,
issueId,
runId,
previousStatus: "in_progress",
retryReason: "issue_continuation_needed",
});
expect(recovery.description).toContain("Latest retry failure details were withheld from the issue thread");
expect(recovery.description).not.toContain("- Failure: none recorded");
const comments = await db.select().from(issueComments).where(eq(issueComments.issueId, issueId));
expect(comments).toHaveLength(1);
expect(comments[0]?.body).toContain("Latest retry failure details were withheld from the issue thread");
expect(comments[0]?.body).not.toContain("- Failure: none recorded");
});
it("reuses the raced stranded recovery issue when duplicate active recovery creation conflicts", async () => {
const { companyId, issueId } = await seedStrandedIssueFixture({
status: "in_progress",
runStatus: "failed",
retryReason: "issue_continuation_needed",
});
const heartbeat = heartbeatService(db);
const results = await Promise.allSettled(
Array.from({ length: 8 }, () => heartbeat.reconcileStrandedAssignedIssues()),
);
expect(results.every((result) => result.status === "fulfilled")).toBe(true);
const recoveries = await db
.select()
.from(issues)
.where(and(
eq(issues.companyId, companyId),
eq(issues.originKind, "stranded_issue_recovery"),
eq(issues.originId, issueId),
));
expect(recoveries).toHaveLength(1);
await expect(sourceBlockerIssueIds(companyId, issueId)).resolves.toEqual([recoveries[0]?.id]);
});
it("blocks stranded recovery issues in place instead of creating nested recovery issues", async () => {
const sourceIssueId = randomUUID();
const { companyId, agentId, issueId, runId } = await seedStrandedIssueFixture({
status: "in_progress",
runStatus: "failed",
});
await db
.update(issues)
.set({
title: "Recover stalled issue PAP-1",
originKind: "stranded_issue_recovery",
originId: sourceIssueId,
})
.where(eq(issues.id, issueId));
const issuePrefix = `T${companyId.replace(/-/g, "").slice(0, 6).toUpperCase()}`;
await db.insert(issues).values({
id: sourceIssueId,
companyId,
title: "Original stranded source",
status: "blocked",
priority: "medium",
issueNumber: 2,
identifier: `${issuePrefix}-2`,
});
await db.insert(issueRelations).values({
companyId,
issueId,
relatedIssueId: sourceIssueId,
type: "blocks",
});
const heartbeat = heartbeatService(db);
const result = await heartbeat.reconcileStrandedAssignedIssues();
expect(result.dispatchRequeued).toBe(0);
expect(result.continuationRequeued).toBe(0);
expect(result.escalated).toBe(1);
expect(result.issueIds).toEqual([issueId]);
const recoveryIssue = await db.select().from(issues).where(eq(issues.id, issueId)).then((rows) => rows[0] ?? null);
expect(recoveryIssue?.status).toBe("blocked");
expect(recoveryIssue?.assigneeAgentId).toBe(agentId);
expect(recoveryIssue?.originKind).toBe("stranded_issue_recovery");
expect(recoveryIssue?.originId).toBe(sourceIssueId);
const nestedRecoveries = await db
.select()
.from(issues)
.where(and(eq(issues.companyId, companyId), eq(issues.originKind, "stranded_issue_recovery"), eq(issues.originId, issueId)));
expect(nestedRecoveries).toHaveLength(0);
const runs = await db.select().from(heartbeatRuns).where(eq(heartbeatRuns.agentId, agentId));
expect(runs).toHaveLength(1);
expect(runs[0]?.id).toBe(runId);
const comments = await db.select().from(issueComments).where(eq(issueComments.issueId, issueId));
expect(comments).toHaveLength(1);
expect(comments[0]?.body).toContain("stopped automatic stranded-work recovery");
expect(comments[0]?.body).toContain("Latest retry failure details were withheld from the issue thread");
expect(comments[0]?.body).toContain("recovery issues do not create nested `stranded_issue_recovery` issues");
await expect(sourceBlockerIssueIds(companyId, sourceIssueId)).resolves.toEqual([issueId]);
});
it("keeps repeated recovery failures on the same canonical recovery issue", async () => {
const sourceIssueId = randomUUID();
const { companyId, agentId, issueId, runId } = await seedStrandedIssueFixture({
status: "in_progress",
runStatus: "failed",
});
const issuePrefix = `T${companyId.replace(/-/g, "").slice(0, 6).toUpperCase()}`;
await db.insert(issues).values({
id: sourceIssueId,
companyId,
title: "Original stranded source",
status: "blocked",
priority: "medium",
issueNumber: 2,
identifier: `${issuePrefix}-2`,
});
await db
.update(issues)
.set({
title: "Recover stalled issue PAP-1",
originKind: "stranded_issue_recovery",
originId: sourceIssueId,
})
.where(eq(issues.id, issueId));
await db.insert(issueRelations).values({
companyId,
issueId,
relatedIssueId: sourceIssueId,
type: "blocks",
});
const heartbeat = heartbeatService(db);
const firstResult = await heartbeat.reconcileStrandedAssignedIssues();
expect(firstResult.escalated).toBe(1);
expect(firstResult.issueIds).toEqual([issueId]);
const secondRunId = randomUUID();
await db.insert(heartbeatRuns).values({
id: secondRunId,
companyId,
agentId,
invocationSource: "assignment",
triggerDetail: "system",
status: "failed",
contextSnapshot: {
issueId,
taskId: issueId,
wakeReason: "issue_assigned",
source: "stranded_issue_recovery",
},
startedAt: new Date("2030-03-19T00:10:00.000Z"),
finishedAt: new Date("2030-03-19T00:15:00.000Z"),
createdAt: new Date("2030-03-19T00:10:00.000Z"),
updatedAt: new Date("2030-03-19T00:15:00.000Z"),
errorCode: "adapter_failed",
error: "adapter failed while retrying recovery issue",
});
await db
.update(issues)
.set({
status: "in_progress",
checkoutRunId: secondRunId,
executionRunId: null,
})
.where(eq(issues.id, issueId));
const secondResult = await heartbeat.reconcileStrandedAssignedIssues();
expect(secondResult.dispatchRequeued).toBe(0);
expect(secondResult.continuationRequeued).toBe(0);
expect(secondResult.escalated).toBe(1);
expect(secondResult.issueIds).toEqual([issueId]);
const recoveryIssuesForSource = await db
.select()
.from(issues)
.where(and(eq(issues.companyId, companyId), eq(issues.originKind, "stranded_issue_recovery"), eq(issues.originId, sourceIssueId)));
expect(recoveryIssuesForSource.map((issue) => issue.id)).toEqual([issueId]);
const nestedRecoveries = await db
.select()
.from(issues)
.where(and(eq(issues.companyId, companyId), eq(issues.originKind, "stranded_issue_recovery"), eq(issues.originId, issueId)));
expect(nestedRecoveries).toHaveLength(0);
await expect(sourceBlockerIssueIds(companyId, sourceIssueId)).resolves.toEqual([issueId]);
const comments = await db.select().from(issueComments).where(eq(issueComments.issueId, issueId));
expect(comments).toHaveLength(2);
expect(comments[1]?.body).toContain("Latest retry failure details were withheld from the issue thread");
});
it("does not escalate paused-tree recovery when the automatic continuation retry was cancelled by the hold", async () => { it("does not escalate paused-tree recovery when the automatic continuation retry was cancelled by the hold", async () => {
const { companyId, agentId, issueId } = await seedStrandedIssueFixture({ const { companyId, agentId, issueId } = await seedStrandedIssueFixture({
status: "in_progress", status: "in_progress",
@@ -10,6 +10,7 @@ import {
buildRunLivenessContinuationIdempotencyKey, buildRunLivenessContinuationIdempotencyKey,
classifyIssueGraphLiveness, classifyIssueGraphLiveness,
decideRunLivenessContinuation, decideRunLivenessContinuation,
isStrandedIssueRecoveryOriginKind,
parseIssueGraphLivenessIncidentKey, parseIssueGraphLivenessIncidentKey,
} from "../services/recovery/index.ts"; } from "../services/recovery/index.ts";
@@ -143,4 +144,11 @@ describe("recovery classifier boundary", () => {
nextAttempt: 1, nextAttempt: 1,
})).toBe("run_liveness_continuation:issue-1:run-1:plan_only:1"); })).toBe("run_liveness_continuation:issue-1:run-1:plan_only:1");
}); });
it("classifies stranded recovery origins as recovery-owned work", () => {
expect(isStrandedIssueRecoveryOriginKind("stranded_issue_recovery")).toBe(true);
expect(isStrandedIssueRecoveryOriginKind("harness_liveness_escalation")).toBe(false);
expect(isStrandedIssueRecoveryOriginKind("manual")).toBe(false);
expect(isStrandedIssueRecoveryOriginKind(null)).toBe(false);
});
}); });
+4 -4
View File
@@ -2264,7 +2264,7 @@ function buildEnvInputMap(inputs: CompanyPortabilityEnvInput[]) {
} }
function readCompanyApprovalDefault(_frontmatter: Record<string, unknown>) { function readCompanyApprovalDefault(_frontmatter: Record<string, unknown>) {
return true; return false;
} }
function readIncludeEntries(frontmatter: Record<string, unknown>): CompanyPackageIncludeEntry[] { function readIncludeEntries(frontmatter: Record<string, unknown>): CompanyPackageIncludeEntry[] {
@@ -3465,7 +3465,7 @@ export function companyPortabilityService(db: Db, storage?: StorageService) {
company: stripEmptyValues({ company: stripEmptyValues({
brandColor: company.brandColor ?? null, brandColor: company.brandColor ?? null,
logoPath: companyLogoPath, logoPath: companyLogoPath,
requireBoardApprovalForNewAgents: company.requireBoardApprovalForNewAgents ? undefined : false, requireBoardApprovalForNewAgents: company.requireBoardApprovalForNewAgents ? true : undefined,
feedbackDataSharingEnabled: company.feedbackDataSharingEnabled ? true : undefined, feedbackDataSharingEnabled: company.feedbackDataSharingEnabled ? true : undefined,
feedbackDataSharingConsentAt: company.feedbackDataSharingConsentAt?.toISOString() ?? null, feedbackDataSharingConsentAt: company.feedbackDataSharingConsentAt?.toISOString() ?? null,
feedbackDataSharingConsentByUserId: company.feedbackDataSharingConsentByUserId ?? null, feedbackDataSharingConsentByUserId: company.feedbackDataSharingConsentByUserId ?? null,
@@ -3986,8 +3986,8 @@ export function companyPortabilityService(db: Db, storage?: StorageService) {
description: include.company ? (sourceManifest.company?.description ?? null) : null, description: include.company ? (sourceManifest.company?.description ?? null) : null,
brandColor: include.company ? (sourceManifest.company?.brandColor ?? null) : null, brandColor: include.company ? (sourceManifest.company?.brandColor ?? null) : null,
requireBoardApprovalForNewAgents: include.company requireBoardApprovalForNewAgents: include.company
? (sourceManifest.company?.requireBoardApprovalForNewAgents ?? true) ? (sourceManifest.company?.requireBoardApprovalForNewAgents ?? false)
: true, : false,
feedbackDataSharingEnabled: include.company feedbackDataSharingEnabled: include.company
? (sourceManifest.company?.feedbackDataSharingEnabled ?? false) ? (sourceManifest.company?.feedbackDataSharingEnabled ?? false)
: false, : false,
+4 -1
View File
@@ -101,6 +101,7 @@ import {
} from "./execution-workspace-policy.js"; } from "./execution-workspace-policy.js";
import { instanceSettingsService } from "./instance-settings.js"; import { instanceSettingsService } from "./instance-settings.js";
import { import {
RECOVERY_ORIGIN_KINDS,
RUN_LIVENESS_CONTINUATION_REASON, RUN_LIVENESS_CONTINUATION_REASON,
buildRunLivenessContinuationIdempotencyKey, buildRunLivenessContinuationIdempotencyKey,
decideRunLivenessContinuation, decideRunLivenessContinuation,
@@ -133,6 +134,7 @@ const MAX_RUN_EVENT_PAYLOAD_ARRAY_ITEMS = 50;
const MAX_RUN_EVENT_PAYLOAD_OBJECT_KEYS = 100; const MAX_RUN_EVENT_PAYLOAD_OBJECT_KEYS = 100;
const MAX_RUN_EVENT_PAYLOAD_DEPTH = 6; const MAX_RUN_EVENT_PAYLOAD_DEPTH = 6;
const HEARTBEAT_MAX_CONCURRENT_RUNS_DEFAULT = AGENT_DEFAULT_MAX_CONCURRENT_RUNS; const HEARTBEAT_MAX_CONCURRENT_RUNS_DEFAULT = AGENT_DEFAULT_MAX_CONCURRENT_RUNS;
const HEARTBEAT_MAX_CONCURRENT_RUNS_MIN = 1;
const HEARTBEAT_MAX_CONCURRENT_RUNS_MAX = 10; const HEARTBEAT_MAX_CONCURRENT_RUNS_MAX = 10;
const LIVENESS_BOOKKEEPING_ACTIVITY_ACTIONS = [ const LIVENESS_BOOKKEEPING_ACTIVITY_ACTIONS = [
"environment.lease_acquired", "environment.lease_acquired",
@@ -848,7 +850,7 @@ export function compactRunLogChunk(chunk: string, maxChars = MAX_PERSISTED_LOG_C
function normalizeMaxConcurrentRuns(value: unknown) { function normalizeMaxConcurrentRuns(value: unknown) {
const parsed = Math.floor(asNumber(value, HEARTBEAT_MAX_CONCURRENT_RUNS_DEFAULT)); const parsed = Math.floor(asNumber(value, HEARTBEAT_MAX_CONCURRENT_RUNS_DEFAULT));
if (!Number.isFinite(parsed)) return HEARTBEAT_MAX_CONCURRENT_RUNS_DEFAULT; if (!Number.isFinite(parsed)) return HEARTBEAT_MAX_CONCURRENT_RUNS_DEFAULT;
return Math.max(HEARTBEAT_MAX_CONCURRENT_RUNS_DEFAULT, Math.min(HEARTBEAT_MAX_CONCURRENT_RUNS_MAX, parsed)); return Math.max(HEARTBEAT_MAX_CONCURRENT_RUNS_MIN, Math.min(HEARTBEAT_MAX_CONCURRENT_RUNS_MAX, parsed));
} }
interface WakeupOptions { interface WakeupOptions {
@@ -6193,6 +6195,7 @@ export function heartbeatService(db: Db, options: HeartbeatServiceOptions = {})
} }
const shouldBlockImmediately = const shouldBlockImmediately =
issue.originKind === RECOVERY_ORIGIN_KINDS.strandedIssueRecovery ||
!recoveryAgentInvokable || !recoveryAgentInvokable ||
!recoveryAgent || !recoveryAgent ||
didAutomaticRecoveryFail(run, issue.status === "todo" ? "assignment_recovery" : "issue_continuation_needed"); didAutomaticRecoveryFail(run, issue.status === "todo" ? "assignment_recovery" : "issue_continuation_needed");
+1
View File
@@ -4,6 +4,7 @@ export {
RECOVERY_REASON_KINDS, RECOVERY_REASON_KINDS,
buildIssueGraphLivenessIncidentKey, buildIssueGraphLivenessIncidentKey,
buildIssueGraphLivenessLeafKey, buildIssueGraphLivenessLeafKey,
isStrandedIssueRecoveryOriginKind,
parseIssueGraphLivenessIncidentKey, parseIssueGraphLivenessIncidentKey,
} from "./origins.js"; } from "./origins.js";
export type { export type {
+4
View File
@@ -17,6 +17,10 @@ export type RecoveryOriginKind = typeof RECOVERY_ORIGIN_KINDS[keyof typeof RECOV
export type RecoveryReasonKind = typeof RECOVERY_REASON_KINDS[keyof typeof RECOVERY_REASON_KINDS]; export type RecoveryReasonKind = typeof RECOVERY_REASON_KINDS[keyof typeof RECOVERY_REASON_KINDS];
export type RecoveryKeyPrefix = typeof RECOVERY_KEY_PREFIXES[keyof typeof RECOVERY_KEY_PREFIXES]; export type RecoveryKeyPrefix = typeof RECOVERY_KEY_PREFIXES[keyof typeof RECOVERY_KEY_PREFIXES];
export function isStrandedIssueRecoveryOriginKind(originKind: string | null | undefined) {
return originKind === RECOVERY_ORIGIN_KINDS.strandedIssueRecovery;
}
export function buildIssueGraphLivenessIncidentKey(input: { export function buildIssueGraphLivenessIncidentKey(input: {
companyId: string; companyId: string;
issueId: string; issueId: string;
+158 -42
View File
@@ -35,6 +35,7 @@ import { getRunLogStore } from "../run-log-store.js";
import { import {
RECOVERY_ORIGIN_KINDS, RECOVERY_ORIGIN_KINDS,
buildIssueGraphLivenessLeafKey, buildIssueGraphLivenessLeafKey,
isStrandedIssueRecoveryOriginKind,
parseIssueGraphLivenessIncidentKey, parseIssueGraphLivenessIncidentKey,
} from "./origins.js"; } from "./origins.js";
import { import {
@@ -101,22 +102,9 @@ function readNonEmptyString(value: unknown): string | null {
function summarizeRunFailureForIssueComment(run: LatestIssueRun) { function summarizeRunFailureForIssueComment(run: LatestIssueRun) {
if (!run) return null; if (!run) return null;
const errorCode = readNonEmptyString(run.errorCode)?.trim() ?? null; if (readNonEmptyString(run.error) || readNonEmptyString(run.errorCode)) {
const rawError = readNonEmptyString(run.error)?.trim() ?? null; return " Latest retry failure details were withheld from the issue thread; inspect the linked run for evidence.";
const apiMessageMatch = rawError?.match(/"message"\s*:\s*"([^"]+)"/); }
const firstLine = rawError
?.split(/\r?\n/)
.map((line) => line.trim())
.find(Boolean) ?? null;
const summarySource = apiMessageMatch?.[1] ?? firstLine;
const summary =
summarySource && summarySource.length > 240
? `${summarySource.slice(0, 237)}...`
: summarySource;
if (errorCode && summary) return ` Latest retry failure: \`${errorCode}\` - ${summary}.`;
if (errorCode) return ` Latest retry failure: \`${errorCode}\`.`;
if (summary) return ` Latest retry failure: ${summary}.`;
return null; return null;
} }
@@ -187,6 +175,19 @@ function isAgentInvokable(agent: typeof agents.$inferSelect | null | undefined)
return Boolean(agent && !["paused", "terminated", "pending_approval"].includes(agent.status)); return Boolean(agent && !["paused", "terminated", "pending_approval"].includes(agent.status));
} }
function isStrandedIssueRecoveryIssue(issue: Pick<typeof issues.$inferSelect, "originKind">) {
return isStrandedIssueRecoveryOriginKind(issue.originKind);
}
function isUnsuccessfulTerminalIssueRun(latestRun: LatestIssueRun) {
return Boolean(
latestRun &&
UNSUCCESSFUL_HEARTBEAT_RUN_TERMINAL_STATUSES.includes(
latestRun.status as (typeof UNSUCCESSFUL_HEARTBEAT_RUN_TERMINAL_STATUSES)[number],
),
);
}
function parseLivenessIncidentKey(incidentKey: string | null | undefined) { function parseLivenessIncidentKey(incidentKey: string | null | undefined) {
if (!incidentKey) return null; if (!incidentKey) return null;
return parseIssueGraphLivenessIncidentKey(incidentKey); return parseIssueGraphLivenessIncidentKey(incidentKey);
@@ -813,6 +814,16 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
); );
} }
function isUniqueStrandedIssueRecoveryConflict(error: unknown) {
if (!error || typeof error !== "object") return false;
const maybe = error as { code?: string; constraint?: string; message?: string };
return maybe.code === "23505" &&
(
maybe.constraint === "issues_active_stranded_issue_recovery_uq" ||
typeof maybe.message === "string" && maybe.message.includes("issues_active_stranded_issue_recovery_uq")
);
}
async function ensureSourceIssueBlockedByStaleEvaluation(input: { async function ensureSourceIssueBlockedByStaleEvaluation(input: {
sourceIssue: typeof issues.$inferSelect | null; sourceIssue: typeof issues.$inferSelect | null;
evaluationIssue: { id: string; identifier: string | null }; evaluationIssue: { id: string; identifier: string | null };
@@ -1257,6 +1268,8 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
latestRun: LatestIssueRun; latestRun: LatestIssueRun;
previousStatus: "todo" | "in_progress"; previousStatus: "todo" | "in_progress";
}) { }) {
if (isStrandedIssueRecoveryIssue(input.issue)) return null;
const existing = await findOpenStrandedIssueRecoveryIssue(input.issue.companyId, input.issue.id); const existing = await findOpenStrandedIssueRecoveryIssue(input.issue.companyId, input.issue.id);
if (existing) return existing; if (existing) return existing;
@@ -1264,32 +1277,40 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
if (!ownerAgentId) return null; if (!ownerAgentId) return null;
const prefix = await getCompanyIssuePrefix(input.issue.companyId); const prefix = await getCompanyIssuePrefix(input.issue.companyId);
const recovery = await issuesSvc.create(input.issue.companyId, { let recovery: Awaited<ReturnType<typeof issuesSvc.create>>;
title: `Recover stalled issue ${input.issue.identifier ?? input.issue.title}`, try {
description: buildStrandedIssueRecoveryDescription({ recovery = await issuesSvc.create(input.issue.companyId, {
issue: input.issue, title: `Recover stalled issue ${input.issue.identifier ?? input.issue.title}`,
latestRun: input.latestRun, description: buildStrandedIssueRecoveryDescription({
previousStatus: input.previousStatus, issue: input.issue,
prefix, latestRun: input.latestRun,
}), previousStatus: input.previousStatus,
status: "todo", prefix,
priority: input.issue.priority, }),
parentId: input.issue.id, status: "todo",
projectId: input.issue.projectId, priority: input.issue.priority,
goalId: input.issue.goalId, parentId: input.issue.id,
assigneeAgentId: ownerAgentId, projectId: input.issue.projectId,
originKind: STRANDED_ISSUE_RECOVERY_ORIGIN_KIND, goalId: input.issue.goalId,
originId: input.issue.id, assigneeAgentId: ownerAgentId,
originRunId: input.latestRun?.id ?? null, originKind: STRANDED_ISSUE_RECOVERY_ORIGIN_KIND,
originFingerprint: [ originId: input.issue.id,
STRANDED_ISSUE_RECOVERY_ORIGIN_KIND, originRunId: input.latestRun?.id ?? null,
input.issue.companyId, originFingerprint: [
input.issue.id, STRANDED_ISSUE_RECOVERY_ORIGIN_KIND,
input.latestRun?.id ?? "no-run", input.issue.companyId,
].join(":"), input.issue.id,
billingCode: input.issue.billingCode, input.latestRun?.id ?? "no-run",
inheritExecutionWorkspaceFromIssueId: input.issue.id, ].join(":"),
}); billingCode: input.issue.billingCode,
inheritExecutionWorkspaceFromIssueId: input.issue.id,
});
} catch (error) {
if (!isUniqueStrandedIssueRecoveryConflict(error)) throw error;
const raced = await findOpenStrandedIssueRecoveryIssue(input.issue.companyId, input.issue.id);
if (!raced) throw error;
return raced;
}
await deps.enqueueWakeup(ownerAgentId, { await deps.enqueueWakeup(ownerAgentId, {
source: "assignment", source: "assignment",
@@ -1315,6 +1336,78 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
return recovery; return recovery;
} }
function buildRecoveryIssueInPlaceEscalationComment(input: {
issue: typeof issues.$inferSelect;
previousStatus: "todo" | "in_progress";
latestRun: LatestIssueRun;
prefix: string;
}) {
const runLink = input.latestRun
? runUiLink({ id: input.latestRun.id, agentId: input.latestRun.agentId }, input.prefix)
: "none";
const retryReason = readNonEmptyString(parseObject(input.latestRun?.contextSnapshot)?.retryReason) ?? "none";
const failureSummary = summarizeRunFailureForIssueComment(input.latestRun);
return [
"Paperclip stopped automatic stranded-work recovery for this recovery issue.",
"",
`- Recovery issue: ${issueUiLink({ identifier: input.issue.identifier, id: input.issue.id }, input.prefix)}`,
`- Previous status: \`${input.previousStatus}\``,
`- Latest run: ${runLink}`,
`- Latest run status: \`${input.latestRun?.status ?? "unknown"}\``,
`- Retry reason: \`${retryReason}\``,
failureSummary ? `- Failure: ${failureSummary.trim()}` : "- Failure: none recorded",
"- Guard: recovery issues do not create nested `stranded_issue_recovery` issues.",
"",
"Next action: the current recovery owner should inspect the failed run evidence, restore a live execution path or record the manual resolution, then move this recovery issue out of `blocked`.",
].join("\n");
}
async function escalateStrandedRecoveryIssueInPlace(input: {
issue: typeof issues.$inferSelect;
previousStatus: "todo" | "in_progress";
latestRun: LatestIssueRun;
}) {
const updated = await issuesSvc.update(input.issue.id, { status: "blocked" });
if (!updated) return null;
const prefix = await getCompanyIssuePrefix(input.issue.companyId);
await issuesSvc.addComment(
input.issue.id,
buildRecoveryIssueInPlaceEscalationComment({
issue: input.issue,
previousStatus: input.previousStatus,
latestRun: input.latestRun,
prefix,
}),
{},
);
await logActivity(db, {
companyId: input.issue.companyId,
actorType: "system",
actorId: "system",
agentId: null,
runId: null,
action: "issue.updated",
entityType: "issue",
entityId: input.issue.id,
details: {
identifier: input.issue.identifier,
status: "blocked",
previousStatus: input.previousStatus,
source: "recovery.reconcile_stranded_recovery_issue",
latestRunId: input.latestRun?.id ?? null,
latestRunStatus: input.latestRun?.status ?? null,
latestRunErrorCode: input.latestRun?.errorCode ?? null,
originKind: input.issue.originKind,
originId: input.issue.originId,
},
});
return updated;
}
async function existingBlockerIssueIds(companyId: string, issueId: string) { async function existingBlockerIssueIds(companyId: string, issueId: string) {
return db return db
.select({ blockerIssueId: issueRelations.issueId }) .select({ blockerIssueId: issueRelations.issueId })
@@ -1357,6 +1450,14 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
latestRun: LatestIssueRun; latestRun: LatestIssueRun;
comment: string; comment: string;
}) { }) {
if (isStrandedIssueRecoveryIssue(input.issue)) {
return escalateStrandedRecoveryIssueInPlace({
issue: input.issue,
previousStatus: input.previousStatus,
latestRun: input.latestRun,
});
}
const recoveryIssue = await ensureStrandedIssueRecoveryIssue({ const recoveryIssue = await ensureStrandedIssueRecoveryIssue({
issue: input.issue, issue: input.issue,
previousStatus: input.previousStatus, previousStatus: input.previousStatus,
@@ -1457,6 +1558,21 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
} }
const latestRun = await getLatestIssueRun(issue.companyId, issue.id); const latestRun = await getLatestIssueRun(issue.companyId, issue.id);
if (isStrandedIssueRecoveryIssue(issue) && isUnsuccessfulTerminalIssueRun(latestRun)) {
const updated = await escalateStrandedRecoveryIssueInPlace({
issue,
previousStatus: issue.status as "todo" | "in_progress",
latestRun,
});
if (updated) {
result.escalated += 1;
result.issueIds.push(issue.id);
} else {
result.skipped += 1;
}
continue;
}
if (issue.status === "todo") { if (issue.status === "todo") {
if (!latestRun || latestRun.status === "succeeded") { if (!latestRun || latestRun.status === "succeeded") {
result.skipped += 1; result.skipped += 1;