Guard cheap recovery model usage (#6371)

## Thinking Path > - Paperclip is the control plane that coordinates AI-agent work through issues, heartbeats, comments, approvals, and auditable recovery paths. > - The affected subsystem is heartbeat/recovery orchestration, especially the optional cheap model profile used for operational recovery overhead. > - Cheap recovery should repair status and liveness, but it must not become the worker lane that writes deliverables, continues source work, or propagates cheap execution hints into downstream retries. > - The gap was that cheap-profile hints could follow recovery wake contexts and assignment overrides farther than intended, making real work eligible to run on the cheap model. > - This pull request separates status-only cheap recovery from normal source-work continuations, adds route guards for deliverable mutations during cheap status-only runs, and documents the invariant. > - The benefit is safer retry/recovery behavior: cheap runs can clean up control-plane state, while any remaining source work resumes through a normal/original model path. ## What Changed - Added recovery model-profile work classes so status-only recovery carries explicit guard context and normal-model continuations scrub cheap hints. - Updated heartbeat, productivity review, liveness continuation, and recovery service wakeups to request cheap only for bounded status-only recovery work. - Blocked cheap status-only recovery runs from writing issue documents, plans, attachments, work products, or assigning downstream work back to `modelProfile: "cheap"`. - Added/updated server tests for cheap profile propagation, artifact/document guards, route authorization, retry scheduling, and successful-run handoff behavior. - Documented the recovery model-profile lane in `doc/SPEC-implementation.md` and `doc/execution-semantics.md`. - After rebasing onto current `public-gh/master`, stabilized the new `InstanceSidebar` plugin-filter tests so the PR check lane stays green. ## Verification - Local: `pnpm exec vitest run --config vitest.config.ts src/services/recovery/model-profile-hint.test.ts src/__tests__/issue-agent-mutation-ownership-routes.test.ts src/__tests__/issue-document-restore-routes.test.ts` from `server/` - 3 files, 37 tests passed after final edits. - Local: `pnpm exec vitest run --config vitest.config.ts src/__tests__/heartbeat-process-recovery.test.ts` from `server/` - 44 tests passed after rerunning the cleanup-sensitive file alone. - Local: `pnpm --filter @paperclipai/ui exec vitest run src/components/InstanceSidebar.test.tsx` - 4 tests passed. - Local: `pnpm --filter @paperclipai/server typecheck` - passed. - Local: `pnpm --filter @paperclipai/ui typecheck` - passed. - PR checks on latest head `6f8c3b1380f5bd872c6f49f6f7188ecf3bb6d263` - all green, including `verify`, build, typecheck, server/general/serialized tests, e2e, Snyk, and policy. - Greptile: pass 3 returned Confidence Score 5/5 with zero unresolved Greptile review threads. ## Risks - Medium risk: recovery behavior is intentionally stricter, so any path that incorrectly relies on cheap recovery to keep doing source work will now need to hand back to a normal-model run. - Low migration risk: no schema changes. - No product UI changes; the UI file touched is a test-only stabilization after rebasing onto current `master`. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool use and local code execution enabled; context window not exposed in this environment. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots (N/A: no product UI changes) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
2026-05-19 13:46:02 -05:00
parent 24748de421
commit bfe6369ef5
17 changed files with 529 additions and 78 deletions
@@ -499,7 +499,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
      payload: withRecoveryModelProfileHint({
        issueId: input.issueId,
        ...(input.retryOfRunId ? { retryOfRunId: input.retryOfRunId } : {}),
-      }),
+      }, "normal_model"),
      requestedByActorType: "system",
      requestedByActorId: null,
      contextSnapshot: withRecoveryModelProfileHint({
@@ -509,7 +509,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        retryReason: input.retryReason,
        source: input.source,
        ...(input.retryOfRunId ? { retryOfRunId: input.retryOfRunId } : {}),
-      }),
+      }, "normal_model"),
    });

    if (queued && input.retryOfRunId) {
@@ -535,7 +535,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
      payload: withRecoveryModelProfileHint({
        issueId: issue.id,
        mutation: "assigned_todo_liveness_dispatch",
-      }),
+      }, "normal_model"),
      requestedByActorType: "system",
      requestedByActorId: null,
      contextSnapshot: withRecoveryModelProfileHint({
@@ -543,7 +543,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        taskId: issue.id,
        wakeReason: "issue_assigned",
        source: "issue.assigned_todo_liveness_dispatch",
-      }),
+      }, "normal_model"),
    });
  }

@@ -650,7 +650,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        payload: withRecoveryModelProfileHint({
          issueId: candidate.id,
          mutation: "unassigned_blocker_recovery",
-        }),
+        }, "normal_model"),
        requestedByActorType: "system",
        requestedByActorId: null,
        contextSnapshot: withRecoveryModelProfileHint({
@@ -658,7 +658,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
          taskId: candidate.id,
          wakeReason: "issue_assigned",
          source: "issue.unassigned_blocker_recovery",
-        }),
+        }, "normal_model"),
      });

      if (queued) {
@@ -1455,7 +1455,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        goalId: sourceIssue?.goalId ?? null,
        billingCode: sourceIssue?.billingCode ?? null,
        assigneeAgentId: ownerAgentId,
-        assigneeAdapterOverrides: recoveryAssigneeAdapterOverrides(),
+        assigneeAdapterOverrides: recoveryAssigneeAdapterOverrides("status_only"),
        originKind: STALE_ACTIVE_RUN_EVALUATION_ORIGIN_KIND,
        originId: input.run.id,
        originRunId: input.run.id,
@@ -1501,7 +1501,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
          issueId: evaluation.id,
          staleRunId: input.run.id,
          sourceIssueId: sourceIssue?.id ?? null,
-        }),
+        }, "status_only"),
        requestedByActorType: "system",
        requestedByActorId: null,
        contextSnapshot: withRecoveryModelProfileHint({
@@ -1511,7 +1511,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
          source: STALE_ACTIVE_RUN_EVALUATION_ORIGIN_KIND,
          staleRunId: input.run.id,
          sourceIssueId: sourceIssue?.id ?? null,
-        }),
+        }, "status_only"),
      });
    }
    return { kind: "created" as const, evaluationIssueId: evaluation.id };
@@ -1890,7 +1890,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        projectId: input.issue.projectId,
        goalId: input.issue.goalId,
        assigneeAgentId: ownerAgentId,
-        assigneeAdapterOverrides: recoveryAssigneeAdapterOverrides(),
+        assigneeAdapterOverrides: recoveryAssigneeAdapterOverrides("status_only"),
        originKind: STRANDED_ISSUE_RECOVERY_ORIGIN_KIND,
        originId: input.issue.id,
        originRunId: input.latestRun?.id ?? null,
@@ -1920,7 +1920,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        sourceIssueId: input.issue.id,
        strandedRunId: input.latestRun?.id ?? null,
        recoveryCause,
-      }),
+      }, "status_only"),
      requestedByActorType: "system",
      requestedByActorId: null,
      contextSnapshot: withRecoveryModelProfileHint({
@@ -1931,7 +1931,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        sourceIssueId: input.issue.id,
        strandedRunId: input.latestRun?.id ?? null,
        recoveryCause,
-      }),
+      }, "status_only"),
    });

    return recovery;
@@ -2050,7 +2050,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        recoveryActionId: input.action.id,
        strandedRunId: input.latestRun?.id ?? null,
        recoveryCause: input.recoveryCause,
-      }),
+      }, "status_only"),
      requestedByActorType: "system",
      requestedByActorId: null,
      contextSnapshot: withRecoveryModelProfileHint({
@@ -2063,7 +2063,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        sourceIssueId: input.issue.id,
        strandedRunId: input.latestRun?.id ?? null,
        recoveryCause: input.recoveryCause,
-      }),
+      }, "status_only"),
    });
  }

@@ -3256,7 +3256,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        projectId: recoveryIssue.projectId,
        goalId: recoveryIssue.goalId,
        assigneeAgentId: ownerSelection.agentId,
-        assigneeAdapterOverrides: recoveryAssigneeAdapterOverrides(),
+        assigneeAdapterOverrides: recoveryAssigneeAdapterOverrides("status_only"),
        originKind: RECOVERY_ORIGIN_KINDS.issueGraphLivenessEscalation,
        originId: input.finding.incidentKey,
        originFingerprint: livenessRecoveryLeafFingerprint(input.finding),
@@ -3342,7 +3342,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        sourceIssueId: issue.id,
        recoveryIssueId: recoveryIssue.id,
        incidentKey: input.finding.incidentKey,
-      }),
+      }, "status_only"),
      requestedByActorType: "system",
      requestedByActorId: null,
      contextSnapshot: withRecoveryModelProfileHint({
@@ -3353,7 +3353,7 @@ export function recoveryService(db: Db, deps: { enqueueWakeup: RecoveryWakeup })
        sourceIssueId: issue.id,
        recoveryIssueId: recoveryIssue.id,
        incidentKey: input.finding.incidentKey,
-      }),
+      }, "status_only"),
    });

    logger.warn({