Files
paperclip/ui/src/components/IssueRunLedger.test.tsx
T
Dotta a3de1d764d Add cheap model profiles for local adapters (#4881)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies, where
adapters are the boundary between the board, agents, and execution
runtimes.
> - Local adapters currently expose a primary runtime configuration, but
operators often need a cheaper model lane for routine or low-risk work.
> - That cheap lane has to stay adapter-owned: runtime profile settings
should not mutate the primary adapter config or bypass existing
auth/secret mediation.
> - Issue creation also needs an ergonomic way to request primary,
cheap, or custom model behavior for a selected assignee.
> - This pull request adds a first-class `cheap` model profile contract
across adapter capabilities, heartbeat config resolution, agent
configuration, and issue creation.
> - The benefit is cheaper task execution can be configured and
requested explicitly while preserving adapter boundaries, secret
handling, and audit visibility.

## What Changed

- Added adapter model-profile capability metadata and a `cheap` profile
contract for supported local adapters.
- Applied `runtimeConfig.modelProfiles.cheap.adapterConfig` during
heartbeat config resolution, including requested/applied/fallback run
metadata.
- Added agent configuration UI for cheap model profile settings without
writing those settings into primary `adapterConfig`.
- Added New Issue assignee model lane controls for Primary / Cheap /
Custom and request payload handling.
- Added run ledger profile badges and Storybook stories for the new
cheap-lane UI states.
- Added tests for validators, heartbeat model profile application,
permission/secret mediation, UI payload helpers, and run ledger
rendering.
- Added committed UI verification screenshots under
`docs/pr-screenshots/pap-2837/`.
- Addressed Greptile review feedback around cheap-profile defaults,
shared profile types, and fallback test data.

## Verification

Local:

- `pnpm exec vitest run packages/shared/src/validators/issue.test.ts
server/src/__tests__/adapter-registry.test.ts
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/heartbeat-model-profile.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/lib/agent-config-patch.test.ts
ui/src/lib/issue-assignee-overrides.test.ts
ui/src/lib/new-agent-runtime-config.test.ts` — passed, 8 files / 103
tests.
- `pnpm exec vitest run ui/src/lib/new-agent-runtime-config.test.ts
ui/src/components/IssueRunLedger.test.tsx` — passed after
Greptile/rebase follow-up, 2 files / 17 tests.
- `pnpm --filter @paperclipai/ui typecheck` — passed after
Greptile/rebase follow-up.
- `pnpm -r typecheck` — passed.
- `pnpm build` — passed.
- `pnpm test:run` — did not complete successfully in this local
worktree: it stopped in pre-existing `@paperclipai/adapter-utils`
sandbox/SSH fixture suites outside this PR diff. Failures were 5s local
timeouts plus `git init -b` unsupported by this machine's Git 2.21.0.
The branch-specific targeted suites above passed.
- Branch was fetched/rebased onto `public-gh/master`; `git rev-list
--left-right --count public-gh/master...HEAD` reports `0 9`.

Remote PR checks on latest head
`e30bf399146451c86cee98ed528d51d33fa5af5a`:

- `policy` — passed.
- `verify` — passed.
- `e2e` — passed.
- `Greptile Review` — passed, confidence score 5/5; Greptile review
threads resolved.
- `security/snyk (cryppadotta)` — passed.

Screenshots:

- [New issue cheap lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-cheap-desktop.png)
- [New issue custom lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-custom-desktop.png)
- [New issue unsupported adapter
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-unsupported-desktop.png)
- [Run ledger model profile badges
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/runledger-profile-badges-desktop.png)
- Mobile variants are also in `docs/pr-screenshots/pap-2837/`.

## Risks

- Medium: heartbeat config mediation now merges runtime model profiles
into adapter configs, so adapter secret normalization and host-command
restrictions must keep covering nested config paths.
- Medium: the UI adds another issue creation choice; unsupported
adapters must keep hiding the cheap lane and preserve primary behavior.
- Low migration risk: no database migration is included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

OpenAI Codex coding agent using GPT-5-class reasoning with repo tool use
and command execution. Exact served model/context window was not exposed
by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 15:32:04 -05:00

487 lines
16 KiB
TypeScript

// @vitest-environment jsdom
import { act } from "react";
import type { ComponentProps, ReactNode } from "react";
import { createRoot, type Root } from "react-dom/client";
import type { ActivityEvent, Issue, RunLivenessState } from "@paperclipai/shared";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import type { RunForIssue } from "../api/activity";
import type { ActiveRunForIssue } from "../api/heartbeats";
import { IssueRunLedgerContent } from "./IssueRunLedger";
vi.mock("@/lib/router", () => ({
Link: ({ children, to, ...props }: { children: ReactNode; to: string } & ComponentProps<"a">) => (
<a href={to} {...props}>{children}</a>
),
}));
// eslint-disable-next-line @typescript-eslint/no-explicit-any
(globalThis as any).IS_REACT_ACT_ENVIRONMENT = true;
let container: HTMLDivElement;
let root: Root;
beforeEach(() => {
vi.useFakeTimers();
vi.setSystemTime(new Date("2026-04-18T20:00:00.000Z"));
container = document.createElement("div");
document.body.appendChild(container);
root = createRoot(container);
});
afterEach(() => {
act(() => root.unmount());
container.remove();
vi.useRealTimers();
});
function render(ui: ReactNode) {
act(() => {
root.render(ui);
});
}
function createRun(overrides: Partial<RunForIssue> = {}): RunForIssue {
return {
runId: "run-00000000",
status: "succeeded",
agentId: "agent-1",
adapterType: "codex_local",
startedAt: "2026-04-18T19:58:00.000Z",
finishedAt: "2026-04-18T19:59:00.000Z",
createdAt: "2026-04-18T19:58:00.000Z",
invocationSource: "assignment",
usageJson: null,
resultJson: null,
livenessState: "advanced",
livenessReason: "Run produced concrete action evidence: 2 activity event(s)",
continuationAttempt: 0,
lastUsefulActionAt: "2026-04-18T19:59:00.000Z",
nextAction: null,
...overrides,
};
}
function createActivity(overrides: Partial<ActivityEvent> = {}): ActivityEvent {
return {
id: "activity-1",
companyId: "company-1",
actorType: "system",
actorId: "system",
action: "issue.updated",
entityType: "issue",
entityId: "issue-1",
agentId: null,
runId: null,
details: null,
createdAt: new Date("2026-04-18T19:57:00.000Z"),
...overrides,
};
}
function createIssue(overrides: Partial<Issue> = {}): Issue {
return {
id: "issue-1",
companyId: "company-1",
projectId: null,
projectWorkspaceId: null,
goalId: null,
parentId: null,
title: "Child issue",
description: null,
status: "todo",
priority: "medium",
assigneeAgentId: null,
assigneeUserId: null,
checkoutRunId: null,
executionRunId: null,
executionAgentNameKey: null,
executionLockedAt: null,
createdByAgentId: null,
createdByUserId: null,
issueNumber: null,
identifier: "PAP-1",
requestDepth: 0,
billingCode: null,
assigneeAdapterOverrides: null,
executionWorkspaceId: null,
executionWorkspacePreference: null,
executionWorkspaceSettings: null,
startedAt: null,
completedAt: null,
cancelledAt: null,
hiddenAt: null,
createdAt: new Date("2026-04-18T19:00:00.000Z"),
updatedAt: new Date("2026-04-18T19:00:00.000Z"),
...overrides,
};
}
function createActiveRun(overrides: Partial<ActiveRunForIssue> = {}): ActiveRunForIssue {
return {
id: "run-live-1",
status: "running",
invocationSource: "assignment",
triggerDetail: null,
startedAt: "2026-04-18T19:58:00.000Z",
finishedAt: null,
createdAt: "2026-04-18T19:58:00.000Z",
agentId: "agent-1",
agentName: "CodexCoder",
adapterType: "codex_local",
outputSilence: {
lastOutputAt: "2026-04-18T19:00:00.000Z",
lastOutputSeq: 4,
lastOutputStream: "stdout",
silenceStartedAt: "2026-04-18T19:30:00.000Z",
silenceAgeMs: 45 * 60 * 1000,
level: "critical",
suspicionThresholdMs: 10 * 60 * 1000,
criticalThresholdMs: 30 * 60 * 1000,
snoozedUntil: null,
evaluationIssueId: "issue-eval-1",
evaluationIssueIdentifier: "PAP-404",
evaluationIssueAssigneeAgentId: "agent-owner",
},
...overrides,
};
}
function renderLedger(props: Partial<ComponentProps<typeof IssueRunLedgerContent>> = {}) {
render(
<IssueRunLedgerContent
runs={props.runs ?? []}
liveRuns={props.liveRuns}
activeRun={props.activeRun}
issueStatus={props.issueStatus ?? "in_progress"}
childIssues={props.childIssues ?? []}
agentMap={props.agentMap ?? new Map([["agent-1", { name: "CodexCoder" }]])}
activityEvents={props.activityEvents}
renderActivityEvent={props.renderActivityEvent}
pendingWatchdogDecision={props.pendingWatchdogDecision}
canRecordWatchdogDecisions={props.canRecordWatchdogDecisions}
watchdogDecisionError={props.watchdogDecisionError}
onWatchdogDecision={props.onWatchdogDecision}
/>,
);
}
describe("IssueRunLedger", () => {
it("renders every liveness state with exhausted continuation context", () => {
const states: RunLivenessState[] = [
"advanced",
"plan_only",
"empty_response",
"blocked",
"failed",
"completed",
"needs_followup",
];
renderLedger({
runs: states.map((state, index) =>
createRun({
runId: `run-${index}0000000`,
createdAt: `2026-04-18T19:5${index}:00.000Z`,
livenessState: state,
livenessReason: state === "needs_followup"
? "Run produced useful output but no concrete action evidence; continuation attempts exhausted"
: `state ${state}`,
continuationAttempt: state === "needs_followup" ? 3 : 0,
}),
),
});
expect(container.textContent).toContain("Advanced");
expect(container.textContent).toContain("Plan only");
expect(container.textContent).toContain("Empty response");
expect(container.textContent).toContain("Blocked");
expect(container.textContent).toContain("Failed");
expect(container.textContent).toContain("Completed");
expect(container.textContent).toContain("Needs follow-up");
expect(container.textContent).toContain("Exhausted");
expect(container.textContent).toContain("Continuation attempt 3");
});
it("renders historical runs without liveness metadata as unavailable", () => {
renderLedger({
runs: [
createRun({
livenessState: null,
livenessReason: null,
continuationAttempt: undefined,
lastUsefulActionAt: null,
nextAction: null,
resultJson: null,
}),
],
});
expect(container.textContent).toContain("No liveness data");
expect(container.textContent).toContain("Stop Unavailable");
expect(container.textContent).toContain("Last useful action Unavailable");
});
it("interleaves run rows and activity rows by timestamp", () => {
renderLedger({
runs: [
createRun({
runId: "run-oldest",
startedAt: "2026-04-18T19:55:00.000Z",
createdAt: "2026-04-18T19:55:00.000Z",
}),
createRun({
runId: "run-newest",
startedAt: "2026-04-18T19:59:00.000Z",
createdAt: "2026-04-18T19:59:00.000Z",
}),
],
activityEvents: [
createActivity({
id: "activity-middle",
action: "activity-middle",
createdAt: new Date("2026-04-18T19:57:00.000Z"),
}),
],
renderActivityEvent: (event) => (
<div data-testid={`activity-${event.id}`}>{event.action}</div>
),
});
const text = container.textContent ?? "";
const newestIndex = text.indexOf("run-newe");
const activityIndex = text.indexOf("activity-middle");
const oldestIndex = text.indexOf("run-olde");
expect(newestIndex).toBeGreaterThanOrEqual(0);
expect(activityIndex).toBeGreaterThan(newestIndex);
expect(oldestIndex).toBeGreaterThan(activityIndex);
});
it("shows live runs as pending final checks without missing-data language", () => {
renderLedger({
runs: [
createRun({
status: "running",
finishedAt: null,
livenessState: null,
livenessReason: null,
continuationAttempt: 0,
lastUsefulActionAt: null,
nextAction: null,
resultJson: null,
}),
],
});
expect(container.textContent).toContain("Running now by CodexCoder");
expect(container.textContent).toContain("Checks after finish");
expect(container.textContent).toContain("Last useful action No action recorded yet");
expect(container.textContent).toContain("Stop Still running");
expect(container.textContent).not.toContain("Liveness pending");
expect(container.textContent).not.toContain("initial attempt");
});
it("surfaces scheduled retry timing and exhaustion state without opening logs", () => {
renderLedger({
runs: [
createRun({
runId: "run-scheduled",
status: "scheduled_retry",
finishedAt: null,
livenessState: null,
livenessReason: null,
retryOfRunId: "run-root",
scheduledRetryAt: "2026-04-18T20:15:00.000Z",
scheduledRetryAttempt: 2,
scheduledRetryReason: "transient_failure",
}),
createRun({
runId: "run-exhausted",
status: "failed",
createdAt: "2026-04-18T19:57:00.000Z",
retryOfRunId: "run-root",
scheduledRetryAttempt: 4,
scheduledRetryReason: "transient_failure",
retryExhaustedReason: "Bounded retry exhausted after 4 scheduled attempts; no further automatic retry will be queued",
}),
],
});
expect(container.textContent).toContain("Retry scheduled");
expect(container.textContent).toContain("Attempt 2");
expect(container.textContent).toContain("Transient failure");
expect(container.textContent).toContain("Next retry");
expect(container.textContent).toContain("Retry exhausted");
expect(container.textContent).toContain("no further automatic retry will be queued");
expect(container.textContent).toContain("Manual intervention required");
});
it("shows timeout, cancel, and budget stop reasons without raw logs", () => {
renderLedger({
runs: [
createRun({
runId: "run-timeout",
resultJson: { stopReason: "timeout", timeoutFired: true, effectiveTimeoutSec: 30 },
}),
createRun({
runId: "run-cancel",
resultJson: { stopReason: "cancelled" },
createdAt: "2026-04-18T19:57:00.000Z",
}),
createRun({
runId: "run-budget",
resultJson: { stopReason: "budget_paused" },
createdAt: "2026-04-18T19:56:00.000Z",
}),
createRun({
runId: "run-paused",
resultJson: { stopReason: "paused" },
createdAt: "2026-04-18T19:55:00.000Z",
}),
],
});
expect(container.textContent).toContain("timeout (30s timeout)");
expect(container.textContent).toContain("cancelled");
expect(container.textContent).toContain("budget paused");
expect(container.textContent).toContain("paused by board");
});
it("surfaces active and completed child issue summaries", () => {
renderLedger({
childIssues: [
createIssue({ id: "child-1", identifier: "PAP-2", title: "Implement worker handoff", status: "in_progress" }),
createIssue({ id: "child-2", identifier: "PAP-3", title: "Verify final report", status: "done" }),
createIssue({ id: "child-3", identifier: "PAP-4", title: "Cancelled experiment", status: "cancelled" }),
],
});
expect(container.textContent).toContain("Child work");
expect(container.textContent).toContain("1 active, 1 done, 1 cancelled");
expect(container.textContent).toContain("PAP-2");
expect(container.textContent).toContain("Implement worker handoff");
renderLedger({
childIssues: [
createIssue({ id: "child-2", identifier: "PAP-3", title: "Verify final report", status: "done" }),
createIssue({ id: "child-3", identifier: "PAP-4", title: "Cancelled experiment", status: "cancelled" }),
],
});
expect(container.textContent).toContain("all 2 terminal (1 done, 1 cancelled)");
});
it("uses wrapping-friendly markup for long next action text", () => {
renderLedger({
runs: [
createRun({
nextAction: "Continue investigating this intentionally-long-next-action-token-that-needs-to-wrap-cleanly-on-mobile-and-desktop-without-overlapping-controls.",
}),
],
});
const nextAction = [...container.querySelectorAll("span")]
.find((node) => node.textContent?.includes("intentionally-long-next-action-token"));
expect(nextAction?.className).toContain("break-words");
expect(container.textContent).toContain("Next action:");
});
it("shows when older runs are clipped from the ledger", () => {
renderLedger({
runs: Array.from({ length: 22 }, (_, index) =>
createRun({
runId: `run-${index.toString().padStart(8, "0")}`,
createdAt: `2026-04-18T19:${String(index).padStart(2, "0")}:00.000Z`,
}),
),
});
expect(container.textContent).toContain("2 older items not shown");
});
it("renders stale-run banner, watchdog actions, and silence badge for live runs", () => {
const onWatchdogDecision = vi.fn();
renderLedger({
runs: [createRun({ runId: "run-live-1", status: "running", finishedAt: null })],
activeRun: createActiveRun(),
onWatchdogDecision,
});
expect(container.textContent).toContain("Stale-run watchdog alert");
expect(container.textContent).toContain("PAP-404");
expect(container.textContent).toContain("Stale run");
const watchdogBanner = Array.from(container.querySelectorAll("p"))
.find((node) => node.textContent?.includes("Stale-run watchdog alert"))
?.closest("div");
expect(watchdogBanner?.className).toContain("border-red-500/30");
expect(watchdogBanner?.className).toContain("bg-red-500/10");
const continueButton = Array.from(container.querySelectorAll("button")).find(
(button) => button.textContent?.includes("Continue monitoring"),
);
expect(continueButton).not.toBeUndefined();
act(() => {
continueButton?.dispatchEvent(new MouseEvent("click", { bubbles: true }));
});
expect(onWatchdogDecision).toHaveBeenCalledWith({
runId: "run-live-1",
decision: "continue",
evaluationIssueId: "issue-eval-1",
});
});
it("renders requested/applied model profile and surfaces fallback reasons", () => {
renderLedger({
runs: [
createRun({
runId: "run-cheap-applied",
resultJson: {
modelProfile: {
requested: "cheap",
applied: "cheap",
configSource: "agent_runtime",
fallbackReason: null,
},
},
}),
createRun({
runId: "run-cheap-fallback",
createdAt: "2026-04-18T19:50:00.000Z",
resultJson: {
modelProfile: {
requested: "cheap",
applied: null,
configSource: null,
fallbackReason: "agent_runtime_profile_disabled",
},
},
}),
],
});
expect(container.textContent).toContain("Profile: cheap");
expect(container.textContent).toContain("Profile: cheap (unavailable)");
expect(container.textContent).toContain("Cheap profile fell back to primary");
expect(container.textContent).toContain("agent_runtime_profile_disabled");
});
it("hides watchdog decision actions for known non-owner viewers", () => {
const onWatchdogDecision = vi.fn();
renderLedger({
runs: [createRun({ runId: "run-live-1", status: "running", finishedAt: null })],
activeRun: createActiveRun(),
canRecordWatchdogDecisions: false,
onWatchdogDecision,
});
expect(container.textContent).toContain("Stale-run watchdog alert");
expect(container.textContent).toContain("PAP-404");
expect(container.textContent).not.toContain("Continue monitoring");
expect(container.textContent).not.toContain("Snooze 1h");
expect(container.textContent).not.toContain("Mark false positive");
expect(container.querySelectorAll("button")).toHaveLength(0);
expect(onWatchdogDecision).not.toHaveBeenCalled();
});
});