Files
paperclip/packages/adapters/codex-local/src/server/parse.test.ts
T
Dotta 09d0678840 [codex] Harden heartbeat scheduling and runtime controls (#4223)
## Thinking Path

> - Paperclip orchestrates AI agents through issue checkout, heartbeat
runs, routines, and auditable control-plane state
> - The runtime path has to recover from lost local processes, transient
adapter failures, blocked dependencies, and routine coalescing without
stranding work
> - The existing branch carried several reliability fixes across
heartbeat scheduling, issue runtime controls, routine dispatch, and
operator-facing run state
> - These changes belong together because they share backend contracts,
migrations, and runtime status semantics
> - This pull request groups the control-plane/runtime slice so it can
merge independently from board UI polish and adapter sandbox work
> - The benefit is safer heartbeat recovery, clearer runtime controls,
and more predictable recurring execution behavior

## What Changed

- Adds bounded heartbeat retry scheduling, scheduled retry state, and
Codex transient failure recovery handling.
- Tightens heartbeat process recovery, blocker wake behavior, issue
comment wake handling, routine dispatch coalescing, and
activity/dashboard bounds.
- Adds runtime-control MCP tools and Paperclip skill docs for issue
workspace runtime management.
- Adds migrations `0061_lively_thor_girl.sql` and
`0062_routine_run_dispatch_fingerprint.sql`.
- Surfaces retry state in run ledger/agent UI and keeps related shared
types synchronized.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/routines-service.test.ts`
- `pnpm exec vitest run src/tools.test.ts` from `packages/mcp-server`

## Risks

- Medium risk: this touches heartbeat recovery and routine dispatch,
which are central execution paths.
- Migration order matters if split branches land out of order: merge
this PR before branches that assume the new runtime/routine fields.
- Runtime retry behavior should be watched in CI and in local operator
smoke tests because it changes how transient failures are resumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 12:24:11 -05:00

121 lines
3.9 KiB
TypeScript
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
import { describe, expect, it } from "vitest";
import {
isCodexTransientUpstreamError,
isCodexUnknownSessionError,
parseCodexJsonl,
} from "./parse.js";
describe("parseCodexJsonl", () => {
it("captures session id, assistant summary, usage, and error message", () => {
const stdout = [
JSON.stringify({ type: "thread.started", thread_id: "thread_123" }),
JSON.stringify({
type: "item.completed",
item: { type: "agent_message", text: "Recovered response" },
}),
JSON.stringify({
type: "turn.completed",
usage: { input_tokens: 10, cached_input_tokens: 2, output_tokens: 4 },
}),
JSON.stringify({ type: "turn.failed", error: { message: "resume failed" } }),
].join("\n");
expect(parseCodexJsonl(stdout)).toEqual({
sessionId: "thread_123",
summary: "Recovered response",
usage: {
inputTokens: 10,
cachedInputTokens: 2,
outputTokens: 4,
},
errorMessage: "resume failed",
});
});
it("uses the last agent message as the summary when commentary updates precede the final answer", () => {
const stdout = [
JSON.stringify({ type: "thread.started", thread_id: "thread_123" }),
JSON.stringify({
type: "item.completed",
item: { type: "reasoning", text: "Checking the heartbeat procedure" },
}),
JSON.stringify({
type: "item.completed",
item: { type: "agent_message", text: "Im checking out the issue and reading the docs now." },
}),
JSON.stringify({
type: "item.completed",
item: { type: "agent_message", text: "Fixed the issue and verified the targeted tests pass." },
}),
JSON.stringify({
type: "turn.completed",
usage: { input_tokens: 10, cached_input_tokens: 2, output_tokens: 4 },
}),
].join("\n");
expect(parseCodexJsonl(stdout)).toEqual({
sessionId: "thread_123",
summary: "Fixed the issue and verified the targeted tests pass.",
usage: {
inputTokens: 10,
cachedInputTokens: 2,
outputTokens: 4,
},
errorMessage: null,
});
});
});
describe("isCodexUnknownSessionError", () => {
it("detects the current missing-rollout thread error", () => {
expect(
isCodexUnknownSessionError(
"",
"Error: thread/resume: thread/resume failed: no rollout found for thread id d448e715-7607-4bcc-91fc-7a3c0c5a9632",
),
).toBe(true);
});
it("still detects existing stale-session wordings", () => {
expect(isCodexUnknownSessionError("unknown thread id", "")).toBe(true);
expect(isCodexUnknownSessionError("", "state db missing rollout path for thread abc")).toBe(true);
});
it("does not classify unrelated Codex failures as stale sessions", () => {
expect(isCodexUnknownSessionError("", "model overloaded")).toBe(false);
});
});
describe("isCodexTransientUpstreamError", () => {
it("classifies the remote-compaction high-demand failure as transient upstream", () => {
expect(
isCodexTransientUpstreamError({
errorMessage:
"Error running remote compact task: We're currently experiencing high demand, which may cause temporary errors.",
}),
).toBe(true);
expect(
isCodexTransientUpstreamError({
stderr: "We're currently experiencing high demand, which may cause temporary errors.",
}),
).toBe(true);
});
it("does not classify deterministic compaction errors as transient", () => {
expect(
isCodexTransientUpstreamError({
errorMessage: [
"Error running remote compact task: {",
' "error": {',
' "message": "Unknown parameter: \'prompt_cache_retention\'.",',
' "type": "invalid_request_error",',
' "param": "prompt_cache_retention",',
' "code": "unknown_parameter"',
" }",
"}",
].join("\n"),
}),
).toBe(false);
});
});