Files
paperclip/packages/adapters/codex-local/src/server/parse.test.ts
T
Dotta 8f1cd0474f [codex] Improve transient recovery and Codex model refresh (#4383)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapter execution and retry classification decide whether agent work
pauses, retries, or recovers automatically
> - Transient provider failures need to be classified precisely so
Paperclip does not convert retryable upstream conditions into false hard
failures
> - At the same time, operators need an up-to-date model list for
Codex-backed agents and prompts should nudge agents toward targeted
verification instead of repo-wide sweeps
> - This pull request tightens transient recovery classification for
Claude and Codex, updates the agent prompt guidance, and adds Codex
model refresh support end-to-end
> - The benefit is better automatic retry behavior plus fresher
operator-facing model configuration

## What Changed

- added Codex usage-limit retry-window parsing and Claude extra-usage
transient classification
- normalized the heartbeat transient-recovery contract across adapter
executions and heartbeat scheduling
- documented that deferred comment wakes only reopen completed issues
for human/comment-reopen interactions, while system follow-ups leave
closed work closed
- updated adapter-utils prompt guidance to prefer targeted verification
- added Codex model refresh support in the server route, registry,
shared types, and agent config form
- added adapter/server tests covering the new parsing, retry scheduling,
and model-refresh behavior

## Verification

- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-claude-local
packages/adapters/claude-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-codex-local
packages/adapters/codex-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/adapter-model-refresh-routes.test.ts
server/src/__tests__/adapter-models.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/codex-local-execute.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-retry-scheduling.test.ts`

## Risks

- Moderate behavior risk: retry classification affects whether runs
auto-recover or block, so mistakes here could either suppress needed
retries or over-retry real failures
- Low workflow risk: deferred comment wake reopening is intentionally
scoped to human/comment-reopen interactions so system follow-ups do not
revive completed issues unexpectedly

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:40:40 -05:00

141 lines
4.8 KiB
TypeScript
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
import { describe, expect, it } from "vitest";
import {
extractCodexRetryNotBefore,
isCodexTransientUpstreamError,
isCodexUnknownSessionError,
parseCodexJsonl,
} from "./parse.js";
describe("parseCodexJsonl", () => {
it("captures session id, assistant summary, usage, and error message", () => {
const stdout = [
JSON.stringify({ type: "thread.started", thread_id: "thread_123" }),
JSON.stringify({
type: "item.completed",
item: { type: "agent_message", text: "Recovered response" },
}),
JSON.stringify({
type: "turn.completed",
usage: { input_tokens: 10, cached_input_tokens: 2, output_tokens: 4 },
}),
JSON.stringify({ type: "turn.failed", error: { message: "resume failed" } }),
].join("\n");
expect(parseCodexJsonl(stdout)).toEqual({
sessionId: "thread_123",
summary: "Recovered response",
usage: {
inputTokens: 10,
cachedInputTokens: 2,
outputTokens: 4,
},
errorMessage: "resume failed",
});
});
it("uses the last agent message as the summary when commentary updates precede the final answer", () => {
const stdout = [
JSON.stringify({ type: "thread.started", thread_id: "thread_123" }),
JSON.stringify({
type: "item.completed",
item: { type: "reasoning", text: "Checking the heartbeat procedure" },
}),
JSON.stringify({
type: "item.completed",
item: { type: "agent_message", text: "Im checking out the issue and reading the docs now." },
}),
JSON.stringify({
type: "item.completed",
item: { type: "agent_message", text: "Fixed the issue and verified the targeted tests pass." },
}),
JSON.stringify({
type: "turn.completed",
usage: { input_tokens: 10, cached_input_tokens: 2, output_tokens: 4 },
}),
].join("\n");
expect(parseCodexJsonl(stdout)).toEqual({
sessionId: "thread_123",
summary: "Fixed the issue and verified the targeted tests pass.",
usage: {
inputTokens: 10,
cachedInputTokens: 2,
outputTokens: 4,
},
errorMessage: null,
});
});
});
describe("isCodexUnknownSessionError", () => {
it("detects the current missing-rollout thread error", () => {
expect(
isCodexUnknownSessionError(
"",
"Error: thread/resume: thread/resume failed: no rollout found for thread id d448e715-7607-4bcc-91fc-7a3c0c5a9632",
),
).toBe(true);
});
it("still detects existing stale-session wordings", () => {
expect(isCodexUnknownSessionError("unknown thread id", "")).toBe(true);
expect(isCodexUnknownSessionError("", "state db missing rollout path for thread abc")).toBe(true);
});
it("does not classify unrelated Codex failures as stale sessions", () => {
expect(isCodexUnknownSessionError("", "model overloaded")).toBe(false);
});
});
describe("isCodexTransientUpstreamError", () => {
it("classifies the remote-compaction high-demand failure as transient upstream", () => {
expect(
isCodexTransientUpstreamError({
errorMessage:
"Error running remote compact task: We're currently experiencing high demand, which may cause temporary errors.",
}),
).toBe(true);
expect(
isCodexTransientUpstreamError({
stderr: "We're currently experiencing high demand, which may cause temporary errors.",
}),
).toBe(true);
});
it("classifies usage-limit windows as transient and extracts the retry time", () => {
const errorMessage = "You've hit your usage limit for GPT-5.3-Codex-Spark. Switch to another model now, or try again at 11:31 PM.";
const now = new Date(2026, 3, 22, 22, 29, 2);
expect(isCodexTransientUpstreamError({ errorMessage })).toBe(true);
expect(extractCodexRetryNotBefore({ errorMessage }, now)?.getTime()).toBe(
new Date(2026, 3, 22, 23, 31, 0, 0).getTime(),
);
});
it("parses explicit timezone hints on usage-limit retry windows", () => {
const errorMessage = "You've hit your usage limit for GPT-5.3-Codex-Spark. Switch to another model now, or try again at 11:31 PM (America/Chicago).";
const now = new Date("2026-04-23T03:29:02.000Z");
expect(extractCodexRetryNotBefore({ errorMessage }, now)?.toISOString()).toBe(
"2026-04-23T04:31:00.000Z",
);
});
it("does not classify deterministic compaction errors as transient", () => {
expect(
isCodexTransientUpstreamError({
errorMessage: [
"Error running remote compact task: {",
' "error": {',
' "message": "Unknown parameter: \'prompt_cache_retention\'.",',
' "type": "invalid_request_error",',
' "param": "prompt_cache_retention",',
' "code": "unknown_parameter"',
" }",
"}",
].join("\n"),
}),
).toBe(false);
});
});