Make ACPX-Claude adapter work seamlessly (PAPA-388) (#6590)

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, so when
an adapter fails, the platform must surface enough detail for the next
agent (or human reviewer) to act
> - The `acpx_local` adapter wraps `claude-agent-acp`, which in turn
drives the Claude Code SDK — three layers, three different permission
and error-handling models
> - A user created a `Claude Local ACPX` agent in PAPA-387 and it failed
instantly with the generic `acpx.error / "Internal error"` log,
stranding the work and triggering an opaque `stranded_assigned_issue`
recovery to the CTO
> - Once the diagnostic blackbox was opened, the underlying cause turned
out to be two SDK-level mismatches: a model-name allowlist that rejects
bare IDs like `claude-opus-4-7`, and a Claude Code
permission/Read-sandbox configuration that silently denies every
non-allowlisted tool when the user's `~/.claude/settings.json` has
`defaultMode: "dontAsk"`
> - This pull request fixes both classes of failure in the adapter
itself so new ACPX agents work seamlessly without per-host
configuration, and widens the diagnostic surface so the *next* failure
of any kind is actionable
> - The benefit is that ACPX-Claude can join the regular agent roster —
verified end to end on PAPA-401, where the agent successfully reached
the Paperclip API, opened a worktree, surveyed existing notification
PRs, and posted a structured plan

## What Changed

- Widen ACPX failure diagnostics
(`packages/adapters/acpx-local/src/server/execute.ts`):
- Capture `err.name`, ACP code, `cause.message`, retryable flag, and a
5-frame stack preview into `errorMeta`.
- Promote phase-specific error codes: `ensure_session →
acpx_session_init_failed`, `configure_session →
acpx_session_config_failed`, `turn → acpx_turn_failed`, plus mapping for
`ACP_BACKEND_MISSING` / `ACP_BACKEND_UNAVAILABLE`.
- Set `verbose: true` on the ACPX runtime so its session-event log flows
through `ctx.onLog`.
- Capture child-process stderr via a wrapper-script tee into
`<stateDir>/run-stderr/<runId>.log`, inline the tail into the
`acpx.error` payload as `childStderrTail`, and forward it through
`ctx.onLog("stderr", …)` so it lands in the heartbeat `stderrExcerpt`
column (existing redaction applies).
- Set the model via `ANTHROPIC_MODEL` env for the `claude` agent instead
of `set_config_option(model, …)`. The ACP server's `set_config_option`
handler validates against an internal allowlist and rejects bare IDs
like `claude-opus-4-7`. `ANTHROPIC_MODEL` is read during initialization
and bypasses that check.
- Seed `<worktree>/.claude/settings.local.json` before spawning
`claude-agent-acp` (the seamless-API fix). Since `claude-agent-acp`
hard-codes `settingSources: ["user", "project", "local"]` and "local"
has the highest precedence:
- Set `permissions.defaultMode: "default"`, but **only** if the user's
value is missing or `"dontAsk"` (the broken case). Other modes like
`acceptEdits`/`plan` are preserved.
- Pre-allow Paperclip's Bash surface (`Bash(curl:*)`, `Bash(env:*)`,
`Bash(<cwd>/scripts/paperclip-issue-update.sh:*)`,
`Bash(<cwd>/scripts/paperclip:*)`).
- Widen `permissions.additionalDirectories` to include `stateDir`,
`agentHome`, and the per-company instance root
(`~/.paperclip/instances/<id>/companies/<companyId>`). Scoped to this
company only — does not expose other tenants.
- Existing user entries are merged, not replaced. The resolved roots are
folded into the session fingerprint so warm-session handles invalidate
when they change.
- Sync the existing server-side integration test
(`server/src/__tests__/acpx-local-execute.test.ts`) to assert
`acpx_session_init_failed` instead of the now-removed
`acpx_protocol_error` for `ACP_SESSION_INIT_FAILED` (a follow-up to
commit 1).

## Verification

- `pnpm --filter "@paperclipai/adapter-acpx-local" run typecheck` —
passes.
- `pnpm vitest run` in `packages/adapters/acpx-local` — 35/35 pass,
includes 4 new tests covering the settings.local.json write path (claude
only, merge with pre-existing content, `dontAsk` override, codex no-op).
- `pnpm vitest run src/__tests__/acpx-local-execute.test.ts` in
`server/` — 15/15 pass after the test-sync commit.
- End-to-end manual verification (PAPA-401): the `Claude Local ACPX`
agent that previously hit "restricted environment" now successfully
reaches the Paperclip API, opens its worktree, posts structured plan
comments, and flips the issue to `in_review` without any external
configuration.

## Risks

- **Low**, scoped to the `acpx_local` adapter. The settings.local.json
write is per-worktree (worktrees live under
`.paperclip/worktrees/<issue>/`) and only triggers when `acpxAgent ===
"claude"`. Existing user content is merged with `[...existing,
...paperclip]` and deduped — nothing is overwritten outright.
- The `defaultMode` override is intentionally narrow: it only flips
`"dontAsk"` (which silently denies every tool and is the root cause) to
`"default"`. Users who explicitly picked `acceptEdits`, `plan`, or any
other mode keep their choice.
- Stderr capture goes through the existing `log-redaction` pass before
persisting, so `PAPERCLIP_API_KEY` and similar secrets in the wrapper
env don't leak into heartbeat logs.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- Claude Opus 4.7 (`claude-opus-4-7`), running in the `claude_local`
adapter via Paperclip's harness. Extended thinking enabled, tool use
enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (adapter-only)
- [ ] I have updated relevant documentation to reflect my changes — no
user-facing docs changed; internal commentary in the code change
explains the SDK constraints
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
This commit is contained in:
Devin Foley
2026-05-23 13:01:27 -07:00
committed by GitHub
parent 897cc322c7
commit 96f0279e08
3 changed files with 578 additions and 19 deletions
@@ -2,6 +2,7 @@ import fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it } from "vitest";
import type { AcpRuntimeOptions } from "acpx/runtime";
import { createAcpxLocalExecutor } from "./execute.js";
const tempRoots: string[] = [];
@@ -376,6 +377,126 @@ describe("acpx_local runtime skill isolation", () => {
expect(envFiles.filter((contents) => contents.includes("PAPERCLIP_API_KEY='second-key'"))).toHaveLength(1);
});
it("enriches acpx.error diagnostics and child stderr when ensureSession rejects", async () => {
const root = await makeTempRoot();
const stateDir = path.join(root, "state");
const runStderrDir = path.join(stateDir, "run-stderr");
await fs.mkdir(runStderrDir, { recursive: true });
const stderrTail = "claude-agent-acp: SDK init failed (auth missing)";
await fs.writeFile(path.join(runStderrDir, "run-1.log"), `${stderrTail}\n`, "utf8");
class FakeAcpRuntimeError extends Error {
readonly code = "ACP_SESSION_INIT_FAILED";
readonly cause: Error;
readonly retryable = false;
constructor(message: string, cause: Error) {
super(message);
this.name = "AcpRuntimeError";
this.cause = cause;
}
}
const logs: Array<{ stream: string; text: string }> = [];
const execute = createAcpxLocalExecutor({
createRuntime: () => ({
ensureSession: async () => {
throw new FakeAcpRuntimeError(
"session/new failed: backend rejected initialize",
new Error("upstream timeout"),
);
},
startTurn: () => ({
events: (async function* () {})(),
result: Promise.resolve({ status: "completed", stopReason: "end_turn" }),
cancel: async () => {},
}),
close: async () => {},
}) as never,
});
const result = await execute({
runId: "run-1",
agent: { id: "agent-1", companyId: "company-1" },
runtime: {},
config: {
agent: "custom",
agentCommand: "node ./fake-acp.js",
stateDir,
},
context: {},
onLog: async (stream: "stdout" | "stderr", text: string) => {
logs.push({ stream, text });
},
onMeta: async () => {},
} as never);
expect(result.exitCode).toBe(1);
expect(result.errorCode).toBe("acpx_session_init_failed");
const meta = result.errorMeta ?? {};
expect(meta.errorName).toBe("AcpRuntimeError");
expect(meta.acpCode).toBe("ACP_SESSION_INIT_FAILED");
expect(meta.causeMessage).toBe("upstream timeout");
expect(meta.retryable).toBe(false);
expect(typeof meta.stackPreview).toBe("string");
expect(meta.phase).toBe("ensure_session");
const errorLogLine = logs.find((entry) => entry.stream === "stdout" && entry.text.includes("\"type\":\"acpx.error\""));
expect(errorLogLine).toBeTruthy();
const errorPayload = JSON.parse(errorLogLine!.text.trim());
expect(errorPayload.phase).toBe("ensure_session");
expect(errorPayload.errorName).toBe("AcpRuntimeError");
expect(errorPayload.acpCode).toBe("ACP_SESSION_INIT_FAILED");
expect(errorPayload.causeMessage).toBe("upstream timeout");
expect(errorPayload.childStderrTail).toContain("SDK init failed");
const stderrLog = logs.find((entry) => entry.stream === "stderr" && entry.text.includes("ACPX child stderr tail"));
expect(stderrLog).toBeTruthy();
expect(stderrLog!.text).toContain(stderrTail);
});
it("writes wrapper that redirects child stderr to a per-run log file", async () => {
const root = await makeTempRoot();
const stateDir = path.join(root, "state");
const runtimeOptions: AcpRuntimeOptions[] = [];
const execute = createAcpxLocalExecutor({
createRuntime: (options) => {
runtimeOptions.push(options as unknown as AcpRuntimeOptions);
return buildRuntime() as never;
},
});
const result = await execute({
runId: "run-stderr-1",
agent: { id: "agent-1", companyId: "company-1" },
runtime: {},
config: {
agent: "custom",
agentCommand: "node ./fake-acp.js",
stateDir,
},
context: {},
onLog: async () => {},
onMeta: async () => {},
} as never);
expect(result.exitCode).toBe(0);
const verboseFlags = runtimeOptions.map((options) => (options as { verbose?: boolean }).verbose);
// verbose is scoped to the claude agent (PAPA-388); the custom agent here
// should not opt in to ACPX runtime verbose session-event logs.
expect(verboseFlags.every((flag) => flag === false)).toBe(true);
const wrappers = await fs.readdir(path.join(stateDir, "wrappers"));
const wrapperFile = wrappers.find((name) => name.endsWith(".sh"));
expect(wrapperFile).toBeTruthy();
const wrapper = await fs.readFile(path.join(stateDir, "wrappers", wrapperFile!), "utf8");
expect(wrapper).toContain("stderr_dir=");
expect(wrapper).toContain("run-stderr");
expect(wrapper).toContain("PAPERCLIP_RUN_ID");
expect(wrapper).toContain("tee -a");
expect(wrapper).toContain("exec node ./fake-acp.js");
});
it("passes Paperclip env through the ACP agent wrapper instead of process.env", async () => {
let observedApiKeyDuringStream: string | undefined;
const execute = createAcpxLocalExecutor({
@@ -422,4 +543,160 @@ describe("acpx_local runtime skill isolation", () => {
else process.env.PAPERCLIP_API_KEY = previousApiKey;
}
});
it("writes a Paperclip-managed .claude/settings.local.json for the claude agent so it can reach the Paperclip API", async () => {
const root = await makeTempRoot();
const stateDir = path.join(root, "state");
const cwd = path.join(root, "worktree");
await fs.mkdir(cwd, { recursive: true });
const { meta } = await runExecutor(
{ agent: "claude", stateDir, cwd },
{ context: { paperclipWorkspace: { cwd, agentHome: path.join(root, "agent-home") } } },
);
const settingsPath = path.join(cwd, ".claude", "settings.local.json");
const written = JSON.parse(await fs.readFile(settingsPath, "utf8")) as {
permissions?: {
allow?: unknown;
additionalDirectories?: unknown;
defaultMode?: unknown;
};
};
expect(written.permissions?.defaultMode).toBe("default");
const allow = written.permissions?.allow;
expect(Array.isArray(allow)).toBe(true);
expect(allow).toContain("Bash(curl:*)");
expect(allow).toContain(`Bash(${cwd}/scripts/paperclip-issue-update.sh:*)`);
const additionalDirectories = written.permissions?.additionalDirectories as string[] | undefined;
expect(Array.isArray(additionalDirectories)).toBe(true);
expect(additionalDirectories).toContain(stateDir);
expect(additionalDirectories).toContain(path.join(root, "agent-home"));
const note = (meta[0]?.commandNotes as string[] | undefined)?.find((entry) =>
entry.includes("Paperclip-managed Claude settings"),
);
expect(note).toBeTruthy();
});
it("merges Paperclip allowlist into an existing .claude/settings.local.json without losing user entries", async () => {
const root = await makeTempRoot();
const stateDir = path.join(root, "state");
const cwd = path.join(root, "worktree");
await fs.mkdir(path.join(cwd, ".claude"), { recursive: true });
await fs.writeFile(
path.join(cwd, ".claude", "settings.local.json"),
JSON.stringify(
{
statusLine: { type: "command", command: "preserve-me" },
permissions: {
allow: ["Bash(npm test:*)"],
additionalDirectories: ["/Users/example/custom"],
defaultMode: "acceptEdits",
},
},
null,
2,
),
"utf8",
);
await runExecutor(
{ agent: "claude", stateDir, cwd },
{ context: { paperclipWorkspace: { cwd } } },
);
const written = JSON.parse(
await fs.readFile(path.join(cwd, ".claude", "settings.local.json"), "utf8"),
) as {
statusLine?: unknown;
permissions?: {
allow?: string[];
additionalDirectories?: string[];
defaultMode?: string;
};
};
expect(written.statusLine).toEqual({ type: "command", command: "preserve-me" });
expect(written.permissions?.defaultMode).toBe("acceptEdits");
expect(written.permissions?.allow).toContain("Bash(npm test:*)");
expect(written.permissions?.allow).toContain("Bash(curl:*)");
expect(written.permissions?.additionalDirectories).toContain("/Users/example/custom");
expect(written.permissions?.additionalDirectories).toContain(stateDir);
});
it("overrides a user-supplied dontAsk defaultMode so ACPX can route Bash through canUseTool", async () => {
const root = await makeTempRoot();
const stateDir = path.join(root, "state");
const cwd = path.join(root, "worktree");
await fs.mkdir(path.join(cwd, ".claude"), { recursive: true });
await fs.writeFile(
path.join(cwd, ".claude", "settings.local.json"),
JSON.stringify({ permissions: { defaultMode: "dontAsk" } }, null, 2),
"utf8",
);
const { meta } = await runExecutor(
{ agent: "claude", stateDir, cwd },
{ context: { paperclipWorkspace: { cwd } } },
);
const written = JSON.parse(
await fs.readFile(path.join(cwd, ".claude", "settings.local.json"), "utf8"),
) as { permissions?: { defaultMode?: string } };
expect(written.permissions?.defaultMode).toBe("default");
const overrideNote = (meta[0]?.commandNotes as string[] | undefined)?.find((entry) =>
entry.includes("overrode user dontAsk"),
);
expect(overrideNote).toBeTruthy();
});
it("opts the claude agent into ACPX runtime verbose logs but leaves codex/custom agents quiet", async () => {
const root = await makeTempRoot();
const cwd = path.join(root, "worktree");
await fs.mkdir(cwd, { recursive: true });
const verboseByAgent: Record<string, boolean | undefined> = {};
for (const agent of ["claude", "codex", "custom"] as const) {
const runtimeOptions: AcpRuntimeOptions[] = [];
const execute = createAcpxLocalExecutor({
createRuntime: (options) => {
runtimeOptions.push(options as AcpRuntimeOptions);
return buildRuntime() as never;
},
});
const result = await execute({
runId: `run-${agent}`,
agent: { id: `agent-${agent}`, companyId: "company-1" },
runtime: {},
config:
agent === "custom"
? { agent, agentCommand: "node ./fake-acp.js", stateDir: path.join(root, `state-${agent}`), cwd }
: { agent, stateDir: path.join(root, `state-${agent}`), cwd },
context: { paperclipWorkspace: { cwd } },
onLog: async () => {},
onMeta: async () => {},
} as never);
expect(result.exitCode).toBe(0);
verboseByAgent[agent] = (runtimeOptions[0] as { verbose?: boolean } | undefined)?.verbose;
}
expect(verboseByAgent.claude).toBe(true);
expect(verboseByAgent.codex).toBe(false);
expect(verboseByAgent.custom).toBe(false);
});
it("does not touch .claude/settings.local.json for the codex agent", async () => {
const root = await makeTempRoot();
const stateDir = path.join(root, "state");
const cwd = path.join(root, "worktree");
await fs.mkdir(cwd, { recursive: true });
await runExecutor(
{ agent: "codex", stateDir, cwd },
{ context: { paperclipWorkspace: { cwd } } },
);
expect(await pathExists(path.join(cwd, ".claude", "settings.local.json"))).toBe(false);
});
});