Files
paperclip/server/src/__tests__/adapter-registry.test.ts
T
Devin Foley b24c6909e8 Harden remote sandbox runtime probes, timeouts, and installs (#5685)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent runs inside a sandbox environment so its CLI is isolated
from the host
> - Sandbox-backed adapter runs go through a small set of shared helpers
— `ensureAdapterExecutionTargetCommandResolvable`, the sandbox callback
bridge runner, and per-adapter `SANDBOX_INSTALL_COMMAND` strings
> - When standing up new sandbox provider plugins, the existing helpers
timed out, missed install fallbacks, or leaned on assumptions that only
held for E2B
> - Local adapters (`claude-local`, `codex-local`, `gemini-local`,
`opencode-local`) needed slightly hardened probes so they could install
themselves and validate inside *any* remote sandbox transport, not just
E2B
> - This pull request bundles those runtime fixes so future sandbox
provider plugins inherit a working baseline
> - The benefit is that adding a new sandbox provider plugin no longer
requires touching adapter-utils or each local-adapter probe — the
supporting infra is already correct

## What Changed

- `packages/adapter-utils/src/execution-target.ts`: introduce
`DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC = 1800` and
`resolveAdapterExecutionTargetTimeoutSec(...)`. Local and SSH adapters
keep the historical "0 means no adapter timeout" behavior;
sandbox-backed runs without an explicit `timeoutSec` get an explicit
30-minute default so remote installs and warm-up don't time out at the
per-RPC default. Plumbed `timeoutSec` through
`ensureAdapterExecutionTargetCommandResolvable` so install probes inside
a sandbox honor adapter-level overrides instead of the bridge's 5-minute
default.
- `packages/adapters/opencode-local/src/index.ts`: switch
`SANDBOX_INSTALL_COMMAND` from `npm install -g opencode-ai` to `curl
-fsSL https://opencode.ai/install | bash`. The npm package reifies four
large prebuilt-binary subpackages in parallel even though only one
matches the host arch; on bandwidth-constrained sandboxes that blew
through the 240s install budget. The official installer fetches one
arch-specific binary and adds `$HOME/.opencode/bin` to PATH via
`~/.bashrc`, which the sandbox-callback-bridge login-shell script
already sources.
- `packages/adapters/{claude,codex,gemini,opencode}-local/`: harden
remote-target probes — pass `--skip-git-repo-check` for Codex when
probing outside a repo, normalize permission flags for Claude, and add
`*.remote.test.ts` coverage that exercises the remote-sandbox path
explicitly for each adapter.
- `packages/adapter-utils/src/sandbox-install-command.{ts,test.ts}`
(new): add `buildSandboxNpmInstallCommand` helper.
`server/src/adapters/registry.ts` + new
`server/src/__tests__/adapter-registry.test.ts`: wire adapter install
commands so they fall back to a writable `$HOME/.local` prefix when
global install isn't available.
- `server/src/__tests__/plugin-worker-manager.test.ts` + new
`server/src/__tests__/fixtures/plugin-worker-delayed.cjs`: pin per-call
timeout overrides so plugin worker exec calls honor the caller's timeout
instead of the worker's default.

## Verification

- `pnpm typecheck`
- `pnpm exec vitest run --no-coverage
packages/adapter-utils/src/execution-target-sandbox.test.ts
packages/adapter-utils/src/sandbox-install-command.test.ts`
- `pnpm exec vitest run --no-coverage
server/src/__tests__/plugin-worker-manager.test.ts
server/src/__tests__/adapter-registry.test.ts
server/src/__tests__/claude-local-adapter-environment.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/gemini-local-adapter-environment.test.ts`
- `pnpm exec vitest run --no-coverage
packages/adapters/codex-local/src/server/test.remote.test.ts
packages/adapters/opencode-local/src/server/test.remote.test.ts
packages/adapters/codex-local/src/server/codex-args.test.ts
packages/adapters/codex-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts`

All passing locally.

## Risks

- Touches shared `adapter-utils` and several `*-local` adapters. The
30-minute default applies only when both (a) the target is
`remote+sandbox` and (b) no `timeoutSec` is configured — local + SSH
paths are unchanged. New test coverage was added alongside each behavior
change to pin the contracts.
- Switching OpenCode's install command to the official installer is a
behavior change for any operator running OpenCode inside a remote
sandbox. Local installs are unaffected (the `SANDBOX_INSTALL_COMMAND`
only runs when an adapter is being installed inside a sandbox).
- Low risk overall — no migrations, no API surface change.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep),
no code execution beyond local repo commands

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A, no UI change
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 00:31:54 -07:00

552 lines
18 KiB
TypeScript

import { describe, expect, it, beforeEach, afterEach, vi } from "vitest";
import { buildSandboxNpmInstallCommand } from "@paperclipai/adapter-utils";
import type { ServerAdapterModule } from "../adapters/index.js";
const hermesExecuteMock = vi.hoisted(() =>
vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
})),
);
vi.mock("hermes-paperclip-adapter/server", () => ({
execute: hermesExecuteMock,
testEnvironment: async () => ({
adapterType: "hermes_local",
status: "pass",
checks: [],
testedAt: new Date(0).toISOString(),
}),
sessionCodec: null,
listSkills: async () => [],
syncSkills: async () => ({ entries: [] }),
detectModel: async () => null,
}));
import {
detectAdapterModel,
findActiveServerAdapter,
findServerAdapter,
listAdapterModels,
listAdapterModelProfiles,
registerServerAdapter,
requireServerAdapter,
unregisterServerAdapter,
} from "../adapters/index.js";
import {
resolveExternalAdapterRegistration,
setOverridePaused,
} from "../adapters/registry.js";
const externalAdapter: ServerAdapterModule = {
type: "external_test",
execute: async () => ({
exitCode: 0,
signal: null,
timedOut: false,
}),
testEnvironment: async () => ({
adapterType: "external_test",
status: "pass",
checks: [],
testedAt: new Date(0).toISOString(),
}),
models: [{ id: "external-model", label: "External Model" }],
supportsLocalAgentJwt: false,
};
describe("server adapter registry", () => {
beforeEach(() => {
unregisterServerAdapter("external_test");
unregisterServerAdapter("claude_local");
setOverridePaused("claude_local", false);
});
afterEach(() => {
unregisterServerAdapter("external_test");
unregisterServerAdapter("claude_local");
setOverridePaused("claude_local", false);
hermesExecuteMock.mockClear();
});
it("registers external adapters and exposes them through lookup helpers", async () => {
expect(findServerAdapter("external_test")).toBeNull();
registerServerAdapter(externalAdapter);
expect(requireServerAdapter("external_test")).toBe(externalAdapter);
expect(await listAdapterModels("external_test")).toEqual([
{ id: "external-model", label: "External Model" },
]);
});
it("exposes adapter model profiles when adapters declare them", async () => {
const adapterWithProfiles: ServerAdapterModule = {
...externalAdapter,
modelProfiles: [
{
key: "cheap",
label: "Cheap",
adapterConfig: { model: "external-mini" },
source: "adapter_default",
},
],
};
registerServerAdapter(adapterWithProfiles);
expect(await listAdapterModelProfiles("external_test")).toEqual([
{
key: "cheap",
label: "Cheap",
adapterConfig: { model: "external-mini" },
source: "adapter_default",
},
]);
});
it("removes external adapters when unregistered", () => {
registerServerAdapter(externalAdapter);
unregisterServerAdapter("external_test");
expect(findServerAdapter("external_test")).toBeNull();
expect(() => requireServerAdapter("external_test")).toThrow(
"Unknown adapter type: external_test",
);
});
it("allows external plugin to override a built-in adapter type", () => {
// claude_local is always built-in
const builtIn = findServerAdapter("claude_local");
expect(builtIn).not.toBeNull();
const plugin: ServerAdapterModule = {
type: "claude_local",
execute: async () => ({
exitCode: 0,
signal: null,
timedOut: false,
}),
testEnvironment: async () => ({
adapterType: "claude_local",
status: "pass",
checks: [],
testedAt: new Date(0).toISOString(),
}),
models: [{ id: "plugin-model", label: "Plugin Override" }],
supportsLocalAgentJwt: false,
};
registerServerAdapter(plugin);
// Plugin wins
const resolved = requireServerAdapter("claude_local");
expect(resolved).toBe(plugin);
expect(resolved.models).toEqual([
{ id: "plugin-model", label: "Plugin Override" },
]);
});
it("exposes capability flags from registered adapters", () => {
const adapterWithCaps: ServerAdapterModule = {
type: "external_test",
execute: async () => ({ exitCode: 0, signal: null, timedOut: false }),
testEnvironment: async () => ({
adapterType: "external_test",
status: "pass" as const,
checks: [],
testedAt: new Date(0).toISOString(),
}),
supportsLocalAgentJwt: true,
supportsInstructionsBundle: true,
instructionsPathKey: "customPathKey",
requiresMaterializedRuntimeSkills: true,
};
registerServerAdapter(adapterWithCaps);
const resolved = findActiveServerAdapter("external_test");
expect(resolved).not.toBeNull();
expect(resolved!.supportsInstructionsBundle).toBe(true);
expect(resolved!.instructionsPathKey).toBe("customPathKey");
expect(resolved!.requiresMaterializedRuntimeSkills).toBe(true);
expect(resolved!.supportsLocalAgentJwt).toBe(true);
});
it("returns undefined for capability flags on adapters that do not set them", () => {
registerServerAdapter(externalAdapter);
const resolved = findActiveServerAdapter("external_test");
expect(resolved).not.toBeNull();
expect(resolved!.supportsInstructionsBundle).toBeUndefined();
expect(resolved!.instructionsPathKey).toBeUndefined();
expect(resolved!.requiresMaterializedRuntimeSkills).toBeUndefined();
});
it("built-in claude_local adapter declares capability flags", () => {
const adapter = findActiveServerAdapter("claude_local");
expect(adapter).not.toBeNull();
expect(adapter!.supportsInstructionsBundle).toBe(true);
expect(adapter!.instructionsPathKey).toBe("instructionsFilePath");
expect(adapter!.requiresMaterializedRuntimeSkills).toBe(false);
expect(adapter!.supportsLocalAgentJwt).toBe(true);
});
it("built-in local adapters declare cheap model profile defaults where supported", async () => {
await expect(listAdapterModelProfiles("claude_local")).resolves.toEqual([
expect.objectContaining({
key: "cheap",
adapterConfig: expect.objectContaining({ model: "claude-sonnet-4-6" }),
source: "adapter_default",
}),
]);
await expect(listAdapterModelProfiles("codex_local")).resolves.toEqual([
expect.objectContaining({
key: "cheap",
adapterConfig: expect.objectContaining({ model: "gpt-5.3-codex-spark" }),
source: "adapter_default",
}),
]);
await expect(listAdapterModelProfiles("gemini_local")).resolves.toEqual([
expect.objectContaining({
key: "cheap",
adapterConfig: expect.objectContaining({ model: "gemini-2.5-flash-lite" }),
source: "adapter_default",
}),
]);
await expect(listAdapterModelProfiles("opencode_local")).resolves.toEqual([
expect.objectContaining({
key: "cheap",
adapterConfig: expect.objectContaining({ model: "openai/gpt-5.1-codex-mini" }),
source: "adapter_default",
}),
]);
await expect(listAdapterModelProfiles("cursor")).resolves.toEqual([
expect.objectContaining({
key: "cheap",
adapterConfig: expect.objectContaining({ model: "gpt-5.1-codex-mini" }),
source: "adapter_default",
}),
]);
await expect(listAdapterModelProfiles("pi_local")).resolves.toEqual([]);
});
it("wraps built-in npm runtime installs with the sandbox-aware install helper", () => {
const expectedClaudeInstall = `if ! command -v 'claude' >/dev/null 2>&1; then ${buildSandboxNpmInstallCommand("@anthropic-ai/claude-code")}; fi`;
const expectedCodexInstall = `if ! command -v 'codex' >/dev/null 2>&1; then ${buildSandboxNpmInstallCommand("@openai/codex")}; fi`;
const expectedGeminiInstall = `if ! command -v 'gemini' >/dev/null 2>&1; then ${buildSandboxNpmInstallCommand("@google/gemini-cli")}; fi`;
const expectedOpenCodeInstall = `if ! command -v 'opencode' >/dev/null 2>&1; then ${buildSandboxNpmInstallCommand("opencode-ai")}; fi`;
expect(findActiveServerAdapter("claude_local")?.getRuntimeCommandSpec?.({})).toEqual({
command: "claude",
detectCommand: "claude",
installCommand: expectedClaudeInstall,
});
expect(findActiveServerAdapter("codex_local")?.getRuntimeCommandSpec?.({})).toEqual({
command: "codex",
detectCommand: "codex",
installCommand: expectedCodexInstall,
});
expect(findActiveServerAdapter("gemini_local")?.getRuntimeCommandSpec?.({})).toEqual({
command: "gemini",
detectCommand: "gemini",
installCommand: expectedGeminiInstall,
});
expect(findActiveServerAdapter("opencode_local")?.getRuntimeCommandSpec?.({})).toEqual({
command: "opencode",
detectCommand: "opencode",
installCommand: expectedOpenCodeInstall,
});
});
it("switches active adapter behavior back to the builtin when an override is paused", async () => {
const builtIn = findServerAdapter("claude_local");
expect(builtIn).not.toBeNull();
const detectModel = vi.fn(async () => ({
model: "plugin-model",
provider: "plugin-provider",
source: "plugin-source",
}));
const plugin: ServerAdapterModule = {
type: "claude_local",
execute: async () => ({
exitCode: 0,
signal: null,
timedOut: false,
}),
testEnvironment: async () => ({
adapterType: "claude_local",
status: "pass",
checks: [],
testedAt: new Date(0).toISOString(),
}),
models: [{ id: "plugin-model", label: "Plugin Override" }],
detectModel,
supportsLocalAgentJwt: false,
};
registerServerAdapter(plugin);
expect(findActiveServerAdapter("claude_local")).toBe(plugin);
expect(await listAdapterModels("claude_local")).toEqual([
{ id: "plugin-model", label: "Plugin Override" },
]);
expect(await detectAdapterModel("claude_local")).toMatchObject({
model: "plugin-model",
provider: "plugin-provider",
});
expect(setOverridePaused("claude_local", true)).toBe(true);
expect(findActiveServerAdapter("claude_local")).not.toBe(plugin);
expect(await listAdapterModels("claude_local")).toEqual(builtIn?.models ?? []);
expect(await detectAdapterModel("claude_local")).toBeNull();
expect(detectModel).toHaveBeenCalledTimes(1);
});
it("injects the local agent JWT and Paperclip API auth guidance into Hermes", async () => {
const adapter = requireServerAdapter("hermes_local");
await adapter.execute({
runId: "run-123",
agent: {
id: "agent-123",
companyId: "company-123",
name: "Hermes Agent",
role: "engineer",
adapterType: "hermes_local",
adapterConfig: {
env: {
OPENAI_API_KEY: "llm-token",
},
promptTemplate: "Existing prompt",
},
},
runtime: {},
config: {},
context: {},
onLog: async () => {},
onMeta: async () => {},
onSpawn: async () => {},
authToken: "agent-run-jwt",
});
expect(hermesExecuteMock).toHaveBeenCalledTimes(1);
const [patchedCtx] = hermesExecuteMock.mock.calls[0];
expect(patchedCtx.agent.adapterConfig).toMatchObject({
env: {
OPENAI_API_KEY: "llm-token",
PAPERCLIP_API_KEY: "agent-run-jwt",
PAPERCLIP_RUN_ID: "run-123",
},
});
expect(patchedCtx.agent.adapterConfig.promptTemplate).toContain(
"Authorization: Bearer $PAPERCLIP_API_KEY",
);
expect(patchedCtx.agent.adapterConfig.promptTemplate).toContain(
"X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID",
);
expect(patchedCtx.agent.adapterConfig.promptTemplate).toContain("Existing prompt");
});
it("preserves Hermes command normalization while injecting auth", async () => {
const adapter = requireServerAdapter("hermes_local");
await adapter.execute({
runId: "run-123",
agent: {
id: "agent-123",
companyId: "company-123",
name: "Hermes Agent",
role: "engineer",
adapterType: "hermes_local",
adapterConfig: {
command: "agent-hermes",
},
},
runtime: {},
config: {
command: "runtime-hermes",
},
context: {},
onLog: async () => {},
onMeta: async () => {},
onSpawn: async () => {},
authToken: "agent-run-jwt",
});
expect(hermesExecuteMock).toHaveBeenCalledTimes(1);
const [patchedCtx] = hermesExecuteMock.mock.calls[0];
expect(patchedCtx.config.hermesCommand).toBe("runtime-hermes");
expect(patchedCtx.agent.adapterConfig.hermesCommand).toBe("agent-hermes");
expect(patchedCtx.agent.adapterConfig.env.PAPERCLIP_API_KEY).toBe("agent-run-jwt");
});
it("passes the original Hermes context through when authToken is absent", async () => {
const adapter = requireServerAdapter("hermes_local");
const ctx = {
runId: "run-123",
agent: {
id: "agent-123",
companyId: "company-123",
name: "Hermes Agent",
role: "engineer",
adapterType: "hermes_local",
adapterConfig: {
env: {
PAPERCLIP_API_KEY: "server-level-key",
},
promptTemplate: "Existing prompt",
},
},
runtime: {},
config: {},
context: {},
onLog: async () => {},
onMeta: async () => {},
onSpawn: async () => {},
};
await adapter.execute(ctx);
expect(hermesExecuteMock).toHaveBeenCalledTimes(1);
expect(hermesExecuteMock).toHaveBeenCalledWith(ctx);
});
it("preserves an explicit Hermes Paperclip API key and does not set promptTemplate when none was configured", async () => {
const adapter = requireServerAdapter("hermes_local");
await adapter.execute({
runId: "run-123",
agent: {
id: "agent-123",
companyId: "company-123",
name: "Hermes Agent",
role: "engineer",
adapterType: "hermes_local",
adapterConfig: {
env: {
PAPERCLIP_API_KEY: "explicit-agent-key",
PAPERCLIP_RUN_ID: "stale-run-id",
},
},
},
runtime: {},
config: {},
context: {},
onLog: async () => {},
onMeta: async () => {},
onSpawn: async () => {},
authToken: "agent-run-jwt",
});
const [patchedCtx] = hermesExecuteMock.mock.calls[0];
expect(patchedCtx.agent.adapterConfig.env.PAPERCLIP_API_KEY).toBe("explicit-agent-key");
expect(patchedCtx.agent.adapterConfig.env.PAPERCLIP_RUN_ID).toBe("run-123");
// No custom promptTemplate was set — Hermes must use its built-in default.
// Setting promptTemplate here would replace the full default with just the auth guard text,
// stripping assigned issue / workflow instructions.
expect(patchedCtx.agent.adapterConfig.promptTemplate).toBeUndefined();
});
it("does not set promptTemplate when no custom template is configured, preserving Hermes default", async () => {
const adapter = requireServerAdapter("hermes_local");
await adapter.execute({
runId: "run-123",
agent: {
id: "agent-123",
companyId: "company-123",
name: "Hermes Agent",
role: "engineer",
adapterType: "hermes_local",
adapterConfig: {},
},
runtime: {},
config: {},
context: {},
onLog: async () => {},
onMeta: async () => {},
onSpawn: async () => {},
authToken: "agent-run-jwt",
});
const [patchedCtx] = hermesExecuteMock.mock.calls[0];
// promptTemplate must remain unset so Hermes uses its built-in heartbeat/task prompt.
expect(patchedCtx.agent.adapterConfig.promptTemplate).toBeUndefined();
// Auth token is still injected.
expect(patchedCtx.agent.adapterConfig.env.PAPERCLIP_API_KEY).toBe("agent-run-jwt");
});
});
describe("resolveExternalAdapterRegistration", () => {
it("preserves module-provided sessionManagement", () => {
const sessionManagement = {
supportsSessionResume: true,
nativeContextManagement: "unknown" as const,
defaultSessionCompaction: {
enabled: true,
maxSessionRuns: 200,
maxRawInputTokens: 2_000_000,
maxSessionAgeHours: 72,
},
};
const adapter: ServerAdapterModule = {
type: "external_session_test",
execute: async () => ({ exitCode: 0, signal: null, timedOut: false }),
testEnvironment: async () => ({
adapterType: "external_session_test",
status: "pass",
checks: [],
testedAt: new Date(0).toISOString(),
}),
sessionManagement,
};
const resolved = resolveExternalAdapterRegistration(adapter);
expect(resolved.sessionManagement).toBe(sessionManagement);
});
it("falls back to the hardcoded registry when the module omits sessionManagement", () => {
// An external that overrides a built-in type should inherit the built-in's
// sessionManagement when it does not provide its own.
const adapter: ServerAdapterModule = {
type: "claude_local",
execute: async () => ({ exitCode: 0, signal: null, timedOut: false }),
testEnvironment: async () => ({
adapterType: "claude_local",
status: "pass",
checks: [],
testedAt: new Date(0).toISOString(),
}),
};
const resolved = resolveExternalAdapterRegistration(adapter);
expect(resolved.sessionManagement).toBeDefined();
expect(resolved.sessionManagement?.supportsSessionResume).toBe(true);
expect(resolved.sessionManagement?.nativeContextManagement).toBe("confirmed");
});
it("leaves sessionManagement undefined when neither module nor registry provides one", () => {
const adapter: ServerAdapterModule = {
type: "external_unknown_test",
execute: async () => ({ exitCode: 0, signal: null, timedOut: false }),
testEnvironment: async () => ({
adapterType: "external_unknown_test",
status: "pass",
checks: [],
testedAt: new Date(0).toISOString(),
}),
};
const resolved = resolveExternalAdapterRegistration(adapter);
expect(resolved.sessionManagement).toBeUndefined();
});
});