Files
paperclip/server/src/__tests__/environment-run-orchestrator.test.ts
T
Devin Foley 90631b09b3 Let adapters declare runtime command spec for remote provisioning (#5141)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, running
adapter
> commands like `claude`, `codex`, `pi` either locally or on remote
runtimes
>   (SSH hosts, sandboxes, etc.)
> - On a fresh remote runtime — particularly an ephemeral sandbox — the
> adapter's CLI may not be installed yet. Today operators handle this
via
> external configuration (e.g. a project-level `provisionCommand` shell
> script) that has to know about every adapter the operator might want
to use
> - This means every adapter has its own well-known npm package, but
operators
>   end up writing duplicate provision shell scripts that paste together
> `npm install -g @anthropic-ai/claude-code`, `npm install -g
@openai/codex`,
>   etc. — knowledge the adapter itself already has
> - This PR moves that knowledge into the adapter modules: each adapter
declares
> how its runtime command should be detected and (if applicable)
installed
> via `getRuntimeCommandSpec(config)`. The execution path runs the
adapter's
> own install command on remote sandbox targets before launching, so a
fresh
> sandbox bootstraps itself instead of requiring a hand-written
provision script
> - The benefit is fewer footguns for operators provisioning remote
runtimes,
>   and a clean place for new adapters to plug in their install recipe

## What Changed

- New types in `packages/adapter-utils/src/types.ts`:
    - `AdapterRuntimeCommandSpec` describing `command`, optional
      `detectCommand`, and optional `installCommand`
    - Optional `getRuntimeCommandSpec(config)` on `ServerAdapterModule`
- Optional `runtimeCommandSpec` on `AdapterExecutionContext` so adapters
      receive the resolved spec at execute time
- New helper `ensureAdapterExecutionTargetRuntimeCommandInstalled(...)`
in
`packages/adapter-utils/src/execution-target.ts` that runs the install
command
on remote targets when `transport === "sandbox"`. SSH and local targets
are
  no-ops. Throws on timeout or non-zero exit so failures surface early.
- Each of `claude-local`, `codex-local`, `cursor-local`, `gemini-local`,
  `opencode-local`, `pi-local`'s `execute.ts` now reads
`ctx.runtimeCommandSpec?.installCommand` and calls the helper before
launching
  the adapter command.
- `server/src/adapters/registry.ts` declares `getRuntimeCommandSpec` for
each
  adapter:
- claude/codex/gemini/opencode/pi-local: `npm install -g <package>`
recipe via
a shared `buildNpmRuntimeCommandSpec` helper, with a defensive guard
that
only auto-installs when the configured `command` matches the well-known
      fallback (custom binaries are left alone).
- cursor-local: declares `command` only; no auto-install (no public npm
      package), preserving the existing manual setup.
- `server/src/services/heartbeat.ts` resolves the spec via
`adapter.getRuntimeCommandSpec?.(runtimeConfig)` and passes it through
to
  `AdapterExecutionContext`.
- Tests added in `execution-target.test.ts` (~75 lines), e2b
`plugin.test.ts` (~32 lines), and `environment-run-orchestrator.test.ts`
  (~76 lines).

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-run-orchestrator`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: run an adapter (claude/codex/etc.) against a fresh
sandbox-backed
environment that does NOT have the adapter CLI pre-installed. Confirm
the
install runs once at the start of the agent run and the adapter then
launches
successfully. Re-run on the same sandbox; confirm the install command is
  idempotent and the second run starts faster.
- Confirm SSH and local execution paths are unaffected (gated by
  `transport === "sandbox"`).

## Risks

- Behavioural shift on sandbox runs: a new install step now runs at the
start
  of every sandbox agent run for adapters with `installCommand` set. The
install commands are idempotent (`if ! command -v X >/dev/null 2>&1;
then
npm install -g <pkg>; fi`), so this is fast on warm sandboxes. On a cold
  sandbox, the first run takes longer.
- Operators who used the legacy project-level `provisionCommand` to
install
adapter CLIs can drop that part of their script; the adapter handles it
now.
  Existing scripts continue to work — installs are idempotent.
- The cursor-local adapter has no auto-install (no public npm package).
  Behaviour for cursor-local on sandboxes is unchanged.
- New optional surface on `ServerAdapterModule`. Plugins that don't
implement
  `getRuntimeCommandSpec` retain previous behaviour (no auto-install).

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:35:36 -07:00

551 lines
18 KiB
TypeScript

import { beforeEach, describe, expect, it, vi } from "vitest";
// ---------------------------------------------------------------------------
// Hoisted mocks — must be declared before any imports that reference them
// ---------------------------------------------------------------------------
const mockResolveEnvironmentExecutionTarget = vi.hoisted(() => vi.fn());
const mockAdapterExecutionTargetToRemoteSpec = vi.hoisted(() => vi.fn());
const mockBuildWorkspaceRealizationRequest = vi.hoisted(() => vi.fn());
const mockUpdateLeaseMetadata = vi.hoisted(() => vi.fn());
const mockUpdateExecutionWorkspace = vi.hoisted(() => vi.fn());
const mockLogActivity = vi.hoisted(() => vi.fn());
vi.mock("../services/environment-execution-target.js", () => ({
resolveEnvironmentExecutionTarget: mockResolveEnvironmentExecutionTarget,
resolveEnvironmentExecutionTransport: vi.fn().mockResolvedValue(null),
}));
vi.mock("@paperclipai/adapter-utils/execution-target", () => ({
adapterExecutionTargetToRemoteSpec: mockAdapterExecutionTargetToRemoteSpec,
}));
vi.mock("../services/workspace-realization.js", () => ({
buildWorkspaceRealizationRequest: mockBuildWorkspaceRealizationRequest,
}));
vi.mock("../services/environments.js", () => ({
environmentService: vi.fn(() => ({
ensureLocalEnvironment: vi.fn(),
getById: vi.fn(),
acquireLease: vi.fn(),
releaseLease: vi.fn(),
updateLeaseMetadata: mockUpdateLeaseMetadata,
})),
}));
vi.mock("../services/execution-workspaces.js", () => ({
executionWorkspaceService: vi.fn(() => ({
update: mockUpdateExecutionWorkspace,
})),
}));
vi.mock("../services/activity-log.js", () => ({
logActivity: mockLogActivity,
}));
// ---------------------------------------------------------------------------
// Imports after mocks
// ---------------------------------------------------------------------------
import {
environmentRunOrchestrator,
EnvironmentRunError,
} from "../services/environment-run-orchestrator.ts";
import type { Environment, EnvironmentLease, ExecutionWorkspace } from "@paperclipai/shared";
import type { RealizedExecutionWorkspace } from "../services/workspace-runtime.ts";
import type { EnvironmentRuntimeService } from "../services/environment-runtime.ts";
// ---------------------------------------------------------------------------
// Fixtures
// ---------------------------------------------------------------------------
function makeEnvironment(driver: string = "local"): Environment {
return {
id: "env-1",
companyId: "company-1",
name: "Test Environment",
description: null,
driver: driver as Environment["driver"],
status: "active",
config: {},
metadata: null,
createdAt: new Date(),
updatedAt: new Date(),
};
}
function makeLease(overrides: Partial<EnvironmentLease> = {}): EnvironmentLease {
return {
id: "lease-1",
companyId: "company-1",
environmentId: "env-1",
executionWorkspaceId: null,
issueId: null,
heartbeatRunId: "run-1",
status: "active",
leasePolicy: "ephemeral",
provider: "local",
providerLeaseId: null,
acquiredAt: new Date(),
lastUsedAt: new Date(),
expiresAt: null,
releasedAt: null,
failureReason: null,
cleanupStatus: null,
metadata: null,
createdAt: new Date(),
updatedAt: new Date(),
...overrides,
};
}
function makeExecutionWorkspace(cwd: string = "/workspace/project"): RealizedExecutionWorkspace {
return {
baseCwd: "/workspace",
source: "project_primary",
projectId: "project-1",
workspaceId: "ws-1",
repoUrl: null,
repoRef: null,
strategy: "project_primary",
cwd,
branchName: null,
worktreePath: null,
warnings: [],
created: false,
};
}
function makePersistedExecutionWorkspace(
overrides: Partial<ExecutionWorkspace> = {},
): ExecutionWorkspace {
return {
id: "ew-1",
companyId: "company-1",
projectId: "project-1",
projectWorkspaceId: null,
sourceIssueId: null,
mode: "standard",
strategyType: "project_primary",
name: "workspace",
status: "open",
cwd: "/workspace/project",
repoUrl: null,
baseRef: null,
branchName: null,
providerType: "local",
providerRef: null,
derivedFromExecutionWorkspaceId: null,
lastUsedAt: new Date(),
openedAt: new Date(),
closedAt: null,
cleanupEligibleAt: null,
cleanupReason: null,
config: null,
metadata: null,
createdAt: new Date(),
updatedAt: new Date(),
...overrides,
};
}
function makeRealizeInput(overrides: {
environment?: Environment;
lease?: EnvironmentLease;
persistedExecutionWorkspace?: ExecutionWorkspace | null;
} = {}): Parameters<ReturnType<typeof environmentRunOrchestrator>["realizeForRun"]>[0] {
return {
environment: overrides.environment ?? makeEnvironment("local"),
lease: overrides.lease ?? makeLease(),
adapterType: "claude_local",
companyId: "company-1",
issueId: null,
heartbeatRunId: "run-1",
executionWorkspace: makeExecutionWorkspace(),
effectiveExecutionWorkspaceMode: null,
persistedExecutionWorkspace: overrides.persistedExecutionWorkspace !== undefined
? overrides.persistedExecutionWorkspace
: null,
};
}
function makeMockRuntime(overrides: Partial<EnvironmentRuntimeService> = {}): EnvironmentRuntimeService {
return {
acquireRunLease: vi.fn(),
releaseRunLeases: vi.fn(),
execute: vi.fn().mockResolvedValue({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "",
stderr: "",
}),
realizeWorkspace: vi.fn().mockResolvedValue({
cwd: "/workspace/project",
metadata: {
workspaceRealization: {
version: 1,
driver: "local",
cwd: "/workspace/project",
},
},
}),
...overrides,
} as unknown as EnvironmentRuntimeService;
}
// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------
describe("environmentRunOrchestrator — realizeForRun", () => {
const mockDb = {} as any;
beforeEach(() => {
vi.clearAllMocks();
mockBuildWorkspaceRealizationRequest.mockReturnValue({
version: 1,
adapterType: "claude_local",
companyId: "company-1",
environmentId: "env-1",
executionWorkspaceId: null,
issueId: null,
heartbeatRunId: "run-1",
requestedMode: null,
source: {
kind: "project_primary",
localPath: "/workspace/project",
projectId: null,
projectWorkspaceId: null,
repoUrl: null,
repoRef: null,
strategy: "project_primary",
branchName: null,
worktreePath: null,
},
runtimeOverlay: {
provisionCommand: null,
},
});
mockAdapterExecutionTargetToRemoteSpec.mockReturnValue({
kind: "local",
environmentId: "env-1",
leaseId: "lease-1",
});
mockUpdateLeaseMetadata.mockResolvedValue(null);
mockUpdateExecutionWorkspace.mockResolvedValue(null);
mockLogActivity.mockResolvedValue(undefined);
});
it("happy path: returns lease, executionTarget, and remoteExecution on successful realization", async () => {
const executionTarget = { kind: "local", environmentId: "env-1", leaseId: "lease-1" };
const remoteExecution = { kind: "local", environmentId: "env-1", leaseId: "lease-1" };
mockResolveEnvironmentExecutionTarget.mockResolvedValue(executionTarget);
mockAdapterExecutionTargetToRemoteSpec.mockReturnValue(remoteExecution);
const runtime = makeMockRuntime();
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
const result = await orchestrator.realizeForRun(makeRealizeInput());
expect(result.lease).toBeDefined();
expect(result.executionTarget).toEqual(executionTarget);
expect(result.remoteExecution).toEqual(remoteExecution);
expect(result.workspaceRealization).toEqual(
expect.objectContaining({ version: 1, driver: "local" }),
);
expect(runtime.realizeWorkspace).toHaveBeenCalledOnce();
expect(mockResolveEnvironmentExecutionTarget).toHaveBeenCalledOnce();
});
it("realization failure: runtime.realizeWorkspace throws → EnvironmentRunError with code workspace_realization_failed", async () => {
const runtime = makeMockRuntime({
realizeWorkspace: vi.fn().mockRejectedValue(new Error("sandbox unreachable")),
});
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
await expect(orchestrator.realizeForRun(makeRealizeInput())).rejects.toSatisfy(
(err: unknown) =>
err instanceof EnvironmentRunError &&
err.code === "workspace_realization_failed" &&
err.environmentId === "env-1" &&
err.driver === "local",
);
expect(mockResolveEnvironmentExecutionTarget).not.toHaveBeenCalled();
});
it("target resolution failure: resolveEnvironmentExecutionTarget throws → EnvironmentRunError with code transport_resolution_failed", async () => {
mockResolveEnvironmentExecutionTarget.mockRejectedValue(new Error("network error"));
const runtime = makeMockRuntime();
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
await expect(orchestrator.realizeForRun(makeRealizeInput())).rejects.toSatisfy(
(err: unknown) =>
err instanceof EnvironmentRunError &&
err.code === "transport_resolution_failed" &&
err.environmentId === "env-1",
);
});
it("non-sandbox driver skips workspace realization and goes straight to target resolution", async () => {
const environment = makeEnvironment("plugin" as Environment["driver"]);
const executionTarget = null;
mockResolveEnvironmentExecutionTarget.mockResolvedValue(executionTarget);
const runtime = makeMockRuntime();
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
const result = await orchestrator.realizeForRun(
makeRealizeInput({ environment }),
);
expect(runtime.realizeWorkspace).not.toHaveBeenCalled();
expect(result.workspaceRealization).toEqual({});
expect(result.executionTarget).toBeNull();
});
it("persisted metadata is updated on lease and execution workspace after realization", async () => {
const persistedExecutionWorkspace = makePersistedExecutionWorkspace();
const updatedLease = makeLease({
metadata: { workspaceRealization: { version: 1, driver: "local", cwd: "/workspace/project" } },
});
const updatedEw = { ...persistedExecutionWorkspace, metadata: { workspaceRealizationRequest: {}, workspaceRealization: {} } };
mockUpdateLeaseMetadata.mockResolvedValue(updatedLease);
mockUpdateExecutionWorkspace.mockResolvedValue(updatedEw);
mockResolveEnvironmentExecutionTarget.mockResolvedValue({ kind: "local", environmentId: "env-1", leaseId: "lease-1" });
const runtime = makeMockRuntime();
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
const result = await orchestrator.realizeForRun(
makeRealizeInput({ persistedExecutionWorkspace }),
);
// Lease metadata should have been updated with workspaceRealization
expect(mockUpdateLeaseMetadata).toHaveBeenCalledOnce();
expect(mockUpdateLeaseMetadata).toHaveBeenCalledWith(
"lease-1",
expect.objectContaining({ workspaceRealization: expect.any(Object) }),
);
// Execution workspace metadata should have been updated
expect(mockUpdateExecutionWorkspace).toHaveBeenCalledOnce();
expect(mockUpdateExecutionWorkspace).toHaveBeenCalledWith(
"ew-1",
expect.objectContaining({
metadata: expect.objectContaining({
workspaceRealizationRequest: expect.any(Object),
workspaceRealization: expect.any(Object),
}),
}),
);
// The returned lease should reflect the updated value
expect(result.lease).toEqual(updatedLease);
expect(result.persistedExecutionWorkspace).toEqual(updatedEw);
});
it("runs a remote provision command after workspace realization when configured", async () => {
mockBuildWorkspaceRealizationRequest.mockReturnValue({
version: 1,
adapterType: "claude_local",
companyId: "company-1",
environmentId: "env-1",
executionWorkspaceId: null,
issueId: null,
heartbeatRunId: "run-1",
requestedMode: null,
source: {
kind: "project_primary",
localPath: "/workspace/project",
projectId: null,
projectWorkspaceId: null,
repoUrl: null,
repoRef: null,
strategy: "project_primary",
branchName: null,
worktreePath: null,
},
runtimeOverlay: {
provisionCommand: "npm install -g @anthropic-ai/claude-code",
},
});
mockResolveEnvironmentExecutionTarget.mockResolvedValue({
kind: "remote",
transport: "sandbox",
providerKey: "e2b",
remoteCwd: "/remote/workspace",
environmentId: "env-1",
leaseId: "lease-1",
});
const runtime = makeMockRuntime({
realizeWorkspace: vi.fn().mockResolvedValue({
cwd: "/remote/workspace",
metadata: {
workspaceRealization: {
version: 1,
transport: "sandbox",
remote: { path: "/remote/workspace" },
},
},
}),
});
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
await orchestrator.realizeForRun(makeRealizeInput({
environment: makeEnvironment("sandbox"),
}));
expect(runtime.execute).toHaveBeenCalledOnce();
expect(runtime.execute).toHaveBeenCalledWith(expect.objectContaining({
environment: expect.objectContaining({ driver: "sandbox" }),
lease: expect.objectContaining({ id: "lease-1" }),
command: "bash",
args: ["-lc", "npm install -g @anthropic-ai/claude-code"],
cwd: "/remote/workspace",
env: {
SHELL: "/bin/bash",
},
}));
});
it("runs project-level provision commands for ssh environments", async () => {
mockBuildWorkspaceRealizationRequest.mockReturnValue({
version: 1,
adapterType: "gemini_local",
companyId: "company-1",
environmentId: "env-1",
executionWorkspaceId: null,
issueId: null,
heartbeatRunId: "run-1",
requestedMode: null,
source: {
kind: "project_primary",
localPath: "/workspace/project",
projectId: null,
projectWorkspaceId: null,
repoUrl: null,
repoRef: null,
strategy: "project_primary",
branchName: null,
worktreePath: null,
},
runtimeOverlay: {
provisionCommand: "npm install -g @google/gemini-cli",
},
});
mockResolveEnvironmentExecutionTarget.mockResolvedValue({
kind: "remote",
transport: "ssh",
remoteCwd: "/remote/workspace",
environmentId: "env-1",
leaseId: "lease-1",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/remote/workspace",
remoteWorkspacePath: "/remote/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
});
const runtime = makeMockRuntime({
realizeWorkspace: vi.fn().mockResolvedValue({
cwd: "/remote/workspace",
metadata: {
workspaceRealization: {
version: 1,
transport: "ssh",
remote: { path: "/remote/workspace" },
},
},
}),
});
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
await orchestrator.realizeForRun(makeRealizeInput({
environment: makeEnvironment("ssh"),
lease: makeLease({
provider: "ssh",
metadata: {
driver: "ssh",
remoteCwd: "/remote/workspace",
remoteWorkspacePath: "/remote/workspace",
host: "ssh.example.test",
port: 22,
username: "ssh-user",
},
}),
}));
expect(runtime.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "bash",
args: ["-lc", "npm install -g @google/gemini-cli"],
}));
expect(mockResolveEnvironmentExecutionTarget).toHaveBeenCalledOnce();
});
it("surfaces remote provision command failures before resolving the adapter target", async () => {
mockBuildWorkspaceRealizationRequest.mockReturnValue({
version: 1,
adapterType: "claude_local",
companyId: "company-1",
environmentId: "env-1",
executionWorkspaceId: null,
issueId: null,
heartbeatRunId: "run-1",
requestedMode: null,
source: {
kind: "project_primary",
localPath: "/workspace/project",
projectId: null,
projectWorkspaceId: null,
repoUrl: null,
repoRef: null,
strategy: "project_primary",
branchName: null,
worktreePath: null,
},
runtimeOverlay: {
provisionCommand: "install-tool",
},
});
const runtime = makeMockRuntime({
execute: vi.fn().mockResolvedValue({
exitCode: 127,
signal: null,
timedOut: false,
stdout: "",
stderr: "/bin/sh: install-tool: not found\n",
}),
});
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
await expect(orchestrator.realizeForRun(makeRealizeInput({
environment: makeEnvironment("sandbox"),
}))).rejects.toSatisfy(
(err: unknown) =>
err instanceof EnvironmentRunError &&
err.code === "workspace_realization_failed" &&
String(err.message).includes("install-tool: not found"),
);
expect(mockResolveEnvironmentExecutionTarget).not.toHaveBeenCalled();
});
});