Files
paperclip/server/src/__tests__/environment-run-orchestrator.test.ts
T
Devin Foley 70679a3321 Add sandbox environment support (#4415)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.

## What Changed

- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.

## Verification

- Ran locally before the final fixture-only scrub:
  - `pnpm -r typecheck`
  - `pnpm test:run`
  - `pnpm build`
- Ran locally after the final scrub amend:
  - `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
  - create a sandbox environment backed by the fake provider plugin
  - run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly

## Risks

- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.

## Model Used

- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00

351 lines
12 KiB
TypeScript

import { beforeEach, describe, expect, it, vi } from "vitest";
// ---------------------------------------------------------------------------
// Hoisted mocks — must be declared before any imports that reference them
// ---------------------------------------------------------------------------
const mockResolveEnvironmentExecutionTarget = vi.hoisted(() => vi.fn());
const mockAdapterExecutionTargetToRemoteSpec = vi.hoisted(() => vi.fn());
const mockBuildWorkspaceRealizationRequest = vi.hoisted(() => vi.fn());
const mockUpdateLeaseMetadata = vi.hoisted(() => vi.fn());
const mockUpdateExecutionWorkspace = vi.hoisted(() => vi.fn());
const mockLogActivity = vi.hoisted(() => vi.fn());
vi.mock("../services/environment-execution-target.js", () => ({
resolveEnvironmentExecutionTarget: mockResolveEnvironmentExecutionTarget,
resolveEnvironmentExecutionTransport: vi.fn().mockResolvedValue(null),
}));
vi.mock("@paperclipai/adapter-utils/execution-target", () => ({
adapterExecutionTargetToRemoteSpec: mockAdapterExecutionTargetToRemoteSpec,
}));
vi.mock("../services/workspace-realization.js", () => ({
buildWorkspaceRealizationRequest: mockBuildWorkspaceRealizationRequest,
}));
vi.mock("../services/environments.js", () => ({
environmentService: vi.fn(() => ({
ensureLocalEnvironment: vi.fn(),
getById: vi.fn(),
acquireLease: vi.fn(),
releaseLease: vi.fn(),
updateLeaseMetadata: mockUpdateLeaseMetadata,
})),
}));
vi.mock("../services/execution-workspaces.js", () => ({
executionWorkspaceService: vi.fn(() => ({
update: mockUpdateExecutionWorkspace,
})),
}));
vi.mock("../services/activity-log.js", () => ({
logActivity: mockLogActivity,
}));
// ---------------------------------------------------------------------------
// Imports after mocks
// ---------------------------------------------------------------------------
import {
environmentRunOrchestrator,
EnvironmentRunError,
} from "../services/environment-run-orchestrator.ts";
import type { Environment, EnvironmentLease, ExecutionWorkspace } from "@paperclipai/shared";
import type { RealizedExecutionWorkspace } from "../services/workspace-runtime.ts";
import type { EnvironmentRuntimeService } from "../services/environment-runtime.ts";
// ---------------------------------------------------------------------------
// Fixtures
// ---------------------------------------------------------------------------
function makeEnvironment(driver: string = "local"): Environment {
return {
id: "env-1",
companyId: "company-1",
name: "Test Environment",
description: null,
driver: driver as Environment["driver"],
status: "active",
config: {},
metadata: null,
createdAt: new Date(),
updatedAt: new Date(),
};
}
function makeLease(overrides: Partial<EnvironmentLease> = {}): EnvironmentLease {
return {
id: "lease-1",
companyId: "company-1",
environmentId: "env-1",
executionWorkspaceId: null,
issueId: null,
heartbeatRunId: "run-1",
status: "active",
leasePolicy: "ephemeral",
provider: "local",
providerLeaseId: null,
acquiredAt: new Date(),
lastUsedAt: new Date(),
expiresAt: null,
releasedAt: null,
failureReason: null,
cleanupStatus: null,
metadata: null,
createdAt: new Date(),
updatedAt: new Date(),
...overrides,
};
}
function makeExecutionWorkspace(cwd: string = "/workspace/project"): RealizedExecutionWorkspace {
return {
baseCwd: "/workspace",
source: "project_primary",
projectId: "project-1",
workspaceId: "ws-1",
repoUrl: null,
repoRef: null,
strategy: "project_primary",
cwd,
branchName: null,
worktreePath: null,
warnings: [],
created: false,
};
}
function makePersistedExecutionWorkspace(
overrides: Partial<ExecutionWorkspace> = {},
): ExecutionWorkspace {
return {
id: "ew-1",
companyId: "company-1",
projectId: "project-1",
projectWorkspaceId: null,
sourceIssueId: null,
mode: "standard",
strategyType: "project_primary",
name: "workspace",
status: "open",
cwd: "/workspace/project",
repoUrl: null,
baseRef: null,
branchName: null,
providerType: "local",
providerRef: null,
derivedFromExecutionWorkspaceId: null,
lastUsedAt: new Date(),
openedAt: new Date(),
closedAt: null,
cleanupEligibleAt: null,
cleanupReason: null,
config: null,
metadata: null,
createdAt: new Date(),
updatedAt: new Date(),
...overrides,
};
}
function makeRealizeInput(overrides: {
environment?: Environment;
lease?: EnvironmentLease;
persistedExecutionWorkspace?: ExecutionWorkspace | null;
} = {}): Parameters<ReturnType<typeof environmentRunOrchestrator>["realizeForRun"]>[0] {
return {
environment: overrides.environment ?? makeEnvironment("local"),
lease: overrides.lease ?? makeLease(),
adapterType: "claude_local",
companyId: "company-1",
issueId: null,
heartbeatRunId: "run-1",
executionWorkspace: makeExecutionWorkspace(),
effectiveExecutionWorkspaceMode: null,
persistedExecutionWorkspace: overrides.persistedExecutionWorkspace !== undefined
? overrides.persistedExecutionWorkspace
: null,
};
}
function makeMockRuntime(overrides: Partial<EnvironmentRuntimeService> = {}): EnvironmentRuntimeService {
return {
acquireRunLease: vi.fn(),
releaseRunLeases: vi.fn(),
realizeWorkspace: vi.fn().mockResolvedValue({
cwd: "/workspace/project",
metadata: {
workspaceRealization: {
version: 1,
driver: "local",
cwd: "/workspace/project",
},
},
}),
...overrides,
} as unknown as EnvironmentRuntimeService;
}
// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------
describe("environmentRunOrchestrator — realizeForRun", () => {
const mockDb = {} as any;
beforeEach(() => {
vi.clearAllMocks();
mockBuildWorkspaceRealizationRequest.mockReturnValue({
version: 1,
adapterType: "claude_local",
companyId: "company-1",
environmentId: "env-1",
executionWorkspaceId: null,
issueId: null,
heartbeatRunId: "run-1",
requestedMode: null,
source: {
kind: "project_primary",
localPath: "/workspace/project",
projectId: null,
projectWorkspaceId: null,
repoUrl: null,
repoRef: null,
strategy: "project_primary",
branchName: null,
worktreePath: null,
},
runtimeOverlay: {
provisionCommand: null,
},
});
mockAdapterExecutionTargetToRemoteSpec.mockReturnValue({
kind: "local",
environmentId: "env-1",
leaseId: "lease-1",
});
mockUpdateLeaseMetadata.mockResolvedValue(null);
mockUpdateExecutionWorkspace.mockResolvedValue(null);
mockLogActivity.mockResolvedValue(undefined);
});
it("happy path: returns lease, executionTarget, and remoteExecution on successful realization", async () => {
const executionTarget = { kind: "local", environmentId: "env-1", leaseId: "lease-1" };
const remoteExecution = { kind: "local", environmentId: "env-1", leaseId: "lease-1" };
mockResolveEnvironmentExecutionTarget.mockResolvedValue(executionTarget);
mockAdapterExecutionTargetToRemoteSpec.mockReturnValue(remoteExecution);
const runtime = makeMockRuntime();
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
const result = await orchestrator.realizeForRun(makeRealizeInput());
expect(result.lease).toBeDefined();
expect(result.executionTarget).toEqual(executionTarget);
expect(result.remoteExecution).toEqual(remoteExecution);
expect(result.workspaceRealization).toEqual(
expect.objectContaining({ version: 1, driver: "local" }),
);
expect(runtime.realizeWorkspace).toHaveBeenCalledOnce();
expect(mockResolveEnvironmentExecutionTarget).toHaveBeenCalledOnce();
});
it("realization failure: runtime.realizeWorkspace throws → EnvironmentRunError with code workspace_realization_failed", async () => {
const runtime = makeMockRuntime({
realizeWorkspace: vi.fn().mockRejectedValue(new Error("sandbox unreachable")),
});
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
await expect(orchestrator.realizeForRun(makeRealizeInput())).rejects.toSatisfy(
(err: unknown) =>
err instanceof EnvironmentRunError &&
err.code === "workspace_realization_failed" &&
err.environmentId === "env-1" &&
err.driver === "local",
);
expect(mockResolveEnvironmentExecutionTarget).not.toHaveBeenCalled();
});
it("target resolution failure: resolveEnvironmentExecutionTarget throws → EnvironmentRunError with code transport_resolution_failed", async () => {
mockResolveEnvironmentExecutionTarget.mockRejectedValue(new Error("network error"));
const runtime = makeMockRuntime();
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
await expect(orchestrator.realizeForRun(makeRealizeInput())).rejects.toSatisfy(
(err: unknown) =>
err instanceof EnvironmentRunError &&
err.code === "transport_resolution_failed" &&
err.environmentId === "env-1",
);
});
it("non-sandbox driver skips workspace realization and goes straight to target resolution", async () => {
const environment = makeEnvironment("plugin" as Environment["driver"]);
const executionTarget = null;
mockResolveEnvironmentExecutionTarget.mockResolvedValue(executionTarget);
const runtime = makeMockRuntime();
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
const result = await orchestrator.realizeForRun(
makeRealizeInput({ environment }),
);
expect(runtime.realizeWorkspace).not.toHaveBeenCalled();
expect(result.workspaceRealization).toEqual({});
expect(result.executionTarget).toBeNull();
});
it("persisted metadata is updated on lease and execution workspace after realization", async () => {
const persistedExecutionWorkspace = makePersistedExecutionWorkspace();
const updatedLease = makeLease({
metadata: { workspaceRealization: { version: 1, driver: "local", cwd: "/workspace/project" } },
});
const updatedEw = { ...persistedExecutionWorkspace, metadata: { workspaceRealizationRequest: {}, workspaceRealization: {} } };
mockUpdateLeaseMetadata.mockResolvedValue(updatedLease);
mockUpdateExecutionWorkspace.mockResolvedValue(updatedEw);
mockResolveEnvironmentExecutionTarget.mockResolvedValue({ kind: "local", environmentId: "env-1", leaseId: "lease-1" });
const runtime = makeMockRuntime();
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
const result = await orchestrator.realizeForRun(
makeRealizeInput({ persistedExecutionWorkspace }),
);
// Lease metadata should have been updated with workspaceRealization
expect(mockUpdateLeaseMetadata).toHaveBeenCalledOnce();
expect(mockUpdateLeaseMetadata).toHaveBeenCalledWith(
"lease-1",
expect.objectContaining({ workspaceRealization: expect.any(Object) }),
);
// Execution workspace metadata should have been updated
expect(mockUpdateExecutionWorkspace).toHaveBeenCalledOnce();
expect(mockUpdateExecutionWorkspace).toHaveBeenCalledWith(
"ew-1",
expect.objectContaining({
metadata: expect.objectContaining({
workspaceRealizationRequest: expect.any(Object),
workspaceRealization: expect.any(Object),
}),
}),
);
// The returned lease should reflect the updated value
expect(result.lease).toEqual(updatedLease);
expect(result.persistedExecutionWorkspace).toEqual(updatedEw);
});
});