forked from farhoodlabs/paperclip
Add sandbox environment support (#4415)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge
This commit is contained in:
@@ -0,0 +1,350 @@
|
||||
import { beforeEach, describe, expect, it, vi } from "vitest";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Hoisted mocks — must be declared before any imports that reference them
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const mockResolveEnvironmentExecutionTarget = vi.hoisted(() => vi.fn());
|
||||
const mockAdapterExecutionTargetToRemoteSpec = vi.hoisted(() => vi.fn());
|
||||
const mockBuildWorkspaceRealizationRequest = vi.hoisted(() => vi.fn());
|
||||
const mockUpdateLeaseMetadata = vi.hoisted(() => vi.fn());
|
||||
const mockUpdateExecutionWorkspace = vi.hoisted(() => vi.fn());
|
||||
const mockLogActivity = vi.hoisted(() => vi.fn());
|
||||
|
||||
vi.mock("../services/environment-execution-target.js", () => ({
|
||||
resolveEnvironmentExecutionTarget: mockResolveEnvironmentExecutionTarget,
|
||||
resolveEnvironmentExecutionTransport: vi.fn().mockResolvedValue(null),
|
||||
}));
|
||||
|
||||
vi.mock("@paperclipai/adapter-utils/execution-target", () => ({
|
||||
adapterExecutionTargetToRemoteSpec: mockAdapterExecutionTargetToRemoteSpec,
|
||||
}));
|
||||
|
||||
vi.mock("../services/workspace-realization.js", () => ({
|
||||
buildWorkspaceRealizationRequest: mockBuildWorkspaceRealizationRequest,
|
||||
}));
|
||||
|
||||
vi.mock("../services/environments.js", () => ({
|
||||
environmentService: vi.fn(() => ({
|
||||
ensureLocalEnvironment: vi.fn(),
|
||||
getById: vi.fn(),
|
||||
acquireLease: vi.fn(),
|
||||
releaseLease: vi.fn(),
|
||||
updateLeaseMetadata: mockUpdateLeaseMetadata,
|
||||
})),
|
||||
}));
|
||||
|
||||
vi.mock("../services/execution-workspaces.js", () => ({
|
||||
executionWorkspaceService: vi.fn(() => ({
|
||||
update: mockUpdateExecutionWorkspace,
|
||||
})),
|
||||
}));
|
||||
|
||||
vi.mock("../services/activity-log.js", () => ({
|
||||
logActivity: mockLogActivity,
|
||||
}));
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Imports after mocks
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
import {
|
||||
environmentRunOrchestrator,
|
||||
EnvironmentRunError,
|
||||
} from "../services/environment-run-orchestrator.ts";
|
||||
import type { Environment, EnvironmentLease, ExecutionWorkspace } from "@paperclipai/shared";
|
||||
import type { RealizedExecutionWorkspace } from "../services/workspace-runtime.ts";
|
||||
import type { EnvironmentRuntimeService } from "../services/environment-runtime.ts";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Fixtures
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
function makeEnvironment(driver: string = "local"): Environment {
|
||||
return {
|
||||
id: "env-1",
|
||||
companyId: "company-1",
|
||||
name: "Test Environment",
|
||||
description: null,
|
||||
driver: driver as Environment["driver"],
|
||||
status: "active",
|
||||
config: {},
|
||||
metadata: null,
|
||||
createdAt: new Date(),
|
||||
updatedAt: new Date(),
|
||||
};
|
||||
}
|
||||
|
||||
function makeLease(overrides: Partial<EnvironmentLease> = {}): EnvironmentLease {
|
||||
return {
|
||||
id: "lease-1",
|
||||
companyId: "company-1",
|
||||
environmentId: "env-1",
|
||||
executionWorkspaceId: null,
|
||||
issueId: null,
|
||||
heartbeatRunId: "run-1",
|
||||
status: "active",
|
||||
leasePolicy: "ephemeral",
|
||||
provider: "local",
|
||||
providerLeaseId: null,
|
||||
acquiredAt: new Date(),
|
||||
lastUsedAt: new Date(),
|
||||
expiresAt: null,
|
||||
releasedAt: null,
|
||||
failureReason: null,
|
||||
cleanupStatus: null,
|
||||
metadata: null,
|
||||
createdAt: new Date(),
|
||||
updatedAt: new Date(),
|
||||
...overrides,
|
||||
};
|
||||
}
|
||||
|
||||
function makeExecutionWorkspace(cwd: string = "/workspace/project"): RealizedExecutionWorkspace {
|
||||
return {
|
||||
baseCwd: "/workspace",
|
||||
source: "project_primary",
|
||||
projectId: "project-1",
|
||||
workspaceId: "ws-1",
|
||||
repoUrl: null,
|
||||
repoRef: null,
|
||||
strategy: "project_primary",
|
||||
cwd,
|
||||
branchName: null,
|
||||
worktreePath: null,
|
||||
warnings: [],
|
||||
created: false,
|
||||
};
|
||||
}
|
||||
|
||||
function makePersistedExecutionWorkspace(
|
||||
overrides: Partial<ExecutionWorkspace> = {},
|
||||
): ExecutionWorkspace {
|
||||
return {
|
||||
id: "ew-1",
|
||||
companyId: "company-1",
|
||||
projectId: "project-1",
|
||||
projectWorkspaceId: null,
|
||||
sourceIssueId: null,
|
||||
mode: "standard",
|
||||
strategyType: "project_primary",
|
||||
name: "workspace",
|
||||
status: "open",
|
||||
cwd: "/workspace/project",
|
||||
repoUrl: null,
|
||||
baseRef: null,
|
||||
branchName: null,
|
||||
providerType: "local",
|
||||
providerRef: null,
|
||||
derivedFromExecutionWorkspaceId: null,
|
||||
lastUsedAt: new Date(),
|
||||
openedAt: new Date(),
|
||||
closedAt: null,
|
||||
cleanupEligibleAt: null,
|
||||
cleanupReason: null,
|
||||
config: null,
|
||||
metadata: null,
|
||||
createdAt: new Date(),
|
||||
updatedAt: new Date(),
|
||||
...overrides,
|
||||
};
|
||||
}
|
||||
|
||||
function makeRealizeInput(overrides: {
|
||||
environment?: Environment;
|
||||
lease?: EnvironmentLease;
|
||||
persistedExecutionWorkspace?: ExecutionWorkspace | null;
|
||||
} = {}): Parameters<ReturnType<typeof environmentRunOrchestrator>["realizeForRun"]>[0] {
|
||||
return {
|
||||
environment: overrides.environment ?? makeEnvironment("local"),
|
||||
lease: overrides.lease ?? makeLease(),
|
||||
adapterType: "claude_local",
|
||||
companyId: "company-1",
|
||||
issueId: null,
|
||||
heartbeatRunId: "run-1",
|
||||
executionWorkspace: makeExecutionWorkspace(),
|
||||
effectiveExecutionWorkspaceMode: null,
|
||||
persistedExecutionWorkspace: overrides.persistedExecutionWorkspace !== undefined
|
||||
? overrides.persistedExecutionWorkspace
|
||||
: null,
|
||||
};
|
||||
}
|
||||
|
||||
function makeMockRuntime(overrides: Partial<EnvironmentRuntimeService> = {}): EnvironmentRuntimeService {
|
||||
return {
|
||||
acquireRunLease: vi.fn(),
|
||||
releaseRunLeases: vi.fn(),
|
||||
realizeWorkspace: vi.fn().mockResolvedValue({
|
||||
cwd: "/workspace/project",
|
||||
metadata: {
|
||||
workspaceRealization: {
|
||||
version: 1,
|
||||
driver: "local",
|
||||
cwd: "/workspace/project",
|
||||
},
|
||||
},
|
||||
}),
|
||||
...overrides,
|
||||
} as unknown as EnvironmentRuntimeService;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Tests
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("environmentRunOrchestrator — realizeForRun", () => {
|
||||
const mockDb = {} as any;
|
||||
|
||||
beforeEach(() => {
|
||||
vi.clearAllMocks();
|
||||
|
||||
mockBuildWorkspaceRealizationRequest.mockReturnValue({
|
||||
version: 1,
|
||||
adapterType: "claude_local",
|
||||
companyId: "company-1",
|
||||
environmentId: "env-1",
|
||||
executionWorkspaceId: null,
|
||||
issueId: null,
|
||||
heartbeatRunId: "run-1",
|
||||
requestedMode: null,
|
||||
source: {
|
||||
kind: "project_primary",
|
||||
localPath: "/workspace/project",
|
||||
projectId: null,
|
||||
projectWorkspaceId: null,
|
||||
repoUrl: null,
|
||||
repoRef: null,
|
||||
strategy: "project_primary",
|
||||
branchName: null,
|
||||
worktreePath: null,
|
||||
},
|
||||
runtimeOverlay: {
|
||||
provisionCommand: null,
|
||||
},
|
||||
});
|
||||
|
||||
mockAdapterExecutionTargetToRemoteSpec.mockReturnValue({
|
||||
kind: "local",
|
||||
environmentId: "env-1",
|
||||
leaseId: "lease-1",
|
||||
});
|
||||
|
||||
mockUpdateLeaseMetadata.mockResolvedValue(null);
|
||||
mockUpdateExecutionWorkspace.mockResolvedValue(null);
|
||||
mockLogActivity.mockResolvedValue(undefined);
|
||||
});
|
||||
|
||||
it("happy path: returns lease, executionTarget, and remoteExecution on successful realization", async () => {
|
||||
const executionTarget = { kind: "local", environmentId: "env-1", leaseId: "lease-1" };
|
||||
const remoteExecution = { kind: "local", environmentId: "env-1", leaseId: "lease-1" };
|
||||
|
||||
mockResolveEnvironmentExecutionTarget.mockResolvedValue(executionTarget);
|
||||
mockAdapterExecutionTargetToRemoteSpec.mockReturnValue(remoteExecution);
|
||||
|
||||
const runtime = makeMockRuntime();
|
||||
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
|
||||
|
||||
const result = await orchestrator.realizeForRun(makeRealizeInput());
|
||||
|
||||
expect(result.lease).toBeDefined();
|
||||
expect(result.executionTarget).toEqual(executionTarget);
|
||||
expect(result.remoteExecution).toEqual(remoteExecution);
|
||||
expect(result.workspaceRealization).toEqual(
|
||||
expect.objectContaining({ version: 1, driver: "local" }),
|
||||
);
|
||||
|
||||
expect(runtime.realizeWorkspace).toHaveBeenCalledOnce();
|
||||
expect(mockResolveEnvironmentExecutionTarget).toHaveBeenCalledOnce();
|
||||
});
|
||||
|
||||
it("realization failure: runtime.realizeWorkspace throws → EnvironmentRunError with code workspace_realization_failed", async () => {
|
||||
const runtime = makeMockRuntime({
|
||||
realizeWorkspace: vi.fn().mockRejectedValue(new Error("sandbox unreachable")),
|
||||
});
|
||||
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
|
||||
|
||||
await expect(orchestrator.realizeForRun(makeRealizeInput())).rejects.toSatisfy(
|
||||
(err: unknown) =>
|
||||
err instanceof EnvironmentRunError &&
|
||||
err.code === "workspace_realization_failed" &&
|
||||
err.environmentId === "env-1" &&
|
||||
err.driver === "local",
|
||||
);
|
||||
|
||||
expect(mockResolveEnvironmentExecutionTarget).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("target resolution failure: resolveEnvironmentExecutionTarget throws → EnvironmentRunError with code transport_resolution_failed", async () => {
|
||||
mockResolveEnvironmentExecutionTarget.mockRejectedValue(new Error("network error"));
|
||||
|
||||
const runtime = makeMockRuntime();
|
||||
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
|
||||
|
||||
await expect(orchestrator.realizeForRun(makeRealizeInput())).rejects.toSatisfy(
|
||||
(err: unknown) =>
|
||||
err instanceof EnvironmentRunError &&
|
||||
err.code === "transport_resolution_failed" &&
|
||||
err.environmentId === "env-1",
|
||||
);
|
||||
});
|
||||
|
||||
it("non-sandbox driver skips workspace realization and goes straight to target resolution", async () => {
|
||||
const environment = makeEnvironment("plugin" as Environment["driver"]);
|
||||
const executionTarget = null;
|
||||
|
||||
mockResolveEnvironmentExecutionTarget.mockResolvedValue(executionTarget);
|
||||
|
||||
const runtime = makeMockRuntime();
|
||||
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
|
||||
|
||||
const result = await orchestrator.realizeForRun(
|
||||
makeRealizeInput({ environment }),
|
||||
);
|
||||
|
||||
expect(runtime.realizeWorkspace).not.toHaveBeenCalled();
|
||||
expect(result.workspaceRealization).toEqual({});
|
||||
expect(result.executionTarget).toBeNull();
|
||||
});
|
||||
|
||||
it("persisted metadata is updated on lease and execution workspace after realization", async () => {
|
||||
const persistedExecutionWorkspace = makePersistedExecutionWorkspace();
|
||||
const updatedLease = makeLease({
|
||||
metadata: { workspaceRealization: { version: 1, driver: "local", cwd: "/workspace/project" } },
|
||||
});
|
||||
const updatedEw = { ...persistedExecutionWorkspace, metadata: { workspaceRealizationRequest: {}, workspaceRealization: {} } };
|
||||
|
||||
mockUpdateLeaseMetadata.mockResolvedValue(updatedLease);
|
||||
mockUpdateExecutionWorkspace.mockResolvedValue(updatedEw);
|
||||
mockResolveEnvironmentExecutionTarget.mockResolvedValue({ kind: "local", environmentId: "env-1", leaseId: "lease-1" });
|
||||
|
||||
const runtime = makeMockRuntime();
|
||||
const orchestrator = environmentRunOrchestrator(mockDb, { environmentRuntime: runtime });
|
||||
|
||||
const result = await orchestrator.realizeForRun(
|
||||
makeRealizeInput({ persistedExecutionWorkspace }),
|
||||
);
|
||||
|
||||
// Lease metadata should have been updated with workspaceRealization
|
||||
expect(mockUpdateLeaseMetadata).toHaveBeenCalledOnce();
|
||||
expect(mockUpdateLeaseMetadata).toHaveBeenCalledWith(
|
||||
"lease-1",
|
||||
expect.objectContaining({ workspaceRealization: expect.any(Object) }),
|
||||
);
|
||||
|
||||
// Execution workspace metadata should have been updated
|
||||
expect(mockUpdateExecutionWorkspace).toHaveBeenCalledOnce();
|
||||
expect(mockUpdateExecutionWorkspace).toHaveBeenCalledWith(
|
||||
"ew-1",
|
||||
expect.objectContaining({
|
||||
metadata: expect.objectContaining({
|
||||
workspaceRealizationRequest: expect.any(Object),
|
||||
workspaceRealization: expect.any(Object),
|
||||
}),
|
||||
}),
|
||||
);
|
||||
|
||||
// The returned lease should reflect the updated value
|
||||
expect(result.lease).toEqual(updatedLease);
|
||||
expect(result.persistedExecutionWorkspace).toEqual(updatedEw);
|
||||
});
|
||||
});
|
||||
Reference in New Issue
Block a user