forked from farhoodlabs/paperclip
e3c875c1c7
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Heartbeats run inside managed sandboxes (E2B, Cloudflare Sandbox), and each run begins by uploading the agent's workspace as a tar archive > - PAPA-381's E2B runs were failing at 5 and 11 minutes — two distinct failure modes were entangled: workspace tar extraction errors on Linux, and sandbox idle/lease timeouts during normal heartbeat gaps > - Workspace tar extraction failed because macOS bsdtar embeds `LIBARCHIVE.xattr.*` PAX headers that GNU tar on Linux rejects with "This does not look like a tar archive"; the existing `COPYFILE_DISABLE=1` only suppresses AppleDouble `._*` sidecars, not inline PAX xattr entries > - E2B sandboxes also expired between heartbeats because `timeoutMs` defaulted to a short window and was never refreshed per execute, and Cloudflare sandboxes idled out because `sleepAfter` defaulted to 10 minutes > - This pull request adds `--no-xattrs` to the workspace tar invocation, refreshes the E2B sandbox lifetime on each execute and bumps the default `timeoutMs` to 1h, and raises the Cloudflare `sleepAfter` default to 1h > - The benefit is that long-running heartbeat-driven runs (Claude, Codex, etc.) survive across both their initial workspace upload and the natural idle gaps between executes on both E2B and Cloudflare ## What Changed - `packages/adapter-utils/src/sandbox-managed-runtime.ts`: added `--no-xattrs` to `createTarballFromDirectory` so macOS bsdtar produces a clean POSIX tar that GNU tar on Linux can extract, with an inline comment explaining why `COPYFILE_DISABLE=1` alone is insufficient. - `packages/plugins/sandbox-providers/e2b/src/plugin.ts`: refresh the sandbox lifetime on every execute (so long runs don't expire mid-job) and raised the default `timeoutMs` to 1h. - `packages/plugins/sandbox-providers/e2b/src/manifest.ts` and `plugin.test.ts`: updated manifest defaults and added regression coverage for the new behavior. - `packages/plugins/sandbox-providers/cloudflare/src/config.ts`, `manifest.ts`, `plugin.test.ts`: raised default `sleepAfter` from 10m to 1h, mirroring the E2B 1h default, and added a regression test asserting the acquire-lease request body sends `sleepAfter: "1h"` when not overridden. ## Verification - `pnpm --filter @paperclipai/plugin-e2b test` - `pnpm --filter @paperclipai/plugin-cloudflare-sandbox test` - Locally cherry-picked the `--no-xattrs` fix onto master and confirmed end-to-end via a real PAPA-381-style heartbeat-driven E2B run that the workspace upload now extracts cleanly on Linux. The user (board operator) tested this on master and reported "Ok, that worked." - Manual reviewer steps: trigger an E2B heartbeat from a macOS host (this is where the bsdtar xattr headers come from), confirm the workspace tar extracts on the Linux sandbox side; run a long (>15 min) Cloudflare sandbox flow and confirm no lost-lease/idle errors between executes. ## Risks - Low risk overall. - `--no-xattrs` is widely supported by both macOS bsdtar and GNU tar (Linux). Worst case it silently no-ops on a future host that doesn't support it; in that case the existing failure mode reappears, it doesn't introduce a new one. - Raising default `timeoutMs` (E2B) and `sleepAfter` (Cloudflare) from short values to 1h means sandboxes stay alive longer between executes by default. This is the intended behavior — operators that want a tighter idle window can still override via plugin config. - E2B per-execute sandbox lifetime refresh adds a small API call per execute; it is bounded by the same client that already handles execute traffic, so no new dependencies or retry semantics. ## Model Used - Claude (Anthropic), `claude-opus-4-7`, extended thinking enabled, tool use enabled (file/grep/git tools and Paperclip control-plane API). Used to diagnose the dual failure mode (workspace tar PAX xattr headers + sandbox lifetime), write the fixes and tests, and drive the verification loop with the board operator. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots (N/A — no UI changes) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
398 lines
12 KiB
TypeScript
398 lines
12 KiB
TypeScript
import { beforeEach, describe, expect, it, vi } from "vitest";
|
|
|
|
const fetchMock = vi.fn();
|
|
let plugin: typeof import("./plugin.js").default;
|
|
|
|
function jsonResponse(body: unknown, status = 200): Response {
|
|
return new Response(JSON.stringify(body), {
|
|
status,
|
|
headers: { "Content-Type": "application/json" },
|
|
});
|
|
}
|
|
|
|
function requestInitAt(index = 0): RequestInit {
|
|
return fetchMock.mock.calls[index]?.[1] as RequestInit;
|
|
}
|
|
|
|
function requestHeadersAt(index = 0): Headers {
|
|
return requestInitAt(index).headers as Headers;
|
|
}
|
|
|
|
function requestBodyAt(index = 0): Record<string, unknown> {
|
|
return JSON.parse(String(requestInitAt(index).body ?? "{}")) as Record<string, unknown>;
|
|
}
|
|
|
|
describe("Cloudflare sandbox provider plugin", () => {
|
|
beforeEach(async () => {
|
|
fetchMock.mockReset();
|
|
vi.stubGlobal("fetch", fetchMock);
|
|
vi.resetModules();
|
|
plugin = (await import("./plugin.js")).default;
|
|
});
|
|
|
|
it("declares the Cloudflare environment lifecycle handlers", async () => {
|
|
expect(await plugin.definition.onHealth?.()).toEqual({
|
|
status: "ok",
|
|
message: "Cloudflare sandbox provider plugin healthy",
|
|
});
|
|
expect(plugin.definition.onEnvironmentAcquireLease).toBeTypeOf("function");
|
|
expect(plugin.definition.onEnvironmentExecute).toBeTypeOf("function");
|
|
});
|
|
|
|
it("normalizes and validates Cloudflare config", async () => {
|
|
const result = await plugin.definition.onEnvironmentValidateConfig?.({
|
|
driverKey: "cloudflare",
|
|
config: {
|
|
bridgeBaseUrl: " https://bridge.example.workers.dev/ ",
|
|
bridgeAuthToken: " secret-ref://bridge-token ",
|
|
reuseLease: true,
|
|
keepAlive: true,
|
|
normalizeId: false,
|
|
requestedCwd: " /workspace/custom ",
|
|
sessionStrategy: "default",
|
|
timeoutMs: "450000.9",
|
|
bridgeRequestTimeoutMs: "40000.1",
|
|
},
|
|
});
|
|
|
|
expect(result).toEqual({
|
|
ok: true,
|
|
normalizedConfig: {
|
|
bridgeBaseUrl: "https://bridge.example.workers.dev/",
|
|
bridgeAuthToken: "secret-ref://bridge-token",
|
|
reuseLease: true,
|
|
keepAlive: true,
|
|
sleepAfter: "1h",
|
|
normalizeId: false,
|
|
requestedCwd: "/workspace/custom",
|
|
sessionStrategy: "default",
|
|
sessionId: "paperclip",
|
|
timeoutMs: 450000,
|
|
bridgeRequestTimeoutMs: 40000,
|
|
previewHostname: null,
|
|
},
|
|
});
|
|
});
|
|
|
|
it("rejects insecure or contradictory config", async () => {
|
|
await expect(plugin.definition.onEnvironmentValidateConfig?.({
|
|
driverKey: "cloudflare",
|
|
config: {
|
|
bridgeBaseUrl: "http://bridge.example.workers.dev",
|
|
bridgeAuthToken: "secret-ref://bridge-token",
|
|
reuseLease: true,
|
|
keepAlive: false,
|
|
requestedCwd: "workspace/not-absolute",
|
|
},
|
|
})).resolves.toEqual({
|
|
ok: false,
|
|
errors: [
|
|
"bridgeBaseUrl must use HTTPS unless it points at localhost.",
|
|
"reuseLease requires keepAlive for Cloudflare sandboxes.",
|
|
"requestedCwd must be an absolute POSIX path.",
|
|
],
|
|
});
|
|
});
|
|
|
|
it("maps acquire lease responses from the bridge", async () => {
|
|
fetchMock.mockResolvedValueOnce(
|
|
jsonResponse({
|
|
providerLeaseId: "pc-run-1-abcd1234",
|
|
metadata: {
|
|
provider: "cloudflare",
|
|
remoteCwd: "/workspace/paperclip",
|
|
resumedLease: false,
|
|
},
|
|
}),
|
|
);
|
|
|
|
const lease = await plugin.definition.onEnvironmentAcquireLease?.({
|
|
driverKey: "cloudflare",
|
|
companyId: "company-1",
|
|
environmentId: "env-1",
|
|
issueId: "issue-1",
|
|
runId: "run-1",
|
|
requestedCwd: "/workspace/paperclip",
|
|
config: {
|
|
bridgeBaseUrl: "https://bridge.example.workers.dev",
|
|
bridgeAuthToken: "resolved-token",
|
|
},
|
|
});
|
|
|
|
expect(lease).toEqual({
|
|
providerLeaseId: "pc-run-1-abcd1234",
|
|
metadata: {
|
|
provider: "cloudflare",
|
|
remoteCwd: "/workspace/paperclip",
|
|
resumedLease: false,
|
|
},
|
|
});
|
|
expect(fetchMock).toHaveBeenCalledWith(
|
|
"https://bridge.example.workers.dev/api/paperclip-sandbox/v1/leases/acquire",
|
|
expect.objectContaining({
|
|
method: "POST",
|
|
headers: expect.any(Headers),
|
|
}),
|
|
);
|
|
expect(requestHeadersAt().get("X-Paperclip-Run-Id")).toBe("run-1");
|
|
expect(requestHeadersAt().get("X-Paperclip-Environment-Id")).toBe("env-1");
|
|
expect(requestHeadersAt().get("X-Paperclip-Issue-Id")).toBe("issue-1");
|
|
expect(requestBodyAt()).toMatchObject({
|
|
environmentId: "env-1",
|
|
runId: "run-1",
|
|
issueId: "issue-1",
|
|
requestedCwd: "/workspace/paperclip",
|
|
});
|
|
});
|
|
|
|
it("defaults the sleepAfter passed to the bridge to 1h so long runs don't idle out", async () => {
|
|
fetchMock.mockResolvedValueOnce(
|
|
jsonResponse({
|
|
providerLeaseId: "pc-run-1-abcd1234",
|
|
metadata: { provider: "cloudflare", remoteCwd: "/workspace/paperclip", resumedLease: false },
|
|
}),
|
|
);
|
|
|
|
await plugin.definition.onEnvironmentAcquireLease?.({
|
|
driverKey: "cloudflare",
|
|
companyId: "company-1",
|
|
environmentId: "env-1",
|
|
runId: "run-1",
|
|
requestedCwd: "/workspace/paperclip",
|
|
config: {
|
|
bridgeBaseUrl: "https://bridge.example.workers.dev",
|
|
bridgeAuthToken: "resolved-token",
|
|
},
|
|
});
|
|
|
|
expect(requestBodyAt()).toMatchObject({ sleepAfter: "1h" });
|
|
});
|
|
|
|
it("returns expired lease semantics when resume reports lost state", async () => {
|
|
fetchMock.mockResolvedValueOnce(
|
|
jsonResponse(
|
|
{
|
|
error: "sandbox_state_lost",
|
|
message: "Cloudflare sandbox state is no longer available.",
|
|
},
|
|
409,
|
|
),
|
|
);
|
|
|
|
const lease = await plugin.definition.onEnvironmentResumeLease?.({
|
|
driverKey: "cloudflare",
|
|
companyId: "company-1",
|
|
environmentId: "env-1",
|
|
providerLeaseId: "pc-env-env-1",
|
|
leaseMetadata: { remoteCwd: "/workspace/paperclip" },
|
|
config: {
|
|
bridgeBaseUrl: "https://bridge.example.workers.dev",
|
|
bridgeAuthToken: "resolved-token",
|
|
},
|
|
});
|
|
|
|
expect(lease).toEqual({
|
|
providerLeaseId: null,
|
|
metadata: {
|
|
provider: "cloudflare",
|
|
expired: true,
|
|
},
|
|
});
|
|
});
|
|
|
|
it("passes bridge execute results through unchanged", async () => {
|
|
fetchMock.mockResolvedValueOnce(
|
|
jsonResponse({
|
|
exitCode: 0,
|
|
signal: null,
|
|
timedOut: false,
|
|
stdout: "/workspace/paperclip\n",
|
|
stderr: "",
|
|
}),
|
|
);
|
|
|
|
const result = await plugin.definition.onEnvironmentExecute?.({
|
|
driverKey: "cloudflare",
|
|
companyId: "company-1",
|
|
environmentId: "env-1",
|
|
lease: { providerLeaseId: "pc-run-1-abcd1234", metadata: {} },
|
|
command: "pwd",
|
|
args: [],
|
|
cwd: "/workspace/paperclip",
|
|
config: {
|
|
bridgeBaseUrl: "https://bridge.example.workers.dev",
|
|
bridgeAuthToken: "resolved-token",
|
|
},
|
|
});
|
|
|
|
expect(result).toEqual({
|
|
exitCode: 0,
|
|
signal: null,
|
|
timedOut: false,
|
|
stdout: "/workspace/paperclip\n",
|
|
stderr: "",
|
|
});
|
|
});
|
|
|
|
it("routes bridge-channel execute calls through a dedicated session", async () => {
|
|
// pluginLogger must be set for the streaming branch to be reachable, so
|
|
// we can assert that bridge-channel calls take the non-streaming path
|
|
// even when adapter sessions would otherwise stream.
|
|
await plugin.definition.setup?.({
|
|
logger: { info: () => undefined, warn: () => undefined, error: () => undefined, debug: () => undefined },
|
|
} as never);
|
|
fetchMock.mockResolvedValueOnce(
|
|
jsonResponse({
|
|
exitCode: 0,
|
|
signal: null,
|
|
timedOut: false,
|
|
stdout: "ok\n",
|
|
stderr: "",
|
|
}),
|
|
);
|
|
|
|
await plugin.definition.onEnvironmentExecute?.({
|
|
driverKey: "cloudflare",
|
|
companyId: "company-1",
|
|
environmentId: "env-1",
|
|
lease: { providerLeaseId: "pc-run-1-abcd1234", metadata: {} },
|
|
command: "sh",
|
|
args: ["-lc", "ls"],
|
|
cwd: "/workspace/paperclip",
|
|
env: {
|
|
PAPERCLIP_SANDBOX_EXEC_CHANNEL: "bridge",
|
|
KEEP_ME: "visible",
|
|
},
|
|
config: {
|
|
bridgeBaseUrl: "https://bridge.example.workers.dev",
|
|
bridgeAuthToken: "resolved-token",
|
|
sessionStrategy: "default",
|
|
sessionId: "paperclip",
|
|
},
|
|
});
|
|
|
|
expect(requestBodyAt()).toMatchObject({
|
|
sessionStrategy: "named",
|
|
sessionId: "paperclip-bridge",
|
|
env: {
|
|
KEEP_ME: "visible",
|
|
},
|
|
});
|
|
expect(requestBodyAt().env).not.toHaveProperty("PAPERCLIP_SANDBOX_EXEC_CHANNEL");
|
|
// Bridge-channel commands must use the non-streaming exec path. The
|
|
// @cloudflare/sandbox SDK's streaming mode can drop the final stdout
|
|
// chunk when a short shell exits the same tick it writes — bridge ops
|
|
// carry machine-consumed stdout (readiness JSON, base64 file payloads,
|
|
// queue response bodies) where that data loss surfaces as opaque
|
|
// "invalid readiness JSON" / "Invalid bridge request payload" errors.
|
|
expect(requestBodyAt().streamOutput).toBe(false);
|
|
});
|
|
|
|
it("uses streaming exec for non-bridge adapter commands so live logs flow", async () => {
|
|
// Streaming is gated on `pluginLogger` being set, which normally happens
|
|
// in `setup()`. Wire a minimal logger so the streaming branch is reachable.
|
|
await plugin.definition.setup?.({
|
|
logger: { info: () => undefined, warn: () => undefined, error: () => undefined, debug: () => undefined },
|
|
} as never);
|
|
fetchMock.mockResolvedValueOnce(
|
|
new Response(
|
|
"event: stdout\ndata: {\"data\":\"hello\\n\"}\n\nevent: complete\ndata: {\"exitCode\":0,\"signal\":null,\"timedOut\":false,\"stdout\":\"hello\\n\",\"stderr\":\"\"}\n\n",
|
|
{
|
|
status: 200,
|
|
headers: { "Content-Type": "text/event-stream" },
|
|
},
|
|
),
|
|
);
|
|
|
|
await plugin.definition.onEnvironmentExecute?.({
|
|
driverKey: "cloudflare",
|
|
companyId: "company-1",
|
|
environmentId: "env-1",
|
|
lease: { providerLeaseId: "pc-run-1-abcd1234", metadata: {} },
|
|
command: "echo",
|
|
args: ["hello"],
|
|
cwd: "/workspace/paperclip",
|
|
env: { KEEP_ME: "visible" },
|
|
config: {
|
|
bridgeBaseUrl: "https://bridge.example.workers.dev",
|
|
bridgeAuthToken: "resolved-token",
|
|
sessionStrategy: "named",
|
|
sessionId: "paperclip",
|
|
},
|
|
});
|
|
|
|
expect(requestBodyAt().streamOutput).toBe(true);
|
|
});
|
|
|
|
it("maps lost-lease execute errors into a deterministic command failure", async () => {
|
|
fetchMock.mockResolvedValueOnce(
|
|
jsonResponse(
|
|
{
|
|
error: "sandbox_state_lost",
|
|
message: "Cloudflare sandbox state is no longer available.",
|
|
},
|
|
409,
|
|
),
|
|
);
|
|
|
|
const result = await plugin.definition.onEnvironmentExecute?.({
|
|
driverKey: "cloudflare",
|
|
companyId: "company-1",
|
|
environmentId: "env-1",
|
|
lease: { providerLeaseId: "pc-run-1-abcd1234", metadata: {} },
|
|
command: "pwd",
|
|
args: [],
|
|
cwd: "/workspace/paperclip",
|
|
config: {
|
|
bridgeBaseUrl: "https://bridge.example.workers.dev",
|
|
bridgeAuthToken: "resolved-token",
|
|
},
|
|
});
|
|
|
|
expect(result).toEqual({
|
|
exitCode: 1,
|
|
signal: null,
|
|
timedOut: false,
|
|
stdout: "",
|
|
stderr: "Cloudflare sandbox state is no longer available.\n",
|
|
});
|
|
});
|
|
|
|
it("wraps realizeWorkspace bridge failures and forwards the issue header", async () => {
|
|
fetchMock.mockResolvedValueOnce(
|
|
jsonResponse(
|
|
{
|
|
error: "command_failed",
|
|
message: "mkdir: permission denied",
|
|
},
|
|
500,
|
|
),
|
|
);
|
|
|
|
await expect(plugin.definition.onEnvironmentRealizeWorkspace?.({
|
|
driverKey: "cloudflare",
|
|
companyId: "company-1",
|
|
environmentId: "env-1",
|
|
issueId: "issue-1",
|
|
lease: {
|
|
providerLeaseId: "pc-run-1-abcd1234",
|
|
metadata: { remoteCwd: "/workspace/paperclip" },
|
|
},
|
|
workspace: {
|
|
localPath: "/tmp/project",
|
|
metadata: {
|
|
workspaceRealizationRequest: {
|
|
issueId: "issue-1",
|
|
},
|
|
},
|
|
},
|
|
config: {
|
|
bridgeBaseUrl: "https://bridge.example.workers.dev",
|
|
bridgeAuthToken: "resolved-token",
|
|
},
|
|
})).rejects.toThrow("Failed to prepare Cloudflare sandbox workspace at /workspace/paperclip: mkdir: permission denied");
|
|
|
|
expect(requestHeadersAt().get("X-Paperclip-Issue-Id")).toBe("issue-1");
|
|
});
|
|
});
|