forked from farhoodlabs/paperclip
486fb88a1515f0d301295854db0250f75ccfb098
4 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
486fb88a15 |
Add Cloudflare sandbox provider plugin (#5687)
> _Stacked on top of #5685 → #5686. Diff against master includes commits from earlier PRs in the stack — review focuses on the two new commits (`Extend sandbox callback bridge for Worker-hosted plugins` + `Add Cloudflare sandbox provider plugin`)._ ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs in a sandbox environment, and operators choose which provider backs that sandbox — today E2B and Daytona are bundled with the platform > - Cloudflare Workers + Durable Objects + the Sandbox SDK offer a credible new option: globally distributed, cheap idle, and operator-deployable as a single Worker > - To plug it in, Paperclip needs (a) a provider plugin that speaks the `PaperclipPluginManifestV1` lifecycle and (b) a small operator-deployed Worker — the **bridge** — that adapts Paperclip's runtime RPCs to the Cloudflare Sandbox SDK > - The plugin extends the existing sandbox-callback-bridge with a `bridge.transport: "worker"` discriminator so the platform routes runtime RPCs through the Worker bridge instead of the in-process runner > - This pull request adds the plugin, the bridge Worker template, and the supporting adapter-utils + server hooks the new transport needs > - The benefit is that operators can run sandboxes on Cloudflare's edge with no new platform code beyond installing the plugin and deploying the Worker ## What Changed **Shared support (`Extend sandbox callback bridge for Worker-hosted plugins`):** - `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`: expose `expectedHostHeader` so plugin-side bridge clients can verify the canonical request envelope before forwarding. - `packages/adapter-utils/src/command-managed-runtime.{ts,test.ts}`: relax the always-fresh runner construction so callers can re-use a runner across exec calls (Worker-hosted bridges hold the runner inside a Durable Object). - `server/src/services/environment-runtime.ts` + `environment-runtime.test.ts`: route Worker-hosted bridges through the same env-shaping path as E2B and pin the `requestEnv` contract. - `server/src/services/plugin-environment-driver.ts`: thread an optional `issueId` through the runtime descriptor so bridges can scope leases to the originating issue (used by Cloudflare to map a sandbox to the issue/workflow for billing and audit). - `packages/plugins/sdk/src/protocol.ts`: add `issueId?` to `PluginEnvironmentDriverBaseParams` and the new `bridge.transport: "worker"` discriminator that the new plugin declares. - `server/__tests__/heartbeat-plugin-environment.test.ts`: pin the heartbeat path against the new runtime descriptor. **The Cloudflare plugin itself (`Add Cloudflare sandbox provider plugin`):** - `packages/plugins/sandbox-providers/cloudflare/`: plugin entry, manifest, plugin runtime (lifecycle + bridge client), config parsing, and Vitest coverage. Manifest declares `bridge.transport: "worker"` so the platform routes runtime RPCs through the bridge client. - `bridge-template/`: a Worker template the operator deploys with `wrangler`. Owns Durable Object-backed sessions (`sessions.ts`), exec/stream routes (`exec.ts`, `routes.ts`), and an HMAC auth layer (`auth.ts`) that pins the `Host` header surface. Includes the SDK-contract-correct exec implementation, lease recovery, and chunked stdout/stderr streaming. - Tests cover lease/session handoff (`bridge-template/src/exec.test.ts`, `routes.test.ts`), bridge client request shaping (`src/bridge-client.test.ts`), and end-to-end plugin behavior (`src/plugin.test.ts`) including streamed exec output. 27 tests in total. - `README.md` walks the operator through deploying the bridge Worker, registering the plugin, and configuring the runtime. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage packages/adapter-utils/src/sandbox-callback-bridge.test.ts packages/adapter-utils/src/command-managed-runtime.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/heartbeat-plugin-environment.test.ts` - `(cd packages/plugins/sandbox-providers/cloudflare && pnpm test)` — 27 passing For an operator-side smoke test: 1. Deploy the bridge: `cd packages/plugins/sandbox-providers/cloudflare/bridge-template && wrangler deploy` 2. Register the plugin in your Paperclip instance, point its bridge URL at the deployed Worker, set the HMAC shared secret. 3. Create a sandbox environment whose provider is `cloudflare`, then run a Codex or Claude job against it. ## Risks - Adds a new `bridge.transport: "worker"` code path, but the existing E2B / Daytona transports go through the same shaped helpers and have explicit test coverage that pins their behavior unchanged. - The Worker bridge stores session state in a Durable Object; operator instances must be aware of the corresponding Cloudflare costs (DO requests, storage). Documented in the README. - The `issueId` plumbing is optional throughout — existing plugins that don't supply it continue to work. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes (plugin README, bridge-template README) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
29401b231b |
fix(ci): gate new release packages on npm bootstrap (#5146)
## Thinking Path > - Paperclip is a control plane for autonomous agent companies, so its release automation is part of the core operator trust boundary. > - The affected subsystem is npm/GitHub Actions release publishing for the public monorepo packages. > - The concrete failure was that a newly added package reached `master`, the canary workflow attempted its first publish, and npm trusted publishing was not yet bootstrapped for that package. > - That means the problem is not just one broken run; it is a missing pre-merge guard that lets release-ineligible packages land and only fail once `publish_canary` runs. > - This pull request makes release enrollment explicit, validates that enrollment in CI, and adds a PR-time bootstrap check against npm for changed release-enabled package manifests. > - The result is that we keep trusted publishing, avoid teaching CI to `npm adduser`, and move this class of failure from post-merge canary time to pre-merge review time. ## What Changed - Added `scripts/release-package-manifest.json` so release-managed public packages are explicitly enrolled instead of being inferred from every non-private workspace package. - Hardened `scripts/release-package-map.mjs` to validate the manifest before release workflows rewrite versions or assemble publish payloads. - Added `scripts/check-release-package-bootstrap.mjs` and wired it into `.github/workflows/pr.yml` so PRs that change a release-enabled package manifest fail if that package does not already exist on npm. - Added release-package manifest coverage tests to `scripts/release-package-map.test.mjs` and included them in `pnpm run test:release-registry`. - Wired manifest validation into `.github/workflows/release.yml` and documented the first-publish bootstrap policy in `doc/PUBLISHING.md` and `doc/RELEASE-AUTOMATION-SETUP.md`. ## Verification - `pnpm run test:release-registry` - `./scripts/release.sh canary --skip-verify --dry-run` - Confirmed the committed diff contains no obvious PII/secrets via targeted pattern scan before pushing. ## Risks - Low risk overall: this is CI/release-policy code, not product runtime logic. - The new PR bootstrap check depends on npm metadata availability, so a transient npm outage could block a PR that changes a release-enabled package manifest. - The manifest introduces a new source of truth that must stay aligned with public package additions, but that is intentional and now enforced. ## Model Used - OpenAI Codex via the `codex_local` Paperclip adapter; GPT-5-based coding agent with tool use, terminal execution, git, and GitHub CLI. Exact served model ID/context window are not exposed by the local runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
0e51fa2b0d |
Honor reuse-existing preference and assignee default environment in issue runs (#5139)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside execution workspaces (a per-issue cwd + env), and an issue > can prefer to reuse an existing workspace or get a fresh one each time > - The heartbeat service was reading the existing workspace's config to derive > environment selection regardless of whether the issue actually wanted to reuse > it. So fresh-run issues were inheriting stale config from a workspace that was > about to be discarded > - Separately, when an issue is assigned to an agent, the issue's execution > workspace settings weren't picking up the agent's `defaultEnvironmentId`, > even though the agent's choice is the natural default for that issue > - This PR makes both selection paths honor the obvious source of truth: > workspace config flows only when the issue actually wants `reuse_existing`, > and the assignee agent's default environment is applied at assignment time if > nothing else is set on the issue > - The benefit is that re-running a flaky issue picks up the right environment > instead of inheriting the previous run's config, and assigning an agent to an > issue does the obvious thing without operator intervention ## What Changed - `server/src/services/heartbeat.ts`: introduce `reusableExecutionWorkspaceConfig` that is non-null only when `shouldReuseExisting` is true. Both `resolveExecutionWorkspaceEnvironmentId(...)` and `applyPersistedExecutionWorkspaceConfig(...)` now read from it instead of unconditionally consulting `existingExecutionWorkspace?.config`. Fresh-run issues no longer inherit stale environment config from an in-flight workspace about to be discarded. - `server/src/services/issues.ts`: when an issue update sets a new `assigneeAgentId` and isolated workspaces are enabled, populate `executionWorkspaceSettings.environmentId` from the assignee agent's `defaultEnvironmentId` if the issue doesn't have an explicit `environmentId` set yet. - Tests added in `heartbeat-plugin-environment.test.ts` (~216 lines) and `issues-service.test.ts` (~85 lines) covering both paths. ## Verification - `pnpm --filter @paperclipai/server test -- heartbeat-plugin-environment issues-service` - Manual QA: assign an issue to an agent that has a non-default `defaultEnvironmentId`, confirm the issue's workspace settings now include that environment id without operator intervention. Trigger a rerun on an issue whose existing workspace points at a stale environment, confirm the rerun uses the freshly-resolved environment. ## Risks - Behavioural shift on assignment: previously assigning an agent didn't propagate the agent's default environment to the issue. Now it does. Callers that explicitly want the issue to keep its existing/null environment must set `executionWorkspaceSettings.environmentId` themselves; the new logic only fires when no explicit value is set. - Behavioural shift on rerun: stale workspace config is no longer applied to fresh runs. Operators who relied on this implicit inheritance may see different environment selection on the first rerun after deploy. Mitigation: the explicit isssue settings and project policy are still honored as before. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A (no UI changes) - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
70679a3321 |
Add sandbox environment support (#4415)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |