8014445b238531a00a6b91e79e94de6be9981f59
2 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
868d08903e |
test: isolate CLI company import e2e state (#4560)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, and its CLI import/export path is part of how operators move company state safely between environments. > - The `paperclipai company import/export` e2e test is supposed to validate that portability flow inside a hermetic harness, not against a developer's live Paperclip home. > - This regression showed nested CLI subprocesses could silently fall back to ambient `PAPERCLIP_*` state and mutate a real local instance by creating extra companies such as `CLI-1-Roundtrip-Test`. > - The first job was to pin the test subprocesses to isolated config, home, instance, auth, and context paths, and to add a regression assertion that proves the nested CLI writes stay inside the test-owned state. > - Once the PR was up, CI and Greptile exposed two follow-on issues that were blocking merge: plugin SDK typecheck bootstrap was racing across packages in fresh CI, and the new lock helper needed one more fix to release its lock on failure. > - This pull request therefore ends up doing two tightly related things: fixing the original CLI isolation leak, and hardening the supporting typecheck/bootstrap path enough for the fix to verify cleanly in CI. > - The benefit is that the portability e2e test is now actually isolated, and the PR verification path is stable enough to catch regressions instead of introducing its own nondeterministic failures. ## What Changed - Hardened `cli/src/__tests__/company-import-export-e2e.test.ts` so nested CLI subprocesses re-seed isolated `PAPERCLIP_CONFIG`, `PAPERCLIP_HOME`, `PAPERCLIP_INSTANCE_ID`, `PAPERCLIP_CONTEXT`, `PAPERCLIP_AUTH_STORE`, and throwaway `HOME` values instead of falling back to ambient machine state. - Added a regression assertion around `paperclipai context set --json`, then cleared the temporary `context.json` so the isolation check and the later export/import flow stay independent. - Passed the same isolated `HOME` into the server subprocess so both sides of the e2e harness are symmetric. - Introduced locking in `scripts/ensure-plugin-build-deps.mjs` and switched the server/plugin example `typecheck` scripts to use that helper instead of launching concurrent raw `@paperclipai/plugin-sdk` builds. - Fixed the helper failure path so it releases the lock before exiting non-zero, which prevents stale-lock timeouts during parallel typecheck runs. ## Verification - `pnpm vitest run cli/src/__tests__/company-import-export-e2e.test.ts --project paperclipai` - `pnpm --filter paperclipai typecheck` - `pnpm -r typecheck` - PR checks now pass on the current head, including `policy`, `verify`, `e2e`, `security/snyk`, and `Greptile Review`. ## Risks - Low risk. The product-facing behavior change is scoped to test harness code in the CLI e2e suite. - The CI stabilization changes only affect bootstrap/typecheck helper paths for the server and plugin/example packages, but they do touch shared verification plumbing; the main risk is changing how fresh build artifacts are prepared in local/CI typecheck runs. ## Model Used - Anthropic Claude via Paperclip `claude_local`, model `claude-opus-4-7`, high-effort local coding agent, used for the initial implementation and first peer-reviewed verification. - OpenAI Codex via Paperclip `codex_local`, model `gpt-5.4`, high reasoning-effort local coding agent with tool use, used for CI triage, Greptile follow-up fixes, verification, and PR maintenance. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
4ef969f084 |
Add E2B sandbox provider plugin (#4452)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Sandbox environments are part of that execution layer, and the recent core refactor moved provider-specific behavior to a generic plugin seam > - This pull request adds a dedicated `@paperclipai/plugin-e2b` package so E2B can live entirely outside core host code > - Because the feature is still unreleased, the plugin should model third-party packaging directly instead of carrying extra backward-compatibility complexity in core or the workspace lockfile > - This branch therefore makes the E2B provider a standalone publishable package, documents the package-local dev flow, and keeps the publish manifest/runtime dependency story correct > - The benefit is that E2B becomes a true plugin reference implementation that can be installed by package name without reopening core Paperclip code ## What Changed - Added `packages/plugins/paperclip-plugin-e2b` as the E2B sandbox provider plugin package - Implemented config validation, lease acquire/resume/release/destroy handlers, workspace realization, and command execution for E2B sandboxes - Excluded the E2B plugin package from the root workspace so the repo no longer needs `pnpm-lock.yaml` churn for its third-party dependency graph - Added package-local development/install support plus a prepack manifest generator so the published tarball still declares `@paperclipai/plugin-sdk` and `e2b` runtime dependencies - Addressed review feedback by fixing sandbox cleanup on acquire failures, rejecting blank templates, normalizing fractional `timeoutMs`, and always passing the configured template name to the E2B SDK - Updated focused Vitest coverage for config normalization, validation, acquire cleanup, command execution, and lease release behavior - Updated the Dockerfile deps stage to copy the E2B package manifest so the policy check stays in sync ## Verification - `cd packages/plugins/paperclip-plugin-e2b && pnpm install --ignore-workspace --no-lockfile` - `cd packages/plugins/paperclip-plugin-e2b && pnpm build` - `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace test` - `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace typecheck` - `cd packages/plugins/paperclip-plugin-e2b && npm pack --dry-run` ## Risks - The package now relies on a prepack manifest rewrite so the publish-time dependency list stays correct while the repo-local dev manifest stays workspace-light - The current repo snapshot is still unreleased, so the generated publish manifest points at the repo SDK version until the normal release flow rewrites versions before publish - Real-world E2B environments may still expose edge cases around lifecycle timing or sandbox metadata beyond the mocked unit coverage > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, GitHub CLI, and local build/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |