forked from farhoodlabs/paperclip
08dc3d9ff4fdc5c8f29cc4f3aef1f9fc8969f7c4
11 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
b24c6909e8 |
Harden remote sandbox runtime probes, timeouts, and installs (#5685)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs inside a sandbox environment so its CLI is isolated from the host > - Sandbox-backed adapter runs go through a small set of shared helpers — `ensureAdapterExecutionTargetCommandResolvable`, the sandbox callback bridge runner, and per-adapter `SANDBOX_INSTALL_COMMAND` strings > - When standing up new sandbox provider plugins, the existing helpers timed out, missed install fallbacks, or leaned on assumptions that only held for E2B > - Local adapters (`claude-local`, `codex-local`, `gemini-local`, `opencode-local`) needed slightly hardened probes so they could install themselves and validate inside *any* remote sandbox transport, not just E2B > - This pull request bundles those runtime fixes so future sandbox provider plugins inherit a working baseline > - The benefit is that adding a new sandbox provider plugin no longer requires touching adapter-utils or each local-adapter probe — the supporting infra is already correct ## What Changed - `packages/adapter-utils/src/execution-target.ts`: introduce `DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC = 1800` and `resolveAdapterExecutionTargetTimeoutSec(...)`. Local and SSH adapters keep the historical "0 means no adapter timeout" behavior; sandbox-backed runs without an explicit `timeoutSec` get an explicit 30-minute default so remote installs and warm-up don't time out at the per-RPC default. Plumbed `timeoutSec` through `ensureAdapterExecutionTargetCommandResolvable` so install probes inside a sandbox honor adapter-level overrides instead of the bridge's 5-minute default. - `packages/adapters/opencode-local/src/index.ts`: switch `SANDBOX_INSTALL_COMMAND` from `npm install -g opencode-ai` to `curl -fsSL https://opencode.ai/install | bash`. The npm package reifies four large prebuilt-binary subpackages in parallel even though only one matches the host arch; on bandwidth-constrained sandboxes that blew through the 240s install budget. The official installer fetches one arch-specific binary and adds `$HOME/.opencode/bin` to PATH via `~/.bashrc`, which the sandbox-callback-bridge login-shell script already sources. - `packages/adapters/{claude,codex,gemini,opencode}-local/`: harden remote-target probes — pass `--skip-git-repo-check` for Codex when probing outside a repo, normalize permission flags for Claude, and add `*.remote.test.ts` coverage that exercises the remote-sandbox path explicitly for each adapter. - `packages/adapter-utils/src/sandbox-install-command.{ts,test.ts}` (new): add `buildSandboxNpmInstallCommand` helper. `server/src/adapters/registry.ts` + new `server/src/__tests__/adapter-registry.test.ts`: wire adapter install commands so they fall back to a writable `$HOME/.local` prefix when global install isn't available. - `server/src/__tests__/plugin-worker-manager.test.ts` + new `server/src/__tests__/fixtures/plugin-worker-delayed.cjs`: pin per-call timeout overrides so plugin worker exec calls honor the caller's timeout instead of the worker's default. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage packages/adapter-utils/src/execution-target-sandbox.test.ts packages/adapter-utils/src/sandbox-install-command.test.ts` - `pnpm exec vitest run --no-coverage server/src/__tests__/plugin-worker-manager.test.ts server/src/__tests__/adapter-registry.test.ts server/src/__tests__/claude-local-adapter-environment.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/gemini-local-adapter-environment.test.ts` - `pnpm exec vitest run --no-coverage packages/adapters/codex-local/src/server/test.remote.test.ts packages/adapters/opencode-local/src/server/test.remote.test.ts packages/adapters/codex-local/src/server/codex-args.test.ts packages/adapters/codex-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts` All passing locally. ## Risks - Touches shared `adapter-utils` and several `*-local` adapters. The 30-minute default applies only when both (a) the target is `remote+sandbox` and (b) no `timeoutSec` is configured — local + SSH paths are unchanged. New test coverage was added alongside each behavior change to pin the contracts. - Switching OpenCode's install command to the official installer is a behavior change for any operator running OpenCode inside a remote sandbox. Local installs are unaffected (the `SANDBOX_INSTALL_COMMAND` only runs when an adapter is being installed inside a sandbox). - Low risk overall — no migrations, no API surface change. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep), no code execution beyond local repo commands ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
9578dc3da7 |
Wire per-adapter sandbox install commands through test and execute paths (#5280)
> **Stacked PR.** Sits on top of the e2b sandbox chain — #5278 (stdin staging) and #5279 (honest-resolvability + login-profiles). The cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are the new `maybeRunSandboxInstallCommand` helper in `packages/adapter-utils/src/execution-target.ts` and the per-adapter `index.ts`/`server/test.ts`/`server/execute.ts` wiring under `packages/adapters/{claude,codex,cursor,gemini,opencode,pi}-local/`. The honest resolvability check from #5279 is what gives this PR's install command a meaningful "did it actually land on PATH" follow-up. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Sandbox execution targets are ephemeral — each fresh lease starts from a template image that may or may not have the agent CLIs preinstalled > - When a CLI isn't preinstalled, the resolvability probe fails at `command -v` and the hello probe never runs > - There's no shared mechanism for "before you probe or provision, install the CLI on this sandbox" > - This pull request adds a `SANDBOX_INSTALL_COMMAND` constant per adapter and a `maybeRunSandboxInstallCommand` helper that runs it via the existing sandbox login shell, captures structured output, and never throws (so the resolvability + hello probe still run after); each adapter's `test()` and `execute()` share the constant so the two callsites can't drift > - The benefit is a fresh sandbox lease without a preinstalled CLI now installs it once via `sh -lc` before the resolvability probe and before managed-runtime provisioning, with a uniform `<adapter>_install_command_run` check on the test report ## What Changed - `packages/adapter-utils/src/execution-target.ts`: add `AdapterSandboxInstallCommandCheck` and `maybeRunSandboxInstallCommand` (runs the install via existing sandbox shell, captures exit/stdout/stderr, returns a structured info/warn check, never throws) - Add `SANDBOX_INSTALL_COMMAND` to each adapter's `index.ts` so `test()` and `execute()` share a single source of truth - Wire each of the 6 affected adapter `testEnvironment()`s to call `maybeRunSandboxInstallCommand` before `ensureAdapterExecutionTargetCommandResolvable` - Pass `installCommand: SANDBOX_INSTALL_COMMAND` through `prepareAdapterExecutionTargetRuntime` in each adapter's `execute()` - Per-adapter install commands use npm globals where possible so binaries land on a PATH segment the template already exports: - claude → `npm install -g @anthropic-ai/claude-code` - codex → `npm install -g @openai/codex` - cursor → `curl https://cursor.com/install -fsS | bash` - gemini → `npm install -g @google/gemini-cli` - opencode → `npm install -g opencode-ai` - pi → `npm install -g @mariozechner/pi-coding-agent` SSH and local targets ignore `installCommand` (SSH runtime takes no such param; local short-circuits before runtime prep), so this is a no-op for non-sandbox environments. ## Verification - `pnpm typecheck` clean - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` and per-adapter projects pass - Manual sandbox matrix (claude, codex, cursor, gemini, opencode, pi) — each goes `install_command_run → resolvable → hello_probe_passed` (Codex and Pi land on `hello_probe_auth_required`, which is the configured-credentials problem, not an install issue) - SSH no-regression: SSH Claude still passes; the helper short-circuits on non-sandbox targets ## Risks Medium — adds a network/CPU cost (npm install / curl) on every fresh sandbox lease. Cost is bounded (one-time per lease, typically tens of seconds for npm globals), and the helper never throws so a failing install still lets the report run resolvability and hello probes. If a sandbox image already has the CLI, the install is an idempotent reinstall. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
f9cf1d2f6a |
Add cursor sandbox support and fix SSH workspace sync (#4803)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents can run inside sandboxed environments like E2B, or on remote hosts via SSH > - The cursor adapter needs to resolve `cursor-agent` inside sandbox environments where it's installed in `~/.local/bin` > - But when using the default `agent` command on a sandbox target, the adapter didn't know to look in `~/.local/bin/cursor-agent`, causing "command not found" failures > - Additionally, repeated SSH runs failed because `git checkout` during workspace sync conflicted with leftover `.paperclip-runtime` files from previous runs > - This PR adds sandbox-aware command resolution for cursor and fixes the SSH workspace sync conflict > - The benefit is cursor works in E2B sandboxes out of the box, and repeated SSH runs don't fail on workspace sync ## What Changed - `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and prefers `~/.local/bin/cursor-agent` when the default command is requested; tightened the sandbox command probe to validate the binary exists before launching; preserves explicit custom command overrides - `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH workspace sync to handle `.paperclip-runtime` untracked file conflicts from previous runs ## Verification - `pnpm test` — all existing and new tests pass, including cursor sandbox probe, sandbox execution, and custom command override tests - `pnpm typecheck` — clean - Manual: configure an E2B environment, run a cursor-local task, verify it resolves cursor-agent from the sandbox install path ## Risks - Low-medium. The `--force` flag on git checkout could discard uncommitted changes in the remote workspace, but the workspace is managed by Paperclip and should not contain user edits. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
9b99d30330 |
Add dedicated environment settings page and test-in-environment (#4798)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside environments (local, SSH, E2B sandbox) > - Operators need to configure and manage these environments > - But environment settings were buried inside the general company settings page, making them hard to find > - Additionally, when testing an agent from the configuration form, the test always ran locally regardless of which environment was selected > - This PR moves environments into a dedicated top-level company settings section and wires the "Test Environment" button to run inside the selected environment > - The benefit is operators can find and manage environments more easily, and the test button now validates the actual environment the agent will use ## What Changed - Added a dedicated `CompanyEnvironments` settings page with its own route and sidebar entry - Updated `CompanySettingsSidebar` and `CompanySettingsNav` to include the new environments section - Modified the agent test route (`POST /agents/:id/test`) to accept an optional `environmentId` parameter - Updated all adapter `test.ts` handlers to resolve and use the specified execution target environment - Added `resolveTestExecutionTarget` to `execution-target.ts` for remote environment test resolution with cwd fallback - Moved the "Test Environment" button and its feedback display into the `NewAgent` page footer for better UX flow ## Verification - `pnpm test` — all existing and new tests pass - `pnpm typecheck` — clean - Manual: navigate to Company Settings, confirm "Environments" appears as a top-level section - Manual: configure an agent with a non-local environment, click "Test Environment", confirm the test runs inside that environment ## Risks - Low risk. UI-only routing change for the settings page. The test-in-environment change adds an optional parameter with a local fallback, so existing behavior is preserved when no environment is specified. ## Model Used Codex GPT 5.4 high via Paperclip. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
07987d75ad |
feat(claude-local): add Bedrock model selection support
Previously, --model was completely skipped for Bedrock users, so the model dropdown selection was silently ignored and the CLI always used its default model. Selecting Haiku would still run Opus. - Add listClaudeModels() that returns Bedrock-native model IDs (us.anthropic.*) when Bedrock env is detected - Register listModels on claude_local adapter so the UI dropdown shows Bedrock models instead of Anthropic API names - Allow --model to pass through when the ID is a Bedrock-native identifier (us.anthropic.* or ARN) - Add isBedrockModelId() helper shared by execute.ts and test.ts Follows up on #2793 which added basic Bedrock auth detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
b6e40fec54 |
feat: add AWS Bedrock auth support on "claude-local" (#2793)
Closes #2412 Related: #2681, #498, #128 ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Claude Code adapter spawns the `claude` CLI to run agent tasks > - The adapter detects auth mode by checking for `ANTHROPIC_API_KEY` — recognizing only "api" and "subscription" modes > - But users running Claude Code via **AWS Bedrock** (`CLAUDE_CODE_USE_BEDROCK=1`) fall through to the "subscription" path > - This causes a misleading "ANTHROPIC_API_KEY is not set; subscription-based auth can be used" message in the environment check > - Additionally, the hello probe passes `--model claude-opus-4-6` which is **not a valid Bedrock model identifier**, causing `400 The provided model identifier is invalid` and a probe failure > - This pull request adds Bedrock auth detection, skips the Anthropic-style `--model` flag for Bedrock, and returns the correct billing type > - The benefit is that Bedrock users get a working environment check and correct cost tracking out of the box --- ## Pain Point Many enterprise teams use **Claude Code through AWS Bedrock** rather than Anthropic's direct API — for compliance, billing consolidation, or VPC requirements. Currently, these users hit a **hard wall during onboarding**: | Problem | Impact | |---|---| | ❌ Adapter environment check **always fails** | Users cannot create their first agent — blocked at step 1 | | ❌ `--model claude-opus-4-6` is **invalid on Bedrock** (requires `us.anthropic.*` format) | Hello probe exits with code 1: `400 The provided model identifier is invalid` | | ❌ Auth shown as _"subscription-based"_ | Misleading — Bedrock is neither subscription nor API-key auth | | ❌ Quota polling hits Anthropic OAuth endpoint | Fails silently for Bedrock users who have no Anthropic subscription | > **Bottom line**: Paperclip is completely unusable for Bedrock users out of the box. ## Why Bedrock Matters AWS Bedrock is a major deployment path for Claude in enterprise environments: - **Enterprise compliance** — data stays within the customer's AWS account and VPC - **Unified billing** — Claude usage appears on the existing AWS invoice, no separate Anthropic billing - **IAM integration** — access controlled through AWS IAM roles and policies - **Regional deployment** — models run in the customer's preferred AWS region Supporting Bedrock unlocks Paperclip for organizations that **cannot** use Anthropic's direct API due to procurement, security, or regulatory constraints. --- ## What Changed - **`execute.ts`**: Added `isBedrockAuth()` helper that checks `CLAUDE_CODE_USE_BEDROCK` and `ANTHROPIC_BEDROCK_BASE_URL` env vars. `resolveClaudeBillingType()` now returns `"metered_api"` for Bedrock. Biller set to `"aws_bedrock"`. Skips `--model` flag when Bedrock is active (Anthropic-style model IDs are invalid on Bedrock; the CLI uses its own configured model). - **`test.ts`**: Environment check now detects Bedrock env vars (from adapter config or server env) and shows `"AWS Bedrock auth detected. Claude will use Bedrock for inference."` instead of the misleading subscription message. Also skips `--model` in the hello probe for Bedrock. - **`quota.ts`**: Early return with `{ ok: true, windows: [] }` when Bedrock is active — Bedrock usage is billed through AWS, not Anthropic's subscription quota system. - **`ui/src/lib/utils.ts`**: Added `"aws_bedrock"` → `"AWS Bedrock"` to `providerDisplayName()` and `quotaSourceDisplayName()`. ## Verification 1. `pnpm -r typecheck` — all packages pass 2. Unit tests added and passing (6/6) 3. Environment check with Bedrock env vars: | | Before | After | |---|---|---| | **Status** | 🔴 Failed | ✅ Passed | | **Auth message** | `ANTHROPIC_API_KEY is not set; subscription-based auth can be used if Claude is logged in.` | `AWS Bedrock auth detected. Claude will use Bedrock for inference.` | | **Hello probe** | `ERROR · Claude hello probe failed.` (exit code 1 — `--model claude-opus-4-6` is invalid on Bedrock) | `INFO · Claude hello probe succeeded.` | | **Screenshot** | <img height="500" alt="Screenshot 2026-04-05 at 8 25 27 AM" src="https://github.com/user-attachments/assets/476431f6-6139-425a-8abc-97875d653657" /> | <img height="500" alt="Screenshot 2026-04-05 at 8 31 58 AM" src="https://github.com/user-attachments/assets/d388ce87-c5e6-4574-b8d2-fd8b86135299" /> | 4. Existing API key / subscription paths are completely untouched unless Bedrock env vars are present ## Risks - **Low risk.** All changes are additive — existing "api" and "subscription" code paths are only entered when Bedrock env vars are absent. - When Bedrock is active, the `--model` flag is skipped, so the Paperclip model dropdown selection is ignored in favor of the Claude CLI's own model config. This is intentional since Bedrock requires different model identifiers. ## Model Used - Claude Opus 4.6 (`claude-opus-4-6`, 1M context window) via Claude Code CLI ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> |
||
|
|
14d59da316 |
feat(adapters): external adapter plugin system with dynamic UI parser
- Plugin loader: install/reload/remove/reinstall external adapters from npm packages or local directories - Plugin store persisted at ~/.paperclip/adapter-plugins.json - Self-healing UI parser resolution with version caching - UI: Adapter Manager page, dynamic loader, display registry with humanized names for unknown adapter types - Dev watch: exclude adapter-plugins dir from tsx watcher to prevent mid-request server restarts during reinstall - All consumer fallbacks use getAdapterLabel() for consistent display - AdapterTypeDropdown uses controlled open state for proper close behavior - Remove hermes-local from built-in UI (externalized to plugin) - Add docs for external adapters and UI parser contract |
||
|
|
9513bef7e7 | Improve onboarding adapter diagnostics with live hello probes | ||
|
|
8351f7f1bd | Auto-create missing cwd for claude_local and codex_local | ||
|
|
f60c1001ec |
refactor: rename packages to @paperclipai and CLI binary to paperclipai
Rename all workspace packages from @paperclip/* to @paperclipai/* and the CLI binary from `paperclip` to `paperclipai` in preparation for npm publishing. Bump CLI version to 0.1.0 and add package metadata (description, keywords, license, repository, files). Update all imports, documentation, user-facing messages, and tests accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
f80a802592 |
Add adapter environment testing infrastructure
Introduce testEnvironment() on ServerAdapterModule with structured pass/warn/fail diagnostics for all four adapter types (claude_local, codex_local, process, http). Adds POST test-environment endpoint, shared types/validators, adapter test implementations, and UI API client. Includes asset type foundations used by related features. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |