Devin Foley ad0bb57350 Fix exe.dev sandbox installs for gemini/opencode local adapters (#5737)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, including
running adapter CLIs inside remote sandboxes
> - The QA matrix in PAPA-316 spins up local-runtime adapters
(claude/gemini/opencode) against both SSH and the new exe.dev sandbox
provider, and "Test" exercises the same install + probe path the real
runtime uses
> - On exe.dev the QA matrix failed at three different points:
SSH/sandbox secret refs would not resolve, gemini-local could not find
npm, and opencode-local installed a binary that was not on the
probe-shell PATH
> - These are all environment-shape issues the runtime should handle,
not regressions in any individual adapter, so they need to be fixed in
the shared install/resolve layer before the matrix can pass
> - This pull request wires the environment id through to secret-ref
resolution, bootstraps npm from a portable Node tarball when the sandbox
image lacks Node, and symlinks the opencode binary into a directory that
non-login shells see
> - The benefit is that the QA matrix passes end-to-end on exe.dev, and
any future sandbox provider that ships without Node or relies on rc-file
PATH wiring gets the same fixes for free

## What Changed

- `server/src/services/environment-execution-target.ts`: pass the
environment `id` into `resolveEnvironmentDriverConfigForRuntime` for
both the sandbox and SSH branches, so `privateKeySecretRef` /
sandbox-provider secret refs (e.g. exe.dev `apiKey`) can resolve against
the secret store at runtime instead of throwing `Runtime secret
resolution requires an environment id`.
- `packages/adapter-utils/src/sandbox-install-command.ts`: extend
`buildSandboxNpmInstallCommand` with an `ENSURE_NPM_PREAMBLE` that, when
`npm` is missing, downloads a portable Node v22 tarball into
`$HOME/.local` and sets `PAPERCLIP_NPM_BOOTSTRAPPED=1` so the install
step skips sudo (sudo's `secure_path` would lose the freshly-installed
`npm` in `$HOME/.local/bin`). Distro-packaged Node from apt-get is
intentionally avoided because it tends to be too old to parse modern JS
syntax used by `@google/gemini-cli`.
- `packages/adapters/gemini-local/src/index.ts`: switch the hardcoded
`npm install -g @google/gemini-cli` to `buildSandboxNpmInstallCommand`,
so gemini-local picks up the same sudo-aware + npm-bootstrap behavior as
the other local adapters.
- `packages/adapters/opencode-local/src/index.ts`: append a step to the
install command that symlinks `$HOME/.opencode/bin/opencode` into
`$HOME/.local/bin`. The upstream installer only adds `~/.opencode/bin`
to PATH via `~/.bashrc`, which non-login `sh -c` probe invocations do
not source.
- `packages/adapter-utils/src/sandbox-install-command.test.ts`: cover
the new preamble plus the unchanged root/sudo/user-prefix branches.

## Verification

- `cd packages/adapter-utils && npm test -- sandbox-install-command`
(passes; new "bootstraps npm from a portable Node tarball when missing"
case is included).
- Manual: ran the in-app `Test` action against the QA matrix dev
instance for `QA exe.dev Claude`, `QA exe.dev Gemini`, and `QA exe.dev
OpenCode` — all three now report `status=pass` including the hello
probe. `QA SSH Claude` also passes; without the environment-id fix, SSH
resolution threw before the wrapper / install fixes could run.
- Suggested reviewer check: re-run the matrix on a fresh exe.dev
environment and confirm the install step no longer hits `npm: command
not found` for gemini and the opencode probe no longer hits `opencode:
command not found`.

## Risks

- Low/medium. The npm bootstrap pins Node `v22.11.0` from
`nodejs.org/dist`; if that URL becomes unreachable the install will fail
with a clear `curl` error rather than corrupting state. The bootstrap
path is only taken when `npm` is genuinely missing, so existing sandbox
images that ship with Node are unaffected.
- The opencode symlink uses `ln -sf` into `$HOME/.local/bin`, which is
created with `mkdir -p`; idempotent on re-install.
- The `id` change is a strict additive: callers previously got
`undefined` and only the secret-ref code paths actually read it. No
behavior change for environments without secret refs.

## Model Used

- Claude (Anthropic), `claude-opus-4-7`, with extended thinking and tool
use enabled. Iterated through the Paperclip QA matrix harness; no other
model assisted.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (n/a — runtime/install path only)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 14:28:22 -07:00
2026-05-03 22:52:33 -05:00
2026-03-07 02:57:28 +09:00

Paperclip — runs your business

Quickstart · Docs · GitHub · Discord · Twitter

MIT License Stars Discord



What is Paperclip?

Open-source orchestration for zero-human companies

If OpenClaw is an employee, Paperclip is the company

Paperclip is a Node.js server and React UI that orchestrates a team of AI agents to run a business. Bring your own agents, assign goals, and track your agents' work and costs from one dashboard.

It looks like a task manager — but under the hood it has org charts, budgets, governance, goal alignment, and agent coordination.

Manage business goals, not pull requests.

Step Example
01 Define the goal "Build the #1 AI note-taking app to $1M MRR."
02 Hire the team CEO, CTO, engineers, designers, marketers — any bot, any provider.
03 Approve and run Review strategy. Set budgets. Hit go. Monitor from the dashboard.

COMING SOON: Clipmart — Download and run entire companies with one click. Browse pre-built company templates — full org structures, agent configs, and skills — and import them into your Paperclip instance in seconds.


Works
with
OpenClaw
OpenClaw
Claude
Claude Code
Codex
Codex
Cursor
Cursor
Bash
Bash
HTTP
HTTP

If it can receive a heartbeat, it's hired.


Paperclip is right for you if

  • You want to build autonomous AI companies
  • You coordinate many different agents (OpenClaw, Codex, Claude, Cursor) toward a common goal
  • You have 20 simultaneous Claude Code terminals open and lose track of what everyone is doing
  • You want agents running autonomously 24/7, but still want to audit work and chime in when needed
  • You want to monitor costs and enforce budgets
  • You want a process for managing agents that feels like using a task manager
  • You want to manage your autonomous businesses from your phone

Features

🔌 Bring Your Own Agent

Any agent, any runtime, one org chart. If it can receive a heartbeat, it's hired.

🎯 Goal Alignment

Every task traces back to the company mission. Agents know what to do and why.

💓 Heartbeats

Agents wake on a schedule, check work, and act. Delegation flows up and down the org chart.

💰 Cost Control

Monthly budgets per agent. When they hit the limit, they stop. No runaway costs.

🏢 Multi-Company

One deployment, many companies. Complete data isolation. One control plane for your portfolio.

🎫 Ticket System

Every conversation traced. Every decision explained. Full tool-call tracing and immutable audit log.

🛡️ Governance

You're the board. Approve hires, override strategy, pause or terminate any agent — at any time.

📊 Org Chart

Hierarchies, roles, reporting lines. Your agents have a boss, a title, and a job description.

📱 Mobile Ready

Monitor and manage your autonomous businesses from anywhere.

Problems Paperclip solves

Without Paperclip With Paperclip
You have 20 Claude Code tabs open and can't track which one does what. On reboot you lose everything. Tasks are ticket-based, conversations are threaded, sessions persist across reboots.
You manually gather context from several places to remind your bot what you're actually doing. Context flows from the task up through the project and company goals — your agent always knows what to do and why.
Folders of agent configs are disorganized and you're re-inventing task management, communication, and coordination between agents. Paperclip gives you org charts, ticketing, delegation, and governance out of the box — so you run a company, not a pile of scripts.
Runaway loops waste hundreds of dollars of tokens and max your quota before you even know what happened. Cost tracking surfaces token budgets and throttles agents when they're out. Management prioritizes with budgets.
You have recurring jobs (customer support, social, reports) and have to remember to manually kick them off. Heartbeats handle regular work on a schedule. Management supervises.
You have an idea, you have to find your repo, fire up Claude Code, keep a tab open, and babysit it. Add a task in Paperclip. Your coding agent works on it until it's done. Management reviews their work.

Why Paperclip is special

Paperclip handles the hard orchestration details correctly.

Atomic execution. Task checkout and budget enforcement are atomic, so no double-work and no runaway spend.
Persistent agent state. Agents resume the same task context across heartbeats instead of restarting from scratch.
Runtime skill injection. Agents can learn Paperclip workflows and project context at runtime, without retraining.
Governance with rollback. Approval gates are enforced, config changes are revisioned, and bad changes can be rolled back safely.
Goal-aware execution. Tasks carry full goal ancestry so agents consistently see the "why," not just a title.
Portable company templates. Export/import orgs, agents, and skills with secret scrubbing and collision handling.
True multi-company isolation. Every entity is company-scoped, so one deployment can run many companies with separate data and audit trails.

What's Under the Hood

Paperclip is a full control plane, not a wrapper. Before you build any of this yourself, know that it already exists:

┌──────────────────────────────────────────────────────────────┐
│                       PAPERCLIP SERVER                       │
│                                                              │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐  ┌───────────┐  │
│  │Identity & │  │  Work &   │  │ Heartbeat │  │Governance │  │
│  │  Access   │  │   Tasks   │  │ Execution │  │& Approvals│  │
│  └───────────┘  └───────────┘  └───────────┘  └───────────┘  │
│                                                              │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐  ┌───────────┐  │
│  │ Org Chart │  │Workspaces │  │  Plugins  │  │  Budget   │  │
│  │ & Agents  │  │ & Runtime │  │           │  │ & Costs   │  │
│  └───────────┘  └───────────┘  └───────────┘  └───────────┘  │
│                                                              │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐  ┌───────────┐  │
│  │ Routines  │  │ Secrets & │  │ Activity  │  │  Company  │  │
│  │& Schedules│  │  Storage  │  │ & Events  │  │Portability│  │
│  └───────────┘  └───────────┘  └───────────┘  └───────────┘  │
└──────────────────────────────────────────────────────────────┘
         ▲              ▲              ▲              ▲
   ┌─────┴─────┐  ┌─────┴─────┐  ┌─────┴─────┐  ┌─────┴─────┐
   │  Claude   │  │   Codex   │  │   CLI     │  │ HTTP/web  │
   │   Code    │  │           │  │  agents   │  │   bots    │
   └───────────┘  └───────────┘  └───────────┘  └───────────┘

The Systems

Identity & Access — Two deployment modes (trusted local or authenticated), board users, agent API keys, short-lived run JWTs, company memberships, invite flows, and OpenClaw onboarding. Every mutating request is traced to an actor.

Org Chart & Agents — Agents have roles, titles, reporting lines, permissions, and budgets. Adapter examples match the diagram: Claude Code, Codex, CLI agents such as Cursor/Gemini/bash, HTTP/webhook bots such as OpenClaw, and external adapter plugins. If it can receive a heartbeat, it's hired.

Work & Task System — Issues carry company/project/goal/parent links, atomic checkout with execution locks, first-class blocker dependencies, comments, documents, attachments, work products, labels, and inbox state. No double-work, no lost context.

Heartbeat Execution — DB-backed wakeup queue with coalescing, budget checks, workspace resolution, secret injection, skill loading, and adapter invocation. Runs produce structured logs, cost events, session state, and audit trails. Recovery handles orphaned runs automatically.

Workspaces & Runtime — Project workspaces, isolated execution workspaces (git worktrees, operator branches), and runtime services (dev servers, preview URLs). Agents work in the right directory with the right context every time.

Governance & Approvals — Board approval workflows, execution policies with review/approval stages, decision tracking, budget hard-stops, agent pause/resume/terminate, and full audit logging. You're the board — nothing ships without your sign-off.

Budget & Cost Control — Token and cost tracking by company, agent, project, goal, issue, provider, and model. Scoped budget policies with warning thresholds and hard stops. Overspend pauses agents and cancels queued work automatically.

Routines & Schedules — Recurring tasks with cron, webhook, and API triggers. Concurrency and catch-up policies. Each routine execution creates a tracked issue and wakes the assigned agent — no manual kick-offs needed.

Plugins — Instance-wide plugin system with out-of-process workers, capability-gated host services, job scheduling, tool exposure, and UI contributions. Extend Paperclip without forking it.

Secrets & Storage — Instance and company secrets, encrypted local storage, provider-backed object storage, attachments, and work products. Sensitive values stay out of prompts unless a scoped run explicitly needs them.

Activity & Events — Mutating actions, heartbeat state changes, cost events, approvals, comments, and work products are recorded as durable activity so operators can audit what happened and why.

Company Portability — Export and import entire organizations — agents, skills, projects, routines, and issues — with secret scrubbing and collision handling. One deployment, many companies, complete data isolation.


What Paperclip is not

Not a chatbot. Agents have jobs, not chat windows.
Not an agent framework. We don't tell you how to build agents. We tell you how to run a company made of them.
Not a workflow builder. No drag-and-drop pipelines. Paperclip models companies — with org charts, goals, budgets, and governance.
Not a prompt manager. Agents bring their own prompts, models, and runtimes. Paperclip manages the organization they work in.
Not a single-agent tool. This is for teams. If you have one agent, you probably don't need Paperclip. If you have twenty — you definitely do.
Not a code review tool. Paperclip orchestrates work, not pull requests. Bring your own review process.

Quickstart

Open source. Self-hosted. No Paperclip account required.

npx paperclipai onboard --yes

That quickstart path now defaults to trusted local loopback mode for the fastest first run. To start in authenticated/private mode instead, choose a bind preset explicitly:

npx paperclipai onboard --yes --bind lan
# or:
npx paperclipai onboard --yes --bind tailnet

If you already have Paperclip configured, rerunning onboard keeps the existing config in place. Use paperclipai configure to edit settings.

Or manually:

git clone https://github.com/paperclipai/paperclip.git
cd paperclip
pnpm install
pnpm dev

This starts the API server at http://localhost:3100. An embedded PostgreSQL database is created automatically — no setup required.

Requirements: Node.js 20+, pnpm 9.15+


FAQ

What does a typical setup look like? Locally, a single Node.js process manages an embedded Postgres and local file storage. For production, point it at your own Postgres and deploy however you like. Configure projects, agents, and goals — the agents take care of the rest.

If you're a solo-entreprenuer you can use Tailscale to access Paperclip on the go. Then later you can deploy to e.g. Vercel when you need it.

Can I run multiple companies? Yes. A single deployment can run an unlimited number of companies with complete data isolation.

How is Paperclip different from agents like OpenClaw or Claude Code? Paperclip uses those agents. It orchestrates them into a company — with org charts, budgets, goals, governance, and accountability.

Why should I use Paperclip instead of just pointing my OpenClaw to Asana or Trello? Agent orchestration has subtleties in how you coordinate who has work checked out, how to maintain sessions, monitoring costs, establishing governance - Paperclip does this for you.

(Bring-your-own-ticket-system is on the Roadmap)

Do agents run continuously? By default, agents run on scheduled heartbeats and event-based triggers (task assignment, @-mentions). You can also hook in continuous agents like OpenClaw. You bring your agent and Paperclip coordinates.


Development

pnpm dev              # Full dev (API + UI, watch mode)
pnpm dev:once         # Full dev without file watching
pnpm dev:server       # Server only
pnpm build            # Build all
pnpm typecheck        # Type checking
pnpm test             # Cheap default test run (Vitest only)
pnpm test:watch       # Vitest watch mode
pnpm test:e2e         # Playwright browser suite
pnpm db:generate      # Generate DB migration
pnpm db:migrate       # Apply migrations

pnpm test does not run Playwright. Browser suites stay separate and are typically run only when working on those flows or in CI.

See doc/DEVELOPING.md for the full development guide.


Roadmap

  • Plugin system (e.g. add a knowledge base, custom tracing, queues, etc)
  • Get OpenClaw / claw-style agent employees
  • companies.sh - import and export entire organizations
  • Easy AGENTS.md configurations
  • Skills Manager
  • Scheduled Routines
  • Better Budgeting
  • Agent Reviews and Approvals
  • Multiple Human Users
  • Cloud / Sandbox agents (e.g. Cursor / e2b agents)
  • Artifacts & Work Products
  • Memory / Knowledge
  • Enforced Outcomes
  • MAXIMIZER MODE
  • Deep Planning
  • Work Queues
  • Self-Organization
  • Automatic Organizational Learning
  • CEO Chat
  • Cloud deployments
  • Desktop App

This is the short roadmap preview. See the full roadmap in ROADMAP.md.


Community & Plugins

Find Plugins and more at awesome-paperclip

Telemetry

Paperclip collects anonymous usage telemetry to help us understand how the product is used and improve it. No personal information, issue content, prompts, file paths, or secrets are ever collected. Private repository references are hashed with a per-install salt before being sent.

Telemetry is enabled by default and can be disabled with any of the following:

Method How
Environment variable PAPERCLIP_TELEMETRY_DISABLED=1
Standard convention DO_NOT_TRACK=1
CI environments Automatically disabled when CI=true
Config file Set telemetry.enabled: false in your Paperclip config

Contributing

We welcome contributions. See the contributing guide for details.


Community


License

MIT © 2026 Paperclip

Star History

Star History Chart



Open source under MIT. Built for people who want to run companies, not babysit agents.

S
Description
Open-source orchestration for zero-human companies
Readme MIT 40 MiB
Languages
TypeScript 97.9%
JavaScript 1.1%
Shell 0.7%
CSS 0.2%