Compare commits

..

188 Commits

Author SHA1 Message Date
Chris Farhood 5499a0b4a6 ci: adapt workflows for Gitea migration
Change runner from runners-farhoodlabs to ubuntu-latest across all fork
workflows. Update container registry from ghcr.io to git.farh.net and
authenticate with REGISTRY_TOKEN. Migrate update-infra API calls from
GitHub to Gitea. Disable refresh-lockfile.yml (requires GitHub gh CLI).
Update CLAUDE.md references.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-20 11:17:45 +00:00
Chris Farhood 55faea456f Merge pull request #16 from farhoodlabs/dev
Dev
2026-05-16 08:38:38 -07:00
Chris Farhood 329ba3fd2e Merge pull request #15 from farhoodlabs/feat/portability-git-backend-agnostic
refactor(portability): migrate to git-source; delete github-fetch.ts
2026-05-16 07:43:35 -07:00
Chris Farhood bf251188df test(portability): cover resolveSource orchestration via previewImport
Closes the coverage gap on the actual migrated function. Mocks the
two network-touching git-source exports (resolveGitRef, openRepoSnapshot)
while keeping parseGitSourceUrl real so the parseGitHubSourceUrl shim
contract stays honest. Adds 5 cases:

- happy path: opens one snapshot, calls listFiles, readFileOptional
  on COMPANY.md, readFile on candidate paths
- ref fallback: when openRepoSnapshot('main') rejects, falls back to
  'master' and emits the expected warning
- COMPANY.md absent everywhere: throws "missing COMPANY.md"
- referenced logo: readBinary is called for the logoPath from
  .paperclip.yaml
- logo read failure: warning emitted, no throw

57/57 portability tests passing; existing 52 unchanged via shim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 10:35:56 -04:00
Chris Farhood 80f7d8270c refactor(portability): migrate to git-source; delete github-fetch.ts
Mirrors the skills refactor: company-portability was the second user of
the per-host REST shim (its own parallel parseGitHubSourceUrl + fetch
helpers + raw.githubusercontent URL builder), so importing a company
package from a non-github URL hit the same Gitea 404 the skills path did.

- Extend git-source.ts:
  - parseGitSourceUrl: also recognises query-string shape
    (?ref=...&path=...) used by portability URLs, with precedence over
    path-style segments when both are present.
  - RepoSnapshot: add readBinary (Uint8Array for the company logo
    fetch) and readFileOptional (null on NotFoundError, for the
    COMPANY.md probe + main->master fallback).
- Rewrite resolveSource in company-portability.ts to open a single
  in-memory snapshot per import and serve all reads (COMPANY.md,
  candidate tree, includes, logo) from it. Drops fetchText/fetchJson/
  fetchBinary/fetchOptionalText.
- parseGitHubSourceUrl stays exported with its original return shape
  ({hostname, owner, repo, ref, basePath, companyPath}) so the existing
  test suite passes unchanged. It now delegates URL parsing to
  parseGitSourceUrl and layers companyPath derivation on top.
- Delete server/src/services/github-fetch.ts: zero remaining callers.

Test coverage:
- 7 new git-source tests (query-string parse variants, query-string
  precedence over path style, readBinary, readFileOptional NotFound
  null + non-NotFound rethrow) — 34/34 passing.
- 52 existing company-portability tests still pass via the
  parseGitHubSourceUrl shim contract.
- Smoke-tested end-to-end against https://git.farh.net/.../?ref=main:
  ref resolves, snapshot opens, readFile/readBinary/readFileOptional
  all return expected results.

Note: two pre-existing failures in company-skills-routes.test.ts
("does not expose a skill reference...") exist on dev too and are
unrelated to this change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 10:28:22 -04:00
Chris Farhood 5703fa225c Merge dev into local; drop dead assemble-local workflow
- Resolves the duplicate-SHA conflict on the gitea/skills commits
  by taking dev's versions (canonical after PR #13 superseded the
  original shim with the git-source refactor).
- Deletes .github/workflows/assemble-local.yml -- the workflow
  triggered on master push but lived on local, so it never fired
  automatically; promotion happens via dev->local PRs instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 10:16:28 -04:00
Chris Farhood 4317d2a3b4 Merge pull request #13 from farhoodlabs/feat/skills-git-backend-agnostic
refactor(skills): backend-agnostic git via wire protocol (fixes Gitea/Forgejo)
2026-05-16 06:51:22 -07:00
Chris Farhood d30afdb1b2 test(skills): add vitest coverage for git-source module
27 tests covering the surface that had none:

- parseGitSourceUrl: bare URLs (github/gitea/gitlab), tree/blob/src
  shapes, subpaths, file paths, trailing .git stripping, https-only
  enforcement, malformed/missing-segment rejection.
- resolveGitRef: 40-hex SHA passthrough (no network call), default
  branch via HEAD symref, named branch, peeled annotated tag, lightweight
  tag, ref-not-found, network/401/404 error translation, onAuth
  callback shape (token-as-username, x-oauth-basic) and absence.
- openRepoSnapshot: clone args (singleBranch/depth=1/noCheckout),
  tree walk filtering trees vs blobs, readFile path, SHA fallback
  when tracking ref is null, 404 translation.

Mocks at the isomorphic-git boundary; verifies our adaptation logic,
not isomorphic-git itself.

Known limit surfaced by a test (not fixed here): gitea URLs with
slash-containing branch names like /src/branch/feature/x are
ambiguous without server-side disambiguation. The test uses a
single-segment branch; the multi-segment case needs a separate fix
(refCandidates from longest-to-shortest, resolved against
listServerRefs output).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 09:36:27 -04:00
Chris Farhood 0fd4e9c4d1 refactor(skills): replace per-host REST shims with git wire protocol
The skill import/update/file-read pipeline talked to host-specific REST
APIs (GitHub /commits/{ref}, /git/trees/{sha}, raw.githubusercontent.com)
and the recent Gitea support was a parallel shim on top of the same
pattern. The result was multiple ref-resolution shapes that needed
per-host branching, and on Gitea the /commits/{ref} endpoint returns
404 outright -- so even public Gitea/Forgejo repos failed to import.

Replace with a single git-source module backed by isomorphic-git +
memfs. It speaks the smart-HTTP protocol any sane git server already
serves:

- resolveGitRef: one listServerRefs call, no host API. Handles default
  branch (symref on HEAD), named branches, annotated/lightweight tags,
  and SHA passthrough.
- openRepoSnapshot: shallow singleBranch clone into an in-memory fs;
  listFiles via git.walk, readFile via git.readBlob. No tempdirs, no
  execFile, no per-host endpoints.
- Universal auth via onAuth (token-as-username) covering GitHub PATs,
  GitLab PATs, Gitea/Forgejo tokens.
- parseGitSourceUrl recognises github tree/blob, gitea src/branch|
  commit|tag, gitlab /-/tree, bitbucket /src/{ref} URL shapes plus
  bare clone URLs.

Stored skill metadata is unchanged (hostname/owner/repo/ref/trackingRef/
repoSkillDir), so existing rows keep working -- the clone URL is
derived at fetch time.

company-portability.ts still imports github-fetch.ts (same broken
pattern, separate feature). Left as a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 09:16:00 -04:00
Chris Farhood 8dbe99e32e feat(skills): support Gitea/Forgejo git hosts end-to-end
The skills source pipeline was hardcoded to GitHub conventions, so even
though the UI now accepts non-GitHub URLs, the server couldn't actually
fetch from anywhere else.

- github-fetch.ts: dispatch by host family (github.com → GitHub API +
  raw.githubusercontent.com; everything else → Gitea/Forgejo API v1 +
  /api/v1/repos/.../media for raw content).
- parseGitHubSourceUrl: also accept Gitea/Forgejo web URLs
  (/{owner}/{repo}/src/{branch|commit|tag}/{ref}/{path}).
- routes/company-skills.ts: drop the hostname='github.com' gate in
  deriveTrackedSkillRef so non-GitHub skills are still tracked.
- Generalize user-facing strings ('GitHub PAT' → 'PAT', 'GitHub source URL'
  → 'Source URL', etc.).

GitHub Enterprise (was assumed by '/api/v3') is no longer a special case —
non-github.com hosts are treated as Gitea/Forgejo. If GHE support is needed
later, add a per-source host-family override.
2026-05-14 11:49:51 -04:00
Chris Farhood 818a8eade8 feat(skills): support Gitea/Forgejo git hosts end-to-end
The skills source pipeline was hardcoded to GitHub conventions, so even
though the UI now accepts non-GitHub URLs, the server couldn't actually
fetch from anywhere else.

- github-fetch.ts: dispatch by host family (github.com → GitHub API +
  raw.githubusercontent.com; everything else → Gitea/Forgejo API v1 +
  /api/v1/repos/.../media for raw content).
- parseGitHubSourceUrl: also accept Gitea/Forgejo web URLs
  (/{owner}/{repo}/src/{branch|commit|tag}/{ref}/{path}).
- routes/company-skills.ts: drop the hostname='github.com' gate in
  deriveTrackedSkillRef so non-GitHub skills are still tracked.
- Generalize user-facing strings ('GitHub PAT' → 'PAT', 'GitHub source URL'
  → 'Source URL', etc.).

GitHub Enterprise (was assumed by '/api/v3') is no longer a special case —
non-github.com hosts are treated as Gitea/Forgejo. If GHE support is needed
later, add a per-source host-family override.
2026-05-14 11:49:49 -04:00
Chris Farhood 9e854e33d9 fix(skills): drop GitHub-only regex gate on PAT input
The PAT input on the skill import flow was hidden by a regex that matched
github.com or org/repo shorthand. Self-hosted Gitea/Forgejo/GitLab sources
got no auth field at all. Always show the input when a source is entered,
and label it generically ('Personal access token') instead of 'GitHub PAT'.

UI only — backend already accepts any token via /skills/:id/auth and
/companies/:companyId/skills POST {source, authToken}.
2026-05-14 11:41:40 -04:00
Chris Farhood 26e814a426 fix(skills): drop GitHub-only regex gate on PAT input
The PAT input on the skill import flow was hidden by a regex that matched
github.com or org/repo shorthand. Self-hosted Gitea/Forgejo/GitLab sources
got no auth field at all. Always show the input when a source is entered,
and label it generically ('Personal access token') instead of 'GitHub PAT'.

UI only — backend already accepts any token via /skills/:id/auth and
/companies/:companyId/skills POST {source, authToken}.
2026-05-14 11:41:39 -04:00
Chris Farhood fccbc7e39e feat(ci): install gitea tea CLI in fork Dockerfile
Adds the official Gitea 'tea' CLI (v0.14.0) alongside the existing forgejo
CLIs (fj, fj-ex, fgj). Useful when interacting with Gitea instances whose API
surface is covered by tea but not by the forgejo variants.
2026-05-14 10:04:18 -04:00
Chris Farhood 729ef021e9 feat(ci): install gitea tea CLI in fork Dockerfile
Adds the official Gitea 'tea' CLI (v0.14.0) alongside the existing forgejo
CLIs (fj, fj-ex, fgj). Useful when interacting with Gitea instances whose API
surface is covered by tea but not by the forgejo variants.
2026-05-14 10:03:30 -04:00
Dotta 9b275c332a fix(plugin): fail fast on upload protocol drift 2026-05-13 22:35:26 -04:00
Dotta 9035b70aa9 fix(plugin): close timed out kubernetes exec sockets 2026-05-13 22:35:26 -04:00
Dotta 1eccb71213 fix(plugin): guard kubernetes upload edge cases 2026-05-13 22:35:26 -04:00
Dotta f8b8303089 fix(plugin): harden kubernetes exec upload parsing 2026-05-13 22:35:26 -04:00
Dotta 3e998bda97 fix(plugin): close kubernetes exec timeout edges 2026-05-13 22:35:26 -04:00
Dotta 40e8638aa3 fix(plugin): harden kubernetes fast upload edges 2026-05-13 22:35:26 -04:00
Dotta 713fb6eb4e fix(plugin): share kubernetes shell quoting helper 2026-05-13 22:35:26 -04:00
Dotta 58d1b19206 fix(plugin): address kubernetes fast upload review 2026-05-13 22:35:26 -04:00
Dotta fcbbd50b60 feat(plugin): add kubernetes fast upload interceptor 2026-05-13 22:35:26 -04:00
Dotta a6c2e0392b fix(plugin): address kubernetes greptile follow-up
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-13 22:35:26 -04:00
Dotta a98c5cdfa9 fix(plugin): warn on missing kubernetes adapter env 2026-05-13 22:35:26 -04:00
Dotta 94fc81266f fix(plugin): reconcile kubernetes namespace labels 2026-05-13 22:35:26 -04:00
Dotta b248acd46c fix(plugin): align kubernetes config validation 2026-05-13 22:35:26 -04:00
Dotta c37e5919ce fix(plugin): restrict kubernetes cilium cidr egress 2026-05-13 22:35:26 -04:00
Dotta 45621aac53 fix(plugin): address kubernetes greptile timeouts 2026-05-13 22:35:26 -04:00
Dotta 39d81c732c fix(plugin): bound kubernetes sandbox execution 2026-05-13 22:35:26 -04:00
Dotta e691d30d12 fix(plugin): harden kubernetes sandbox orchestration 2026-05-13 22:35:26 -04:00
Dotta 163e3ca1a5 feat(plugin): add kubernetes sandbox provider 2026-05-13 22:35:26 -04:00
Chris Farhood 55d6c5bfa4 Merge upstream/master into dev (13 commits — includes #5922, #5938, blocked inbox, recovery actions) 2026-05-13 22:35:18 -04:00
Chris Farhood b6b81f2f06 Merge updated feat/plugin-acquire-lease-agent-id into dev (adds tests) 2026-05-13 18:54:32 -04:00
Chris Farhood 4c4eeaba2b test: cover agentId threading on plugin lease RPCs and call sites
Adds focused tests for every code path the agentId addition touches:

- environment-runtime.test.ts (4 new tests):
  - plugin-driver acquireLease forwards agentId in RPC payload when present
  - plugin-driver acquireLease omits agentId from RPC payload when null
  - sandbox-provider acquireLease forwards agentId when present
  - sandbox-provider resumeLease forwards agentId when reuseLease=true matches
  - seedEnvironment helper now exposes the seeded agentId

- environment-run-orchestrator.test.ts (2 new tests):
  - acquireForRun threads agentId through to runtime.acquireRunLease
  - logActivity records the same agentId on environment.lease_acquired
  - new vi.hoisted mocks for environmentService.getById + ensureLocalEnvironment

- agent-test-environment-routes.test.ts (1 new assertion):
  - ad-hoc operator test-environment probe calls acquireRunLease with
    agentId: null and heartbeatRunId: null (no agent context)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 18:52:28 -04:00
Dotta f4bed4a70f Release changelog v2026.513.0 (#5944)
## Summary

- Add `releases/v2026.513.0.md` covering the stable release range
`v2026.512.0..origin/master` (6 PRs).
- Includes one new DB migration (`0084_issue_recovery_actions`) under
the Upgrade Guide.
- No breaking changes detected; all PRs are core-maintainer commits so
the Contributors section is omitted.

## Highlights captured

- Source-scoped recovery actions
([#5599](https://github.com/paperclipai/paperclip/pull/5599))
- Blocked Inbox attention view
([#5603](https://github.com/paperclipai/paperclip/pull/5603))
- Local plugin development workflow
([#5821](https://github.com/paperclipai/paperclip/pull/5821))

## Test plan

- [ ] Reviewer confirms the highlight/improvement/fix categorization
matches release intent
- [ ] Reviewer confirms `0084_issue_recovery_actions` upgrade note is
accurate
- [ ] Reviewer signs off on `releases/v2026.513.0.md` for the stable
release cut

Generated under [PAP-9378](/PAP/issues/PAP-9378) via the
`release-changelog` skill.

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-13 16:56:19 -05:00
Dotta 4142559c37 [codex] Add blocked inbox attention view (#5603)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies through
company-scoped issues, comments, approvals, and execution workspaces.
> - Operators need the Inbox to show not only active work, but also
blocked work that may need human or agent attention.
> - The existing inbox experience did not have a dedicated blocked-work
surface, so blocked tasks were harder to triage and resume deliberately.
> - Backend consumers also needed a compact attention signal that
distinguishes actionable blockers from covered or waiting blocker
states.
> - This pull request adds a Blocked Inbox tab backed by issue
blocker-attention metadata, shared validators, and UI helpers.
> - The benefit is a clearer triage path for stalled or blocked
Paperclip work without exposing external wait internals in the
operator-facing UI.

## What Changed

- Added shared issue blocker-attention types, validators, and exports
for the API/UI contract.
- Added backend blocker-attention computation and issue route support
for blocked inbox data.
- Added the Blocked Inbox tab, blocked reason chips, filtering/search
UI, responsive layouts, and Storybook stories.
- Updated inbox helpers and page behavior so toolbar controls only
appear where they apply.
- Added coverage for shared validators, server blocker-attention
behavior, blocked inbox UI helpers/components, and the Inbox page.
- Added a screenshot helper script for the blocked inbox Storybook
stories.
- Addressed Greptile feedback by making urgency sorting deterministic
for null stop times, avoiding full blocked-inbox list enrichment for
counts, and hardening the screenshot helper.

## Verification

- Rebased the branch cleanly onto `public-gh/master`.
- Confirmed the diff does not include `pnpm-lock.yaml`.
- Confirmed the diff does not include database migration files.
- Ran `pnpm exec vitest run packages/shared/src/validators/issue.test.ts
server/src/__tests__/issue-blocker-attention.test.ts
ui/src/components/BlockedInboxView.test.tsx
ui/src/components/BlockedReasonChip.test.tsx
ui/src/lib/blockedInbox.test.ts ui/src/lib/inbox.test.ts
ui/src/pages/Inbox.test.tsx`.
- Ran `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/server typecheck && pnpm --filter @paperclipai/ui
typecheck`.
- Checked `ROADMAP.md`; this is scoped inbox/operator triage work and
does not duplicate a listed roadmap feature.
- Greptile Review is green on the latest head and all four Greptile
review threads are resolved.
- GitHub PR checks are green on the latest head: policy, security/snyk,
e2e, verify, Canary Dry Run, Greptile Review, and serialized server
suites 1/4 through 4/4.

## Risks

- Medium review surface because this touches the shared issue contract,
server issue services, and the Inbox UI together.
- Blocker-attention classification may need product tuning after
operators use it on real blocked queues.
- UI screenshots were not attached in this PR-opening pass; the branch
includes `scripts/screenshot-blocked-inbox.mjs` and Storybook stories
for visual capture.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

OpenAI Codex, GPT-5-based coding agent with shell, git, GitHub CLI,
GitHub connector, and Paperclip API tool use. Reasoning mode: medium.
Context window: not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-13 16:41:36 -05:00
Dotta d1a8c873b2 fix(remote-sandbox): harden host workspace resumes (#5922)
## Thinking Path

> - Paperclip orchestrates AI agents through a control plane while
adapters execute work in local, remote, or sandboxed runtimes.
> - Remote sandbox execution depends on a strict host-versus-remote
workspace boundary: the host prepares/restores files, while the adapter
command runs inside the sandbox cwd.
> - Jannes' PR #5823 identified host-side failure modes that were not
covered by replacement PR #5822.
> - Persisting a remote pod cwd in session params could poison the next
host heartbeat resume and make Paperclip inspect or upload system temp
roots.
> - Plugin sandbox providers also need a narrow way to receive
model-provider API keys without exposing the full server environment to
every plugin worker.
> - This pull request ports the host-side fixes from #5823 in the
current codebase style, with focused regression coverage.
> - The benefit is safer remote sandbox resumes and plugin worker
environment handling without broadening core plugin privileges.

## What Changed

- Persist host workspace cwd, not remote sandbox cwd, in `claude_local`
session params while retaining remote execution identity metadata.
- Reject saved session cwds that point at system roots before heartbeat
falls back to agent home workspace.
- Skip sockets, FIFOs, devices, and other non-file entries during
workspace restore snapshot capture/comparison.
- Pass a small model-provider API-key allowlist only to plugins
declaring `environment.drivers.register`.
- Added focused regression tests for remote Claude session params,
unsafe session cwd detection, plugin worker env filtering, and non-file
snapshot entries.

Credits: ports host-side fixes from Jannes' #5823.

## Verification

- `pnpm vitest run
packages/adapter-utils/src/workspace-restore-merge.test.ts
server/src/services/session-workspace-cwd.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/plugin-database.test.ts` (25 passed, 7 skipped by
existing embedded-Postgres host guard)
- `pnpm --filter @paperclipai/adapter-utils typecheck`
- `pnpm --filter @paperclipai/adapter-claude-local typecheck`
- `pnpm --filter @paperclipai/server typecheck`

## Risks

- Low risk: changes are scoped to remote sandbox/session metadata,
workspace snapshot filtering, and plugin worker env setup.
- Sandbox-provider plugins now receive only the explicit model-provider
key allowlist; any provider needing another key name will need a
deliberate allowlist update.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled local code
execution and repository editing.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-13 16:23:04 -05:00
Dotta 012a738729 Add ordered sub-issue navigation (#5938)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through company-scoped
issues, comments, and execution context.
> - The issue detail page is the board surface where operators and
agents inspect a task in its parent/child workflow.
> - Ordered sub-issues need a low-friction way to move through work
without returning to the parent list after every issue.
> - Existing issue detail navigation only covered sibling transitions
and did not continue into a parent issue's first ordered child.
> - This pull request adds ordered previous/next navigation for issue
detail views and extends it to continue from a parent or last sibling
into the first direct child.
> - The benefit is a smoother review/execution path through hierarchical
work while preserving hidden issue filtering and dependency-aware
ordering.

## What Changed

- Added `IssueSiblingNavigation` and route-state handling so issue
detail footers can link to previous/next ordered issues.
- Extended sub-issue ordering helpers to build navigation from siblings
plus direct children, including root-parent and
last-sibling-to-first-child cases.
- Added page, component, and library tests for ordered sibling
navigation, child fallback navigation, hidden issues, and link
rendering.
- Fixed the quicklook blur/click race Greptile found by deferring close
until after portaled link clicks can complete, with a regression test.
- Polished the navigation landmark label so it remains accurate when the
next target is a direct child rather than a sibling.

## Verification

- `pnpm exec vitest run src/components/IssueLinkQuicklook.test.tsx
src/lib/issue-detail-subissues.test.ts
src/components/IssueSiblingNavigation.test.tsx
src/pages/IssueDetail.test.tsx --config vitest.config.ts` from `ui/` -
31 tests passed.
- `pnpm --filter @paperclipai/ui typecheck` - passed.
- `git diff --check` - passed.
- GitHub PR checks on latest head `34046be2` - passed: Greptile Review,
verify, e2e, Canary Dry Run, policy, Snyk, and serialized server shards.
- Screenshots: not captured in this heartbeat; this PR is a draft and
the changed states are covered by focused component/page tests.

## Risks

- Low risk; this is a UI navigation addition with no database or API
contract changes.
- The main behavioral risk is navigation ordering drift if
`workflowSort` expectations change later.
- The IssueDetail navigation now waits for child issue loading, which
avoids stale child fallback links but can delay footer navigation
briefly while data loads.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected - check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent with repository tool use and shell
execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-13 15:43:51 -05:00
Dotta eb452fba30 Fix comment date binding regression (#5919)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, and
issue comments are the primary durable communication surface between
operators and agents.
> - Commit `c445e592` (`fix(ui): fix message attribution for
agent-posted comments with user author IDs (#5780)`) added server-side
derived attribution for historical comments by scanning heartbeat runs
near comment timestamps.
> - That scan accidentally bound JavaScript `Date` objects directly into
postgres-js SQL fragments for the run timestamp window.
> - On real Postgres, that can fail while listing issue comments with
`ERR_INVALID_ARG_TYPE`, which makes comments disappear from issue pages
such as `PAP-9284`.
> - This pull request keeps the attribution behavior intact while
changing only the broken timestamp binding path.
> - The benefit is that comments load again without weakening the
conservative attribution recovery introduced by `c445e592`.

## What Changed

- Convert the derived-attribution heartbeat-run window bounds to ISO
timestamp strings before binding them into SQL, with explicit
`::timestamptz` casts.
- Add an embedded Postgres regression that inserts a heartbeat run and
user-authored comment, then verifies `issueService.listComments()`
returns the comment while the attribution scan runs.
- Delete `heartbeat_runs` during the issue service test cleanup before
deleting agents so the new test data does not leak across cases.

## Verification

- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts -t
"lists user comments when derived run attribution scans a timestamp
window"`
- `pnpm --filter @paperclipai/server typecheck`
- `git diff --check`

## Risks

- Low risk. The change is limited to how timestamp parameters are bound
for an existing query.
- The derived attribution logic remains conservative and still requires
exact run-log proof before relabeling a comment.
- The regression uses embedded Postgres so it covers the postgres-js
binding path that failed in production-like local runs.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via the Paperclip `codex_local` adapter; GPT-5
coding-agent family with local terminal, file-editing, and git/GitHub
CLI tool use. Exact hosted model deployment ID is not exposed by this
local adapter runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable: server-side comment API bugfix)
- [x] I have updated relevant documentation to reflect my changes (not
applicable: no documented behavior or command changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-13 12:56:51 -05:00
Chris Farhood 7a8afbb719 Merge pull request #12 from farhoodlabs/dev
Dev
2026-05-12 16:39:28 -07:00
Dotta b947a7d76c [codex] Improve local plugin development workflow (#5821)
## Thinking Path

> - Paperclip is the control plane for autonomous AI-agent companies.
> - Plugins are the extension point for adding capabilities without
expanding the core product surface.
> - Local plugin development needed a tighter CLI-first loop so plugin
authors can scaffold, run, install, inspect, and reload plugins without
reaching into internal package paths.
> - The server plugin install path also needed local-path handling that
keeps plugin identity, dashboard routes, and development watchers
coherent.
> - This pull request adds the CLI scaffold/install workflow, fixes the
server and SDK edge cases that blocked that loop, and updates the
agent-facing plugin creation skill and docs.
> - The benefit is that contributors can develop plugins from local
folders with a documented, repeatable happy path.

## What Changed

- Added `paperclipai plugin init` coverage and CLI wiring for local
plugin scaffolding.
- Improved local plugin install handling, plugin key route resolution,
dashboard capability behavior, and dev watcher startup/reload behavior.
- Fixed plugin SDK worker entrypoint validation for symlinked package
layouts.
- Added targeted tests for plugin init, server plugin authz/watcher
behavior, SDK worker host validation, and the authoring smoke example.
- Added a short local plugin development guide and refreshed the plugin
authoring guide plus `paperclip-create-plugin` skill instructions.

## Verification

- `pnpm run preflight:workspace-links && pnpm --filter
@paperclipai/plugin-sdk build && pnpm --filter
@paperclipai/create-paperclip-plugin typecheck && pnpm --filter
paperclipai typecheck && pnpm --filter @paperclipai/plugin-sdk typecheck
&& pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run --project paperclipai
cli/src/__tests__/plugin-init.test.ts`
- `pnpm exec vitest run --project @paperclipai/plugin-sdk
packages/plugins/sdk/tests/worker-rpc-host.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/plugin-dev-watcher.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/plugin-routes-authz.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --dir packages/plugins/examples/plugin-authoring-smoke-example
test`
- Confirmed `pnpm-lock.yaml` is not included in the PR diff.

## Risks

- Medium risk: this touches plugin install routing, CLI command
behavior, and the local development watcher.
- Local path plugin installs execute trusted local code by design; the
new docs call out that trust boundary.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled local shell and git
workflow, medium reasoning effort. Context window details were not
exposed in this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

UI screenshots: not applicable; this PR changes CLI/server/plugin docs
and tests, not board UI rendering.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-12 17:38:24 -05:00
Dotta 0808b388ee [codex] Add source-scoped recovery actions (#5599)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies, where work
must end with a clear disposition rather than ambiguous agent liveness.
> - Recovery currently detects stalled or missing-next-step issues, but
source issue recovery can become split across child recovery issues,
blockers, and comments.
> - That makes it harder for operators and agents to see who owns
recovery and what exact action is needed on the original issue.
> - Source-scoped recovery actions give the original issue a first-class
active recovery state with owner, evidence, wake policy, and resolution
outcome.
> - This pull request adds the recovery-action data model, backend
reconciliation and resolution APIs, and board UI indicators/actions.
> - The benefit is clearer stalled-work recovery without losing source
issue context or relying on comments as the liveness path.

## What Changed

- Added the `issue_recovery_actions` schema, shared
types/constants/validators, and an idempotent
`0084_issue_recovery_actions` migration ordered after current `master`
migrations.
- Updated stranded/missing-disposition recovery to create source-scoped
recovery actions, wake the recovery owner on the source issue, and avoid
locking the source issue for recovery-action wakes.
- Added API support for reading active recovery actions on issue
detail/list surfaces and resolving them with restored, blocked,
cancelled, or false-positive outcomes.
- Require blocked recovery resolutions to have an unresolved first-class
blocker, and removed the UI shortcut that could mark recovery blocked
without a blocker selection path.
- Surfaced recovery indicators/actions in the issue UI, blocker notices,
active run panels, issue rows, and Storybook coverage.
- Updated docs and focused tests for recovery semantics, ownership,
races, stale comments, and UI behavior.

## Verification

- `pnpm exec vitest run
server/src/__tests__/issue-recovery-actions.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
ui/src/components/IssueRecoveryActionCard.test.tsx
ui/src/components/IssueBlockedNotice.test.tsx ui/src/api/issues.test.ts`
— 5 files, 72 tests passed.
- `pnpm --filter @paperclipai/shared typecheck` — passed.
- `pnpm --filter @paperclipai/db typecheck` — passed, including
migration numbering check.
- `pnpm --filter @paperclipai/server typecheck` — passed.
- `pnpm --filter @paperclipai/ui typecheck` — passed.
- Follow-up verification after blocker-resolution guard: `pnpm exec
vitest run server/src/__tests__/issue-recovery-actions.test.ts
ui/src/components/IssueRecoveryActionCard.test.tsx
ui/src/api/issues.test.ts` — 3 files, 27 tests passed.
- Follow-up `pnpm --filter @paperclipai/server typecheck` — passed.
- Follow-up `pnpm --filter @paperclipai/ui typecheck` — passed.
- UI states are available in
`ui/storybook/stories/source-issue-recovery.stories.tsx`; screenshot
capture helper is `scripts/screenshot-recovery-card.cjs`.

## Risks

- Medium: recovery behavior changes from child recovery issue ownership
toward source-scoped actions, so operators may see stalled-work state in
new places.
- Migration risk is mitigated by using the next migration slot after
`master` and making the table/constraints/index creation idempotent for
anyone who previously applied the old branch-local
`0082_dizzy_master_mold` migration.
- Existing child recovery issue paths are still guarded for
already-created recovery issues, but new source-scoped flows should be
watched in CI and Greptile review.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use enabled for shell, Git,
GitHub, and local test execution. Context window not exposed by the
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-12 09:37:15 -05:00
Chris Farhood b61455373c Merge updated feat/plugin-acquire-lease-agent-id into dev (adds resumeLease agentId) 2026-05-12 07:36:19 -04:00
Chris Farhood 73f4685729 feat(plugin-sdk): also thread agentId into environmentResumeLease params
Symmetric with the acquireLease change. Lets plugin-backed sandbox
providers reject a reusable lease whose stored agentId doesn't match
the current run's agent, forcing the host to acquire a fresh lease
instead of stomping the previous agent's workspace state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 07:36:08 -04:00
Chris Farhood 7cee02ddf3 Merge branch 'feat/plugin-acquire-lease-agent-id' into dev
Thread agentId into PluginEnvironmentAcquireLeaseParams + host call sites
so plugin-backed sandbox providers (e.g. paperclip-plugin-k8s) can scope
lease state per-agent without needing an SDK callback or DB lookup.
2026-05-12 07:34:00 -04:00
Chris Farhood 417782a6ec feat(plugin-sdk): thread agentId into environmentAcquireLease params
Add an optional agentId field to PluginEnvironmentAcquireLeaseParams and
thread it through the host's environment-runtime + run-orchestrator call
sites so plugin-backed sandbox providers can scope lease state (subdirs,
PVCs, etc.) per agent without an SDK callback or DB lookup.

The field is required-but-nullable on the internal EnvironmentDriverAcquireInput
(string | null) so every call site has to think about whether it has an
agent context. Ad-hoc operator probes (agent test-environment route)
pass null. The plugin RPC payload omits the field entirely when null,
keeping wire compatibility with older plugin worker SDKs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 07:33:10 -04:00
Devin Foley c445e59256 fix(ui): fix message attribution for agent-posted comments with user author IDs (#5780)
## Thinking Path

> - Paperclip’s issue chat is an audit surface: reviewers need to trust
who actually authored a message.
> - Some historical agent comments were persisted with `authorUserId`
and no surviving `createdByRunId`, so the UI rendered real agent output
as if it came from the board user.
> - A pure timestamp-window fallback is too risky because human
reviewers can comment while agents are running.
> - The safe recovery path is to derive attribution only when the server
can prove it from same-issue run logs that include the exact posted
comment id, then let the chat renderer prefer that recovered agent
attribution.
> - This keeps historical threads trustworthy without mutating old
database rows or guessing in ambiguous cases.

## What Changed

- Added shared `IssueComment` fields for derived attribution so server
and UI can carry recovered `derivedAuthorAgentId`,
`derivedCreatedByRunId`, and `derivedAuthorSource` consistently.
- Added server-side attribution recovery in
`server/src/services/issues.ts` that reads same-issue run logs and only
derives agent authorship when a run log contains the exact `comment id:
...` emitted during posting.
- Updated issue chat rendering in `ui/src/lib/issue-chat-messages.ts` to
prefer direct agent authorship, then activity-log `runAgentId`, then the
server-derived attribution.
- Removed the unsafe UI-only run-window fallback from
`ui/src/pages/IssueDetail.tsx` so human comments posted during an active
run are not silently relabeled as agent output.
- Added regression coverage for both the run-log derivation path and the
chat-rendering fallback behavior.
- Bounded server-side run-log enrichment to 8 concurrent reads per
request and removed the unused `issueCommentSchema` declaration during
PR cleanup.

## Verification

- `pnpm exec vitest run ui/src/lib/issue-chat-messages.test.ts
server/src/__tests__/issues-service.test.ts`
- `pnpm test:run:general`
- Live validation on May 12, 2026 in `PAPA-322`: confirmed the
previously misattributed historical comments on `PAPA-316` now render as
Claude-authored on `http://goldie.gerbil-company.ts.net:3100`.
- Reviewer check: open `PAPA-316` in the running instance and confirm
historical comments such as `## Investigation: exe.dev 422 + codex
re-test` render under Claude instead of the board user.

## Risks

- Low risk. The change is scoped to comment attribution recovery and
rendering.
- Derived attribution is intentionally conservative: if there is no
exact run-log proof, the comment remains user-authored instead of
guessing.
- Run-log recovery depends on retained same-issue logs, so older
comments without that evidence remain unchanged.

## Model Used

- OpenAI Codex via the Paperclip `codex_local` adapter (GPT-5-class
coding agent with tool use in the local Paperclip runtime; the exact
deployment/model ID is not surfaced by this workspace).

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-12 01:20:49 -07:00
Dotta 9746dab4e8 Bump release changelog to v2026.512.0 (#5764)
## Summary

PR [#5366](https://github.com/paperclipai/paperclip/pull/5366) already
merged the v2026.511.0 changelog. This follow-up bumps the artifact to
the actual cut date and drops the pre-alpha sandbox work per maintainer
feedback.

- **Rename** `releases/v2026.511.0.md` → `releases/v2026.512.0.md`
- **Bump header / date** to `# v2026.512.0` / `> Released: 2026-05-12`
- **Drop new sandbox content** (pre-alpha, not yet ready):
- Daytona sandbox provider plugin highlight
([#5580](https://github.com/paperclipai/paperclip/pull/5580),
[#5586](https://github.com/paperclipai/paperclip/pull/5586))
- Cursor sandbox support improvement
([#4803](https://github.com/paperclipai/paperclip/pull/4803))
- Cursor sandbox runtime resolution fix
([#5446](https://github.com/paperclipai/paperclip/pull/5446))
- Sandbox provider messaging polish
([#4902](https://github.com/paperclipai/paperclip/pull/4902))
- **Add LLM Wiki plugin package highlight**
([#5716](https://github.com/paperclipai/paperclip/pull/5716)) — the
package itself landed on master after #5366 merged.
- **Update Upgrade Guide closer** to mention only the `cursor_cloud`
adapter as opt-in.

The `cursor_cloud` adapter is kept in (adapter, not sandbox). The
exe.dev and Cloudflare sandbox provider plugins that landed since the
merge are also excluded as pre-alpha.

No breaking changes; the nine new migrations (`0075`–`0083`) carry over
unchanged from the merged 511 file.

## Test plan

- [ ] Maintainer review of the dropped entries — confirm I caught
everything sandbox-related you wanted out
- [ ] Confirm Cursor Cloud adapter staying in is intentional (flag for
removal if not)
- [ ] Confirm LLM Wiki plugin package highlight phrasing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 22:06:43 -05:00
Dotta 563413ecd4 Fix LLM wiki type contracts (#5758)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, and
plugins extend that control plane without bloating core.
> - The LLM Wiki plugin adds a knowledge surface through the plugin
runtime and shared plugin UI components.
> - After the LLM Wiki work merged to `master`, CI exposed TypeScript
contract drift between plugin code, SDK component types, and update
settings types.
> - The ingestion settings update path intentionally accepts partial
source toggles, but its type intersected with the full settings shape
and required every source key.
> - The LLM Wiki UI also passes managed routine default-drift metadata
through the shared routine list item shape, but that metadata was
missing from the public item type.
> - This pull request narrows those type contracts to match the existing
runtime behavior.
> - The benefit is restoring typecheck on `master` with a small,
non-behavioral follow-up.

## What Changed

- Added a `WikiEventIngestionSettingsUpdate` type that permits partial
source updates without weakening normalized stored settings.
- Added managed routine default-drift metadata to the plugin SDK
`ManagedRoutinesListItem` type.
- Mirrored that managed routine default-drift type in the host UI
component item type.

## Verification

- `pnpm --filter @paperclipai/plugin-llm-wiki typecheck`
- `pnpm --filter @paperclipai/plugin-sdk typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `git diff --check`

## Risks

- Low risk. This is a TypeScript type-contract fix only; no runtime
behavior or database schema changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled local repository
editing and command execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Notes on checklist applicability: no screenshots are included because
the UI change is a shared type-only contract update with no visual
behavior change; no docs were required because no behavior or commands
changed.

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 21:07:06 -05:00
github-actions[bot] 94ce7af715 chore(lockfile): refresh pnpm-lock.yaml (#5756)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-05-11 20:46:58 -05:00
Dotta 508355b8fc [codex] Add LLM Wiki plugin package to master (#5716)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The plugin system is the extension surface for optional product
capabilities without baking every workflow into core.
> - The LLM Wiki plugin package was reviewed in stacked PR #5592, which
targeted `pap-9173-llm-wiki-rest`.
> - The stack base PR #5597 merged to `master` before #5592 was merged
into that branch, so the plugin package never reached `master`.
> - A direct PR from `pap-9173-llm-wiki-rest` back to `master` would be
noisy because that branch has diverged from current `master`.
> - This pull request reapplies the reviewed
`packages/plugins/plugin-llm-wiki/` package onto current `master` and
updates Docker deps-stage manifest coverage.
> - The branch intentionally no longer changes `pnpm-workspace.yaml`
after maintainer feedback; because the new package is now a root
workspace importer, the remaining integration question is how
maintainers want the root lockfile handled under the current PR policy.

## What Changed

- Added the LLM Wiki plugin package under
`packages/plugins/plugin-llm-wiki/` from the merged PR #5592 head.
- Preserved the post-review cleanup from #5592: generated
design/screenshot artifacts are not committed, and `src/ui/index.tsx` /
`src/wiki.ts` are small public entrypoints.
- Added the new plugin package manifest to the Docker deps stage so
policy can validate package manifest coverage.
- Removed the earlier `pnpm-workspace.yaml` exclusion per maintainer
request, so the plugin is included by the existing `packages/plugins/*`
workspace glob.

## Verification

Current head:
- PGlite migration harness: ran migrations 001-003, verified old
non-space distillation unique constraints were removed, inserted
duplicate cursor and work-item keys in a second space, then reran
migration 003 successfully
- `node ./scripts/check-docker-deps-stage.mjs`
- `git diff --check`

Known current-head install result after removing the workspace
exclusion:
- `pnpm install --frozen-lockfile` fails because `pnpm-lock.yaml` has no
importer for `packages/plugins/plugin-llm-wiki/package.json`.

Previously verified on the same plugin source before the
workspace-exclusion removal:
- `pnpm --filter @paperclipai/plugin-sdk build`
- `cd packages/plugins/plugin-llm-wiki && pnpm install --lockfile=false
&& pnpm test`

## Risks

- The branch now includes `packages/plugins/plugin-llm-wiki` in the root
workspace but does not update `pnpm-lock.yaml`. Root frozen install will
fail until maintainers choose a lockfile path that fits repo policy.
- Committing `pnpm-lock.yaml` directly on this PR conflicts with the
current PR policy check, while excluding the package from
`pnpm-workspace.yaml` was rejected in maintainer feedback.
- The package includes UI code already reviewed in #5592; generated
screenshot/design artifacts were intentionally removed per maintainer
request, so visual review should regenerate screenshots locally if
needed.
- The package depends on plugin host support from #5597, which is
already merged to `master`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI GPT-5 Codex via Codex CLI, tool use and local code execution
enabled; context window not exposed.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run the targeted checks listed above
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Stack context: #5592 was merged into `pap-9173-llm-wiki-rest` after
#5597 had already merged that branch to `master`, so this follow-up PR
is needed to carry the plugin package itself into `master`.

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 20:45:41 -05:00
Chris Farhood 30ef61bb25 Merge pull request #11 from farhoodlabs/dev
Dev
2026-05-11 17:02:31 -07:00
Chris Farhood 872dd664ed revert(secrets): drop fork's usages-tracking + delete guard
Now that upstream's #5429 provides provider-based secrets management with formal
bindings (companySecretBindings table populated at config-save time) plus the
/secrets/:id/usage endpoint backed by listBindingReferences(), the fork's
parallel usages() scan is redundant for agent/routine bindings.

The fork's scan did cover one path upstream doesn't track: skill metadata
sourceAuthSecretId references. Dropping this means accidental deletion of a
skill's auth secret is no longer rejected — accepted as a chase-upstream tradeoff.

- server/src/services/secrets.ts: drop usages(), SecretUsage* types, in-use guard
  in remove(), and companySkills/agentService imports
- server/src/routes/secrets.ts: drop GET /secrets/:id/usages route
- ui/src/api/secrets.ts: drop usages() client method

Typechecks clean on server and ui.
2026-05-11 19:38:56 -04:00
Chris Farhood d9d9bbcf06 fix(secrets): drop fork's CompanySecrets page and dedupe sidebar/route
The merge of upstream's #5429 (provider vaults + AWS Secrets Manager) added a
new 2,155-line Secrets.tsx page covering everything the fork's 528-line
CompanySecrets page did, plus vault management and AWS import. Auto-merge took
both sides' sidebar entry and route registration, so React Router picked the
fork's older page first and upstream's AWS UI never rendered.

- ui/src/App.tsx: drop CompanySecrets import + duplicate route at /company/settings/secrets
- ui/src/components/CompanySettingsSidebar.tsx: drop duplicate Secrets nav item
- ui/src/pages/CompanySecrets.tsx: delete (superseded)

Server-side delete guard (services/secrets.ts: refuse delete when secret is
referenced by agents or skills) is preserved and still enforced; upstream's
page surfaces the API error message in place of the fork's inline reference list.
2026-05-11 19:32:08 -04:00
Chris Farhood c2a49879d5 fix(ci): add cursor-cloud adapter package.json to fork Dockerfile deps stage
Upstream #5664 added the cursor-cloud adapter; the .farhoodlabs/Dockerfile
overlay was missing the deps-stage COPY for its package.json, so pnpm install
didn't wire it as a workspace package and the build stage failed type-checking
cursor-cloud's UI sources from ui's tsc -b.
2026-05-11 19:16:51 -04:00
Chris Farhood 5d0d076704 chore(docs): remove FAR-108 from pending PR list (k8s adapter dropped) 2026-05-11 18:02:15 -04:00
Chris Farhood 08dc3d9ff4 Merge upstream/master into dev (76 commits)
Resolved 5 conflicts:
- .github/workflows/docker.yml, release.yml: kept fork stubs (CI handled by build-prod/build-dev)
- server/src/routes/secrets.ts: kept fork's /usages route alongside upstream's /usage, /access-events
- server/src/services/secrets.ts: kept fork's usages() function and in-use deletion guard,
  layered before upstream's soft-delete + provider cleanup in remove()
- ui/src/api/secrets.ts: kept fork's usages() method alongside upstream's vault methods

Typechecks pass on @paperclipai/shared, @paperclipai/server, @paperclipai/ui.
2026-05-11 18:01:56 -04:00
Devin Foley ad0bb57350 Fix exe.dev sandbox installs for gemini/opencode local adapters (#5737)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, including
running adapter CLIs inside remote sandboxes
> - The QA matrix in PAPA-316 spins up local-runtime adapters
(claude/gemini/opencode) against both SSH and the new exe.dev sandbox
provider, and "Test" exercises the same install + probe path the real
runtime uses
> - On exe.dev the QA matrix failed at three different points:
SSH/sandbox secret refs would not resolve, gemini-local could not find
npm, and opencode-local installed a binary that was not on the
probe-shell PATH
> - These are all environment-shape issues the runtime should handle,
not regressions in any individual adapter, so they need to be fixed in
the shared install/resolve layer before the matrix can pass
> - This pull request wires the environment id through to secret-ref
resolution, bootstraps npm from a portable Node tarball when the sandbox
image lacks Node, and symlinks the opencode binary into a directory that
non-login shells see
> - The benefit is that the QA matrix passes end-to-end on exe.dev, and
any future sandbox provider that ships without Node or relies on rc-file
PATH wiring gets the same fixes for free

## What Changed

- `server/src/services/environment-execution-target.ts`: pass the
environment `id` into `resolveEnvironmentDriverConfigForRuntime` for
both the sandbox and SSH branches, so `privateKeySecretRef` /
sandbox-provider secret refs (e.g. exe.dev `apiKey`) can resolve against
the secret store at runtime instead of throwing `Runtime secret
resolution requires an environment id`.
- `packages/adapter-utils/src/sandbox-install-command.ts`: extend
`buildSandboxNpmInstallCommand` with an `ENSURE_NPM_PREAMBLE` that, when
`npm` is missing, downloads a portable Node v22 tarball into
`$HOME/.local` and sets `PAPERCLIP_NPM_BOOTSTRAPPED=1` so the install
step skips sudo (sudo's `secure_path` would lose the freshly-installed
`npm` in `$HOME/.local/bin`). Distro-packaged Node from apt-get is
intentionally avoided because it tends to be too old to parse modern JS
syntax used by `@google/gemini-cli`.
- `packages/adapters/gemini-local/src/index.ts`: switch the hardcoded
`npm install -g @google/gemini-cli` to `buildSandboxNpmInstallCommand`,
so gemini-local picks up the same sudo-aware + npm-bootstrap behavior as
the other local adapters.
- `packages/adapters/opencode-local/src/index.ts`: append a step to the
install command that symlinks `$HOME/.opencode/bin/opencode` into
`$HOME/.local/bin`. The upstream installer only adds `~/.opencode/bin`
to PATH via `~/.bashrc`, which non-login `sh -c` probe invocations do
not source.
- `packages/adapter-utils/src/sandbox-install-command.test.ts`: cover
the new preamble plus the unchanged root/sudo/user-prefix branches.

## Verification

- `cd packages/adapter-utils && npm test -- sandbox-install-command`
(passes; new "bootstraps npm from a portable Node tarball when missing"
case is included).
- Manual: ran the in-app `Test` action against the QA matrix dev
instance for `QA exe.dev Claude`, `QA exe.dev Gemini`, and `QA exe.dev
OpenCode` — all three now report `status=pass` including the hello
probe. `QA SSH Claude` also passes; without the environment-id fix, SSH
resolution threw before the wrapper / install fixes could run.
- Suggested reviewer check: re-run the matrix on a fresh exe.dev
environment and confirm the install step no longer hits `npm: command
not found` for gemini and the opencode probe no longer hits `opencode:
command not found`.

## Risks

- Low/medium. The npm bootstrap pins Node `v22.11.0` from
`nodejs.org/dist`; if that URL becomes unreachable the install will fail
with a clear `curl` error rather than corrupting state. The bootstrap
path is only taken when `npm` is genuinely missing, so existing sandbox
images that ship with Node are unaffected.
- The opencode symlink uses `ln -sf` into `$HOME/.local/bin`, which is
created with `mkdir -p`; idempotent on re-install.
- The `id` change is a strict additive: callers previously got
`undefined` and only the secret-ref code paths actually read it. No
behavior change for environments without secret refs.

## Model Used

- Claude (Anthropic), `claude-opus-4-7`, with extended thinking and tool
use enabled. Iterated through the Paperclip QA matrix harness; no other
model assisted.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (n/a — runtime/install path only)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 14:28:22 -07:00
Devin Foley eaa80cf88b Enable CI publishing for cursor-cloud, cloudflare, and exe.dev release packages (#5728)
## Thinking Path

> - Paperclip orchestrates AI agents, and its release flow depends on
explicit package enrollment for automated publishing.
> - The release registry tooling uses
`scripts/release-package-manifest.json` as the source of truth for which
public packages CI is allowed to publish.
> - The cursor cloud adapter plus the Cloudflare and exe.dev sandbox
plugins are public packages that now need to ship through the normal CI
release path.
> - Leaving those entries at `publishFromCi: false` keeps release
automation and registry validation out of sync with the intended package
set.
> - This pull request updates only those three manifest entries and
leaves the release tooling itself unchanged.
> - The benefit is that CI release enrollment now matches the packages
we intend to publish, with the existing manifest checks continuing to
guard correctness.

## What Changed

- Enabled CI publishing for `@paperclipai/adapter-cursor-cloud` in
`scripts/release-package-manifest.json`.
- Enabled CI publishing for `@paperclipai/plugin-cloudflare-sandbox` in
`scripts/release-package-manifest.json`.
- Enabled CI publishing for `@paperclipai/plugin-exe-dev` in
`scripts/release-package-manifest.json`.

## Verification

- `node ./scripts/release-package-map.mjs check`
- `pnpm test:release-registry`

## Risks

- Low risk. This is a manifest-only change, but a wrong enrollment flag
would affect release automation, so the release-registry checks are the
main guardrail.

## Model Used

- OpenAI GPT-5.4 via Paperclip `codex_local` (`adapterConfig.model:
gpt-5.4`), high reasoning effort, with tool use and shell/code
execution. The adapter does not expose a separate context-window value
in this environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-11 11:47:07 -07:00
Dotta 8af38fb054 Revert "fix(ui): prevent lossy cron rewrites + redesign routine triggers tab" (#5725)
## Thinking Path

> - Paperclip orchestrates AI agents through visible, governable task
and routine workflows.
> - Routines are the recurring-work surface where operators configure
schedules, runs, and activity.
> - PR #3569 moved routine operational tabs into the right-hand
properties panel while also redesigning the routine trigger editor.
> - The current product request is to remove that routine properties
right-tab change for now and come back to it later.
> - The cleanest way to do that is a direct revert of #3569 on top of
current `master`, which already includes the #5703 revert.
> - This pull request restores the pre-#3569 routine trigger/detail
behavior and removes the right-tab properties-panel routine layout.
> - The benefit is a simple, reviewable rollback with no schema or API
changes.

## What Changed

- Reverted #3569: `fix(ui): prevent lossy cron rewrites + redesign
routine triggers tab`.
- Restored the previous `RoutineDetail` inline tabs and trigger editing
flow.
- Restored the earlier `ScheduleEditor` implementation.
- Removed the UI components and tests introduced by #3569:
`ConfirmDialog`, `TriggerDialog`, `TriggerListCard`, and
`ScheduleEditor.test.ts`.

## Verification

- `git diff --check origin/master..HEAD`
- `pnpm vitest run ui/src/pages/Routines.test.tsx
ui/src/components/RoutineHistoryTab.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`

Notes:

- `pnpm install --frozen-lockfile` was run in the clean worktree before
verification. It completed with known workspace bin-link warnings for
`paperclip-plugin-dev-server` because the plugin SDK `dist/dev-cli.js`
has not been built in that fresh worktree.
- `Routines.test.tsx` emitted existing Radix dialog accessibility
warnings during the test run; the tests passed.

### Screenshots

This is a direct revert of #3569. The visual state after this PR
corresponds to the old screenshot from #3569, and the state being
removed corresponds to the new/right-panel screenshots from #3569.

| Before this revert | After this revert |
| --- | --- |
| <img width="1410" height="1325"
alt="routine-triggers-before-this-revert"
src="https://github.com/user-attachments/assets/d70dd35b-e72f-4fc6-bb21-be9b0d92b3b1"
/> | <img width="721" height="707"
alt="routine-triggers-after-this-revert"
src="https://github.com/user-attachments/assets/260bb682-32cb-4dff-b038-d55e45824b04"
/> |

Right-hand properties panel state removed by this revert:

<img width="1409" height="830" alt="routine-properties-panel-removed"
src="https://github.com/user-attachments/assets/f1d42f07-7cd3-4614-8e93-5b585affd4bf"
/>

## Risks

- Low technical risk: this is a clean Git revert of a UI-only PR.
- Product risk: #3569 also fixed lossy cron editing and added broader
schedule presets, so this rollback intentionally removes those
improvements along with the right-tab routine layout.
- Follow-up risk: if we want only the schedule-editor fixes back later,
they should be reintroduced separately from the routine properties-panel
layout.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with local shell and
GitHub CLI access. Context window size was not exposed in this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 13:24:48 -05:00
Dotta 0c6f9bdcf8 Revert "fix(ui): improve routine properties panel and history UX" (#5723)
## Thinking Path

> - Paperclip orchestrates AI agents through visible, governable task
and routine workflows.
> - The routines UI includes the routine detail page, properties panel,
history tab, and shared sidebar components.
> - PR #5703 changed that workflow by widening the routine properties
panel and moving revision inspection/comparison into dialogs.
> - The product direction for that change is being paused for now, so
the safest path is a direct revert instead of partial edits.
> - This pull request reverts merge commit
`74cb560c41305ac3283067d1ec8d3060ffdc28cb` from #5703.
> - The benefit is restoring the prior routines UI behavior while
keeping the revert easy to review and re-apply later if needed.

## What Changed

- Reverted #5703: `fix(ui): improve routine properties panel and history
UX`.
- Restored the previous routine properties panel sizing, panel context
API, routine detail layout, and routine history rendering behavior.
- Removed the reverted sidebar pane test additions and restored the
previous focused routine history test expectations.

## Verification

- `git diff --check origin/master..HEAD`
- `pnpm vitest run ui/src/components/RoutineHistoryTab.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`

### Screenshots

This is a direct revert of #5703. The visual state after this PR
corresponds to the "Before" screenshots from #5703, and the state being
removed corresponds to the "After" screenshots from #5703.

#### Trigger Panel Width

| Before this revert | After this revert |
| --- | --- |
| <img width="1742" height="1288" alt="triggers-before-this-revert"
src="https://github.com/user-attachments/assets/9e818978-283c-49a3-9401-879be550c67b"
/> | <img width="1741" height="1289" alt="triggers-after-this-revert"
src="https://github.com/user-attachments/assets/2a391769-c355-4219-8da3-d1ea18698430"
/> |

#### History Panel

| Before this revert | After this revert |
| --- | --- |
| <img width="1741" height="1290" alt="history-before-this-revert"
src="https://github.com/user-attachments/assets/4c139238-8494-4438-89e1-4277d05bc3aa"
/> | <img width="1739" height="1289" alt="history-after-this-revert"
src="https://github.com/user-attachments/assets/eaea4f3d-bb65-4af6-b67f-3ba3026fe0c9"
/> |

## Risks

- Low technical risk: this is a clean Git revert of a recently merged
UI-only PR.
- Product risk: the routine properties panel and revision history return
to the older, narrower workflow that #5703 was improving.
- Re-application risk: future work that wants the #5703 behavior back
should re-apply it deliberately rather than cherry-picking around this
revert.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with local shell and
GitHub CLI access. Context window size was not exposed in this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 13:10:40 -05:00
Dotta 21404e8a34 [codex] Fix Docker build without LLM wiki plugin package (#5714)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, and its
Docker image needs to build from the checked-in core repository.
> - The Docker `deps` stage copies workspace package manifests before
running `pnpm install --frozen-lockfile` so dependency installation can
be cached.
> - Current `master` copied
`packages/plugins/plugin-llm-wiki/package.json`, but that plugin package
has not been merged into core yet.
> - Docker fails before install with a missing build-context path, so
the release image cannot build from the current repository state.
> - This pull request removes the premature plugin manifest copy while
leaving the plugin SDK and existing sandbox plugin package copies
intact.
> - The benefit is that the Docker build no longer depends on an
unmerged plugin package.

## What Changed

- Removed the `packages/plugins/plugin-llm-wiki/package.json` copy from
the Dockerfile `deps` stage.

## Verification

- `git diff --check`
- Static Dockerfile source validation: parsed non-stage `COPY` sources
and confirmed every source exists in the build context.
- Attempted `docker build --target deps --progress=plain -t
paperclip-pap-9235-deps-check .`, but Docker is unavailable in this
execution environment: `Cannot connect to the Docker daemon at
unix:///Users/dotta/.docker/run/docker.sock`.

## Risks

- Low risk. The removed path points to a package that is absent from the
repository, so retaining it is what breaks the build. The plugin can add
its manifest copy back when the package itself lands.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex using GPT-5, tool-enabled coding agent in a local
repository workspace. Exact context-window metadata is not exposed in
this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 10:19:08 -05:00
Devin Foley 5a64cf52a1 Add exe.dev sandbox provider plugin (#5688)
> _Stacked on top of #5685#5686#5687. Diff against master includes
commits from earlier PRs in the stack — review focuses on the two new
commits (`Add long-secret textarea variant to JsonSchemaForm
SecretField` + `Add exe.dev sandbox provider plugin`)._

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent runs in a sandbox environment, and operators choose the
provider — today E2B, Daytona, and (in this stack) Cloudflare
> - exe.dev offers per-VM sandboxes via a small CLI / HTTP API — useful
for operators who want full Linux VMs (vs container/runtime-only
sandboxes)
> - The plugin shape mirrors the e2b plugin: lifecycle hooks (`new`,
`ls`, `rm`) drive exe.dev's CLI; SSH plumbing handles direct VM access
for adapters that need it
> - exe.dev VMs come up bare — `node` is not preinstalled, so the
Paperclip sandbox callback bridge (a Node script) needs Node 20
installed at VM init via `--setup-script`. The plugin defaults the setup
script to a Nodesource install
> - The auth field accepts long SSH private keys, which need a textarea
variant of the existing `SecretField` in `JsonSchemaForm` — added behind
a `maxLength > THRESHOLD` opt-in so other secret fields are unaffected
> - The benefit is that operators get exe.dev as a fully working sandbox
provider out of the box, with no manual VM provisioning required

## What Changed

**Shared UI support (`Add long-secret textarea variant to JsonSchemaForm
SecretField`):**

- `ui/src/components/JsonSchemaForm.tsx` + new
`JsonSchemaForm.test.tsx`: when a secret-formatted field declares
`maxLength` larger than the existing single-line threshold, render a
monospace textarea instead of the masked input. Short secrets (API keys,
tokens) keep the existing masked-input + show/hide toggle behavior.

**The exe.dev plugin (`Add exe.dev sandbox provider plugin`):**

- `packages/plugins/sandbox-providers/exe-dev/`: plugin entry, manifest,
plugin runtime, README, and 19-test Vitest suite.
- Manifest fields: API token (with `secret-ref` + `/exec` permission
notes — needs `new`, `ls`, `rm`), API URL override, optional SSH
username, optional SSH private key (uses the new `JsonSchemaForm`
textarea variant via `maxLength: 4096`), optional SSH identity-file
path, optional setup script.
- Default `--setup-script` is a Nodesource Node 20 install. exe.dev VMs
come up bare and the Paperclip sandbox callback bridge is a Node script,
so without Node preinstalled the bridge can't start. Operators can
override by supplying their own setup script.
- `runLifecycleCommand` redacts env values from the executed command
before surfacing it in error messages, so secrets passed via
`--env=KEY=VALUE` don't leak into operator-visible failures.
- The plugin distinguishes exe.dev's SSH onboarding failures (`Please
complete registration by running: ssh exe.dev`) from general SSH
failures and surfaces a clear remediation message.
- `scripts/release-package-manifest.json`: register the new plugin for
CI publish alongside the existing daytona / e2b providers.

## Verification

- `pnpm typecheck`
- `pnpm exec vitest run --no-coverage
ui/src/components/JsonSchemaForm.test.tsx`
- `(cd packages/plugins/sandbox-providers/exe-dev && pnpm test)` — 19
passing

For an operator-side smoke test:

1. Get an exe.dev API token with `/exec` permission for `new`, `ls`,
`rm`.
2. Register the plugin in your Paperclip instance, configure an
environment with the token.
3. Create a sandbox env whose provider is `exe-dev`, then run a Codex or
Claude job against it. The default Node 20 setup script should bring the
VM up automatically.

## Risks

- Adds a new sandbox provider plugin that follows the existing daytona /
e2b shape; behavior on existing providers is unchanged.
- The `JsonSchemaForm` textarea variant only engages for fields that opt
in via `maxLength` larger than the existing threshold. All existing
secret fields (which don't declare a `maxLength`) keep their current
rendering. Test coverage pins both paths.
- The redaction in `runLifecycleCommand` is a defense-in-depth measure;
the test suite exercises the redaction path. If the redaction misses a
future env-arg shape, the worst case is restored behavior (secrets in
error messages), which is what the existing daytona / e2b plugins also
do today.
- Default setup script downloads from `deb.nodesource.com` over HTTPS at
VM init. Operators on air-gapped networks or with a different package
strategy can override the setup script.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — UI change is a textarea variant of an existing secret
field; will attach screenshots before requesting merge
- [x] I have updated relevant documentation to reflect my changes
(plugin README, manifest descriptions)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 07:42:18 -07:00
Aron Prins 74cb560c41 fix(ui): improve routine properties panel and history UX (#5703)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Routines are the recurring-work surface where operators configure
schedules, executions, activity, and revision history.
> - The routine detail view uses a contextual right properties panel for
triggers, runs, activity, and history.
> - That panel was too cramped for routine workflows: the routine header
could collapse at constrained widths, and revision previews/comparisons
were trying to live inside the same narrow panel.
> - This pull request makes the routine properties panel wider and
responsive without changing the default panel behavior for other pages.
> - It also moves routine revision viewing and comparison into focused
dialogs so history stays usable instead of rendering dense revision
content inside the right panel.
> - The benefit is a cleaner routine workflow: triggers remain
scannable, the main routine stays readable, and revisions can be
inspected, compared, and restored without fighting the sidebar width.

## What Changed

- Added optional per-panel layout options for storage key, default
width, min/max width, and compact viewport behavior.
- Set the routine properties panel to use its own 400px default width
and persistence key, while compacting to 320px on narrower viewports.
- Made the shared resizable sidebar support right-side panes, custom
width bounds, compact max width, and keyboard resizing.
- Fixed the routine detail header so title text and action controls
remain readable beside the properties panel at constrained widths.
- Reworked routine history so selecting a revision opens a read-only
snapshot dialog instead of trying to render the whole revision inside
the right panel.
- Added a side-by-side current-vs-selected revision comparison dialog
with clearer diff markers for structured fields, triggers, and
variables.
- Added focused tests for the resizable pane and routine history
behavior.

## Verification

- `pnpm vitest run ui/src/components/RoutineHistoryTab.test.tsx
ui/src/components/ResizableSidebarPane.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm -r typecheck`
- `git diff --check`
- Browser E2E in TestCo at `http://localhost:3100/TES/dashboard`:
  - created and edited a routine
  - added, edited, toggled, and deleted schedule triggers
  - paused automation
  - ran the routine and stopped the live run
- verified runs, activity, history, snapshot dialog, compare mode,
restore confirmation, routine list, recent runs, row actions, panel
close/reopen, and constrained-width layout

### Screenshots

#### Trigger Panel Width

| Before | After |
| --- | --- |
| <img width="1741" height="1289" alt="triggers-before"
src="https://github.com/user-attachments/assets/2a391769-c355-4219-8da3-d1ea18698430"
/> | <img width="1742" height="1288" alt="triggers-after"
src="https://github.com/user-attachments/assets/9e818978-283c-49a3-9401-879be550c67b"
/> |

#### History Panel

Before, selecting a revision attempted to show dense revision content
inside the already narrow right panel. After, history remains a compact
list and revision details open separately.

| Before | After |
| --- | --- |
| <img width="1739" height="1289" alt="history-before"
src="https://github.com/user-attachments/assets/eaea4f3d-bb65-4af6-b67f-3ba3026fe0c9"
/> | <img width="1741" height="1290" alt="history-after"
src="https://github.com/user-attachments/assets/4c139238-8494-4438-89e1-4277d05bc3aa"
/> |

#### Revision Snapshot

The selected revision now opens in a dedicated read-only dialog instead
of crowding the properties panel.

<img width="1740" height="1289" alt="revision-single"
src="https://github.com/user-attachments/assets/f930f50f-7016-434b-bd81-d8d97304c528"
/>

#### Revision Compare

Historical revisions can be compared side-by-side with the current
revision, including changed structured fields and trigger differences.

<img width="1740" height="1287" alt="revision-compare"
src="https://github.com/user-attachments/assets/5640201e-de4f-446b-8941-1b0f140c56d7"
/>

## Risks

- Low to moderate UI risk: the shared resizable pane API gained optional
layout parameters, but existing callers keep the previous defaults.
- Routine history now uses dialogs for revision viewing and comparison,
so reviewers should confirm the new workflow feels right for restore and
compare.
- Routine panel width now persists under a routine-specific key, so
previous global properties panel width preferences do not carry into
routines.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent in Codex Desktop, tool-enabled with
local shell, git, and in-app browser automation. Context window size was
not exposed in this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-11 07:37:30 -07:00
Devin Foley 486fb88a15 Add Cloudflare sandbox provider plugin (#5687)
> _Stacked on top of #5685#5686. Diff against master includes commits
from earlier PRs in the stack — review focuses on the two new commits
(`Extend sandbox callback bridge for Worker-hosted plugins` + `Add
Cloudflare sandbox provider plugin`)._

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent runs in a sandbox environment, and operators choose which
provider backs that sandbox — today E2B and Daytona are bundled with the
platform
> - Cloudflare Workers + Durable Objects + the Sandbox SDK offer a
credible new option: globally distributed, cheap idle, and
operator-deployable as a single Worker
> - To plug it in, Paperclip needs (a) a provider plugin that speaks the
`PaperclipPluginManifestV1` lifecycle and (b) a small operator-deployed
Worker — the **bridge** — that adapts Paperclip's runtime RPCs to the
Cloudflare Sandbox SDK
> - The plugin extends the existing sandbox-callback-bridge with a
`bridge.transport: "worker"` discriminator so the platform routes
runtime RPCs through the Worker bridge instead of the in-process runner
> - This pull request adds the plugin, the bridge Worker template, and
the supporting adapter-utils + server hooks the new transport needs
> - The benefit is that operators can run sandboxes on Cloudflare's edge
with no new platform code beyond installing the plugin and deploying the
Worker

## What Changed

**Shared support (`Extend sandbox callback bridge for Worker-hosted
plugins`):**

- `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`:
expose `expectedHostHeader` so plugin-side bridge clients can verify the
canonical request envelope before forwarding.
- `packages/adapter-utils/src/command-managed-runtime.{ts,test.ts}`:
relax the always-fresh runner construction so callers can re-use a
runner across exec calls (Worker-hosted bridges hold the runner inside a
Durable Object).
- `server/src/services/environment-runtime.ts` +
`environment-runtime.test.ts`: route Worker-hosted bridges through the
same env-shaping path as E2B and pin the `requestEnv` contract.
- `server/src/services/plugin-environment-driver.ts`: thread an optional
`issueId` through the runtime descriptor so bridges can scope leases to
the originating issue (used by Cloudflare to map a sandbox to the
issue/workflow for billing and audit).
- `packages/plugins/sdk/src/protocol.ts`: add `issueId?` to
`PluginEnvironmentDriverBaseParams` and the new `bridge.transport:
"worker"` discriminator that the new plugin declares.
- `server/__tests__/heartbeat-plugin-environment.test.ts`: pin the
heartbeat path against the new runtime descriptor.

**The Cloudflare plugin itself (`Add Cloudflare sandbox provider
plugin`):**

- `packages/plugins/sandbox-providers/cloudflare/`: plugin entry,
manifest, plugin runtime (lifecycle + bridge client), config parsing,
and Vitest coverage. Manifest declares `bridge.transport: "worker"` so
the platform routes runtime RPCs through the bridge client.
- `bridge-template/`: a Worker template the operator deploys with
`wrangler`. Owns Durable Object-backed sessions (`sessions.ts`),
exec/stream routes (`exec.ts`, `routes.ts`), and an HMAC auth layer
(`auth.ts`) that pins the `Host` header surface. Includes the
SDK-contract-correct exec implementation, lease recovery, and chunked
stdout/stderr streaming.
- Tests cover lease/session handoff (`bridge-template/src/exec.test.ts`,
`routes.test.ts`), bridge client request shaping
(`src/bridge-client.test.ts`), and end-to-end plugin behavior
(`src/plugin.test.ts`) including streamed exec output. 27 tests in
total.
- `README.md` walks the operator through deploying the bridge Worker,
registering the plugin, and configuring the runtime.

## Verification

- `pnpm typecheck`
- `pnpm exec vitest run --no-coverage
packages/adapter-utils/src/sandbox-callback-bridge.test.ts
packages/adapter-utils/src/command-managed-runtime.test.ts
server/src/__tests__/environment-runtime.test.ts
server/src/__tests__/heartbeat-plugin-environment.test.ts`
- `(cd packages/plugins/sandbox-providers/cloudflare && pnpm test)` — 27
passing

For an operator-side smoke test:

1. Deploy the bridge: `cd
packages/plugins/sandbox-providers/cloudflare/bridge-template &&
wrangler deploy`
2. Register the plugin in your Paperclip instance, point its bridge URL
at the deployed Worker, set the HMAC shared secret.
3. Create a sandbox environment whose provider is `cloudflare`, then run
a Codex or Claude job against it.

## Risks

- Adds a new `bridge.transport: "worker"` code path, but the existing
E2B / Daytona transports go through the same shaped helpers and have
explicit test coverage that pins their behavior unchanged.
- The Worker bridge stores session state in a Durable Object; operator
instances must be aware of the corresponding Cloudflare costs (DO
requests, storage). Documented in the README.
- The `issueId` plumbing is optional throughout — existing plugins that
don't supply it continue to work.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A, no UI change
- [x] I have updated relevant documentation to reflect my changes
(plugin README, bridge-template README)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 07:33:13 -07:00
Dotta 4ad1c83b84 Write release changelog for v2026.511.0 (#5366)
## Summary

- Adds the user-facing stable release changelog at
`releases/v2026.511.0.md`, generated per
[`.agents/skills/release-changelog/SKILL.md`](../blob/master/.agents/skills/release-changelog/SKILL.md)
from the 89 commits between `v2026.428.0` and `origin/master`.
- Surfaces nine headline themes: planning mode, full company search,
routine revision history, recovery system notices, expanded plugin host,
**secrets provider vaults + remote import**, **Cursor cloud adapter**,
**Daytona sandbox provider plugin**, and the ACPX local adapter — plus
categorized improvements/fixes.
- Documents nine new additive/idempotent migrations (`0075`–`0083`) in
the Upgrade Guide, including the `fuzzystrmatch` extension requirement
for company search and the `provider_config_id` text→uuid retype on
`company_secrets`.

Branch retains its `release/v2026.506.0` name to avoid PR churn; only
the artifact filename and content track the actual release date. Branch
was rebased onto current `origin/master` and force-pushed with a single
squashed commit.

No `BREAKING:` / `BREAKING CHANGE:` / `feat!:` signals across the full
range.

## Test plan

- [ ] Maintainer review of the draft changelog content for tone/scope
- [ ] Confirm the headline grouping reflects intended release messaging
- [ ] Confirm the migration list and upgrade guide match runtime
behavior
- [ ] Confirm the `0083` `provider_config_id` retype guidance is
acceptable for the release notes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 09:28:02 -05:00
Aron Prins c0c58d6b01 fix(ui): prevent lossy cron rewrites + redesign routine triggers tab (#3569)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Humans configure when those agents run via **routines**, which are
driven by cron-backed triggers
> - The routine detail page exposed triggers through an always-visible
inline add form and per-row inline editor, with a ScheduleEditor that
only understood a narrow set of cron shapes
> - That combination was actively lossy: pasting `0 9,13,17 * * *`
silently collapsed to `0 10 * * *` on save, and common shapes
(every-N-minutes within a window, multiple times per day, monthly on
several dates) had no first-class UI
> - This pull request rebuilds the triggers tab around a list of cards +
add/edit modal, teaches ScheduleEditor the cron shapes users actually
want, and prevents cron round-trips from dropping data
> - It also *optionally* tucks the Triggers/Runs/Activity tabs into the
shared right-hand PropertiesPanel (same pattern as Issues and Goals) so
they stay in view alongside the routine instead of being hidden below
the main content
> - The benefit is that routine scheduling becomes non-destructive and
legible — operators can see, describe, and edit real-world schedules
without dropping into raw cron and without fear that saving will
silently rewrite their trigger

## What Changed

**Core fixes + redesign (required):**
- **ScheduleEditor correctness** — `parseCronToPreset` now detects comma
lists, ranges, steps, and unknown tokens across every cron field and
routes anything it can't round-trip losslessly to the `custom` preset
(except `dow === "1-5"` → `weekdays`). Fixes the `0 9,13,17 * * *` → `0
10 * * *` regression.
- **ScheduleEditor presets** — adds first-class support for
every-N-minutes (with optional hour window + weekdays-only),
every-N-hours, hourly at minute offset, daily with multiple times/day,
selected-days-of-week with multiple times, and monthly on multiple
dates. `describeSchedule` unfolds multi-value hour/day lists into
readable sentences.
- **ScheduleEditor polish** — swaps raw `<input type=\"checkbox\">` for
the shadcn `Checkbox` primitive so hour-window and weekdays-only toggles
match the rest of the app.
- **Triggers tab redesign** — replaces the inline add form + inline
editor with a header + \"Add trigger\" button, compact `TriggerListCard`
entries, and a `TriggerDialog` add/edit modal. Enable/disable is now a
single-click switch on each card; delete goes through a `ConfirmDialog`.
- **Webhook trigger gating** — webhook kind is visible but disabled with
\"— COMING SOON\" in the add dialog, matching the old inline form's
production behaviour. Editing existing webhook triggers still works.
- **Tests** — adds `ScheduleEditor.test.ts` covering the regression cron
strings (`0 9,13,17 * * *`, `0 */4 * * *`, `0 10,16 * * *`) plus
existing preset patterns as regression guards in the other direction.

**Optional layout change (commit `145a86b5` — can be dropped without
affecting the rest):**
- Moves Triggers/Runs/Activity into the shared right-hand
`PropertiesPanel` (persisted open/close, header toggle button),
mirroring `IssueDetail` and `GoalDetail`. The reasoning: these tabs are
the primary way a human *operates* a routine, and keeping them docked on
the right means they're always in view next to the routine content
rather than hidden below the fold. Mobile parity is preserved by
rendering the same tabs inline below `md`. Trigger cards and
run/activity rows were restructured into vertical stacks so they fit the
320px panel without overflow, and the last-result badge became a
wrapping inline chip so long error strings no longer fill the card
width.
- **If reviewers prefer to keep the tabs inline below the routine, this
commit can be reverted cleanly without touching any of the fixes
above.**

## Screenshots:

Old:
<img width="721" height="707" alt="triggers-old"
src="https://github.com/user-attachments/assets/260bb682-32cb-4dff-b038-d55e45824b04"
/>

New: 
<img width="1410" height="1325" alt="Screenshot 2026-04-13 at 12 25 00"
src="https://github.com/user-attachments/assets/d70dd35b-e72f-4fc6-bb21-be9b0d92b3b1"
/>

New Add Trigger modal:
<img width="1408" height="1321" alt="Screenshot 2026-04-13 at 12 25 07"
src="https://github.com/user-attachments/assets/0f23a83d-ba2c-47ed-9efa-829e777dcdf5"
/>

Commit 145a86b5 Properties panel:
<img width="1409" height="830"
alt="commit-145a86b51265e326160cb8c48e0874cb36d86f37"
src="https://github.com/user-attachments/assets/f1d42f07-7cd3-4614-8e93-5b585affd4bf"
/>

## Verification

- `cd ui && npm test -- ScheduleEditor` — new cron parser/describer
cases pass.
- Full UI test suite + typecheck green locally.
- Manual:
1. Open a routine → Triggers tab → verify cards render with enable
switch, edit, and delete (confirm dialog).
2. Create a schedule trigger with each preset (every-N-min with window,
every-N-hours, hourly@offset, daily multi-time, weekly multi-time,
monthly multi-date) → save → reopen → preset + values round-trip intact.
3. Paste `0 9,13,17 * * *` into an existing trigger → editor routes to
Custom with the raw cron preserved → save → value unchanged.
4. Try to add a webhook trigger → kind option shows \"— COMING SOON\"
and is disabled; edit an existing webhook trigger still works.
5. Toggle the properties panel via header button → state persists across
reload. Resize below `md` → tabs render inline.
- **Before/after screenshots:** attached in PR description (inline
triggers tab → list+modal; raw-cron save hazard → custom preset
preservation; bottom-of-page tabs → right-hand PropertiesPanel).

## Risks

- **Medium-low.** UI-only change; no API, schema, or migration impact.
- `parseCronToPreset` / `describeSchedule` signatures are preserved, but
their *behaviour* shifts: more cron strings now resolve to `custom` than
before. Any external caller relying on the old (lossy) classification
would see different preset tags — none known in-repo.
- PropertiesPanel reuse (optional commit) depends on the existing
localStorage key behaviour; if two routes ever write conflicting
open/close state under the same key, one could clobber the other.
Mirrors the established `IssueDetail`/`GoalDetail` pattern, so risk is
bounded. Reverting `145a86b5` removes this risk entirely while keeping
the fixes.
- Webhook kind is disabled in the add dialog only; existing webhook
triggers remain editable, so no data is stranded.

## Model Used

- **Authoring / PR drafting:** Anthropic Claude — `claude-opus-4-6` (1M
context window), via Claude Code CLI. Used for diff review and PR
description drafting. Code authored by @aronprins.
- **Post-hoc audit:** OpenAI Codex — `gpt-5.4` (high reasoning). Audited
the completed work after implementation; found no issues.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified with version + capability details
- [x] Tests run locally and pass
- [x] Added/updated tests (`ScheduleEditor.test.ts`)
- [x] Before/after screenshots attached
- [ ] Documentation updated — none required (internal UI only)
- [x] Risks documented
- [x] Will address all Greptile + reviewer comments before merge
2026-05-11 00:53:10 -07:00
Devin Foley 0fe39a2d5c fix(cursor-local): resolve sandbox agent installs from cursor bin (#5686)
> _Stacked on top of #5685 (Harden remote sandbox runtime). Diff against
master includes commits from earlier PRs in the stack — review focuses
on the new commit only._

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The cursor-local adapter wraps the Cursor Agent CLI so a Paperclip
workflow can drive it inside a sandbox
> - When the adapter runs in a remote sandbox, the Cursor Agent CLI
installs under `$HOME/.local/bin/cursor-agent` (or wherever
`$XDG_BIN_HOME` points), not on the global PATH
> - The existing post-install resolution assumed `cursor-agent` would
resolve via the sandbox's login shell PATH after `npm install -g`, which
fails on sandboxes where the install lands in a user-prefixed directory
that isn't on PATH at probe time
> - This pull request resolves the agent CLI from the cursor binary's
own directory (`dirname "$(command -v cursor)"`) so the install probe
and execute path agree on a real binary location
> - The benefit is that cursor-local works correctly on any sandbox
provider where `npm install` lands in a user-prefixed directory

## What Changed

- `packages/adapters/cursor-local/src/server/remote-command.ts`: resolve
the cursor-agent binary from the cursor bin directory after install,
instead of relying on PATH.
- `packages/adapters/cursor-local/src/server/test.ts`: corresponding
probe tweak.
- `packages/adapters/cursor-local/src/server/test.test.ts` (new) +
`remote-command.test.ts`: focused coverage that exercises the install +
resolve path against a sandbox runner that places the binary in a
user-prefixed directory.

## Verification

- `pnpm exec vitest run --no-coverage
packages/adapters/cursor-local/src/server/test.test.ts
packages/adapters/cursor-local/src/server/remote-command.test.ts
packages/adapters/cursor-local/src/server/execute.test.ts`

All passing locally.

## Risks

- Local cursor-local runs are unaffected — the resolution change only
kicks in for the sandbox install path.
- Low risk; isolated to one adapter.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: tool use (Read/Edit/Bash), no code execution beyond
local repo commands

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A, no UI change
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 00:41:20 -07:00
Devin Foley b24c6909e8 Harden remote sandbox runtime probes, timeouts, and installs (#5685)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent runs inside a sandbox environment so its CLI is isolated
from the host
> - Sandbox-backed adapter runs go through a small set of shared helpers
— `ensureAdapterExecutionTargetCommandResolvable`, the sandbox callback
bridge runner, and per-adapter `SANDBOX_INSTALL_COMMAND` strings
> - When standing up new sandbox provider plugins, the existing helpers
timed out, missed install fallbacks, or leaned on assumptions that only
held for E2B
> - Local adapters (`claude-local`, `codex-local`, `gemini-local`,
`opencode-local`) needed slightly hardened probes so they could install
themselves and validate inside *any* remote sandbox transport, not just
E2B
> - This pull request bundles those runtime fixes so future sandbox
provider plugins inherit a working baseline
> - The benefit is that adding a new sandbox provider plugin no longer
requires touching adapter-utils or each local-adapter probe — the
supporting infra is already correct

## What Changed

- `packages/adapter-utils/src/execution-target.ts`: introduce
`DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC = 1800` and
`resolveAdapterExecutionTargetTimeoutSec(...)`. Local and SSH adapters
keep the historical "0 means no adapter timeout" behavior;
sandbox-backed runs without an explicit `timeoutSec` get an explicit
30-minute default so remote installs and warm-up don't time out at the
per-RPC default. Plumbed `timeoutSec` through
`ensureAdapterExecutionTargetCommandResolvable` so install probes inside
a sandbox honor adapter-level overrides instead of the bridge's 5-minute
default.
- `packages/adapters/opencode-local/src/index.ts`: switch
`SANDBOX_INSTALL_COMMAND` from `npm install -g opencode-ai` to `curl
-fsSL https://opencode.ai/install | bash`. The npm package reifies four
large prebuilt-binary subpackages in parallel even though only one
matches the host arch; on bandwidth-constrained sandboxes that blew
through the 240s install budget. The official installer fetches one
arch-specific binary and adds `$HOME/.opencode/bin` to PATH via
`~/.bashrc`, which the sandbox-callback-bridge login-shell script
already sources.
- `packages/adapters/{claude,codex,gemini,opencode}-local/`: harden
remote-target probes — pass `--skip-git-repo-check` for Codex when
probing outside a repo, normalize permission flags for Claude, and add
`*.remote.test.ts` coverage that exercises the remote-sandbox path
explicitly for each adapter.
- `packages/adapter-utils/src/sandbox-install-command.{ts,test.ts}`
(new): add `buildSandboxNpmInstallCommand` helper.
`server/src/adapters/registry.ts` + new
`server/src/__tests__/adapter-registry.test.ts`: wire adapter install
commands so they fall back to a writable `$HOME/.local` prefix when
global install isn't available.
- `server/src/__tests__/plugin-worker-manager.test.ts` + new
`server/src/__tests__/fixtures/plugin-worker-delayed.cjs`: pin per-call
timeout overrides so plugin worker exec calls honor the caller's timeout
instead of the worker's default.

## Verification

- `pnpm typecheck`
- `pnpm exec vitest run --no-coverage
packages/adapter-utils/src/execution-target-sandbox.test.ts
packages/adapter-utils/src/sandbox-install-command.test.ts`
- `pnpm exec vitest run --no-coverage
server/src/__tests__/plugin-worker-manager.test.ts
server/src/__tests__/adapter-registry.test.ts
server/src/__tests__/claude-local-adapter-environment.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/gemini-local-adapter-environment.test.ts`
- `pnpm exec vitest run --no-coverage
packages/adapters/codex-local/src/server/test.remote.test.ts
packages/adapters/opencode-local/src/server/test.remote.test.ts
packages/adapters/codex-local/src/server/codex-args.test.ts
packages/adapters/codex-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts`

All passing locally.

## Risks

- Touches shared `adapter-utils` and several `*-local` adapters. The
30-minute default applies only when both (a) the target is
`remote+sandbox` and (b) no `timeoutSec` is configured — local + SSH
paths are unchanged. New test coverage was added alongside each behavior
change to pin the contracts.
- Switching OpenCode's install command to the official installer is a
behavior change for any operator running OpenCode inside a remote
sandbox. Local installs are unaffected (the `SANDBOX_INSTALL_COMMAND`
only runs when an adapter is being installed inside a sandbox).
- Low risk overall — no migrations, no API surface change.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep),
no code execution beyond local repo commands

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A, no UI change
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 00:31:54 -07:00
github-actions[bot] 6e4fa78d86 chore(lockfile): refresh pnpm-lock.yaml (#5668)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-05-10 17:30:05 -07:00
Devin Foley 534aee66ae Add cursor_cloud adapter for Cursor SDK + Cloud Agents API v1 (#5664)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - There are many adapter types, one per agent-runtime product (Claude,
Codex, OpenCode, Cursor local CLI, etc.)
> - Cursor shipped a public TypeScript SDK on 2026-04-29 that exposes
Cursor's full hosted-agent platform (cloud VMs, harness, MCP, skills,
hooks)
> - Paperclip had no first-class adapter for this — agents that wanted
to use Cursor's managed cloud runtime had to fall back to the local CLI
adapter, which loses the cloud session, streaming, and durable run model
> - This PR adds a new `cursor_cloud` adapter built directly on
`@cursor/sdk`, with Paperclip's heartbeat mapped to Cursor's
durable-agent + per-run model
> - The benefit is that any Paperclip agent can now drive a Cursor cloud
agent across heartbeats with native session reuse, streaming, and
cancellation, while Paperclip remains the source of truth for issue/task
state

## What Changed

- New built-in adapter package `packages/adapters/cursor-cloud` (15
files, ~1.7k LOC) backed by `@cursor/sdk` ^1.0.12
- `src/server/execute.ts` — SDK-first lifecycle: `Agent.create` /
`Agent.resume` / `Agent.getRun` / `agent.send` / `run.stream` /
`run.wait`, with session reuse keyed on the (runtime env type, env name,
repo set) tuple
- `src/server/session.ts` — codec for `cursorAgentId` + `latestRunId` +
repo metadata, persisted in `runtime.sessionParams`
- `src/server/test.ts` — environment probe via `Cursor.me()` and
optional model validation via `Cursor.models.list()`
- `src/ui/parse-stdout.ts` + `src/cli/format-event.ts` — normalize
Cursor SDK message types (`status`, `thinking`, `assistant`, `user`,
`tool_call`, `tool_result`, `result`) into Paperclip transcript events
for the UI and CLI
- Registrations: `packages/shared/src/constants.ts`,
`packages/adapter-utils/src/session-compaction.ts`,
`server/src/adapters/{registry,builtin-adapter-types}.ts`,
`ui/src/adapters/{registry,adapter-display-registry}.ts` +
`ui/src/adapters/cursor-cloud/index.ts`, `cli/src/adapters/registry.ts`,
plus workspace deps in `cli`/`server`/`ui` `package.json`
- `ui/src/components/AgentConfigForm.tsx` — hide local-Cursor
`mode`/thinking-effort field for `cursor_cloud` (different config
surface)
- 11 vitest tests covering execute paths (fresh create, matching-resume,
active-run reattach, non-finished result), session codec round-trip,
transcript parsing, and config building

## Verification

Reviewer steps:

```bash
pnpm install
pnpm --filter @paperclipai/adapter-cursor-cloud typecheck   # → clean
pnpm vitest run packages/adapters/cursor-cloud              # → 11/11 passing
```

End-to-end check against a real Cursor cloud agent (requires
`CURSOR_API_KEY` and Cursor GitHub-app install on the target repo):

1. Create a `cursor_cloud` agent in Paperclip with `repoUrl` set to the
test repo, `repoStartingRef: main`, and `env.CURSOR_API_KEY` set
2. Trigger a heartbeat → adapter calls `Agent.create({ cloud: { env: {
type: "cloud" }, repos: [...] } })`, streams events, terminates on
`finished`
3. Trigger a second heartbeat → adapter calls `Agent.resume` or
`agent.send` follow-up depending on prior-run state, reusing
`cursorAgentId`
4. The Paperclip UI/CLI transcript reflects Cursor `status` / `thinking`
/ `assistant` events as they stream
5. Cancellation from Paperclip maps to `run.cancel()` or Cloud API v1
`cancelRun` for cross-heartbeat cancellation

A direct-SDK smoke run against a real repo (devinfoley/my_test_project @
main) confirmed: `Cursor.me()` ok → `Agent.create` → `agent.send` →
`run.stream()` (30 events) → terminal status `finished` in ~11s.

## Risks

- **New adapter, additive only.** No existing adapter or registry is
replaced; current `cursor` local-CLI adapter is untouched. Default
behavior of any existing agent is unchanged.
- **External dependency on `@cursor/sdk`.** Cursor's SDK is v1.0.x and
may evolve. Mocked unit tests cover the public surface used here; if the
SDK breaks compatibility we update the adapter independently.
- **Cost/budget.** `cursor_cloud` runs on Cursor's billed cloud VMs;
operators must understand they are spending money outside Paperclip's
budget controls when they enable this adapter. Same shape as other
API-billed adapters.
- **No webhook support in V1.** The SDK already provides
stream/wait/cancel/reattach, so V1 does not require a public callback
URL. If a future use case needs out-of-band wakes, we add a Cloud API v1
webhook bridge as a separate change. This is called out in the issue
plan document.
- **Lockfile.** Per repo policy, `pnpm-lock.yaml` is intentionally not
in this PR — CI's lockfile workflow will update it on merge given the
manifest changes.

## Model Used

- Provider: Anthropic Claude (via Claude Code / Paperclip `claude_local`
adapter)
- Model: `claude-opus-4-7` (Claude Opus 4.7), knowledge cutoff January
2026
- Mode: standard tool-use with extended reasoning
- Context: ~200k token window
- Capabilities used: code generation, multi-file edits, shell/test
execution, GitHub PR workflow

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (11/11 in
`packages/adapters/cursor-cloud`)
- [x] I have added or updated tests where applicable (4 new test files,
11 cases)
- [ ] If this change affects the UI, I have included before/after
screenshots (the only UI change is hiding the local-Cursor mode field on
the `cursor_cloud` adapter — happy to attach a screenshot if the
reviewer wants one)
- [x] I have updated relevant documentation to reflect my changes (issue
plan document supersedes the pre-SDK design; tracked in PAPA-203)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-10 17:21:04 -07:00
Dotta 0096b56a1c [codex] Add LLM Wiki plugin host support (#5597)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The plugin system needs host contracts and runtime support before
large plugins can integrate cleanly.
> - The source branch mixed the LLM Wiki package with supporting
host/runtime work, managed plugin skills, root-level storage spaces, and
a bookmarks reference plugin.
> - [PAP-9173](/PAP/issues/PAP-9173) asked for the current branch to be
split by file boundary: plugin package separately from everything else.
> - [PAP-9188](/PAP/issues/PAP-9188) clarified that LLM Wiki may have
plugin-local spaces, but Paperclip core should not reorganize top-level
local storage into spaces.
> - Follow-up review clarified that the bookmarks example should not
ship in this PR either.
> - This pull request contains the
non-`packages/plugins/plugin-llm-wiki/` host/runtime work, keeps runtime
state under the selected Paperclip instance root, and no longer includes
the bookmarks example.

## What Changed

- Added/updated plugin host contracts, SDK types, worker RPC plumbing,
managed plugin skill support, and related server tests.
- Removed the bookmarks example plugin package and its
bundled-example/workspace references.
- Removed the root-level local spaces CLI/migration surface and restored
instance-root runtime defaults for config, db, logs, storage, secrets,
workspaces, projects, and adapter homes.
- Replaced shared root `space-paths` helpers with `home-paths` helpers
for core runtime storage.
- Tightened stranded recovery unique-conflict detection so concurrent
recovery scans reuse the raced recovery issue when Postgres errors are
wrapped.
- Kept `packages/plugins/plugin-llm-wiki/` out of this PR diff;
plugin-local spaces remain in the stacked plugin-only PR.

## Verification

- `pnpm exec vitest run cli/src/__tests__/data-dir.test.ts
cli/src/__tests__/home-paths.test.ts cli/src/__tests__/onboard.test.ts
packages/shared/src/home-paths.test.ts
packages/db/src/runtime-config.test.ts
server/src/__tests__/agent-instructions-service.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/codex-local-execute.test.ts`
- `pnpm exec vitest run packages/db/src/runtime-config.test.ts`
- `pnpm exec vitest run
server/src/__tests__/plugin-routes-authz.test.ts`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts -t "reuses the
raced stranded recovery issue"` skipped locally because embedded
Postgres did not initialize on this macOS temp host; the code path was
typechecked and is covered by Linux CI.
- Boundary check: no core references remain for `PAPERCLIP_SPACE_ID`,
`spaces migrate-default`, `@paperclipai/shared/space-paths`,
`registerSpacesCommands`, or the removed bookmarks example.
- Previous PR head `4f23e034` had green GitHub checks: `verify`, all
four serialized server shards, `e2e`, `Canary Dry Run`, `policy`, Snyk,
and `Greptile Review`. Current head `582f466d` is re-running checks
after the bookmarks deletion.

## Risks

- Plugin host changes touch shared runtime paths, so regressions would
most likely appear in adapter startup, plugin loading, or local dev path
defaults.
- Removing the bookmarks example also removes one demonstration of
plugin database namespaces plus local-folder persistence; remaining
plugin examples still cover bundled example discovery and plugin host
flows.
- The plugin package itself is intentionally deferred to the stacked
plugin-only PR, where LLM Wiki plugin-local spaces live.
- Existing installs that tested the transient root-level spaces CLI
should stop using it; this PR intentionally removes that unsupported
migration surface before merge.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI GPT-5 Codex via Codex CLI, tool use and local code execution
enabled; context window not exposed.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass, except where noted above
for host-specific embedded Postgres initialization
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Stacked follow-up: PR #5592 contains only
`packages/plugins/plugin-llm-wiki/` and targets this branch.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-10 07:34:12 -05:00
Devin Foley eb12c42009 Clarify sandbox provider messaging in company environments (#4902)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Company Environments is the operator-facing seam for choosing where
compatible adapters execute work.
> - Sandbox provider plugins such as E2B extend that seam, but they are
not agent adapters themselves.
> - The current Company Environments copy put adapter capability rows
and sandbox-provider enablement on the same page without clearly
distinguishing the two concepts.
> - That made it look like installing the E2B sandbox provider caused a
new adapter to appear under adapters.
> - This pull request clarifies the UI language so provider plugins are
described as backing the Sandbox driver rather than being adapter types.
> - The benefit is a more accurate mental model for operators
configuring environments and adapters.

## What Changed

- Added explicit Company Environments copy stating that installed
sandbox providers are not adapter types and instead back the Sandbox
driver for compatible adapters.
- Renamed the support-matrix column from `Sandbox` to `Sandbox via
plugin` to make the provider relationship visible in the table itself.
- Extended the existing environments UI test to assert the new
clarification text.

## Verification

- `pnpm test -- --run ui/src/pages/CompanySettings.test.tsx`
Result: could not complete cleanly in this worktree because the checkout
is missing its local workspace install links.
- Direct Vitest fallback against `ui/src/pages/CompanySettings.test.tsx`
Result: failed before test collection on local dependency resolution
(`react/jsx-dev-runtime`), so there is no passing automated signal from
this checkout.
- Manual review
Confirm the Company Environments page now says sandbox providers are not
adapter types and labels the table column as `Sandbox via plugin`.

## Risks

- Low risk. This is a copy-only UI clarification plus a matching test
assertion; the main risk is wording drift if the product later decides
sandbox providers should be surfaced differently.

## Model Used

- OpenAI Codex via the local `codex_local` Paperclip adapter. This run
used tool-assisted code editing and shell execution. The exact backend
model ID and context window are not exposed in the Paperclip run context
for this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
2026-05-09 23:03:26 -07:00
Devin Foley a72731f118 fix: harden release registry verification against npm lag (#4816)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Its release automation publishes canary packages to npm and then
validates the published registry state before considering the release
healthy
> - The failing canary run `25139465018` showed that npm can expose a
newly published version through version-specific endpoints before the
root package document has fully converged
> - That made a successful canary publish look like a failed release
because the verifier trusted stale root metadata too early
> - This pull request hardens the registry verification path by
preferring version-specific manifest checks, retrying
convergence-sensitive failures, and distinguishing permanent failures
from propagation lag
> - While validating that change in CI, a separate teardown race in
`heartbeat-stale-queue-invalidation.test.ts` surfaced and was hardened
so the PR could pass reliably
> - The benefit is that transient npm propagation lag no longer fails a
successful canary publish, while genuine registry-state and
dependency-integrity failures still stop the release flow promptly

## What Changed

- Hardened `scripts/verify-release-registry-state.mjs` so it prefers
version-specific manifest resolution over stale root metadata, adds
bounded registry-fetch timeouts, and classifies failures as retriable vs
non-retriable.
- Updated `scripts/release-lib.sh` and `scripts/release.sh` so
post-publish registry verification retries only convergence-sensitive
failures and reports immediate permanent failures clearly.
- Expanded `scripts/verify-release-registry-state.test.mjs` with
regression coverage for stale root metadata, fetch timeout behavior,
peer dependency range handling, non-retriable canary-latest cases, and
related verifier edge cases.
- Hardened
`server/src/__tests__/heartbeat-stale-queue-invalidation.test.ts`
teardown to tolerate the late-comment foreign-key race that CI exposed
while validating this branch.

## Verification

- `pnpm run test:release-registry`
- `node --check scripts/verify-release-registry-state.mjs`
- `bash -n scripts/release.sh && bash -n scripts/release-lib.sh`
- PR checks passed on head `5c422600fc12acac61f6b7c267a4dc915df622b1`:
`policy`, `verify`, `e2e`, `security/snyk`, and `Greptile Review`

## Risks

- Low risk. The main behavioral changes are limited to release
automation and verifier retry semantics, plus a test-only teardown
hardening for a CI race.

> I checked [`ROADMAP.md`](ROADMAP.md). This is a narrow release bugfix
and does not overlap planned core feature work.

## Model Used

- OpenAI Codex via Paperclip `codex_local` with tool use and local code
execution enabled. This agent session runs on a GPT-5-class coding
model; the exact backend model ID/context window is not exposed by the
local adapter runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I have addressed all Greptile and reviewer comments before
requesting merge
2026-05-09 22:18:12 -07:00
github-actions[bot] a1b2875165 chore(lockfile): refresh pnpm-lock.yaml (#5610)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-05-09 23:40:25 -05:00
Devin Foley 2f72cb29ea chore: update drizzle-orm to 0.45.2 (#5589)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The server, DB package, and CLI all rely on the shared Drizzle ORM
dependency for core persistence flows.
> - A published install was still resolving nested `drizzle-orm@0.38.4`,
which left the production package graph behind the intended security
update.
> - The repo’s documented dependency policy says GitHub Actions owns
`pnpm-lock.yaml`, so the correct maintainer workflow is to update
dependency manifests in the feature PR and let the lockfile refresh
happen separately after merge.
> - This pull request therefore keeps the Drizzle upgrade to the package
manifests only and leaves lockfile regeneration to the existing `Refresh
Lockfile` automation.

## What Changed

- Updated `drizzle-orm` dependency declarations in `cli/package.json`,
`packages/db/package.json`, and `server/package.json` from `0.38.4` /
`^0.38.4` to `0.45.2` / `^0.45.2`.
- Re-verified the packed `@paperclipai/db` and `@paperclipai/server`
publish payloads to confirm their generated `package.json` files
advertise `drizzle-orm ^0.45.2`.
- Removed the temporary lockfile/CI follow-up commits so the branch now
matches the intended manifest-only protocol.

## Verification

- `pnpm list drizzle-orm -r --depth 0`
- `pnpm exec vitest run packages/db/src/client.test.ts
server/src/__tests__/issues-service.test.ts`
- `pnpm run test:release-registry`
- Packed `@paperclipai/db` and `@paperclipai/server` locally and
inspected the tarball `package.json` files to confirm they advertise
`drizzle-orm ^0.45.2`.

## Risks

- Low to moderate risk: the runtime code paths are unchanged, but
downstream lockfile refresh now depends on the existing post-merge
GitHub automation working as documented.
- A separate packaging/versioning issue around unpublished
`@paperclipai/plugin-sdk@1.0.0` showed up during a raw local tarball
install experiment; that is called out for reviewers but is not part of
this Drizzle bump.

## Model Used

- OpenAI Codex via the `codex_local` adapter, using a GPT-5-based coding
agent with terminal tool use and code execution. The adapter does not
expose a public exact model ID or context-window value in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-09 21:31:57 -07:00
Dotta e3af7aa489 Add shared sidebar section controls (#5585)
## Thinking Path

> - Paperclip is the control plane for AI-agent companies.
> - The board UI sidebar is one of the main ways operators scan active
agents and projects.
> - Agents and projects had duplicated section header behavior, which
made collapse controls, add actions, and future section menus harder to
keep consistent.
> - Operators also need lightweight ways to switch between their curated
sidebar order and common scan orders like alphabetical or recent
activity.
> - This pull request introduces a shared sidebar section header and
uses it for the Agents and Projects sidebar sections.
> - The benefit is a more consistent sidebar surface with reusable
header controls and persisted sort modes without losing the existing
drag-ordered Top view.

## What Changed

- Added a reusable `SidebarSection` component that supports collapsible
content, header actions, and section dropdown menus.
- Updated the Agents sidebar section to use the shared header and add
persisted `Top`, `Alphabetical`, and `Recent` sort modes.
- Updated the Projects sidebar section to use the shared header and add
persisted `Top`, `Alphabetical`, and `Recent` sort modes.
- Added local-storage helpers and cross-tab update events for
agent/project sidebar sort preferences.
- Added focused component coverage for the shared section behavior and
the updated Agents/Projects sidebar ordering paths.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
ui/src/components/SidebarSection.test.tsx
ui/src/components/SidebarProjects.test.tsx
ui/src/components/SidebarAgents.test.tsx`
  - 3 test files passed
  - 18 tests passed

## Risks

- Low-to-moderate UI risk: this changes sidebar section header
interactions and adds persisted client-side sort preferences.
- Drag ordering is intentionally limited to `Top` mode; non-top modes
render sorted lists and do not persist drag order changes.
- No database migrations or API contract changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5-based model, tool-use enabled; exact
hosted model build/context-window identifier was not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-09 19:49:59 -05:00
Devin Foley 433dfed33d Enable CI publish for plugin-daytona (#5586)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The release pipeline gates new public packages behind a bootstrap
policy: `scripts/check-release-package-bootstrap.mjs` requires every
package marked `publishFromCi: true` in
`scripts/release-package-manifest.json` to already exist on npm
> - PR #5580 added the new Daytona sandbox provider plugin but had to
land with `publishFromCi: false` because the package had never been
published, so CI's release plan would have failed bootstrap validation
otherwise
> - Now that `@paperclipai/plugin-daytona` has been bootstrap-published
to npm by hand, the temporary `false` flag is the only thing keeping it
out of the standard CI publish flow
> - This pull request flips the Daytona entry to `publishFromCi: true`,
matching every other release-enabled package in the manifest
> - The benefit is that future tagged releases will publish the Daytona
plugin automatically alongside the rest of the monorepo's public
packages

## What Changed

- Single-line flip in `scripts/release-package-manifest.json`:
`@paperclipai/plugin-daytona` is now `publishFromCi: true`

## Verification

- `node ./scripts/release-package-map.mjs check` → `Release package
manifest OK: 19 enabled for CI publish, 0 disabled pending bootstrap`
(was 18 + 1)
- `node ./scripts/check-release-package-bootstrap.mjs
scripts/release-package-manifest.json` against `origin/master` →
`Release bootstrap OK for changed manifests:
@paperclipai/plugin-daytona`, confirming npm sees the
bootstrap-published package
- No code changes; no tests required beyond the existing manifest
validators

## Risks

- Low risk. Only effect is that the next release run will include
`@paperclipai/plugin-daytona` in its publish set
- If the npm bootstrap was incomplete, CI's bootstrap check will fail
loudly before any release tag goes out — same safety net the policy is
designed to provide

## Model Used

- Claude Opus 4.7 (`claude-opus-4-7`), extended thinking, tool use
enabled

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable (N/A —
manifest-only flag flip, covered by existing validators)
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — release config)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-09 16:58:35 -07:00
Dotta 778e775c35 Add secrets provider vaults and remote import (#5429)
## Thinking Path

> - Paperclip orchestrates AI-agent companies and needs secrets handling
to work across local development, hosted operators, and governed agent
execution.
> - The affected subsystem is the company-scoped secrets control plane:
database schema, server services/routes, CLI workflows, and the Secrets
settings UI.
> - The gap was that secrets were local-only and operators could not
manage provider vaults or import existing remote references without
exposing plaintext.
> - This branch adds provider vault configuration plus an AWS Secrets
Manager remote-import path while preserving company boundaries, binding
context, and audit trails.
> - I kept the PR to a single branch PR, removed unrelated
lockfile/package drift, rebased the full branch onto the current
`public-gh/master`, and addressed fresh Greptile findings.
> - The benefit is a reviewable implementation of provider-backed
secrets with focused tests covering provider selection, import
conflicts, deleted secret reuse, rotation guards, and AWS signing
behavior.

## What Changed

- Added provider vault support for company secrets, including provider
config storage, default vault handling, health checks, binding usage,
access events, and remote import preview/commit.
- Added an AWS Secrets Manager provider using SigV4 request signing,
bounded request timeouts, namespace guardrails, cached runtime
credential resolution, and external-reference linking without plaintext
reads.
- Added Secrets UI surfaces for vault management and remote import, plus
CLI/API documentation for setup and operations.
- Stabilized routine webhook secret binding paths and SSH
environment-driver fixture bindings discovered during verification.
- Addressed Greptile and CI findings: no lockfile/package drift,
monotonic migration metadata, disabled-vault default races, soft-deleted
secret hiding/recreate behavior, remove behavior with disabled vaults,
soft-deleted external-reference re-import, non-active rotation guards,
managed-secret soft deletion through PATCH, and per-call AWS SDK
credential client churn.
- Rebased this branch onto `public-gh/master` at `0e1a5828` and
force-pushed with lease to keep this as the single PR for the branch.

## Verification

- `git fetch public-gh master`
- `git rebase public-gh/master`
- `git diff --name-only public-gh/master...HEAD | grep
'^pnpm-lock\.yaml$' || true` confirmed `pnpm-lock.yaml` is not in the PR
diff.
- Confirmed migration ordering: master ends at `0081_optimal_dormammu`;
this PR adds `0082_dry_vision` and
`0083_company_secret_provider_configs`.
- Inspected migrations for repeat safety: new tables/indexes use `IF NOT
EXISTS`; foreign keys are guarded by `DO $$ ... IF NOT EXISTS`; column
additions use `ADD COLUMN IF NOT EXISTS`.
- `pnpm -r typecheck` passed before the Greptile follow-up commits.
- `pnpm test:run` ran the full stable Vitest path before the Greptile
follow-up commits; it completed with 3 timing-related failures under
parallel load: `codex-local-execute.test.ts`,
`cursor-local-execute.test.ts`, and `environment-service.test.ts`.
- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/codex-local-execute.test.ts
src/__tests__/cursor-local-execute.test.ts
src/__tests__/environment-service.test.ts` passed on targeted rerun
(`24/24`).
- `pnpm build` passed before the Greptile follow-up commits. Vite
reported existing chunk-size/dynamic-import warnings.
- After Greptile follow-up commits: `pnpm --filter @paperclipai/server
exec vitest run src/__tests__/secrets-service.test.ts` passed (`26/26`).
- After Greptile follow-up commits: `pnpm --filter @paperclipai/server
exec vitest run src/__tests__/aws-secrets-manager-provider.test.ts
src/__tests__/secrets-service.test.ts` passed (`39/39`).
- After Greptile follow-up commits: `pnpm --filter @paperclipai/server
typecheck` passed.
- Captured Storybook screenshots from `ui/storybook-static` for visual
review.
- Latest PR checks on `5ca3a5cf`: `policy`, serialized server suites
1/4-4/4, `Canary Dry Run`, `e2e`, `security/snyk`, and `Greptile Review`
pass; aggregate `verify` is still registering the completed child
checks.
- Greptile review loop continued through the latest requested pass; all
Greptile review threads are resolved and the latest `Greptile Review`
check on `5ca3a5cf` passed with 0 comments added.

## Screenshots

Before: the provider-vault and remote-import surfaces did not exist on
`master`; these are after-state screenshots from the Storybook fixtures.

![Secrets
inventory](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secrets-inventory.png)

![Secret binding
picker](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secret-binding-picker.png)

![Environment editor with
secrets](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/env-editor-with-secrets.png)

## Risks

- Migration risk: this adds new secret provider tables and extends
existing secret rows. The migrations were checked for monotonic ordering
and idempotent guards, but reviewers should still inspect upgrade
behavior carefully.
- Provider risk: AWS support uses direct SigV4 requests. Automated tests
cover signing, request timeouts, vault-config selection, namespace
guardrails, pending-version archival, sanitized provider errors, and
service-level cleanup paths. A real-vault AWS smoke test remains
deployment validation for an operator with AWS credentials rather than
an unverified merge blocker in this local branch.
- UI risk: the Secrets page and import dialog are large new surfaces;
screenshots are included above for reviewer inspection.
- Verification risk: the full local stable test command hit
parallel-load timing failures, although the exact failed files passed
when rerun directly.
- Operational risk: remote import intentionally avoids plaintext reads;
operators must understand that imported external references resolve at
runtime and may fail if AWS permissions change.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent with local shell/tool use in the
Paperclip worktree. Exact context-window size was not exposed by the
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 18:22:17 -05:00
Devin Foley 06e6ee25cd Add Daytona sandbox provider plugin (#5580)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents need isolated sandbox environments to execute work safely;
Paperclip already supports E2B as a sandbox provider plugin
> - Users want to use Daytona (https://www.daytona.io/) as an
alternative sandbox backend, but no plugin existed for it
> - Without a Daytona plugin, teams that prefer Daytona's
pricing/regions/runtime can't run Paperclip agents on it
> - This pull request adds a `@paperclip/sandbox-provider-daytona`
plugin that mirrors the existing E2B plugin shape and wires up Daytona's
`@daytonaio/sdk` for sandbox lifecycle, command execution, and shell
detection
> - The benefit is that operators can pick Daytona as a first-class
sandbox provider without touching core code, broadening Paperclip's
runtime options

## What Changed

- New plugin package `packages/plugins/sandbox-providers/daytona` with
manifest, worker entry, and provider implementation backed by
`@daytonaio/sdk`
- Implements sandbox create/destroy/exec/upload/download lifecycle,
shell command detection, and config/env wiring consistent with the E2B
plugin
- Adds unit tests under `src/plugin.test.ts` and a README documenting
setup and the `DAYTONA_API_KEY` requirement
- Minor adjustments in `scripts/paperclip-issue-update.sh`,
`packages/shared/src/issue-thread-interactions.test.ts`, and
`packages/shared/src/validators/issue.ts` to support the integration

## Verification

- Re-ran the full sandbox provider matrix on the QA Paperclip instance
using Daytona as the runtime — all 6 adapters executed inside the
Daytona sandbox with zero `environmentExecute` timeouts
- 5/6 adapters pass cleanly (or with informational warns); the only
failure is `codex_local`, which is an OpenAI quota/billing issue
unrelated to Daytona
- `pnpm --filter @paperclip/sandbox-provider-daytona test` runs the
plugin unit tests

## Risks

- New optional plugin; no behavior change for users who don't enable it
- Requires `DAYTONA_API_KEY` for runtime use — documented in the plugin
README
- Daytona SDK is a new external dependency; tracked in the plugin's own
package.json so it doesn't affect the core install footprint

## Model Used

- Claude Opus 4.7 (`claude-opus-4-7`), extended thinking, tool use
enabled

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — backend plugin)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-09 11:50:12 -07:00
Devin Foley f784d8d90e Retry canary registry verification (#5579)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and the
release pipeline is part of keeping that control plane shippable.
> - The relevant subsystem here is the release automation in
`scripts/release.sh`, which publishes canary builds and then verifies
npm registry state.
> - The failing CI run showed a successful publish followed by an
immediate registry-state verification failure while npm dist-tag
metadata was still propagating.
> - That made the canary job flaky even when the publish itself had
succeeded, which is the wrong failure mode for release automation.
> - This pull request adds bounded retries around the post-publish
registry-state verification step instead of failing on the first stale
read.
> - The benefit is that canary releases tolerate transient npm
propagation lag while still failing clearly if registry metadata never
converges.

## What Changed

- Wrapped the post-publish `verify-release-registry-state.mjs` call in a
bounded retry loop inside `scripts/release.sh`.
- Reused the existing publish verification retry defaults and added
optional overrides via `NPM_REGISTRY_STATE_VERIFY_ATTEMPTS` and
`NPM_REGISTRY_STATE_VERIFY_DELAY_SECONDS` for dist-tag-specific tuning.

## Verification

- `bash -n scripts/release.sh`
- CI will also exercise the release path via the existing `Canary Dry
Run` workflow job in `.github/workflows/pr.yml`.

## Risks

- Low risk. The main behavioral change is that a genuinely broken
registry-state verification can now wait through the configured retry
window before failing.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex local agent, GPT-5-based Codex runtime in Paperclip with
tool use and shell execution. The exact backend model ID/context window
is not surfaced in this local heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-09 11:40:02 -07:00
Devin Foley 0e1a582831 Revert "Add experimental newest-first issue thread" (#5460)
This is actually bad. Glad it was under experiments.
2026-05-07 16:50:31 -07:00
Devin Foley a904effb96 Add experimental newest-first issue thread (#5455)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, so issue
threads are a core operator surface for reviewing work.
> - The issue detail page is the place where humans read agent messages,
user comments, and execution context together.
> - That thread originally rendered oldest-first, which made recent
activity harder to see during active review.
> - Reversing the thread order changes navigation expectations,
timestamp placement, and the "Jump to latest" affordance, so the UI
behavior needed to move as a coherent set.
> - Because this is a visible core-product behavior shift, it also
needed a safe rollout path instead of becoming the default immediately.
> - This pull request adds the newest-first issue thread behavior behind
an Experimental setting, updates the thread UI to match that mode, and
keeps the legacy oldest-first experience unchanged by default.
> - The benefit is that reviewers can opt into a more recent-first issue
workflow without forcing a global behavior change on every Paperclip
instance.

## What Changed

- Reversed issue thread rendering so the newest comments and messages
appear first when the experiment is enabled.
- Moved the plain comment timestamp into the card header in newest-first
mode and kept the legacy timestamp placement for oldest-first mode.
- Moved the `Jump to latest` control to the bottom of the thread in
newest-first mode while leaving the existing top placement for the
legacy mode.
- Added the `Enable Newest-First Issue Thread` experimental instance
setting and wired issue detail to read that toggle.
- Added regression coverage for thread order, timestamp placement,
jump-button placement, and the issue-detail experiment toggle behavior.

## Verification

- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Focused checks that also passed during issue review:
- `pnpm vitest run src/components/IssueChatThread.test.tsx
src/pages/IssueDetail.test.tsx` in `ui/`
- `pnpm vitest run src/__tests__/instance-settings-routes.test.ts` in
`server/`
- Manual review path:
- Enable `Instance Settings > Experimental > Enable Newest-First Issue
Thread`
- Open an issue with comments/messages and confirm newest activity
renders first, timestamps move into the header, and `Jump to latest`
sits below the thread
- Disable the experiment and confirm the legacy oldest-first behavior
returns

## Risks

- Low risk: the behavioral change is gated behind an instance-level
experimental toggle and defaults off.
- The main regression risk is thread navigation drift between the two
modes, especially around anchor scrolling and the `Jump to latest`
affordance.
- There is some UI coupling between issue-detail query state and
experimental settings fetches, so future changes in that area should
keep both modes covered.
- Screenshots are not attached in this PR body; verification is
described with automated coverage and manual steps instead.

> I checked [`ROADMAP.md`](ROADMAP.md). This is a scoped issue-thread UX
improvement and rollout gate, not a duplicate of a roadmap-level planned
core feature.

## Model Used

- OpenAI Codex via the local `codex_local` Paperclip adapter,
GPT-5-based coding agent with terminal tool use and local code execution
in this repository worktree.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-07 16:45:12 -07:00
Devin Foley 4269545b19 Stabilize Cursor sandbox runtime resolution (#5446)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Cursor adapter spawns the Cursor CLI against local, SSH, and
sandbox execution targets; on a fresh sandbox lease, it has to resolve
where Cursor was installed
> - The previous resolver only looked for `~/.local/bin/cursor-agent`
even though the official installer (and the adapter's own
`SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as
`~/.local/bin/agent`, so a sandbox where the install ran successfully
would still fail to find the CLI
> - This pull request lets the resolver accept either basename and lets
the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't
pay the cost of a remote `printf $HOME` round-trip when the home
directory is already known
> - The benefit is sandboxed Cursor runs find the binary that the
install actually produced, and runtime probes are cheaper when the home
dir is already resolved

## What Changed

- `packages/adapters/cursor-local/src/server/remote-command.ts`: accept
either `agent` or `cursor-agent` as the preferred basename; new optional
`remoteSystemHomeDirHint` short-circuits the home-dir probe
- `packages/adapters/cursor-local/src/server/execute.ts`: thread the
home-dir hint through, prefer the resolved binary path, and shift the
effective execution cwd to the per-run managed subdirectory once the
runtime is prepared
- New `remote-command.test.ts` and `execute.test.ts` cover both
basenames, the hint short-circuit, and the cwd shift
- `packages/adapters/cursor-local/src/index.ts`: update doc string to
reflect the broader resolution
- `execute.remote.test.ts` updated to expect the managed-subdirectory
cwd shape introduced by the cwd shift

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-cursor-local` — 6/6 passing
- `pnpm typecheck` clean
- Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor
(binary lands as `~/.local/bin/agent`) now runs cleanly through the
adapter

## Risks

Low. Resolver is strictly broader (matches a superset of paths);
existing setups with `~/.local/bin/cursor-agent` continue to work. The
home-dir hint is opt-in; callers that don't pass it get the existing
probe behavior. Cursor's effective execution cwd now matches the rest of
the adapters (per-run managed subdirectory) — sessions previously rooted
at the workspace root will land in the new subdirectory.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
both basenames + hint short-circuit + cwd shift
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---

> **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative
diff against `master` includes both of those PRs' content; the files
touched by *this* PR's commit are listed under "What Changed" above.
Will rebase onto `master` and force-push once the prerequisite PRs
merge.
2026-05-07 15:00:28 -07:00
Devin Foley fe3904f434 Stabilize runtime probes and Codex env tests (#5445)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters expose a Test action that probes the configured runtime —
install, resolvability, hello — to give operators a fast yes/no on
whether an environment is healthy
> - The Codex test path was running its hello probe directly without
going through the managed-runtime preparation that production runs use,
so a healthy production setup could still report a probe failure
> - The plugin worker manager wasn't surfacing terminated workers
cleanly, leaving the runtime probe waiting on a dead worker until the
request timed out
> - This pull request routes the Codex test probe through
`prepareAdapterExecutionTargetRuntime` (so it sees the same managed
Codex home production sees), exposes `commandCwd` on
`createCommandManagedRuntimeClient` so callers can target a per-probe
directory without leaking the workspace `remoteCwd`, and propagates
plugin-worker termination as a usable error instead of a hang
> - The benefit is the Codex Test action mirrors production behavior
end-to-end, and probes against a terminated plugin worker fail fast
instead of timing out

## What Changed

- `packages/adapter-utils/src/command-managed-runtime.ts`: rename the
`remoteCwd` knob to `commandCwd` so callers can target a per-probe
directory without inheriting the workspace cwd; matching test coverage
in `command-managed-runtime.test.ts`
- `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`:
small fixes to keep callback bridge stop semantics deterministic
- `packages/adapters/codex-local/src/server/test.ts`: thread the Codex
hello probe through `prepareAdapterExecutionTargetRuntime` +
`prepareManagedCodexHome` so the probe sees the same managed home
production sees; new `test.remote.test.ts` covers the remote probe path
- `packages/adapters/cursor-local/src/server/execute.ts`: small
probe-side cleanup that aligns with the new commandCwd contract
- `server/src/services/plugin-worker-manager.ts`: surface plugin-worker
termination as a structured error so callers fail fast; new
`plugin-worker-terminated.cjs` fixture and
`plugin-worker-manager.test.ts` cases pin the behavior

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-codex-local --project
@paperclipai/adapter-cursor-local --project @paperclipai/server` —
1749/1750 passing (1 unrelated skip)
- `pnpm typecheck` clean

## Risks

Low–medium. The `remoteCwd → commandCwd` rename is a parameter renaming
on an internal helper used only by adapter test/execute paths in this
repo. The plugin-worker-terminated path was previously a hang; failing
fast may surface latent timeouts as explicit termination errors in
callers that already expected them.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
commandCwd, plugin-worker termination, and Codex remote test path
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---

> **Stacked PR.** Sits on top of #5444 which adds the per-run runtime
API surface this PR builds on. Cumulative diff against `master` includes
that PR's content; the files touched by *this* PR's commit are listed
under "What Changed" above. Will rebase onto `master` and force-push
once #5444 merges.
2026-05-07 14:52:31 -07:00
Devin Foley 12cb7b40fd Harden remote workspace sync and restore flows (#5444)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - When an agent runs against a remote target, Paperclip syncs the
workspace out to the remote at run start and restores changes back to
the local workspace at run end
> - The previous restore flow naïvely overwrote local files with
whatever the remote returned, so files that the remote run never touched
but had timestamp/mode drift could be needlessly rewritten — and a
single static `refs/paperclip/ssh-sync/imported` ref made concurrent SSH
workspace exports race on the same git ref
> - This pull request adds a `workspace-restore-merge` module that diffs
a pre-run snapshot against the post-run remote state and only writes
back files the remote actually changed; SSH workspace exports now use a
per-import unique ref so concurrent runs can't trample each other
> - Every adapter's execute path threads the snapshot through
`prepareAdapterExecutionTargetRuntime` so the merge has the baseline it
needs
> - The benefit is workspace restores no longer churn untouched files,
and concurrent SSH runs no longer collide on the import ref

## What Changed

- `packages/adapter-utils/src/workspace-restore-merge.{ts,test.ts}`: new
module — directory snapshot (kind/mode/sha256/symlink target) plus
snapshot-aware merge that writes only the files the remote changed
- `packages/adapter-utils/src/ssh.ts`: SSH workspace export uses a
per-import unique ref (`refs/paperclip/ssh-sync/imported/<uuid>`);
restore goes through the new merge helper; `ssh-fixture.test.ts` covers
the unique-ref + merge paths
- `packages/adapter-utils/src/sandbox-managed-runtime.ts` +
`remote-managed-runtime.ts`: thread the snapshot/merge through the
sandbox and SSH paths
- `packages/adapter-utils/src/server-utils.{ts,test.ts}` +
`execution-target.ts`: helpers for capturing the pre-run snapshot;
`prepareAdapterExecutionTargetRuntime` gains required `runId` and
optional `workspaceRemoteDir`, and returns the realized
`workspaceRemoteDir`
- Each adapter's `execute.ts` (acpx, claude, codex, cursor, gemini,
opencode, pi) takes the snapshot at run start and passes it through to
the runtime restore
- Remote execute test mocks updated to match the new
`prepareWorkspaceForSshExecution` return shape and the per-run
`${managedRemoteWorkspace}` cwd subdirectory

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-acpx-local --project
@paperclipai/adapter-claude-local --project
@paperclipai/adapter-codex-local --project
@paperclipai/adapter-cursor-local --project
@paperclipai/adapter-gemini-local --project
@paperclipai/adapter-opencode-local --project
@paperclipai/adapter-pi-local` — 196/196 passing
- `pnpm typecheck` clean across the workspace

## Risks

Medium. The restore path now writes a strict subset of what it
previously did — files the remote did not touch are no longer rewritten.
If any flow was relying on a touch-without-content-change being copied
back (timestamp or permission propagation only), that behavior is now
skipped. Snapshot capture adds an O(N-files-in-workspace) hash pass at
run start; the cost is bounded by the existing exclude list. The `runId`
parameter on `prepareAdapterExecutionTargetRuntime` is now required —
every in-tree caller is updated; out-of-tree adapter authors need to
pass it.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new module +
every adapter execute path covered
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-07 14:44:45 -07:00
Dotta 824298f414 Route sidebar search icon directly to search (#5440)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Operators use the sidebar as their primary board navigation surface
> - The board now has a dedicated search page, so the header search icon
should behave as normal navigation instead of only dispatching a
command-palette shortcut
> - The Work nav also had a separate Search row, which duplicated the
always-visible header search affordance
> - This pull request keeps search one click away while making it a
direct `/search` link and reducing sidebar nav noise
> - The benefit is a smaller, clearer sidebar with search still
accessible from the top-level chrome

## What Changed

- Changed the sidebar header search icon into a direct `NavLink` to
`/search`.
- Removed the duplicate `Search` row from the Work navigation section.
- Added focused Sidebar coverage that asserts the header search link
target and confirms Search is not rendered in the Work nav.
- Refactored the Sidebar test setup helper to avoid repeating the React
Query wrapper across tests.

## Verification

- `pnpm install --frozen-lockfile` in the PR worktree so workspace
package symlinks existed for test execution. This completed with
existing plugin SDK bin warnings for missing built artifacts.
- `pnpm exec vitest run ui/src/components/Sidebar.test.tsx` — 3 passed.
- `pnpm --filter @paperclipai/ui typecheck` — passed.

## Risks

- Low: this changes a sidebar navigation affordance only. Users who
previously clicked the header icon now land on the full search page
instead of opening the command-palette shortcut path.
- Low: removing the Work nav Search row could affect users who expected
Search in that section, but the icon remains in the fixed sidebar header
and is covered by a targeted DOM test.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled
Paperclip heartbeat environment. Context window and internal reasoning
mode are not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or equivalent focused UI verification
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-07 15:20:58 -05:00
Dotta e400315cbf Guard assigned backlog liveness (#5428)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The issue graph and liveness recovery system decide whether assigned
work is executable or parked
> - Assigned issues created without an explicit status could silently
land in backlog, making parents look blocked with no productive wake
path
> - The server, shared validators, recovery analysis, and UI all need to
agree on that execution semantic
> - This pull request makes assigned issue creation default to `todo`,
flags assigned backlog blockers, and surfaces the state in the board
> - The benefit is that parked assigned work becomes intentional and
visible instead of creating silent liveness stalls

## What Changed

- Adds contract tests for assigned issue creation defaults.
- Defaults assigned issue creation to `todo` when status is omitted
while preserving explicit `backlog` parking.
- Exposes `resolveCreateIssueStatusDefault` through shared validators.
- Teaches liveness/blocker attention paths to distinguish assigned
backlog blockers.
- Adds UI notices, row/header badges, and issue detail safeguards for
assigned backlog blockers.
- Adds Storybook fixtures and execution-semantics documentation for the
assigned-backlog behavior.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
packages/shared/src/validators/issue.test.ts
server/src/__tests__/issue-assigned-backlog-contract-routes.test.ts
server/src/__tests__/issue-blocker-attention.test.ts
server/src/__tests__/issue-liveness.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
ui/src/components/IssueAssignedBacklogNotice.test.tsx
ui/src/components/IssueRow.test.tsx` — 50 passed, 23 skipped.
- Skipped tests were embedded Postgres suites on this host with the repo
skip message: `Postgres init script exited with code null. Please check
the logs for extra info. The data directory might already exist.`
- Pairwise merge check against the issue-controls PR branch completed
without conflicts via `git merge --no-commit --no-ff` in a temporary
worktree.
- Screenshots for assigned-backlog UI states:
[light](docs/pr-screenshots/pr-5428/assigned-backlog-light.png),
[dark](docs/pr-screenshots/pr-5428/assigned-backlog-dark.png).
- Follow-up checks: `pnpm --filter /ui typecheck`; `pnpm --filter
/mcp-server build`; `pnpm --filter /mcp-server test`; `pnpm exec vitest
run packages/shared/src/validators/issue.test.ts`; focused UI component
tests.
- Remote PR checks on head `6300b3c`: policy, verify, serialized server
shards 1/4-4/4, Canary Dry Run, e2e, Greptile Review, and Snyk all
passed.

## Risks

- Medium: changes status defaulting for assigned issue creation when the
caller omits status. Explicit `backlog` remains supported, and
server/shared tests cover both paths.
- Medium: liveness classification changes can affect blocker attention
labels; focused service and UI tests cover the new assigned-backlog
state.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled
Paperclip heartbeat environment. Context window and internal reasoning
mode are not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-07 12:25:26 -05:00
Dotta 6f30003421 Polish operator UI task controls (#5427)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Operators spend most of their day scanning skills, routines, inbox
groups, and activity cards
> - Several small UI rough edges made those surfaces harder to scan or
easier to crash on real API payloads
> - These fixes are grouped together because they are low-risk operator
quality-of-life improvements rather than separate control-plane
contracts
> - This pull request polishes skills metadata, routine run-now access,
grouped issue creation defaults, monitor activity rendering, and
activity row identity layout
> - The benefit is a smoother board workflow with fewer small
interruptions while keeping the change set compact

## What Changed

- Improves company skill source display and the used-by agent list.
- Truncates long skill source paths and adds a copy affordance.
- Adds a row-level run-now button to the routines table.
- Adds grouped issue creation defaults for inbox issue groups and aligns
grouped add buttons to the right.
- Fixes `IssueMonitorActivityCard` when `monitorNextCheckAt` arrives as
an ISO string.
- Polishes activity row actor avatar/name layout by using the shared
avatar primitive.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
ui/src/pages/Routines.test.tsx ui/src/components/IssuesList.test.tsx
ui/src/lib/inbox.test.ts
ui/src/components/IssueMonitorActivityCard.test.tsx` — 91 passed.
- The routines test emitted the pre-existing Radix warning about missing
`DialogTitle`/description in dialog content; tests still passed.
- Pairwise merge checks against the other two PR branches reported no
textual conflicts.

## Risks

- Low: changes are UI-focused and covered by targeted component/lib
tests.
- Low-to-medium: activity row layout changes could affect dense feed
scanability; the implementation uses the shared avatar component and
keeps truncation behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled
Paperclip heartbeat environment. Context window and internal reasoning
mode are not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-07 12:24:02 -05:00
Dotta 772fc92619 Add issue controls and retry-now recovery (#5426)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue operators need clear controls for execution settings, model
overrides, and recovery retries
> - Existing issue properties hid useful adapter override state and did
not expose a board-triggered retry for scheduled heartbeat recovery
> - Scheduled retries also need to respect the same safety gates as
normal execution instead of bypassing budget, review, pause, dependency,
or terminal-state checks
> - This pull request adds the issue property controls and retry-now
surfaces together because they share the issue details/properties UI
> - The benefit is that operators can inspect and adjust issue execution
settings and safely trigger pending scheduled recovery without hidden
control-plane behavior

## What Changed

- Adds editable issue assignee model override controls in
`IssueProperties`, with focused coverage.
- Removes the stale workspace tasks link from issue properties.
- Adds a scheduled retry `retry-now` backend path and shared response
types.
- Adds main-pane and properties-pane scheduled retry UI, backed by a
shared `useRetryNowMutation` hook.
- Adds suppression coverage for budget hard stops, review participant
changes, subtree pause holds, unresolved blockers, terminal issues, and
company scoping.
- Updates the `IssueProperties` test harness with toast actions required
by the retry-now hook.

## Verification

- `pnpm exec vitest run ui/src/components/IssueProperties.test.tsx
ui/src/components/IssueScheduledRetryCard.test.tsx` — 31 passed.
- `pnpm exec vitest run
server/src/__tests__/issue-scheduled-retry-routes.test.ts` — exited 0,
but this host skipped the embedded Postgres route tests with: `Postgres
init script exited with code null. Please check the logs for extra info.
The data directory might already exist.`
- Pairwise merge check against the assigned-backlog PR branch completed
without conflicts via `git merge --no-commit --no-ff` in a temporary
worktree.

### Visual verification screenshots

Storybook story: `Product/Issue Scheduled retry surfaces /
ScheduledRetrySurfaces`.

![Scheduled retry card and issue properties rows -
desktop](https://raw.githubusercontent.com/paperclipai/paperclip/62fb566f357312b43b9162af02252d0175530a8f/docs/assets/pr-5426/scheduled-retry-story-desktop.png)

![Scheduled retry card and issue properties rows -
mobile](https://raw.githubusercontent.com/paperclipai/paperclip/62fb566f357312b43b9162af02252d0175530a8f/docs/assets/pr-5426/scheduled-retry-story-mobile.png)

## Risks

- Medium: this touches issue execution/retry behavior, so CI should run
the embedded Postgres route tests on a host that can initialize
Postgres.
- Low-to-medium UI risk around duplicated retry-now entry points; both
surfaces share one mutation hook to keep behavior consistent.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled
Paperclip heartbeat environment. Context window and internal reasoning
mode are not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-07 12:23:13 -05:00
Dotta d0e9cc76f2 Show workspace changes and stale notices in issue threads (#5356)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The issue thread is the operator's durable audit trail for what
changed and why
> - Workspace changes and stale disposition notices need to be visible
in that same timeline without noisy or misleading rendering
> - The local branch already contained backend activity details,
timeline conversion, and UI rendering work for those events
> - This pull request isolates the issue-thread activity work into a
standalone branch against `origin/master`
> - The benefit is a focused audit-trail PR that can merge independently
of the sidebar/operator UI polish branch

## What Changed

- Adds readable workspace-change activity details to issue update
activity events.
- Surfaces workspace-change events in issue chat/timeline rendering.
- Makes the existing issue comment migration idempotent.
- Folds and renders stale disposition notices inline so they match
activity-log styling and spacing.
- Adds focused route, timeline, and issue-thread system notice coverage.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run
server/src/__tests__/issue-activity-events-routes.test.ts
ui/src/lib/issue-timeline-events.test.ts
ui/src/components/IssueChatThreadSystemNotice.test.tsx` — 3 files
passed, 22 tests passed.
- Confirmed the PR changes 9 files and does not include `pnpm-lock.yaml`
or `.github/workflows/*`.
- `pnpm exec vitest run
server/src/__tests__/issue-closed-workspace-routes.test.ts` — 1 file
passed, 4 tests passed.
- `pnpm exec vitest run
server/src/__tests__/issue-activity-events-routes.test.ts
ui/src/lib/issue-timeline-events.test.ts
ui/src/components/IssueChatThreadSystemNotice.test.tsx
server/src/services/recovery/successful-run-handoff.test.ts
packages/shared/src/validators/issue.test.ts` — 5 files passed, 54 tests
passed.
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/server typecheck && pnpm --filter @paperclipai/ui
typecheck`.
- `pnpm --filter @paperclipai/ui typecheck` after adding the Storybook
screenshot fixture.
- Captured Storybook screenshots for the new UI rendering paths:
- Collapsed stale notice + workspace-change row:
`docs/pr-screenshots/pr-5356/issue-thread-notices-collapsed.png`
- Expanded stale notice details:
`docs/pr-screenshots/pr-5356/issue-thread-notices-expanded.png`


### Screenshots

Collapsed stale notice with workspace-change row:

![Collapsed stale notice with workspace-change
row](docs/pr-screenshots/pr-5356/issue-thread-notices-collapsed.png)

Expanded stale notice details:

![Expanded stale notice
details](docs/pr-screenshots/pr-5356/issue-thread-notices-expanded.png)

## Risks

- Moderate risk: this touches issue activity serialization and
issue-thread rendering, both of which are central operator surfaces.
- Migration risk is low: the only migration change makes an existing
migration idempotent.
- No new migrations are introduced, so there is no cross-PR migration
ordering requirement.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, shell/tool-use enabled, used to
split the existing branch, verify the isolated PR branch, and create
this PR.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 09:00:54 -05:00
Dotta 4103978578 Polish operator sidebar and issue property controls (#5355)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Operators use the board sidebar and issue properties panel to move
between companies and understand task metadata
> - Small UI regressions in these controls make repeated board operation
slower and less predictable
> - The local branch already contained targeted fixes for company
ordering, issue date display, and sidebar rail sizing
> - This pull request isolates those operator UI quality-of-life fixes
into a standalone branch against `origin/master`
> - The benefit is a focused, reviewable PR that can merge independently
of the issue-thread activity work

## What Changed

- Shows issue property timestamps with time, not just dates.
- Adds edit-mode support for ordering companies in the sidebar company
menu.
- Fixes a workspace switcher rail regression and keeps the account menu
aligned with the rail width.
- Includes focused component coverage for the touched controls.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/components/IssueProperties.test.tsx
ui/src/components/SidebarCompanyMenu.test.tsx
ui/src/components/Layout.test.tsx
ui/src/components/SidebarAccountMenu.test.tsx` — 4 files passed, 29
tests passed.
- `pnpm --filter /ui typecheck`
- PR checks on `a4030f7a` are green: policy, verify, serialized server
suites 1/4-4/4, e2e, Canary Dry Run, Greptile Review, and Snyk.
- Captured a local Storybook screenshot of `Product/Navigation & Layout`
after the sidebar polish:
`/tmp/pap-3659-screenshots/navigation-layout-after.png`.
- Confirmed the PR changes 8 files and does not include `pnpm-lock.yaml`
or `.github/workflows/*`.

## Risks

- Low to moderate UI risk: this touches shared sidebar components and
issue metadata rendering.
- The company ordering behavior depends on existing query/cache
behavior, so stale cache bugs would show up as ordering inconsistencies.
- No database, API, workflow, or lockfile changes are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, shell/tool-use enabled, used to
split the existing branch, verify the isolated PR branch, and create
this PR.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 08:59:39 -05:00
Dotta 68f69975a4 Harden control-plane safety and issue identifiers (#5292)
## Thinking Path

> - Paperclip relies on issue identifiers, execution policies, and agent
heartbeat rules to keep autonomous work auditable.
> - Safety checks need to reject ambiguous agent handoffs, and
identifier parsing needs to support Cloud tenant prefixes.
> - Agent instructions also need to make final-disposition rules
explicit so work does not stall in vague states.
> - This pull request isolates backend correctness and governance
hardening from the UI and recovery-system-notice branches.
> - The benefit is safer in-review transitions, better identifier
compatibility, and clearer agent operating contracts.

## What Changed

- Fixed run-aware confirmation ordering and interrupted-run state
cleanup.
- Added Cloud tenant identity bootstrap and alphanumeric issue
identifier support across shared parsing and server routes.
- Guarded agent-authored `in_review` updates unless a real review path
exists.
- Tightened heartbeat disposition instructions in adapter
utilities/default AGENTS/Paperclip skill.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run packages/shared/src/issue-references.test.ts
server/src/__tests__/issue-identifier-routes.test.ts
server/src/__tests__/issue-execution-policy-routes.test.ts
packages/adapter-utils/src/server-utils.test.ts` initially had the first
execution-policy test hit Vitest's 5s timeout under the parallel bundle
while the rest passed.
- `pnpm exec vitest run
server/src/__tests__/issue-execution-policy-routes.test.ts
--testTimeout=20000` passed with 10/10 tests.

- Follow-up: `pnpm run typecheck:build-gaps` passed.
- Follow-up: `pnpm --filter @paperclipai/ui typecheck` passed.
- Follow-up: `pnpm vitest run
server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/company-portability.test.ts
server/src/__tests__/costs-service.test.ts` passed.
- Follow-up: `pnpm vitest run ui/src/context/LiveUpdatesProvider.test.ts
ui/src/lib/issue-chat-messages.test.ts
ui/src/lib/issue-reference.test.ts
ui/src/lib/issue-timeline-events.test.ts` passed.

## Risks

- Medium control-plane risk: in-review update validation changes agent
behavior. The error message is explicit and tests cover allowed review
paths.

## Model Used

- OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with
shell/git/GitHub CLI tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 07:49:47 -05:00
Dotta a1b30c9f35 Add planning mode for issue work (#5353)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies.
> - Issues are the core unit of work, and issue comments are how board
users and agents coordinate execution.
> - Some issue conversations need to produce plans and approvals instead
of immediate implementation work.
> - The existing issue contract did not distinguish standard execution
comments from planning-oriented issue work.
> - This pull request adds an issue work-mode contract and board UI
affordances for standard vs planning mode.
> - The benefit is that planning-mode issues can be created, displayed,
discussed, and carried through agent heartbeat context without losing
the normal issue workflow.

## What Changed

- Added `standard` / `planning` issue work-mode contracts across DB,
shared validators/types, server issue flows, plugin protocol, and
adapter heartbeat payloads.
- Added an idempotent `0081_optimal_dormammu` migration for
`issues.work_mode`, ordered after current `public-gh/master` migrations.
- Updated heartbeat/context summaries and issue-thread interaction
behavior so planning work mode is preserved when creating suggested
follow-up issues.
- Added UI support for planning-mode issue creation, issue rows, detail
composer styling, and composer work-mode toggles.
- Added focused server/shared/UI tests plus a Playwright visual
verification spec for planning-mode surfaces.
- Rebased the branch onto current `public-gh/master` and added durable
planning-mode screenshots under `doc/assets/pap-3368/`.

## Verification

- `pnpm --filter @paperclipai/db run check:migrations`
- `pnpm exec vitest run --project @paperclipai/shared
packages/shared/src/validators/issue.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-context-summary.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
server/src/__tests__/issues-goal-context-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueChatThread.test.tsx
ui/src/components/NewIssueDialog.test.tsx
ui/src/components/IssueRow.test.tsx ui/src/pages/IssueDetail.test.tsx`
- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `PAPERCLIP_E2E_SKIP_LLM=true npx playwright test --config
tests/e2e/playwright.config.ts
tests/e2e/planning-mode-visual-verification.spec.ts`

## Screenshots

Desktop planning detail:

![Desktop planning
detail](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-planning-detail.png)

Desktop planning row:

![Desktop planning
row](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-planning-row.png)

Desktop staged standard toggle:

![Desktop staged standard
toggle](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-standard-toggle.png)

Mobile planning detail:

![Mobile planning
detail](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/mobile-planning-detail.png)

Mobile planning row:

![Mobile planning
row](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/mobile-planning-row.png)

## Risks

- Medium migration risk: this adds a non-null issue column. The
migration uses `ADD COLUMN IF NOT EXISTS` so installations that applied
an older branch-local migration number can still apply the final
numbered migration safely.
- Medium contract risk: issue payloads, plugin payloads, and adapter
heartbeat payloads now include work mode; compatibility is handled by
defaulting missing values to `standard`.
- UI risk is moderate because composer controls changed; focused
component tests and visual e2e coverage exercise standard vs planning
display and toggle behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent in a local Paperclip worktree, with
shell/tool use. Exact context-window size is not exposed in this
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 07:01:28 -05:00
Dotta 320fd5d23b Add full company search page (#5293)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators need to find work, documents, agents, projects, comments,
and activity across a company without jumping through separate surfaces.
> - The existing Command-K flow was useful for fast navigation but not
enough for deeper company-wide discovery.
> - Search also needs company-scoped backend contracts, query cost
controls, and indexed document matching so it stays safe as company data
grows.
> - This pull request adds a full company search API and a dedicated
board search page that Command-K can hand off to.
> - The benefit is a single searchable control-plane surface with richer
result context, recents, highlights, and test coverage across server and
UI behavior.

## What Changed

- Added a company-scoped search endpoint/service with query validation,
rate limiting, text matching, fuzzy title matching, and result typing
shared through `@paperclipai/shared`.
- Added idempotent search migrations for document search indexes and
fuzzy matching support.
- Added the full `/companies/:companyKey/search` UI, search result row
components, highlighted snippets, recent searches, and sidebar/Command-K
handoff.
- Added Storybook coverage for search surfaces and Vitest coverage for
server search behavior, rate limiting, route generation, Command-K
behavior, and the search page.
- Addressed Greptile findings by renaming the no-match SQL helper,
applying search pagination after cross-type merge sorting, and
lazy-initializing the default search service so unrelated route-test
mocks do not need to know about it.
- Merged current `public-gh/master` and renumbered the search migrations
behind upstream `0078_white_darwin`: search indexes are now
`0079_company_search_document_indexes` and fuzzy matching is
`0080_company_search_fuzzystrmatch`.

## Verification

- `git fetch public-gh master`
- `git diff --check public-gh/master...HEAD`
- `git diff --name-only public-gh/master...HEAD | rg '^pnpm-lock\.yaml$'
|| true` produced no output before opening the PR.
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/company-search-service.test.ts
server/src/__tests__/company-search-rate-limit-routes.test.ts
ui/src/pages/Search.test.tsx ui/src/components/CommandPalette.test.tsx
ui/src/lib/company-routes.test.ts` passed: 5 files, 25 tests.
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/db typecheck && pnpm --filter @paperclipai/server typecheck
&& pnpm --filter @paperclipai/ui typecheck` passed.
- `pnpm exec vitest run
server/src/__tests__/company-search-service.test.ts
server/src/__tests__/company-search-rate-limit-routes.test.ts && pnpm
--filter @paperclipai/server typecheck` passed after Greptile pagination
fixes.
- `pnpm exec vitest run
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
server/src/__tests__/company-search-rate-limit-routes.test.ts
server/src/__tests__/company-search-service.test.ts && pnpm --filter
@paperclipai/server typecheck` passed after the CI mock fix.
- After resolving the migration conflict with current
`public-gh/master`: `pnpm --filter @paperclipai/db typecheck && pnpm
exec vitest run server/src/__tests__/company-search-service.test.ts
server/src/__tests__/company-search-rate-limit-routes.test.ts && pnpm
--filter @paperclipai/server typecheck` passed.
- DB migration numbering check passed as part of `@paperclipai/db`
typecheck.
- UI states are covered by the added Storybook stories in
`ui/storybook/stories/search.stories.tsx`.
- GitHub reports the PR merge state as `CLEAN` on head `18e54fa8`.
- GitHub PR checks are green on head `18e54fa8`: policy, verify,
serialized server shards 1/4 through 4/4, e2e, canary dry run, Snyk, and
Greptile Review.

## Risks

- Search ranking and snippets are new user-facing behavior, so reviewers
should check whether result ordering feels right on real company data.
- Search touches broad company data, so company scoping and query
cost/rate-limit behavior should be reviewed carefully.
- The migrations add search indexes/extensions; they are idempotent with
`IF NOT EXISTS` for users who may have applied an earlier branch
migration number.

> ROADMAP.md checked. This PR adds a focused board search surface and
does not duplicate an open roadmap item.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub CLI
session with medium reasoning effort. Existing branch commits were
produced across prior agent sessions; this packaging pass verified,
opened the PR, addressed Greptile findings, resolved migration conflicts
after upstream PRs landed, and got PR checks green.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 06:32:37 -05:00
Dotta 424e81d087 Improve operator workflow QoL (#5291)
## Thinking Path

> - Paperclip is a control plane operators use repeatedly to supervise
agent companies.
> - Common operator workflows depend on fast scanning of inboxes, issue
sidebars, workspaces, cost totals, and runtime services.
> - Several small UI and service gaps made those workflows slower or
less clear.
> - This pull request groups the operator-facing QoL changes that can
stand alone from recovery and adapter work.
> - The benefit is a denser, clearer board experience for issue triage
and workspace operation.

## What Changed

- Added inbox assignee/project grouping and issue list token/runtime
totals.
- Improved issue properties with removable blocker chips and workspace
task links.
- Improved execution workspace layout, runtime controls, issues tab
default, and stopped-port reuse behavior.
- Added mobile markdown/routine dialog fixes, page title company names,
sidebar polish, and dashboard run task label cleanup.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/lib/inbox.test.ts
ui/src/components/IssueProperties.test.tsx
ui/src/components/WorkspaceRuntimeControls.test.tsx
server/src/__tests__/workspace-runtime.test.ts
server/src/__tests__/costs-service.test.ts`

## Risks

- Medium UI risk because this touches several operator surfaces. The
branch is intentionally grouped around workflow/QoL files and keeps the
file count below the Greptile limit.

## Model Used

- OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with
shell/git/GitHub CLI tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 06:30:44 -05:00
Dotta 11ffd6f2c5 Improve ACPX adapter configuration (#5290)
## Thinking Path

> - Paperclip orchestrates AI agents across several adapter
implementations.
> - ACPX is a local adapter path that can proxy Claude and Codex-style
execution.
> - Its configuration needed stronger schema defaults, provider-aware
model handling, and better UI support.
> - Plugin authors also need clear docs for managed resources.
> - This pull request improves ACPX adapter configuration and documents
plugin-managed resources.
> - The benefit is a more predictable adapter setup path without
changing unrelated control-plane behavior.

## What Changed

- Improved ACPX config schema, execution config handling, UI build
config, and route coverage.
- Added ACPX model filtering support and tests.
- Updated the agent config form and storybook coverage for ACPX
model/provider behavior.
- Expanded plugin authoring documentation for managed resources.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run server/src/__tests__/acpx-local-execute.test.ts
server/src/__tests__/adapter-routes.test.ts
ui/src/lib/acpx-model-filter.test.ts`

## Risks

- Low-to-medium risk: adapter configuration behavior changes can affect
ACPX users, but the change is isolated to ACPX/plugin-doc surfaces and
covered by targeted adapter tests.

## Model Used

- OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with
shell/git/GitHub CLI tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 06:06:47 -05:00
Dotta 454edfe81e Add recovery handoff system notices (#5289)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Agent runs can end productively while the source issue still lacks a
durable final disposition.
> - That leaves the control plane unsure whether to resume, escalate, or
close the work.
> - Issue comments also need a presentation contract so system-authored
recovery notices can render as first-class thread messages without
overloading normal comments.
> - This pull request adds successful-run handoff recovery, comment
presentation metadata, and system notice rendering.
> - The benefit is stricter task liveness with clearer operator-facing
recovery state.

## What Changed

- Added successful-run handoff decisions, wake payloads, escalation
behavior, and recovery tests.
- Added issue comment presentation metadata with migration
`0078_white_darwin.sql` and shared/server/company portability support.
- Rendered recovery/system notices in issue chat with dedicated UI
components, fixtures, tests, and storybook/lab coverage.
- Included the current recovery model-profile hint patch so automatic
recovery follow-ups use the cheap profile.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run
server/src/services/recovery/successful-run-handoff.test.ts
ui/src/components/SystemNotice.test.tsx
ui/src/lib/system-notice-comment.test.ts
ui/src/components/IssueChatThreadSystemNotice.test.tsx`

## Risks

- Migration-bearing PR: merge this before any other branch that might
later add a migration.
- The branch touches both recovery services and issue-thread rendering,
so review should pay attention to recovery wake idempotency and comment
metadata compatibility.

## Model Used

- OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with
shell/git/GitHub CLI tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 06:05:58 -05:00
Devin Foley 50db8c01d2 Serialize sandbox callback bridge against concurrent heartbeats (#5326)
> **Stacked PR.** This PR's branch carries cumulative content from #5324
(bridge allowlist expand) and #5325 (env sanitization) — the
mutex/sha256 logic in this PR sits on top of both. Reviewers should
focus on the files this PR's commit touches:
`packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`,
`packages/adapter-utils/src/ssh.ts`, and
`packages/adapter-utils/src/ssh-fixture.test.ts`. Will rebase onto
`master` and force-push once both prerequisite PRs are merged.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent that runs in a sandbox or via SSH talks back to the
Paperclip server through a per-lease callback bridge whose entrypoint
script is uploaded to the remote
> - When two heartbeats target the same agent on the same machine
concurrently, both upload the bridge entrypoint and both write to the
same response files — producing torn-write races: `SyntaxError:
Identifier 'randomUUID' has already been declared` from a concatenated
upload, `mv: cannot stat …` from colliding `.json.tmp` writes, and
0-byte commits from a truncated stdin
> - This pull request serializes those operations with a POSIX
`mkdir`-mutex (PID liveness check + atomic rename) at the bridge
entrypoint upload, applies the same lock to the bridge response writer,
forwards stdin into remote ssh commands so the entrypoint payload
arrives intact, and verifies a sha256 of the upload before promoting it
> - The benefit is concurrent heartbeats no longer corrupt each other's
bridge state

## What Changed

- `packages/adapter-utils/src/sandbox-callback-bridge.ts`: serialize
entrypoint upload and response writes via POSIX `mkdir`-mutex with PID
liveness; sha256 the upload before promoting via `mv`; content-skip when
the existing entrypoint already matches
- `packages/adapter-utils/src/ssh.ts`: forward stdin into remote ssh
commands through the SSH managed runtime so `cat > "$remote_upload"`
actually receives the base64-encoded entrypoint
- `packages/adapter-utils/src/ssh-fixture.test.ts`: cover the
stdin-forwarded SSH path
- `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: cover
the mutex, content-skip, sha256-verify, and atomic-rename paths

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils`
- `pnpm typecheck` clean
- Manual: two parallel heartbeats targeting the same SSH agent no longer
race on the bridge entrypoint or response files

## Risks

Medium. Serializing previously-parallel operations adds latency on the
contended path (one heartbeat waits on another), bounded by the
entrypoint upload time. The mutex includes PID liveness so a crashed
heartbeat doesn't deadlock subsequent ones. Sha256-verify gives a clear
"torn upload" failure mode instead of silent 0-byte commits.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — tests cover mutex
+ sha256-verify + stdin-forwarded ssh
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 20:01:04 -07:00
Devin Foley f6bad8f6bf Sanitize remote execution envs at the boundary (#5325)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters spawn CLIs against local, SSH, and sandbox targets,
threading a runtime env through `runAdapterExecutionTargetProcess` and
the SSH/sandbox runners
> - Host identity vars (HOME, TMPDIR, XDG_*, NVM_DIR, PATH) routinely
leak into the env we send to remote targets — sometimes via test probes,
sometimes via runtime config — and break sandboxed/SSH'd CLIs whose own
profiles set those values correctly
> - The sanitization logic existed but lived alongside other helpers in
`server-utils.ts` and was applied piecemeal at adapter callsites, so it
was easy to bypass
> - This pull request lifts the sanitization into a standalone
`remote-execution-env.ts`, applies it at the SSH and sandbox runtime
boundary so every remote spawn goes through it, and removes the
duplicated callsite-level filtering
> - The benefit is identity-bound host env stops leaking across
SSH/sandbox transports regardless of which adapter calls in

## What Changed

- `packages/adapter-utils/src/remote-execution-env.ts`: new module —
single source of truth for which env keys are identity-bound and how to
strip them when the value matches the host's value
- `packages/adapter-utils/src/server-utils.ts`: remove the inline
sanitization (now in `remote-execution-env.ts`)
- `packages/adapter-utils/src/execution-target.ts`: apply sanitization
at the sandbox runtime boundary
- `packages/adapter-utils/src/ssh.ts`: apply sanitization at the SSH
spawn boundary
- `packages/adapters/opencode-local/src/server/test.ts`: drop
now-redundant callsite filtering
- `packages/adapters/pi-local/src/server/test.ts`: drop now-redundant
callsite filtering
- New tests `execution-target.test.ts` and
`execution-target-sandbox.test.ts` cover the sanitizer flow at both
transports, including positive cases (host-shaped path stripped) and
explicit-override preservation

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-opencode-local --project
@paperclipai/adapter-pi-local`
- `pnpm typecheck` clean

## Risks

Low–medium. The sanitization is now applied at one layer (boundary)
instead of N (callsites), so behavior is more consistent. Any adapter
that previously relied on a leaked host var landing on the remote shell
would now see it stripped — but those reliances were what this change
exists to fix.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests at both
transports
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 19:30:14 -07:00
Devin Foley 36eaf9778f Expand sandbox callback bridge allowlist to cover the documented heartbeat surface (#5324)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - When an agent runs in an e2b sandbox or other non-managed
environment, it talks back to the Paperclip server through a per-lease
callback bridge that proxies HTTP requests
> - The bridge has an allowlist of method/path patterns it will forward;
anything outside the list is rejected to keep the bridge tight
> - The allowlist had drifted behind what the heartbeat documentation
describes as the supported callback surface — several documented
endpoints (issue updates, agent-side log emit, work-status writes) were
being rejected at the bridge
> - This pull request expands the allowlist to cover the documented
heartbeat surface and adds tests that pin every newly-allowed pattern,
so the doc and the bridge stay in sync
> - The benefit is sandboxed runs no longer hit "method not allowed" /
"path not allowed" rejections on the documented set of callbacks

## What Changed

- `packages/adapter-utils/src/sandbox-callback-bridge.ts`: expand the
method/path allowlist to match the documented heartbeat callback surface
- `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: add
coverage for every newly-allowed pattern, plus negative cases for
patterns that should still be rejected

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils`
- `pnpm typecheck` clean
- Manual: previously-rejected callbacks from sandboxed runs now succeed
end-to-end

## Risks

Low. The allowlist only grows; nothing previously allowed is now
blocked. Tests pin both the new allowed patterns and that out-of-doc
patterns stay rejected.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
added patterns + still-rejected negatives
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 19:30:11 -07:00
Devin Foley 83e7ecc58e Preserve scope on manual heartbeat invokes (#5323)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The agent live-run route lets operators trigger a manual heartbeat
invocation so an agent can pick up a specific issue or step out of band
> - The current route flow drops the caller's scope (issue/run context)
when forwarding the manual invoke into the heartbeat service, so the
resulting run loses the targeting the operator specified
> - This pull request threads the operator-supplied scope through the
manual invoke path on both the server route and the UI client, with a
regression test that confirms the scope round-trips
> - The benefit is manual heartbeat invokes from the live-run UI
actually pick up the scoped issue/run instead of falling through to the
agent's default routine

## What Changed

- `server/src/routes/agents.ts`: forward the operator-supplied scope
into the manual invoke heartbeat service call
- `server/src/__tests__/agent-live-run-routes.test.ts`: new test
verifying the manual invoke path preserves scope
- `ui/src/api/agents.ts`: pass scope through the live-run client API

## Verification

- `pnpm vitest run --no-coverage
server/src/__tests__/agent-live-run-routes.test.ts`
- `pnpm typecheck` clean

## Risks

Low. The change is purely additive on the route surface — handlers that
did not previously pass scope continue to work; handlers that did pass
it now have it preserved instead of dropped.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new test covers
the preserved-scope path
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (internal API change, no visible UI shift)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 19:30:08 -07:00
Devin Foley 9fb0c73e0a Raise gemini-local hello probe timeout to 60s for SSH and E2B targets (#5322)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Gemini adapter's environment Test surfaces a hello probe so
operators can confirm the CLI runs end-to-end on the configured target
> - On SSH and E2B sandbox targets the round-trip cost (login-shell
sourcing, network, model warm-up) routinely exceeds the existing 10s
probe timeout, so the probe spuriously fails on environments that are
actually healthy
> - This pull request raises the gemini-local hello probe timeout to
60s, matching the timeout we use for slower-bootstrapping adapters
> - The benefit is the Gemini Test action no longer reports false
negatives on remote targets that need a longer first-run window

## What Changed

- `packages/adapters/gemini-local/src/server/test.ts`: hello probe
timeout raised from 10s to 60s

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-gemini-local`
- Manual: SSH and E2B Gemini hello probes now complete cleanly without
spurious timeouts

## Risks

Low. A 60s ceiling on a non-blocking probe is consistent with sibling
adapters; the only behavior change is a longer worst-case wait when the
probe genuinely hangs.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — N/A (one-line
timeout change)
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 19:30:04 -07:00
Dotta d6d7a7cea6 Add routine revision history and restore flow (#5285)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - Routines are the scheduled/recurring work surface that keeps a
company operating without manual kicks.
> - Operators need routine edits to be auditable and recoverable,
especially when routines control assignments, prompts, triggers, and
webhook secrets.
> - Documents already have revision-style safety, but routines did not
have equivalent history or restore semantics.
> - This pull request adds append-only routine revisions across the
database, shared contracts, server routes, and board UI.
> - The benefit is safer routine iteration: users can inspect history,
compare changes, restore older definitions, and avoid overwriting newer
edits.

## What Changed

- Added `routine_revisions` storage, latest revision pointers on
routines, shared types, validators, and API docs for routine revision
history.
- Added server service/route support for listing routine revisions,
conflict-aware routine saves, and append-only restore operations.
- Added a History tab on routine detail with revision preview,
structured change summaries, description line diffs, dirty-edit
blocking, restore confirmation, and restored webhook secret surfacing.
- Extracted the line diff helper from `DocumentDiffModal` into
`ui/src/lib/line-diff.ts` for reuse.
- Rebased the branch onto current `public-gh/master` and renumbered the
routine revision migration to `0077_unusual_karnak` after upstream
`0076_useful_elektra`.
- Made the `0077` routine revision migration idempotent so installs that
already applied the branch-local `0076_unusual_karnak` can safely
advance.
- Updated the plugin SDK test harness routine fixture with the new
revision fields required by the shared `Routine` contract.

## Verification

- `pnpm --filter @paperclipai/db run check:migrations` passed.
- `pnpm exec vitest run --project @paperclipai/shared
packages/shared/src/validators/routine.test.ts` passed.
- `pnpm exec vitest run --project @paperclipai/ui
ui/src/lib/line-diff.test.ts
ui/src/components/RoutineHistoryTab.test.tsx
ui/src/lib/workspace-routines.test.ts ui/src/pages/Routines.test.tsx`
passed.
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/routines-service.test.ts --pool=forks
--poolOptions.forks.isolate=true` passed.
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/routines-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true` passed.
- `pnpm --filter @paperclipai/plugin-sdk typecheck` passed after
updating the SDK test harness fixture.
- `pnpm --filter @paperclipai/plugin-sdk build` passed; this refreshed
local generated SDK output needed by plugin example typechecks.
- `pnpm -r typecheck` passed.

## Risks

- Medium migration risk: this adds routine revision storage and
backfills existing routines. The migration is ordered after upstream
`0076` and uses `IF NOT EXISTS` / duplicate-object guards to tolerate
earlier branch-local migration application.
- Restore behavior intentionally appends a new revision instead of
mutating history; callers expecting an in-place rollback need to follow
the new latest revision pointer.
- Restoring webhook triggers recreates webhook secret material, so users
must copy newly surfaced secrets after restore.
- Conflict-aware saves now reject stale routine edits when the client
sends an older `baseRevisionId`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, with shell/tool use in a local
git worktree. Exact context-window size is not exposed in this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Screenshots: not attached in this draft PR; the new UI flow is covered
by component tests listed above.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-05 11:54:52 -05:00
Devin Foley 9578dc3da7 Wire per-adapter sandbox install commands through test and execute paths (#5280)
> **Stacked PR.** Sits on top of the e2b sandbox chain — #5278 (stdin
staging) and #5279 (honest-resolvability + login-profiles). The
cumulative diff against `master` includes both of those PRs' content;
the files touched by *this* PR's commit are the new
`maybeRunSandboxInstallCommand` helper in
`packages/adapter-utils/src/execution-target.ts` and the per-adapter
`index.ts`/`server/test.ts`/`server/execute.ts` wiring under
`packages/adapters/{claude,codex,cursor,gemini,opencode,pi}-local/`. The
honest resolvability check from #5279 is what gives this PR's install
command a meaningful "did it actually land on PATH" follow-up.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Sandbox execution targets are ephemeral — each fresh lease starts
from a template image that may or may not have the agent CLIs
preinstalled
> - When a CLI isn't preinstalled, the resolvability probe fails at
`command -v` and the hello probe never runs
> - There's no shared mechanism for "before you probe or provision,
install the CLI on this sandbox"
> - This pull request adds a `SANDBOX_INSTALL_COMMAND` constant per
adapter and a `maybeRunSandboxInstallCommand` helper that runs it via
the existing sandbox login shell, captures structured output, and never
throws (so the resolvability + hello probe still run after); each
adapter's `test()` and `execute()` share the constant so the two
callsites can't drift
> - The benefit is a fresh sandbox lease without a preinstalled CLI now
installs it once via `sh -lc` before the resolvability probe and before
managed-runtime provisioning, with a uniform
`<adapter>_install_command_run` check on the test report

## What Changed

- `packages/adapter-utils/src/execution-target.ts`: add
`AdapterSandboxInstallCommandCheck` and `maybeRunSandboxInstallCommand`
(runs the install via existing sandbox shell, captures
exit/stdout/stderr, returns a structured info/warn check, never throws)
- Add `SANDBOX_INSTALL_COMMAND` to each adapter's `index.ts` so `test()`
and `execute()` share a single source of truth
- Wire each of the 6 affected adapter `testEnvironment()`s to call
`maybeRunSandboxInstallCommand` before
`ensureAdapterExecutionTargetCommandResolvable`
- Pass `installCommand: SANDBOX_INSTALL_COMMAND` through
`prepareAdapterExecutionTargetRuntime` in each adapter's `execute()`
- Per-adapter install commands use npm globals where possible so
binaries land on a PATH segment the template already exports:
  - claude → `npm install -g @anthropic-ai/claude-code`
  - codex → `npm install -g @openai/codex`
  - cursor → `curl https://cursor.com/install -fsS | bash`
  - gemini → `npm install -g @google/gemini-cli`
  - opencode → `npm install -g opencode-ai`
  - pi → `npm install -g @mariozechner/pi-coding-agent`

SSH and local targets ignore `installCommand` (SSH runtime takes no such
param; local short-circuits before runtime prep), so this is a no-op for
non-sandbox environments.

## Verification

- `pnpm typecheck` clean
- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils`
and per-adapter projects pass
- Manual sandbox matrix (claude, codex, cursor, gemini, opencode, pi) —
each goes `install_command_run → resolvable → hello_probe_passed` (Codex
and Pi land on `hello_probe_auth_required`, which is the
configured-credentials problem, not an install issue)
- SSH no-regression: SSH Claude still passes; the helper short-circuits
on non-sandbox targets

## Risks

Medium — adds a network/CPU cost (npm install / curl) on every fresh
sandbox lease. Cost is bounded (one-time per lease, typically tens of
seconds for npm globals), and the helper never throws so a failing
install still lets the report run resolvability and hello probes. If a
sandbox image already has the CLI, the install is an idempotent
reinstall.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:29:28 -07:00
Devin Foley af9386f879 Run a real command-v probe and source login profiles before exec in e2b sandboxes (#5279)
> **Stacked PR.** Sits on top of #5278 (`e2b/stage-stdin-to-temp-file`)
which ships the stdin-staging fix this builds on. The cumulative diff
against `master` includes that PR's content; the files touched by *this*
PR's commit are `packages/adapter-utils/src/execution-target.ts`,
`packages/plugins/sandbox-providers/e2b/src/plugin.ts`, and
`packages/plugins/sandbox-providers/e2b/src/plugin.test.ts`.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The adapter Test flow does an "is the command resolvable?" probe
before running the hello probe so the report distinguishes "binary not
installed" from "binary errored"
> - For sandbox targets, that resolvability check was a no-op
early-return — every sandboxed adapter test reported "Command is
executable" regardless of whether the binary existed
> - That made the resolvability check disagree with the hello probe in a
way that looked like a PATH bug, when it was actually a missing CLI
> - Separately, the e2b spawn used `sandbox.commands.run` with a
non-login non-interactive shell whose PATH did not include npm-globals,
nvm shims, or anything else the template installs via
`.profile`/`.bashrc`
> - This pull request makes the resolvability check honest by running a
real `command -v` invocation through the sandbox runner, and aligns the
e2b spawn with SSH by sourcing login profiles before `exec env KEY=val
<cmd>`
> - The benefit is the e2b sandbox spawn agrees with the hello probe and
finds CLIs at template-installed paths

## What Changed

- `packages/adapter-utils/src/execution-target.ts`: add
`ensureSandboxCommandResolvable` that runs `command -v <cli>` through
the sandbox runner; replace the early-return in
`ensureAdapterExecutionTargetCommandResolvable` for sandbox targets
- `packages/plugins/sandbox-providers/e2b/src/plugin.ts`: replace
`buildCommandLine` with `buildLoginShellScript` (sources `/etc/profile`,
`~/.profile`, `~/.bash_profile`, `~/.bashrc`, `~/.zprofile`, and nvm.sh
before `exec env KEY=val <cmd>`); env vars are interpolated inline so
user-configured adapter env always wins over profile-exported values;
drop the now-unused `envs:` SDK option
- `plugin.test.ts` updated for the login-shell wrapping

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/sandbox-e2b` —
17/17 plugin tests pass
- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils`
clean
- `pnpm typecheck` clean
- Manual: previously every sandboxed adapter said "Command is
executable" then the hello probe failed with "exec: not found". After
this change, missing CLIs surface honestly at the resolvability step.
SSH no-regression: SSH Claude probe still passes.

## Risks

Medium — sandbox adapter Test reports will start failing at the
resolvability step for environments where the CLI was never actually
installed. This was always the real state; the previous "Command is
executable" message was incorrect. Operators should expect
previously-green-but-broken sandbox environments to report accurately.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — `plugin.test.ts`
updated for the login-shell wrapping
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:21:37 -07:00
Devin Foley cb6af7c2cc Stage stdin to a temp file so the e2b sandbox executor delivers it reliably (#5278)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The e2b sandbox provider implements `onEnvironmentExecute` so
adapters can spawn CLIs in an e2b sandbox
> - For commands that need stdin (e.g. piping a hello prompt to a CLI),
the previous implementation awaited a foreground `commands.run({ stdin:
true, ... })` and then tried to call `sendStdin(pid)` on the now-dead
PID
> - That call resolves only after the process exits, so stdin was never
delivered and e2b raised "process not found"
> - This pull request stages stdin to `/tmp/paperclip-stdin-<uuid>`
inside the sandbox and shell-redirects it (`exec '<cmd>' '<args>' <
'<file>'`), making the command synchronous regardless of whether stdin
is supplied
> - The benefit is adapter Test probes that pipe a hello prompt to a CLI
inside an e2b sandbox now actually deliver the prompt

## What Changed

- `packages/plugins/sandbox-providers/e2b/src/plugin.ts`: replace the
broken async `commands.run` + `sendStdin` flow with stdin-staging to a
sandbox temp file and shell-redirection
- Staged file is removed in a `finally` block; write failures propagate
after best-effort cleanup

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/sandbox-e2b` —
all 17 unit tests pass
- `pnpm typecheck` clean
- Manual: a sandboxed adapter Test probe that pipes a hello prompt now
receives the prompt

## Risks

Low risk — `plugin.test.ts` already encodes the temp-file design; the
change brings the implementation in line with the test.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — existing tests
already encode the new design
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:49 -07:00
Devin Foley 5c2f9aba9d Run explicit-environment adapter tests on the requested target instead of falling back to the host (#5277)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - When a user clicks "Test" on a configured environment (SSH or
sandbox), the agent-test route exercises the adapter against that target
> - The route previously fell back to running the probe on the Paperclip
host whenever an explicit environment target couldn't be resolved, with
the test report still saying "passed"
> - That hid two real failure modes: misconfigured environments looked
green, and sandbox environments were never actually exercised
> - This pull request acquires an ad-hoc lease and realizes a workspace
for sandbox/plugin test environments, resolves a sandbox execution
target wired to the environment runtime, and returns synthesized
diagnostics instead of running a host probe when an explicit env target
can't be resolved
> - The benefit is the Test action surfaces the real environment state
and never silently exercises the wrong machine

## What Changed

- `server/routes/agents.ts`: acquire an ad-hoc lease and realize a
workspace for sandbox/plugin test environments; resolve a sandbox
execution target wired to the environment runtime
- Return synthesized diagnostics (no host fallback) when an explicit env
target can't be resolved
- `server/services/environment-runtime.ts`: small adjustments to support
the explicit-env-target case
- Clarify test-route messages so they no longer claim a host fallback in
explicit env flows
- New `agent-test-environment-routes.test.ts` covers the guard and
missing-environment path

## Verification

- `pnpm vitest run --no-coverage
server/src/__tests__/agent-test-environment-routes.test.ts`
- `pnpm typecheck` clean
- Manual: a deliberately misconfigured sandbox environment now reports
diagnostics instead of a misleading host-pass

## Risks

Medium — Test route behavior change. Explicit environments that
previously appeared to pass via host fallback will now report their real
state. This is the desired behavior, but operators should expect to see
new failures for environments that were never actually working.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
guard + missing-env paths
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:32 -07:00
Devin Foley 9042b8d042 Write apikey-mode auth.json so Codex CLI 0.122+ can authenticate via OPENAI_API_KEY (#5276)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Codex adapter spawns the OpenAI Codex CLI to drive the model
> - Codex CLI 0.122 changed how it reads credentials: it ignores
`OPENAI_API_KEY` from the environment and reads only
`$CODEX_HOME/auth.json`
> - Without auth.json, Codex 0.122+ returns 401 "Missing bearer or basic
authentication" on `/v1/responses` even when `OPENAI_API_KEY` is
forwarded into the sandbox or remote shell
> - This pull request materializes an apikey-mode `auth.json` in the
managed Codex home (or per-run for the test probe) when an
`OPENAI_API_KEY` is configured
> - The benefit is configured Codex API keys authenticate correctly with
current Codex CLI versions across local, SSH, and sandbox targets

## What Changed

- `codex-home.ts`: add `writeApiKeyAuthJson()` and let
`prepareManagedCodexHome` accept an `apiKey` override that replaces the
symlinked host auth.json with an apikey-mode file
- `execute.ts`: pass `envConfig.OPENAI_API_KEY` into
`prepareManagedCodexHome` so the managed (and synced-to-remote) Codex
home authenticates via the configured key
- `test.ts`: when `OPENAI_API_KEY` is available, wrap the hello probe
with a small shell that materializes a per-run `$CODEX_HOME/auth.json`
before exec'ing codex; key content rides through env to avoid leaking
into process listings
- Update the `codex_hello_probe_auth_required` hint to explain Codex CLI
does not read `OPENAI_API_KEY` from env

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-codex-local`
- `pnpm typecheck` clean
- Manual: Codex 0.122.0 with empty `CODEX_HOME` returns 401 with
env-only auth; with this change it authenticates cleanly

## Risks

Low risk — when no API key is configured, behavior is unchanged (no
auth.json written, existing chatgpt-mode flow preserved). Apikey-mode
`auth.json` is the upstream-supported format.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:27 -07:00
Devin Foley 44c365dea3 Stop leaking host process.env into the remote Pi SSH probe (#5275)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Pi adapter runs the pi-coding-agent CLI against local, SSH, and
sandbox execution targets
> - The Test path's hello probe spreads the host's `process.env` into
the remote process env, including the macOS PATH
> - The leaked Mac PATH overrides the nvm-sourced PATH set up by
`buildSshSpawnTarget`, so on a Linux SSH target `node` resolves to
system Node 18 instead of nvm's Node 20+
> - pi-coding-agent v0.68 / pi-tui then crashes at
`pi-tui/dist/utils.js:27` with `SyntaxError: Invalid regular expression
flags` on the `/v` unicode-sets regex (a Node 20+ feature)
> - This pull request stops the leak — same fix as the opencode SSH
probe — by passing only user-configured adapter env to the probe when
the target is remote
> - The benefit is the Pi hello probe now passes end-to-end against an
SSH target without the Node version mismatch

## What Changed

- `packages/adapters/pi-local/src/server/test.ts` passes only the
user-configured adapter env (`normalizeEnv(env)`) to
`runAdapterExecutionTargetProcess` when the target is remote
- Local probes still get the full `runtimeEnv` so headless permission
injection keeps working

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-pi-local`
- `pnpm typecheck` clean
- Manual: Pi hello probe goes from `pi_hello_probe_failed` (Node 18
regex error) to `pi_hello_probe_passed` against an SSH target

## Risks

Low risk — same pattern shipped for opencode-local and consistent with
claude-local / codex-local / gemini-local.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — pattern mirrors
sibling adapters
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:23 -07:00
Devin Foley 028c5aa00a Stop leaking host process.env into the remote OpenCode SSH probe (#5274)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The OpenCode adapter runs against local, SSH, and sandbox execution
targets
> - The Test path's hello probe spreads the Paperclip host's
`process.env` into the remote process env, which over SSH gets exported
on the remote shell
> - On a Linux SSH target, `HOME=/Users/...` and a host XDG_CONFIG_HOME
pointing at a macOS `/var/folders/...` temp dir cause OpenCode to walk a
host-only path and fail with `EACCES: permission denied, mkdir '/Users'`
> - This pull request stops the leak by passing only user-configured
adapter env to the probe when the target is remote, matching the pattern
already used by claude-local, codex-local, and gemini-local
> - The benefit is the OpenCode hello probe now passes end-to-end
against an SSH target without spurious filesystem errors

## What Changed

- `prepareOpenCodeRuntimeConfig` short-circuits when the target is
remote — the host-fs temp config dir is meaningless and harmful for a
remote target
- `test.ts` passes only the user-configured adapter env (no host
`process.env` spread) to `runAdapterExecutionTargetProcess` when
`targetIsRemote`
- Local probes still get the full `runtimeEnv` so headless permission
injection keeps working

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-opencode-local`
- `pnpm typecheck` clean
- Manual: SSH OpenCode hello probe goes from `EACCES … mkdir '/Users'`
to `opencode_hello_probe_passed`

## Risks

Low risk — local probe behavior is unchanged; the change only narrows
the env passed to remote targets, matching the pattern already shipped
in sibling adapters.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — pattern mirrors
existing sibling tests
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:19 -07:00
Devin Foley ea7f53fd7d Handle Gemini CLI v0.38 stream-json wire format across parser, UI, and CLI formatter (#5273)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent uses an adapter that drives a CLI (Claude, Gemini, Codex,
etc.)
> - The Gemini adapter parses a JSONL transcript stream the CLI emits to
learn what the model said
> - Gemini CLI v0.38 changed the transcript shape: assistant text now
comes through `type=message` with `role`/`content` and terminal status
comes through `type=status` / `type=stats`
> - The existing parser was written against the older `type=assistant` /
`type=result` shape, so post-v0.38 outputs left the parsed summary empty
and downgraded the SSH hello probe to "unexpected output"
> - This pull request updates every Gemini consumer (server parser, UI
parser, CLI formatter) to accept the v0.38 shape while keeping the
legacy shape working
> - The benefit is the Gemini adapter handles current upstream output
without losing backward compatibility, with explicit test coverage for
both shapes

## What Changed

- `packages/adapters/gemini-local/src/server/parse.ts` recognizes
`type=message` events with role/content and stops downgrading them
- `packages/adapters/gemini-local/src/ui/parse-stdout.ts` mirrors the
parser changes for the live UI transcript
- `packages/adapters/gemini-local/src/cli/format-event.ts` formats the
new event shape correctly for CLI output
- `parse.test.ts` and `parse-stdout.test.ts` add v0.38 coverage;
`gemini-local-adapter.test.ts` and `execute.remote.test.ts` switch
happy-path fixtures to the current real wire format and keep dedicated
tests for the older schema

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-gemini-local` — full suite passes including new
v0.38 cases and preserved legacy cases
- `pnpm typecheck` clean

## Risks

Low risk — additive event handling. Legacy event shape path is preserved
with its own tests, so existing fixtures continue to parse identically.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:14 -07:00
Dotta 3c73ed26b5 Expand plugin host surface (#5205)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The plugin system is the extension boundary for optional product
capabilities
> - Rich plugins need more than a worker entrypoint: they need scoped
database storage, local project folders, managed agents/routines, host
navigation, and reusable UI components
> - The LLM Wiki work exposed those missing host surfaces while keeping
plugin code outside the core control plane
> - This pull request expands the core plugin host, SDK, server APIs,
and UI bridge so plugins can declare and use those surfaces
> - The benefit is that future plugins can integrate with Paperclip
through documented, validated contracts instead of bespoke server or UI
imports

## What Changed

- Added plugin-managed database namespaces and migration tracking,
including Drizzle schema/migration files and SQL validation for
namespace isolation.
- Added server support for plugin local folders, managed agents, managed
routines, scoped plugin APIs, and plugin operation visibility.
- Expanded shared plugin manifest/types/validators and SDK
host/testing/UI exports for richer plugin surfaces.
- Added reusable UI pieces for file trees, managed routines, resizable
sidebars, route sidebars, and plugin bridge initialization.
- Updated plugin docs and example plugins to use the expanded host and
SDK surface.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm run preflight:workspace-links && pnpm exec vitest run
packages/shared/src/validators/plugin.test.ts
server/src/__tests__/plugin-database.test.ts
server/src/__tests__/plugin-local-folders.test.ts
server/src/__tests__/plugin-managed-agents.test.ts
server/src/__tests__/plugin-managed-routines.test.ts
server/src/__tests__/plugin-orchestration-apis.test.ts
ui/src/api/plugins.test.ts ui/src/components/FileTree.test.tsx
ui/src/components/ResizableSidebarPane.test.tsx
ui/src/pages/PluginPage.test.tsx ui/src/plugins/bridge.test.ts` passed:
11 files, 67 tests.
- Confirmed this PR changes 89 files and does not include
`pnpm-lock.yaml` or `.github/workflows/*`.

## Risks

- Medium: this expands plugin host contracts across db/shared/server/ui
and includes a new core migration (`0076_useful_elektra.sql`).
- The plugin database namespace validator is intentionally restrictive;
plugin authors may need follow-up affordances for SQL patterns that
remain blocked.
- Merge this before the LLM Wiki plugin PR so the plugin can resolve the
new SDK and host APIs.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub
workflow. Context window size was not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-05 07:42:57 -05:00
Dotta d6bee62f02 Fix Cloud tenant issue identifier routes (#5196)
## Summary

- Allow Cloud tenant issue identifiers with alphanumeric prefixes, such
as `PC1897-1`, to normalize as issue references.
- Resolve those identifiers through issue detail/update routes, active
run/live run polling, activity, costs, and `issueService.getById`.
- Keep UI issue-link parsing aligned so tenant links normalize back to
`/issues/<IDENTIFIER>`.

## Root Cause

Cloud tenant issue prefixes include digits from the stack-id hash. The
app-side route normalization still accepted only all-letter prefixes, so
`/api/issues/PC1897-1` skipped identifier lookup and fell through as a
non-UUID id.

## Verification

- `pnpm exec vitest run packages/shared/src/issue-references.test.ts
ui/src/lib/issue-reference.test.ts
server/src/__tests__/issue-identifier-routes.test.ts
server/src/__tests__/activity-routes.test.ts
server/src/__tests__/costs-service.test.ts
server/src/__tests__/agent-live-run-routes.test.ts
server/src/__tests__/issues-service.test.ts`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/server typecheck`
- `git diff --check`

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-04 13:20:58 -05:00
Dotta edbb670c3b Merge pull request #5154 from paperclipai/pap-3474-docker-timeout
Raise Docker image build timeout
2026-05-03 23:01:46 -05:00
Dotta fd10404374 Raise Docker image build timeout
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-03 22:52:33 -05:00
Devin Foley 47920f9c47 Speed up PR CI critical path (#5147)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies, so
developer throughput on the control plane repo directly affects how fast
the product can evolve.
> - The PR workflow is part of that throughput surface because every
change waits on it before review and merge.
> - This branch started from measured evidence that the PR critical path
was dominated by work that was either serialized unnecessarily or placed
on the wrong part of the graph.
> - The biggest concrete problems were: the canary dry run living inside
`verify`, the server isolated suites running one-by-one in a single
lane, and duplicate CI work that the PR path was paying for without
increasing coverage proportionally.
> - This pull request restructures the PR workflow so those costs are
reduced without removing the important coverage that was already
protecting release and test quality.
> - Follow-up fixes on the branch hardened the new entrypoints so they
work on clean GitHub runners and so the reduced PR typecheck path stays
self-maintaining as workspace packages evolve.
> - The benefit is materially faster PR wall-clock time while keeping
canary packaging checks, serialized-suite isolation, plugin SDK
consumers, and explicit TypeScript coverage where builds do not already
provide it.

## What Changed

- Moved the PR canary dry run into its own `Canary Dry Run` job so it
still runs on PRs but no longer extends the `verify` critical path.
- Split the custom Vitest runner into `general`, `serialized`, and `all`
modes, and added shard support for the isolated server suites.
- Added `test:run:general` and `test:run:serialized` scripts, then
rewired PR CI to fan the serialized server suites out across a 4-way
matrix.
- Added the required `@paperclipai/plugin-sdk` build preflight before
the new reduced-scope typecheck and test entrypoints so they succeed on
clean CI runners.
- Replaced the hardcoded PR build-gap list with
`scripts/run-typecheck-build-gaps.mjs`, which discovers workspace
packages whose `build` scripts skip TypeScript and runs only their
explicit `typecheck` scripts.
- Removed the redundant `pnpm build` from the PR `e2e` job because the
Playwright onboarding path boots Paperclip from source.

## Verification

- `ruby -e "require 'yaml'; YAML.load_file('.github/workflows/pr.yml');
puts 'workflow ok'"`
- `node scripts/run-vitest-stable.mjs --mode general --dry-run`
- `node scripts/run-vitest-stable.mjs --mode serialized --shard-index 0
--shard-count 4 --dry-run`
- `pnpm run typecheck:build-gaps`
- `pnpm test:run:general`
- `pnpm test:run:serialized -- --shard-index 0 --shard-count 4`
- `pnpm build`
- `pnpm paperclipai onboard --yes --run`
- `curl http://127.0.0.1:3299/api/health`

## Risks

- Branch protection or required-check configuration may need to be
updated for the new standalone `Canary Dry Run` job and the
serialized-suite matrix job names.
- `scripts/run-typecheck-build-gaps.mjs` assumes packages that need
explicit PR-time typechecking are the ones whose `build` scripts omit
`tsc`; if build conventions change, that heuristic needs to stay
aligned.
- Serialized test sharding preserves per-suite isolation, but the first
few CI runs should still be watched for shard-balance or naming
assumptions in downstream tooling.

## Model Used

- OpenAI GPT-5.4 via the Codex local adapter, using high reasoning
effort with shell, git, and file-edit tool use in a local worktree.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 20:20:14 -07:00
Dotta e01ffc18d3 Merge pull request #5148 from paperclipai/pap-3474-tenant-identity-deploy
Support Cloud tenant identity bootstrap
2026-05-03 21:57:28 -05:00
Dotta ae23e02526 Support Cloud tenant identity bootstrap
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-03 21:55:52 -05:00
Devin Foley 29401b231b fix(ci): gate new release packages on npm bootstrap (#5146)
## Thinking Path

> - Paperclip is a control plane for autonomous agent companies, so its
release automation is part of the core operator trust boundary.
> - The affected subsystem is npm/GitHub Actions release publishing for
the public monorepo packages.
> - The concrete failure was that a newly added package reached
`master`, the canary workflow attempted its first publish, and npm
trusted publishing was not yet bootstrapped for that package.
> - That means the problem is not just one broken run; it is a missing
pre-merge guard that lets release-ineligible packages land and only fail
once `publish_canary` runs.
> - This pull request makes release enrollment explicit, validates that
enrollment in CI, and adds a PR-time bootstrap check against npm for
changed release-enabled package manifests.
> - The result is that we keep trusted publishing, avoid teaching CI to
`npm adduser`, and move this class of failure from post-merge canary
time to pre-merge review time.

## What Changed

- Added `scripts/release-package-manifest.json` so release-managed
public packages are explicitly enrolled instead of being inferred from
every non-private workspace package.
- Hardened `scripts/release-package-map.mjs` to validate the manifest
before release workflows rewrite versions or assemble publish payloads.
- Added `scripts/check-release-package-bootstrap.mjs` and wired it into
`.github/workflows/pr.yml` so PRs that change a release-enabled package
manifest fail if that package does not already exist on npm.
- Added release-package manifest coverage tests to
`scripts/release-package-map.test.mjs` and included them in `pnpm run
test:release-registry`.
- Wired manifest validation into `.github/workflows/release.yml` and
documented the first-publish bootstrap policy in `doc/PUBLISHING.md` and
`doc/RELEASE-AUTOMATION-SETUP.md`.

## Verification

- `pnpm run test:release-registry`
- `./scripts/release.sh canary --skip-verify --dry-run`
- Confirmed the committed diff contains no obvious PII/secrets via
targeted pattern scan before pushing.

## Risks

- Low risk overall: this is CI/release-policy code, not product runtime
logic.
- The new PR bootstrap check depends on npm metadata availability, so a
transient npm outage could block a PR that changes a release-enabled
package manifest.
- The manifest introduces a new source of truth that must stay aligned
with public package additions, but that is intentional and now enforced.

## Model Used

- OpenAI Codex via the `codex_local` Paperclip adapter; GPT-5-based
coding agent with tool use, terminal execution, git, and GitHub CLI.
Exact served model ID/context window are not exposed by the local
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 19:31:28 -07:00
Devin Foley a5430f010d Handle Gemini assistant message events in JSONL parser (#5143)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, including
agents
>   running the Gemini CLI (`gemini-local` adapter)
> - The Gemini CLI emits a JSONL event stream during a run that the
adapter
> parses to extract the assistant's response text, tool results, and
usage
> - Recent versions of the Gemini CLI emit assistant responses as
> `{ "type": "message", "role": "assistant", "content": ... }` events in
>   addition to the previously-handled event shapes
> - The parser was not handling the new event type, so the assistant's
actual
> response text was being silently dropped from parsed output. Callers
ended
>   up with empty assistant messages even when Gemini had successfully
>   responded
> - This PR teaches the parser to recognize `{type: "message", role:
>   "assistant"}` events and extract their content text via the same
>   `collectMessageText` helper used for other message-shaped events
> - The benefit is that Gemini runs surface the assistant's real
response in
> downstream consumers (issue comments, run logs, downstream agent
context)
>   instead of vanishing

## What Changed

- `packages/adapters/gemini-local/src/server/parse.ts`: in
`parseGeminiJsonl(...)`, add a branch for `event.type === "message"`
with
  `role === "assistant"` that calls
  `messages.push(...collectMessageText(event.content))`.
- `packages/adapters/gemini-local/src/server/parse.test.ts`: ~19 lines
of
  coverage for the new branch.

## Verification

- `pnpm --filter @paperclipai/adapter-gemini-local test -- parse`
- Manual QA: run a Gemini agent on an issue, confirm the assistant's
response
appears as the issue comment / run output. Before this fix the comment
was
  empty even when the run completed successfully.

## Risks

- Tightly scoped: 8 lines of production code in one parser branch. No
effect
  on existing event shapes or other adapters.
- If the Gemini CLI changes its event schema again, this branch may need
to be
  revisited — but adding it is strictly additive over current behaviour.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:36:50 -07:00
Devin Foley 6c090f84a9 Strip inherited host shell env from SSH remote execution (#5142)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents executing on remote SSH hosts receive an env map built from
the host
>   process's env plus per-run additions like `PAPERCLIP_API_KEY`,
>   `PAPERCLIP_RUN_ID`, etc.
> - The env map currently includes inherited host vars by default,
including
> identity-bound ones like `PATH`, `HOME`, `USER`, `NVM_DIR`, `XDG_*` —
> variables whose values are meaningful only on the host they came from
> - Sending the host's `PATH` (containing host-only directories like a
local
> nvm install path) to a remote SSH box overrides the remote's actual
`PATH`
> and breaks command resolution. Same hazard for `HOME` (commands
looking for
> config files end up in a non-existent dir), `USER` (writes go to the
wrong
>   path), etc.
> - This PR adds `sanitizeSshRemoteEnv()` that drops inherited
identity-bound
> vars when their value matches the host process's value. Explicitly-set
> values pass through untouched, so callers that genuinely want to
override
>   remote `PATH` etc. still can — but accidental leakage from
>   `process.env` is filtered.
> - The benefit is that SSH remote execution stops corrupting the remote
> shell's environment with host-shaped paths, so commands resolve
correctly
>   against the remote PATH and config files land in the remote `HOME`

## What Changed

- New `sanitizeSshRemoteEnv(env, inheritedEnv = process.env)` in
`packages/adapter-utils/src/server-utils.ts`. The identity-bound key set
is:
    - `PATH`, `HOME`, `PWD`, `SHELL`, `USER`, `LOGNAME`
    - `NVM_DIR`, `TMPDIR`, `TMP`, `TEMP`
    - `XDG_CONFIG_HOME`, `XDG_CACHE_HOME`, `XDG_DATA_HOME`,
      `XDG_STATE_HOME`, `XDG_RUNTIME_DIR`
For any key in this set, the entry is dropped iff the env value equals
the
  inherited (host process) value. Other keys pass through unchanged.
- `readEnvValueCaseInsensitive(...)` helper handles Windows-style
  case-insensitive env var lookups.
- Wired into `resolveSpawnTarget(...)` for the SSH transport. Sandbox
and local
  paths are unaffected.
- Tests added in `server-utils.test.ts` (~50 lines) covering: matching
keys
filtered, mismatched keys preserved, non-identity keys passed through,
case
  insensitivity.

## Verification

- `pnpm --filter @paperclipai/adapter-utils test -- server-utils`
- Manual QA: run any adapter against an SSH-backed environment, confirm
remote command resolution works (e.g. `node`, `npm`, the adapter's CLI)
and
config files land in the remote user's `HOME`. Compare to the prior
behaviour
by transiently re-introducing the inherited `PATH` and watching commands
  fail with `command not found`.

## Risks

- Behavioural shift: SSH remote execution previously passed inherited
host env
  vars verbatim. Code that relied on that (e.g. a remote command somehow
  expecting the host's `PATH`) will see different behaviour. None of the
  adapter code in this repo has such a dependency.
- Edge case: if a caller explicitly sets `PATH` to the same value as the
host's
`PATH` (literally — same exact string), the sanitizer drops it as a
leak.
  In practice no caller constructs the env this way.
- Windows host: case-insensitive lookup handles `Path` vs `PATH`
correctly.
  Tested.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:36:13 -07:00
Devin Foley 90631b09b3 Let adapters declare runtime command spec for remote provisioning (#5141)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, running
adapter
> commands like `claude`, `codex`, `pi` either locally or on remote
runtimes
>   (SSH hosts, sandboxes, etc.)
> - On a fresh remote runtime — particularly an ephemeral sandbox — the
> adapter's CLI may not be installed yet. Today operators handle this
via
> external configuration (e.g. a project-level `provisionCommand` shell
> script) that has to know about every adapter the operator might want
to use
> - This means every adapter has its own well-known npm package, but
operators
>   end up writing duplicate provision shell scripts that paste together
> `npm install -g @anthropic-ai/claude-code`, `npm install -g
@openai/codex`,
>   etc. — knowledge the adapter itself already has
> - This PR moves that knowledge into the adapter modules: each adapter
declares
> how its runtime command should be detected and (if applicable)
installed
> via `getRuntimeCommandSpec(config)`. The execution path runs the
adapter's
> own install command on remote sandbox targets before launching, so a
fresh
> sandbox bootstraps itself instead of requiring a hand-written
provision script
> - The benefit is fewer footguns for operators provisioning remote
runtimes,
>   and a clean place for new adapters to plug in their install recipe

## What Changed

- New types in `packages/adapter-utils/src/types.ts`:
    - `AdapterRuntimeCommandSpec` describing `command`, optional
      `detectCommand`, and optional `installCommand`
    - Optional `getRuntimeCommandSpec(config)` on `ServerAdapterModule`
- Optional `runtimeCommandSpec` on `AdapterExecutionContext` so adapters
      receive the resolved spec at execute time
- New helper `ensureAdapterExecutionTargetRuntimeCommandInstalled(...)`
in
`packages/adapter-utils/src/execution-target.ts` that runs the install
command
on remote targets when `transport === "sandbox"`. SSH and local targets
are
  no-ops. Throws on timeout or non-zero exit so failures surface early.
- Each of `claude-local`, `codex-local`, `cursor-local`, `gemini-local`,
  `opencode-local`, `pi-local`'s `execute.ts` now reads
`ctx.runtimeCommandSpec?.installCommand` and calls the helper before
launching
  the adapter command.
- `server/src/adapters/registry.ts` declares `getRuntimeCommandSpec` for
each
  adapter:
- claude/codex/gemini/opencode/pi-local: `npm install -g <package>`
recipe via
a shared `buildNpmRuntimeCommandSpec` helper, with a defensive guard
that
only auto-installs when the configured `command` matches the well-known
      fallback (custom binaries are left alone).
- cursor-local: declares `command` only; no auto-install (no public npm
      package), preserving the existing manual setup.
- `server/src/services/heartbeat.ts` resolves the spec via
`adapter.getRuntimeCommandSpec?.(runtimeConfig)` and passes it through
to
  `AdapterExecutionContext`.
- Tests added in `execution-target.test.ts` (~75 lines), e2b
`plugin.test.ts` (~32 lines), and `environment-run-orchestrator.test.ts`
  (~76 lines).

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-run-orchestrator`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: run an adapter (claude/codex/etc.) against a fresh
sandbox-backed
environment that does NOT have the adapter CLI pre-installed. Confirm
the
install runs once at the start of the agent run and the adapter then
launches
successfully. Re-run on the same sandbox; confirm the install command is
  idempotent and the second run starts faster.
- Confirm SSH and local execution paths are unaffected (gated by
  `transport === "sandbox"`).

## Risks

- Behavioural shift on sandbox runs: a new install step now runs at the
start
  of every sandbox agent run for adapters with `installCommand` set. The
install commands are idempotent (`if ! command -v X >/dev/null 2>&1;
then
npm install -g <pkg>; fi`), so this is fast on warm sandboxes. On a cold
  sandbox, the first run takes longer.
- Operators who used the legacy project-level `provisionCommand` to
install
adapter CLIs can drop that part of their script; the adapter handles it
now.
  Existing scripts continue to work — installs are idempotent.
- The cursor-local adapter has no auto-install (no public npm package).
  Behaviour for cursor-local on sandboxes is unchanged.
- New optional surface on `ServerAdapterModule`. Plugins that don't
implement
  `getRuntimeCommandSpec` retain previous behaviour (no auto-install).

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:35:36 -07:00
Devin Foley 2dce81fbf6 Add optional bridge proxy request logging via PAPERCLIP_BRIDGE_DEBUG (#5140)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents on remote sandboxes call back into the Paperclip control
plane via a
> callback bridge — the host process running the bridge proxies HTTP
requests
>   from the sandbox to the Paperclip API
> - When something goes wrong end-to-end (sandbox can't reach Paperclip,
requests
> timing out, malformed responses), it's hard to tell whether the bridge
> processed the request, what URL/method it used, and what the upstream
>   responded with
> - There was no built-in way to log bridge proxy traffic without
modifying
>   adapter code or attaching a debugger
> - This PR adds opt-in stdout logging of every bridge proxy request and
response
> (method, path, query, status), gated behind `PAPERCLIP_BRIDGE_DEBUG`
so it
>   stays off by default
> - The benefit is that operators can flip a single env var to get full
visibility
> into bridge traffic when debugging remote runs, without changing code

## What Changed

- `packages/adapter-utils/src/execution-target.ts`:
`startAdapterExecutionTargetPaperclipBridge`'s `handleRequest` now logs
each
  proxied request and response when `PAPERCLIP_BRIDGE_DEBUG` is truthy:
    - `[paperclip] Bridge proxy <METHOD> <path>?<query>` before fetch
- `[paperclip] Bridge proxy response <status> for <METHOD>
<path>?<query>` after
- Logging is no-op when the env var is unset/`"0"`/`"false"`.

## Verification

- Set `PAPERCLIP_BRIDGE_DEBUG=1` in the host process env, run an agent
against
a sandbox-backed environment, confirm the bridge log lines appear in
stdout.
- Unset the env var and confirm no extra log lines appear during normal
runs.

## Risks

- Off-by-default, no observable change for shipping users.
- When enabled, the logging is verbose — every API call from the sandbox
  produces 2 stdout lines. Operators should only enable it during active
  debugging.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable — covered by
exercising the
      flag in dev; the underlying handleRequest behavior is unchanged
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:34:48 -07:00
Devin Foley 0e51fa2b0d Honor reuse-existing preference and assignee default environment in issue runs (#5139)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside execution workspaces (a per-issue cwd + env), and
an issue
> can prefer to reuse an existing workspace or get a fresh one each time
> - The heartbeat service was reading the existing workspace's config to
derive
> environment selection regardless of whether the issue actually wanted
to reuse
> it. So fresh-run issues were inheriting stale config from a workspace
that was
>   about to be discarded
> - Separately, when an issue is assigned to an agent, the issue's
execution
> workspace settings weren't picking up the agent's
`defaultEnvironmentId`,
>   even though the agent's choice is the natural default for that issue
> - This PR makes both selection paths honor the obvious source of
truth:
> workspace config flows only when the issue actually wants
`reuse_existing`,
> and the assignee agent's default environment is applied at assignment
time if
>   nothing else is set on the issue
> - The benefit is that re-running a flaky issue picks up the right
environment
> instead of inheriting the previous run's config, and assigning an
agent to an
>   issue does the obvious thing without operator intervention

## What Changed

- `server/src/services/heartbeat.ts`: introduce
`reusableExecutionWorkspaceConfig`
  that is non-null only when `shouldReuseExisting` is true. Both
  `resolveExecutionWorkspaceEnvironmentId(...)` and
`applyPersistedExecutionWorkspaceConfig(...)` now read from it instead
of
unconditionally consulting `existingExecutionWorkspace?.config`.
Fresh-run
issues no longer inherit stale environment config from an in-flight
workspace
  about to be discarded.
- `server/src/services/issues.ts`: when an issue update sets a new
  `assigneeAgentId` and isolated workspaces are enabled, populate
  `executionWorkspaceSettings.environmentId` from the assignee agent's
  `defaultEnvironmentId` if the issue doesn't have an explicit
  `environmentId` set yet.
- Tests added in `heartbeat-plugin-environment.test.ts` (~216 lines) and
  `issues-service.test.ts` (~85 lines) covering both paths.

## Verification

- `pnpm --filter @paperclipai/server test --
heartbeat-plugin-environment issues-service`
- Manual QA: assign an issue to an agent that has a non-default
`defaultEnvironmentId`, confirm the issue's workspace settings now
include that
environment id without operator intervention. Trigger a rerun on an
issue
whose existing workspace points at a stale environment, confirm the
rerun uses
  the freshly-resolved environment.

## Risks

- Behavioural shift on assignment: previously assigning an agent didn't
propagate the agent's default environment to the issue. Now it does.
Callers
that explicitly want the issue to keep its existing/null environment
must set
`executionWorkspaceSettings.environmentId` themselves; the new logic
only
  fires when no explicit value is set.
- Behavioural shift on rerun: stale workspace config is no longer
applied to
  fresh runs. Operators who relied on this implicit inheritance may see
different environment selection on the first rerun after deploy.
Mitigation:
the explicit isssue settings and project policy are still honored as
before.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:33:55 -07:00
Devin Foley 09eceb952a Avoid resuming stale remote sessions (Pi adapter) (#5120)
> **Stacked PR (part 7 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
  - PR #5117
  - PR #5118
  - PR #5119
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Pi adapter persists a session jsonl per agent so subsequent runs
resume
>   conversation context instead of starting cold
> - SSH testing reproduced a real failure: a verification issue reached
terminal
>   `done` and the agent claimed success, but the proof artifact
> `manual-qa/environment-matrix/ssh/pi_local.md` was missing from the
realized
>   SSH workspace on the QA target box
> - Root cause: the saved session header recorded a different cwd than
the new
> execution cwd, but the resume eligibility check only compared
session-params
> cwd via local-style `path.resolve` (which doesn't roundtrip on remote
POSIX
> paths). The stale session got resumed and writes landed in the wrong
cwd
> - This PR tightens resume eligibility for remote targets: it adds
remote-aware
> cwd normalisation, reads the first line of the session jsonl over SSH
(`head
>   -n 1`) to verify the saved header cwd, and only resumes when both
> session-params cwd *and* the on-disk header cwd match the realised
execution
>   cwd. Stale sessions are skipped silently and the run starts cold
> - The benefit is that Pi runs across cwd-changing environments stop
> accidentally resuming each other's sessions, and proof artifacts land
where
>   reviewers expect them

## What Changed

- Added `normalizeExecutionCwd`, `executionCwdsMatch`,
`readSessionHeaderCwd`,
  and `readSavedSessionCwd` helpers in `pi-local/src/server/execute.ts`
- `readSavedSessionCwd` reads the first line of the session jsonl —
locally via
`fs.readFile`, remotely via `runAdapterExecutionTargetShellCommand`
(`head -n 1`)
- Resume eligibility now requires:
  1. Saved session id is non-empty
  2. Execution target shape matches (existing check)
  3. Session-params cwd matches the realised execution cwd
4. Session-header cwd (from the on-disk jsonl) matches the realised
execution cwd
- Stale sessions are skipped silently (run starts cold) instead of
resumed
- `execute.remote.test.ts` extended with: matching header → resume;
mismatched
header → start fresh; missing/unreadable header → start fresh; remote
head
  command failure → start fresh

## Verification

- `pnpm --filter @paperclipai/adapter-pi-local test`
- `pnpm test -- pi-local`
- Manual QA: ran a Pi agent twice in two different remote cwds,
confirmed
the second run did not pick up the first run's session and that
subsequent
  runs in the original cwd still resumed correctly

## Risks

- Adds a `head -n 1` shell call per Pi run on remote targets. Negligible
  latency (single read of session jsonl), bounded by 15s timeout.
- If the `head` call fails for unrelated reasons (transient remote
unreachability), the run will start cold instead of resuming. This is
the
safe default but worth noting — operators may see one extra cold run if
a
  remote glitches mid-session.
- No data is deleted or migrated; stale sessions remain on disk for
manual
  inspection if desired.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:51:38 -07:00
Devin Foley d22e790bd4 Validate remote model probes on execution target (OpenCode) (#5119)
> **Stacked PR (part 6 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
  - PR #5117
  - PR #5118
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The OpenCode adapter validates that its configured model exists
before letting
>   a run start so misconfiguration fails fast with a clear error
> - SSH testing reproduced an OpenCode failure where issues stayed
`backlog`,
>   timed out, and produced no comments. The root cause was in
> `packages/adapters/opencode-local/src/server/execute.ts`: the local
model
> guard `ensureOpenCodeModelConfiguredAndAvailable(...)` only ran when
execution
> was *not* remote, so SSH OpenCode bypassed it and failed silently
later
> - Subsequent testing surfaced a related remote-only failure where the
probe
> (when wired up naively) hits `EACCES: permission denied, mkdir
'/var/folders'`
> on the SSH box because of how OpenCode's runtime config picks a
tempdir
> - This PR runs the model probe on the actual execution target —
`opencode
> models` via `runAdapterExecutionTargetProcess` — instead of the local
CLI,
> parses the output with the shared `parseOpenCodeModelsOutput` helper,
and
> reports a concrete error naming the offending model and a sample of
available
>   remote models when the configured model isn't present
> - The benefit is that mismatched OpenCode models surface as a clear
pre-flight
> error referencing the remote target instead of a silent run that never
leaves
>   `backlog`

## What Changed

- Added `ensureRemoteOpenCodeModelConfiguredAndAvailable` in
  `opencode-local/src/server/execute.ts` that runs `opencode models` via
`runAdapterExecutionTargetProcess` and validates the configured model is
in
  the parsed output
- `models.ts` now exports `parseOpenCodeModelsOutput` and
`requireOpenCodeModelId`
  so the remote path can reuse them
- `execute.ts` calls the remote variant when `executionTargetIsRemote`,
otherwise
  the existing local `ensureOpenCodeModelConfiguredAndAvailable`
- Errors include the offending model id and a sample of available remote
models
  so the operator knows exactly what's missing
- `execute.remote.test.ts` extended with cases for: probe timeout, probe
  non-zero exit, empty model list, and missing-model error

## Verification

- `pnpm --filter @paperclipai/adapter-opencode-local test`
- `pnpm test -- opencode-local`
- Manual QA: configured an OpenCode agent with a model that exists
locally but
not in the remote sandbox, and confirmed the new error fires before the
run
  starts and references the remote target

## Risks

- New behaviour: remote model validation adds a `~20s timeout` `opencode
models`
call on every remote run start. For most environments this is fast, but
a
network-slow sandbox could see startup latency rise. Timeout is bounded.
- If the remote CLI is missing or misconfigured, the new error replaces
the old
generic startup failure — clearer message, but the failure point shifts
earlier. Monitor for any QA flows that relied on the old failure shape.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:34:09 -07:00
Devin Foley 856c6cb192 Fix remote workspace environment shaping (#5118)
> **Stacked PR (part 5 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
  - PR #5117
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run with a Paperclip-shaped environment
(`PAPERCLIP_WORKSPACE_CWD`,
> worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can
locate the
>   correct project tree
> - SSH testing reproduced a real failure: a Codex SSH run wrote to
> `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the
realized
> remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...`
> because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into
the
>   remote env
> - Code review on the initial codex-only fix asked to roll the same
approach
> into every other SSH-capable adapter (claude, acpx, cursor, opencode,
gemini,
>   pi) via a shared helper rather than duplicating per-adapter
> - This PR adds `shapePaperclipWorkspaceEnvForExecution` in
adapter-utils that,
> when the execution target is remote: replaces local cwd with the
realized
> execution cwd, nulls out worktree path (which has no remote meaning),
and
> rewrites/strips `cwd` entries in workspace hints based on what was
actually
>   synced. Every adapter calls it before invoking the remote runner
> - The benefit is that remote runs see the realized remote workspace,
host-local
> paths stop leaking into remote env, and the rule is unit-tested in one
place

## What Changed

- Added `shapePaperclipWorkspaceEnvForExecution` to
  `packages/adapter-utils/src/server-utils.ts` with full unit coverage
  (`server-utils.test.ts`)
- Each of acpx-local, claude-local, codex-local, cursor-local,
gemini-local,
opencode-local, pi-local now calls the new shaper before issuing the
remote
  command and feeds the shaped values into `applyPaperclipWorkspaceEnv`
- Per-adapter `execute.remote.test.ts` files extended to cover the new
shaping
  behaviour: localhost paths replaced with remote cwd, foreign-cwd hints
  stripped, worktree path nulled out for remote targets
- `acpx-local/src/server/execute.test.ts` extended with shaping coverage

## Verification

- `pnpm test -- server-utils execute.remote`
- `pnpm --filter @paperclipai/adapter-acpx-local test`
- Manual QA reproducing the original failure:
  1. Provision an E2B sandbox environment for the Paperclip QA company
2. Assign an issue to a remote-targeted claude-local agent and confirm
the
run starts in the correct remote cwd (no `/Users/...` path leakage in
the
     run logs)
  3. Repeat for opencode-local and pi-local

## Risks

- Behavioural shift: hints whose `cwd` doesn't match the workspace cwd
are now
stripped on remote targets. If any adapter relied on a leaked local hint
cwd,
it will see a missing `cwd` instead. Reviewed all current callers — none
do.
- Adds a small per-run cost (path resolve + string normalisation) on
every remote
  execution. Negligible.
- Worktree path is now nulled out on remote (it has no meaning there).
Adapters
  that previously read the value defensively will continue to work.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:17:52 -07:00
Devin Foley bb7d040894 Switch OpenCode to explicit static/local-aware model selection (#5117)
> **Stacked PR (part 4 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - When creating an OpenCode-local agent, Paperclip currently validates
> `adapterConfig.model` against the *Paperclip host's* `opencode models`
output
> - SSH testing surfaced that this blocks creating an OpenCode agent for
an SSH
> environment: the model that exists on the SSH target isn't visible to
the
> host, so creation fails with "OpenCode requires `adapterConfig.model`
in
> provider/model format" even when the operator picked a real remote
model
> - The initial direction was environment-aware model discovery; the
final
> decision was to keep OpenCode on the same explicit-model pattern as
other
> adapters (default + curated list + manual override) and stop blocking
>   creation on host-side discovery
> - This PR does both: the adapter-models endpoint now accepts
`environmentId` and
> probes against the target environment, and the create-time hard gate
is
> replaced by `requireOpenCodeModelId` which validates `provider/model`
*format*
> without requiring host-local discovery. Test/run-time still surfaces
real
>   auth/availability problems
> - The benefit is that operators can create OpenCode agents for remote
> environments without out-of-band setup, and the model picker in the UI
>   reflects the actually-targeted environment

## What Changed

- Added `requireOpenCodeModelId(input)` in
`opencode-local/src/server/models.ts`,
  exported it from the adapter index
- `ensureOpenCodeModelConfiguredAndAvailable` now delegates the format
check to
  `requireOpenCodeModelId`
- `agentsApi.adapterModels(companyId, adapterType, { environmentId })`
now accepts
  an environment ID and passes it as a query parameter
- `queryKeys.agents.adapterModels` now keys on `(companyId, adapterType,
environmentId)`
- `server/src/routes/agents.ts` reads and validates the new query
parameter,
  forwarding it to the adapter's model probe
- `AgentConfigForm.tsx` and `OnboardingWizard.tsx` build the model query
key from
the currently selected default environment ID and disable autodetect for
  `opencode_local` (model selection is explicit)
- `NewAgent.tsx` simplified — no longer special-cases OpenCode
autodetect
- `company-portability.ts` no longer needs OpenCode-specific autodetect
handling
- Tests added/updated:
  `adapter-model-refresh-routes.test.ts`, `adapter-models.test.ts`,
`agent-permissions-routes.test.ts`,
`opencode-local/src/server/models.test.ts`

## Verification

- `pnpm --filter @paperclipai/server test -- adapter-models
adapter-model-refresh agent-permissions`
- `pnpm --filter @paperclipai/adapter-opencode-local test`
- `pnpm --filter @paperclipai/ui test -- AgentConfigForm
OnboardingWizard NewAgent`
- Manual QA in browser:
1. Boot Paperclip on Tailscale-bound port (so it's reachable from
another
machine), create an OpenCode-local agent, switch the default environment
between two installed sandboxes, and confirm the model list refreshes
     per-environment
  2. Submit with a malformed `provider/model` string and verify the new
     `requireOpenCodeModelId` error surfaces
- Before/after screenshots attached for `AgentConfigForm` model picker

## Risks

- Behavioural shift: switching default environment now triggers a model
refetch.
Should be cheap but introduces a new UI loading state for OpenCode
users.
- Removing dynamic autodetect for OpenCode: if any user configured an
agent
without specifying `model` and relied on autodetect populating it, that
agent
will now fail at submit time. Mitigation: validation error is explicit
and
  actionable.
- New query string parameter on `/api/companies/:id/adapter-models` —
older
clients that omit it still work (parameter is optional and defaults to
null).

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:01:34 -07:00
Devin Foley 076067865f Migrate SSH environment callback to bridge (#5116)
> **Stacked PR (part 3 of 7).** Depends on:
  - PR #5114
  - PR #5115
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents executing on a remote SSH-backed environment need a way to
call back into
>   the Paperclip control plane (run events, log streaming, signals)
> - When the SSH host can't reach the Paperclip host (NAT, firewalls, or
simply not
> on the same network), the run silently fails or hangs — a recurring
class of
>   failure during SSH testing
> - In sandboxed environments we already solved this with a callback
bridge that
> tunnels back through the existing connection; SSH was the odd one out
> - This PR migrates SSH execution to use the same callback bridge, so
every
> adapter's remote run uses one consistent reverse-channel. Per-adapter
SSH glue
> is deleted in favour of a shared `CommandManagedRuntimeRunner` built
from the
>   SSH spec
> - The benefit is fewer SSH-specific failure modes, a smaller code
surface, and
>   one place to evolve the callback contract going forward

## What Changed

- Added `createSshCommandManagedRuntimeRunner` in
`packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a
generic
  command-managed-runtime runner (with cwd, env, and timeout handling)
- Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge
URL now flows
  through the shared runner
- Reworked `execution-target.ts` to use the SSH runner alongside sandbox
runners
  via a unified `CommandManagedRuntimeRunner` interface
- Simplified `remote-managed-runtime.ts` and
`sandbox-managed-runtime.ts` to consume
  the shared runner abstraction
- Deleted per-adapter SSH callback wiring from claude-local,
codex-local,
  cursor-local, gemini-local, opencode-local, pi-local execute.ts files
- Removed `environment-runtime-driver-contract.test.ts` (the contract is
now
  enforced by `environment-execution-target.test.ts`)
- Added/updated `execute.remote.test.ts` cases for each adapter to cover
the SSH
  runner path

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm test -- execute.remote` (covers all six local adapters' SSH
paths)
- Manual QA: ran a claude-local agent against an SSH-backed environment,
confirmed
the agent successfully called back to `/api/agent-callback/*` endpoints
during
  the run

## Risks

- Refactor touches all six local adapters. If any adapter had subtle
SSH-specific
behaviour that wasn't captured in tests, it could regress. Mitigation:
each
  adapter's `execute.remote.test.ts` was extended.
- `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking
type change
for any internal consumer. Verified no external plugins consume this
type.
- The new `CommandManagedRuntimeRunner` shape is a public surface in
`@paperclipai/adapter-utils`; downstream plugins implementing custom
runners may
  need updates, but no such plugins exist in this repo.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:43:52 -07:00
Chris Farhood 37e0aac971 ci: build prod image from .farhoodlabs/Dockerfile
Pulls the prod image up to the same toolset as the dev image (kubectl,
kubeseal, uv/uvx, forgejo CLIs, nano, vim) without diverging the upstream
root Dockerfile. Both build-dev.yml and build-prod.yml now share the same
fork-overlay Dockerfile; only the image tag and trigger branch differ.
2026-05-03 15:38:18 -04:00
Devin Foley a7b45938b7 Let sandbox providers declare shell defaults (#5114)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
>   providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
>   provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
>   provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
>   helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
>   preference

## What Changed

- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
  else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
  and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
  and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
  `shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
  startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
  (validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
  external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
  `execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
  `environment-execution-target.test.ts`

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
  shell semantics flow through the callback bridge end-to-end)

## Risks

- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
  providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
  Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
Dotta 15eac43b43 [codex] Retry max-turn exhausted heartbeats (#5096)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies, and
heartbeat execution is the control-plane loop that keeps assigned work
moving.
> - Max-turn exhaustion is a recoverable local-adapter stop condition
for Claude and Gemini agents when a run needs another heartbeat to
continue safely.
> - The previous behavior could leave max-turn continuation details hard
to inspect, and duplicate/stale continuation wakes could keep running
after issue state changed.
> - The adapter layer also needed to avoid trusting arbitrary
stdout/stderr text as scheduler control metadata.
> - This pull request adds bounded max-turn continuation scheduling,
visible retry state, structured stop metadata handling, and
stale/duplicate continuation guards.
> - The benefit is safer automatic continuation after max-turn stops,
clearer operator visibility, and fewer duplicate or stale agent runs.

## What Changed

- Replaces closed PR #4952, whose head repository was deleted.
- Rebases the recovered max-turn continuation branch onto current
`paperclipai/paperclip:master`.
- Adds max-turn continuation scheduling and retry-state plumbing for
heartbeat runs.
- Adds stale/duplicate continuation suppression when issue status,
ownership, or execution locks change.
- Normalizes Claude/Gemini max-turn detection around structured stop
metadata instead of unstructured stdout/stderr text.
- Surfaces max-turn continuation settings and retry visibility in the
board UI.
- Adds focused server, adapter, and UI tests for max-turn stop metadata,
retry scheduling, stale queued-run invalidation, adapter
parsing/execution, run ledger display, and agent config patching.

## Verification

- `pnpm install --no-frozen-lockfile` to refresh local dependencies
after rebasing onto current `master`.
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/claude-local-adapter.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/gemini-local-adapter.test.ts
server/src/__tests__/gemini-local-execute.test.ts
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/heartbeat-stale-queue-invalidation.test.ts
server/src/services/heartbeat-stop-metadata.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/lib/agent-config-patch.test.ts ui/src/lib/runRetryState.test.ts
--testTimeout=20000`
- `pnpm --filter @paperclipai/adapter-claude-local typecheck && pnpm
--filter @paperclipai/adapter-gemini-local typecheck && pnpm --filter
@paperclipai/server typecheck && pnpm --filter @paperclipai/ui
typecheck`
- UI screenshot note: the UI changes are limited to config/ledger state
rendering rather than layout changes; component/unit coverage above
verifies the rendered behavior.

## Risks

- Medium behavior risk: heartbeat retry gating now suppresses max-turn
continuations when issue state or execution locks drift, so any callers
that relied on stale continuations running will now see cancellation
instead.
- Low adapter risk: Claude/Gemini unstructured text no longer triggers
max-turn scheduler metadata, so only structured stop signals and Gemini
exit code 53 are trusted.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5-class model, tool-enabled local
repository editing and command execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable: state/default rendering only; covered by
component/unit tests)
- [x] I have updated relevant documentation to reflect my changes (not
applicable: no user-facing command or docs contract changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 11:30:48 -05:00
Chris Farhood cee1cd7f4e Merge branches 'feat/skills-gitops-complete' and 'feat/secrets-management-ui' into local 2026-05-03 11:02:47 -04:00
Chris Farhood b3e7dcaa83 Merge branches 'feat/skills-gitops-complete' and 'feat/secrets-management-ui' into dev 2026-05-03 11:02:25 -04:00
Chris Farhood 3ea3020a76 fix(secrets): include skill metadata references in usages, route, and delete dialog
secretsSvc.usages() previously only scanned agent env bindings. Skills can
reference a secret via metadata.sourceAuthSecretId (set by the PAT auth
feature), so removing a secret without checking those references could orphan
a sibling skill's PAT pointer.

- Extend usages() to return { agents, skills } with both reference kinds.
- Update remove() to block when either array is non-empty.
- /secrets/:id/usages now responds with { agents, skills }.
- The delete dialog displays both kinds inline so the operator knows which
  references to detach first.
2026-05-03 11:02:16 -04:00
Chris Farhood 27db0d3c67 fix(skills): prevent PAT pollution of bundled skills and orphan secrets on delete
Five connected gaps in the original PAT feature, all in company-skills.ts:

1. upsertImportedSkills only protected bundled rows from overwrite when the
   incoming source claimed to be paperclipai/paperclip. A SKILL.md from any
   other org/repo whose key resolves to paperclipai/paperclip/<slug> would
   hijack the bundled row and gain a sourceAuthSecretId. Broadened: any
   non-bundled incoming is rejected when existing is paperclip_bundled.

2. The metadata-build block preserved sourceAuthSecretId from existing
   indiscriminately, so any pollution of a bundled row was kept across every
   ensureBundledSkills re-upsert. Skip preservation when existing is bundled.

3. importFromSource's auth-token loop wrote sourceAuthSecretId for every
   imported skill including any bundled ones that snuck through. Defense in
   depth: skip skills with sourceKind === "paperclip_bundled".

4. updateSkillAuth had no guard, so the PATCH /skills/:id/auth route could
   attach a PAT to a bundled skill via direct API call. Reject explicitly.

5. deleteSkill removed the secret without checking whether any sibling skill
   still referenced it via metadata.sourceAuthSecretId. Re-imports preserve
   that reference, so two skills could share a secret and deleting one would
   orphan the other's reference. Now skip the remove if another skill in the
   same company still references the secret.
2026-05-03 11:00:02 -04:00
Chris Farhood 5c681340f3 revert: restore paperclip-dev skill (validation requires it for now) 2026-05-03 10:24:00 -04:00
Chris Farhood 85cbbc9263 revert: restore paperclip-dev skill (validation requires it for now)
The earlier fix/remove-paperclip-dev-skill removed the bundled skill,
but companies have stale company_skills rows that reference it as
required, breaking 'Invalid company skill selection' validation. Put
the file back to unblock; the underlying force-required-on-bundled bug
remains and should be fixed in code rather than by deleting the skill.
2026-05-03 10:23:54 -04:00
Dotta 57229d0f24 [codex] Add issue monitor liveness controls (#4988)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies where work
must stay observable, governable, and recoverable.
> - The task/heartbeat subsystem owns agent execution continuity, issue
state transitions, and visible recovery behavior.
> - Waiting on an external service is not the same as being blocked when
the assignee still owns a future check.
> - The gap was that agents had no first-class one-shot monitor state
for external-service waits, so recovery could look stalled or require ad
hoc comments.
> - This pull request adds bounded issue monitors that can wake the
owner, clear exhausted waits, and produce explicit recovery behavior.
> - It also surfaces monitor status in the board UI and documents when
to use monitors versus `blocked`.
> - The benefit is clearer liveness semantics for asynchronous waits
without weakening single-assignee task ownership.

## What Changed

- Added issue monitor fields, shared types, validators, constants, and
an idempotent `0075` migration for scheduled monitor state.
- Added server-side monitor scheduling, dispatch, recovery bounds,
activity logging, and external-ref redaction.
- Added board/agent route coverage for monitor permissions and child
monitor scheduling.
- Added issue detail/property UI for monitor state, a monitor activity
card, and Storybook stories for review surfaces.
- Documented monitor semantics and recovery policy behavior in
`doc/execution-semantics.md`.
- Addressed Greptile review feedback by preserving monitor state in
skipped-stage builders and making board monitor saves send `scheduledBy:
"board"`.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/issue-execution-policy-routes.test.ts
server/src/__tests__/issue-execution-policy.test.ts
server/src/__tests__/issue-monitor-scheduler.test.ts
server/src/__tests__/recovery-classifiers.test.ts
ui/src/components/IssueMonitorActivityCard.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/lib/activity-format.test.ts`
- First run passed 5 files and failed to collect 2 server suites because
the worktree was missing the optional `acpx/runtime` dependency.
- After `pnpm install --frozen-lockfile`, reran the 2 failed suites
successfully.
- `pnpm exec vitest run
server/src/__tests__/issue-monitor-scheduler.test.ts
server/src/__tests__/recovery-classifiers.test.ts`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/db typecheck && pnpm --filter @paperclipai/server typecheck
&& pnpm --filter @paperclipai/ui typecheck`
- `pnpm exec vitest run
server/src/__tests__/issue-execution-policy.test.ts
ui/src/components/IssueProperties.test.tsx`
- `pnpm --filter @paperclipai/server typecheck && pnpm --filter
@paperclipai/ui typecheck`
- `pnpm exec vitest run
ui/src/components/IssueMonitorActivityCard.test.tsx
ui/src/components/IssueProperties.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- Storybook screenshot captured from
`http://127.0.0.1:6006/iframe.html?viewMode=story&id=product-issue-monitor-surfaces--monitor-surfaces`
with Playwright.

## Screenshots

![Issue monitor Storybook
surfaces](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2945-when-a-task-is-waiting-for-an-_external-service_-what-state-should-it-be-in-and-what-recovery-method-could-it-h/docs/pr-screenshots/pap-2945/monitor-surfaces.png)

## Risks

- Medium: this changes heartbeat recovery behavior for scheduled
external-service waits, so regressions could affect wake timing or
recovery issue creation.
- Migration risk is reduced by using `IF NOT EXISTS` for the new issue
monitor columns and index.
- External monitor references are treated as secret-adjacent and are
intentionally omitted from visible activity/wake payloads.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent with repository tool use and terminal
execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or Storybook review surfaces
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 08:58:53 -05:00
Chris Farhood acbfcb7d00 Merge branch 'feat/secrets-management-ui' into local 2026-05-03 09:45:59 -04:00
Chris Farhood 5f3cd831a1 Merge branch 'feat/secrets-management-ui' into dev 2026-05-03 08:00:26 -04:00
Chris Farhood 6cb333b986 fix(ui): order Secrets sidebar item right after Environments 2026-05-03 08:00:14 -04:00
Chris Farhood 18f550b946 fix(ci): make Docker Hub login non-blocking on dev build
The self-hosted runner has been hitting context-deadline timeouts to
docker.io. The actual image push goes to GHCR, so the Docker Hub login
is only there to avoid pull rate limits. Mark it continue-on-error so
transient docker.io connectivity issues don't fail the whole build —
base image pulls fall back to anonymous and proceed.
2026-05-02 17:30:04 -04:00
Chris Farhood 191491a57f feat(secrets): company secrets management UI
New /company/settings/secrets page with create, rotate, edit, and delete
flows. Adds a Secrets entry to the company settings sidebar (and tab nav).
Each row shows name, description, version, and time since last rotation;
per-row actions open dialogs for rotate (textarea for new value), edit
(name + description), and delete (confirmation).

Server-side adds secretService.usages() to enumerate agents that reference
a secret via env bindings, and rejects deletion when any usage exists.
The delete dialog reads the blocking usage list from the error body and
renders it inline so the user knows which agents to detach first.
2026-05-02 17:21:10 -04:00
Dotta 76f09c8eb6 [PAP-3180] Move workspace switcher into sidebar (#4981)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - The board UI needs a clear persistent way to move between company
workspaces.
> - The previous layout kept company switching in a separate left rail,
which made the sidebar feel split between workspace selection and
navigation.
> - The workspace switcher belongs in the sidebar header so navigation
and workspace context stay together.
> - This pull request removes the separate company rail from the layout
and turns the sidebar company menu into the primary workspace switcher.
> - The benefit is a cleaner sidebar structure that keeps workspace
identity, switching, company actions, and navigation in one place.

## What Changed

- Removed the standalone `CompanyRail` from the main layout.
- Added the company/workspace switcher to the default, company settings,
and instance settings sidebars.
- Expanded `SidebarCompanyMenu` to list active workspaces, indicate the
current workspace, navigate out of instance settings when switching, and
expose add-company onboarding.
- Updated focused component tests for the new workspace-switcher
behavior.

## Verification

- `pnpm --filter @paperclipai/ui exec vitest run
src/components/SidebarCompanyMenu.test.tsx
src/components/CompanySettingsSidebar.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- `git diff --check`
- Visual smoke attempted against the managed dev server at
`http://127.0.0.1:57385`; a fresh browser context reached the
authenticated sign-in screen, so I could not capture an authenticated
sidebar screenshot from this heartbeat.

## Risks

- Low-to-medium UI risk: this changes the primary sidebar structure and
workspace-switching entry point.
- The instance-settings switch behavior now routes back to the selected
company dashboard when a workspace is selected.
- No migrations, API contracts, or lockfile changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled, medium reasoning mode.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-02 08:13:53 -05:00
Chris Farhood 3bbd632355 Merge branch 'feat/env-var-multiline-input' into local 2026-05-02 08:18:31 -04:00
Chris Farhood 7e2517935c feat(env): collapse env value textarea to single line when unfocused 2026-05-02 07:36:11 -04:00
Chris Farhood 441bbd5b9a feat(env): allow multi-line env var values via auto-growing textarea
Replace the plain-value <input> with the shadcn Textarea (which has
field-sizing: content baked in). Starts at rows={1}, grows as content
needs more vertical space. Storage and runtime already preserve newlines
so this is purely a UI capability change.
2026-05-02 07:15:08 -04:00
Chris Farhood bf1abb1492 chore(plugin-rpc): raise MAX_RPC_TIMEOUT_MS cap to 60 minutes 2026-05-01 21:00:12 -04:00
Chris Farhood e37180d3e3 chore(plugin-rpc): raise MAX_RPC_TIMEOUT_MS cap to 60 minutes 2026-05-01 21:00:08 -04:00
Chris Farhood 1678160c49 fix(ci): add acpx-local + sandbox plugins to fork Dockerfile deps stage 2026-05-01 19:31:44 -04:00
Chris Farhood c08c72e917 chore(ci): restore fork CI overlay 2026-05-01 19:27:04 -04:00
Chris Farhood fe43fbe2fd Merge branches 'feat/skills-gitops-complete', 'feat/company-portability-complete', 'feat/board-approval-markdown' and 'fix/remove-paperclip-dev-skill' into local 2026-05-01 19:26:56 -04:00
Dotta 685ee84e4a [codex] Document terminal bench dispatch config (#4961)
## Thinking Path

> - Paperclip agents rely on skills for repeatable operating procedures
> - The Terminal-Bench loop skill needs to preserve enough dispatch
configuration to reproduce real heartbeat behavior
> - A bare benchmark command can create unassigned work with no
heartbeat-enabled agent, which is a harness setup failure rather than
product evidence
> - The Paperclip heartbeat skill also needs to keep escalation biased
toward agent-owned follow-through
> - This pull request documents dispatch runner config requirements and
strengthens the agent follow-through rule
> - The benefit is fewer misleading benchmark loops and clearer agent
operating guidance

## What Changed

- Documented `PAPERCLIP_HARBOR_RUNNER_CONFIG` / runner dispatch config
as required Terminal-Bench loop input.
- Updated the Terminal-Bench loop smoke check to require the dispatch
config mention.
- Added stronger Paperclip skill guidance to avoid asking humans for
work an agent can perform.

## Verification

- `pnpm smoke:terminal-bench-loop-skill`

## Risks

- Low risk: documentation and smoke expectation changes only. The
stricter smoke assertion is intentional so future edits do not drop the
dispatch config requirement.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 12:00:47 -05:00
Dotta d7719423e9 [codex] Harden non-system database backup schemas (#4960)
## Thinking Path

> - Paperclip is a control plane whose database is the durable audit and
work record
> - Database backup needs to include operator/plugin schemas while
excluding PostgreSQL-owned internals
> - PostgreSQL reserves the `pg_` schema prefix for system schemas,
including temp and toast variants
> - A single escaped `pg_` prefix predicate is less brittle than
enumerating individual `pg_toast` and `pg_temp` forms
> - This pull request tightens non-system schema discovery for logical
backups without changing the normal user/plugin schema path

## What Changed

- Replaced narrow `pg_toast` and `pg_temp` schema exclusions with an
escaped `pg_` reserved-prefix exclusion.
- Kept `information_schema` excluded from logical backup metadata
discovery.
- Addressed Greptile feedback by removing redundant no-op additions from
the prior iteration.

## Verification

- `pnpm exec vitest run packages/db/src/backup-lib.test.ts`
- PR checks on the latest pushed head: policy, verify, e2e, Greptile
Review, and Snyk

## Risks

- Low risk: PostgreSQL reserves `pg_` schema names for system use, so
this should only exclude database-owned internals that should not be
restored from Paperclip logical backups.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected - check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:59:53 -05:00
Dotta fe401b7fa9 [codex] Polish inbox nested issue UI (#4959)
## Thinking Path

> - Paperclip orchestrates AI agents through issue lists and
issue-thread interactions
> - The inbox must preserve nested issue visibility and keyboard
navigation as work decomposes into deeper sub-issues
> - Some UI polish issues made nested rows harder to scan and pending
question cancellation less covered
> - The issue list also had a small test indentation regression around
load-more behavior
> - This pull request tightens nested inbox rendering and related
issue-thread/list polish
> - The benefit is a more reliable operator inbox for multi-level work
trees

## What Changed

- Included nested grandchild issues in inbox keyboard navigation and
recursive row rendering.
- Sort parent rows by descendant activity so active subtrees remain
visible.
- Removed extra inbox card background styling in favor of the page
surface.
- Added regression coverage for pending question cancellation.
- Cleaned up the issue-list load-more test indentation.

## Verification

- `pnpm exec vitest run ui/src/lib/inbox.test.ts
ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssuesList.test.tsx`
- Screenshots were not captured in this PR split; the visible flow is
covered by focused component/helper tests and should get browser QA in
the follow-up issue.

## Risks

- Medium risk: nested inbox rendering and keyboard navigation are
user-visible. The changes are localized to inbox grouping/rendering
helpers and covered by targeted tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:58:53 -05:00
Dotta 2d72292ad6 [codex] Add workspace routine run tab (#4958)
## Thinking Path

> - Paperclip orchestrates AI agents through reusable execution
workspaces and routines
> - Operators need a fast way to run workspace-aware routines against a
specific execution workspace
> - The existing workspace detail surface showed configuration, runtime
logs, and linked issues, but not routines that depend on workspace
variables
> - Routine runs also needed to prefill the selected execution workspace
so branch variables resolve correctly
> - This pull request adds a workspace routines tab and prefilled
routine-run dialog support
> - The benefit is a tighter workflow for rerunning reviews, smoke
checks, and other workspace-specific routines

## What Changed

- Added an execution workspace `Routines` tab and company-prefixed
routes.
- Listed routines that declare or reference workspace-specific
variables.
- Added `Run now` support that preselects the current execution
workspace in `RoutineRunVariablesDialog`.
- Centralized reusable execution workspace ordering/deduplication for
issue creation and workspace cards.
- Added focused UI helper and dialog regression tests.

## Verification

- `pnpm exec vitest run ui/src/lib/reusable-execution-workspaces.test.ts
ui/src/lib/workspace-routines.test.ts
ui/src/components/RoutineRunVariablesDialog.test.tsx
ui/src/lib/company-routes.test.ts`
- Screenshots were not captured in this PR split; the visible flow is
covered by focused component/helper tests and should get browser QA in
the follow-up issue.

## Risks

- Medium risk: this adds a new workspace detail tab and routine-run
path. It is isolated to workspace-scoped routines and uses existing
routine run APIs.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:58:15 -05:00
Dotta 570a4206da [codex] Recover productive terminal continuations (#4956)
## Thinking Path

> - Paperclip orchestrates AI agents through issue-scoped heartbeat runs
> - Recovery logic decides whether in-progress work still has a live
path after a terminal run
> - A productive terminal continuation can still leave an issue stranded
when no active run or wake remains
> - Treating that state as healthy leaves work stuck despite evidence
that more action is needed
> - This pull request re-enqueues recovery for productive terminal
continuations that left no live path
> - The benefit is fewer silently stranded in-progress issues after
agents make partial progress

## What Changed

- Reclassified successful-but-productive terminal continuations as
recoverable when no live path remains.
- Enqueue a follow-up recovery wake with the original run id and
continuation metadata.
- Added regression tests covering productive terminal continuation
recovery and advanced liveness handoff.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/run-continuations.test.ts`

## Risks

- Medium risk: recovery may schedule one more follow-up where Paperclip
previously considered the work observed. The existing uniqueness,
budget, and escalation checks still constrain retry loops.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:57:23 -05:00
Dotta 3cd26a78fc [codex] Surface live run comment context (#4957)
## Thinking Path

> - Paperclip orchestrates AI agents through issue comments and
heartbeat runs
> - The board UI needs to distinguish a comment that triggered a live
run from comments queued after that run started
> - The run payload already stores comment context, but active-run API
responses did not expose the ids the UI needs
> - Without those ids, the triggering comment can flash as queued while
the agent is already responding to it
> - This pull request exposes live-run comment context and teaches the
optimistic comment helper to ignore the trigger comment
> - The benefit is clearer issue-chat state during comment-triggered
agent interruptions

## What Changed

- Added `contextCommentId` and `contextWakeCommentId` to active/live run
payloads.
- Threaded those ids through server routes, heartbeat summaries, UI API
types, and issue detail rendering.
- Updated optimistic comment classification to avoid marking the
triggering comment as queued.
- Added server and UI regression coverage.

## Verification

- `pnpm exec vitest run
server/src/__tests__/agent-live-run-routes.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low-to-medium risk: adds optional fields to existing run payloads.
Existing consumers should ignore unknown fields, and UI handling is
null-safe.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 10:44:11 -05:00
Dotta e8275318ba [codex] Raise agent heartbeat concurrency default (#4954)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agent heartbeat settings control how much parallel work one employee
can run
> - The previous default of 5 concurrent runs was too restrictive for
active local agent teams
> - The shared default, heartbeat clamp, docs, and route/import/UI
expectations need to agree
> - This pull request raises the default heartbeat concurrency to 20
while keeping explicit headroom up to 50 for power users
> - The benefit is higher throughput for agent teams without each new
agent needing manual runtime config edits

## What Changed

- Raised `AGENT_DEFAULT_MAX_CONCURRENT_RUNS` from 5 to 20.
- Raised the heartbeat service max clamp from 10 to 50, keeping the new
default below the ceiling.
- Updated V1 implementation docs and tests that assert default
imported/exported runtime config.
- Updated the new-agent UI runtime config test to assert the shared
default constant instead of duplicating the numeric value.

## Verification

- `pnpm exec vitest run
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/company-portability.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`

## Risks

- Medium risk: new agents can consume more local execution capacity by
default. The heartbeat scheduler still respects configured max
concurrency and budget/pause controls, and operators can lower or raise
the per-agent cap within the `1..50` clamp.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 10:42:56 -05:00
Dotta e273d621fc [PAP-3154] Stop padding /live-runs by default (#4963)
## Summary
- Fix [PAP-3154](/PAP/issues/PAP-3154): the Sidebar's "Dashboard NN
live" badge showed a constant 50 in every company because `GET
/api/companies/:companyId/live-runs` was padding its response with up to
50 recent (non-live) heartbeat runs whenever the caller did not pass
`minCount`.
- Regression introduced by
[#4875](https://github.com/paperclipai/paperclip/pull/4875) (commit
`6445bef9`), which capped both `minCount` and `limit` at 50 with a
fallback of 50 for omitted values. The cap is correct for `limit` (real
unboundedness guard); for `minCount` it conflates "no padding" with "pad
to the cap".
- Default `minCount` to 0 so callers asking for "live runs" only get
actually-live runs unless they explicitly request padding
(`ActiveAgentsPanel` is the only caller that does). Keep `limit` capped
at 50 by default.

## Test plan
- [x] `pnpm exec vitest run
server/src/__tests__/agent-live-run-routes.test.ts` — 7/7 pass,
including new tests for the no-pad default and explicit padding.
- [x] `pnpm exec vitest run ui/src/components/Sidebar.test.tsx
ui/src/components/ActiveAgentsPanel.test.tsx
ui/src/api/heartbeats.test.ts` — 6/6 pass.
- [ ] Verify in dev: with ~8 truly-live runs in a company, the sidebar
Dashboard badge shows the real count (not 50).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 10:33:13 -05:00
Dotta 42a299fb9d [codex] Bound productivity review recovery loops (#4948)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat/productivity review subsystem detects when assigned
work is likely stuck or churning.
> - Productivity reviews are useful, but repeated reconciliation can
create noisy refresh comments or repeated review issues around the same
source issue.
> - That makes manager follow-up harder because the signal can get
buried under duplicate review activity.
> - This pull request bounds productivity review refreshes and creation
loops while preserving the existing escalation path.
> - The benefit is a quieter recovery loop that still surfaces stuck or
high-churn work for manager attention.

## What Changed

- Added refresh throttling for open productivity review issues,
including a one-hour default interval and a maximum of three refresh
comments per open review.
- Added a rolling 24-hour creation cap so completed/closed reviews
cannot immediately recreate review issues indefinitely for the same
source issue.
- Excluded cancelled productivity reviews from the creation cap so
manager cancellations do not silently suppress future legitimate
reviews.
- Preserved productivity review timestamps in deterministic test paths
and added targeted coverage for immediate refresh suppression, refresh
caps, creation caps, and cancelled-review exclusion.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/productivity-review-service.test.ts`
- `pnpm exec vitest run
server/src/__tests__/productivity-review-service.test.ts`
- Greptile Review: 5/5 on commit
`bcf25832d0ffae25890b2ee7eed112d1c2d114fe` with review threads resolved.
- GitHub PR checks passed on the latest head: `policy`, `verify`, `e2e`,
`Greptile Review`, and `security/snyk (cryppadotta)`.
- Verified the branch is rebased onto `public-gh/master` with no
conflicts.
- Verified the diff does not include `pnpm-lock.yaml`, database schema
changes, or migrations.

## Risks

- Low-to-medium risk: this changes automation cadence for productivity
reviews. A truly stuck issue may receive fewer repeated refresh
comments, but the original review issue remains open and assigned for
manager action.
- No migration risk: this is server logic and tests only.

> Checked [`ROADMAP.md`](ROADMAP.md) for overlapping planned core work;
this is a targeted recovery-loop fix and does not add a new roadmap
feature.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family, tool-using software
engineering mode. Exact context window is not exposed in this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable; server-only change)
- [x] I have updated relevant documentation to reflect my changes (not
applicable; no user-facing docs or commands changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 08:32:04 -05:00
Chris Farhood 2131ede7b8 feat(board): render approval summary/recommendedAction/nextActionOnApproval as markdown
Replaces plain <p> tags in BoardApprovalPayloadContent with MarkdownBody
(softBreaks enabled) so agent-authored markdown in these three fields —
headers, bullets, and newlines — renders correctly in the Board UI instead
of collapsing into a single unstyled paragraph.  No schema change; the
fields remain plain strings in the approval payload, only the renderer
changed.  Matches how comments, issue documents, and interaction cards
already render markdown via MarkdownBody.

Test coverage added for ## header → <h2>, bullet list → <ul><li>, and
plain-prose regression (no markup injected for single-line inputs).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-01 08:20:14 -04:00
Chris Farhood e8579d5c66 feat(import-export): complete company portability — secrets export/import and env round-tripping
Adds opt-in secret export/import: secret values are resolved (and optionally
decrypted) into the portability manifest, and re-created with conflict
handling on import. Fixes env round-tripping so both secret_ref and plain
bindings survive export/import cycles.
2026-05-01 08:18:54 -04:00
Chris Farhood 9e30b72b27 test(skills): cover PAT import, skill auth, and scan-projects routes
- importFromSource is invoked with the PAT when one is supplied in the body
- PATCH /skills/:id/auth updates and clears tokens, with matching activity log entries
- POST /skills/scan-projects is reachable to agents that hold canCreateAgents
- Update existing import-permission assertion to include the new authToken arg
2026-05-01 07:58:51 -04:00
Chris Farhood 7b12d907cc feat(skills): scan re-scans existing GitHub/sks_sh sources for new skills
When the project workspace scan runs, also iterate the source locators of
all accepted GitHub and sks_sh skills, re-fetch each source, and upsert any
skills that have appeared since the last import. Per-source failures are
collected as warnings instead of aborting the whole scan.
2026-05-01 07:43:02 -04:00
Chris Farhood d1d592d793 fix(security): use manual redirects when PAT is attached
Token-free requests follow redirects normally to support renamed/transferred
GitHub repos. Manual redirect policy is only needed when a PAT is attached,
to prevent the bearer token from being forwarded to attacker-controlled
redirect targets.
2026-05-01 07:41:57 -04:00
Chris Farhood 3dfb859676 feat(skills): GitHub PAT support for private skill repos
- Add optional authToken to skill import for GitHub private repos
- Store PAT as encrypted company secret (skill-pat:{skillId})
- Thread auth token through ghFetch and GitHub resolution helpers
- Add PATCH /companies/:companyId/skills/:skillId/auth for managing PAT per skill
- Preserve sourceAuthSecretId across skill re-imports/updates
- Delete PAT secret on PAT clear and on skill deletion to prevent orphans
- UI: Add PAT input field in import form for GitHub URLs
- UI: Add SkillAuthSection with ShieldCheck icon for viewing/updating/removing PAT
2026-05-01 07:41:48 -04:00
Devin Foley d2dd759caa plugins: make e2b template default explicit (#4901)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Remote execution environments are part of that control plane,
including sandbox-provider plugins like E2B
> - The E2B provider already normalizes config and runtime behavior
around a `base` template default
> - But the manifest still presented `template` as required, which
forces redundant operator input and makes the UI contract stricter than
runtime behavior
> - That mismatch showed up while building a repeatable QA workflow for
sandbox testing
> - This pull request makes the manifest and validation contract line up
with the existing `base` default
> - The benefit is a simpler and more accurate E2B environment setup
experience

## What Changed

- Removed the E2B manifest's `required: ["template"]` requirement so the
config schema matches runtime behavior
- Clarified the manifest description to say the template defaults to
`base` when omitted
- Added a focused unit test proving that validation normalizes a missing
template to `base`

## Verification

- Ran the focused E2B plugin test for the new behavior:
- `cd packages/plugins/sandbox-providers/e2b && pnpm test --
--testNamePattern "defaults a missing template to base"`

## Risks

- Low risk. This only loosens the schema to match the plugin's existing
runtime normalization and adds a test for that path.
- The broader E2B plugin suite currently has unrelated existing failures
outside this change; this PR does not modify those paths.

## Model Used

- OpenAI Codex, GPT-5 Codex via Codex CLI agent tooling, large-context
coding workflow with terminal tool use and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
2026-04-30 22:43:24 -07:00
Devin Foley b02e67cea5 fix(ci): diff PR workflow paths from merge base (#4903)
## Thinking Path

> - Paperclip’s PR workflow is part of the control-plane safety surface
because it decides whether a branch is allowed to merge.
> - This issue started in that workflow: the lockfile and manifest
policy checks were diffing `base.sha..head.sha`, which incorrectly
treated unrelated `master` commits as if they belonged to the PR branch.
> - The right fix there is to diff from the PR merge base
(`base...head`) so policy checks only evaluate files introduced by the
branch itself.
> - Once that workflow fix was in place, `/checkpr` exposed a second
blocker on the PR merge ref: `verify` was failing in newer `master`-side
tests that were not part of the original branch diff.
> - The actionable repeated failure came from the ACPX local adapter
test suite, where a test hard-coded the managed Codex home under
`instances/default` even though the stable Vitest runner sets a
non-default `PAPERCLIP_INSTANCE_ID`.
> - This pull request now includes both the original CI diff-scope fix
and the targeted ACPX test fix so the PR’s actual checks align with
current base-branch execution.
> - The benefit is that the original false-positive lockfile failure is
removed, and the merge-ref verify path is hardened against the
instance-id isolation used in CI.

## What Changed

- Updated `.github/workflows/pr.yml` so the lockfile policy and manifest
policy steps diff `pull_request.base.sha...pull_request.head.sha` from
the merge base instead of using a two-dot base/head diff.
- Added an inline workflow comment explaining why the three-dot diff is
required for PR-scoped file detection.
- Updated `packages/adapters/acpx-local/src/server/execute.test.ts` so
the managed Codex home assertion uses a test-specific
`PAPERCLIP_INSTANCE_ID` instead of hard-coding `default`.
- Restored `PAPERCLIP_INSTANCE_ID` after that ACPX test finishes so the
test remains isolated and does not leak process env changes.

## Verification

- Reproduced the original false positive locally by comparing PR heads
`#4901` and `#4902` with the old `base..head` logic; both incorrectly
included `pnpm-lock.yaml` from unrelated `master` commits.
- Verified the new `base...head` logic reduces those PRs to only their
actual changed files and excludes `pnpm-lock.yaml`.
- Verified a real manifest-changing PR (`#4893`) still reports
`package.json` changes under the new logic.
- Ran `pnpm -r typecheck` successfully.
- Ran `pnpm vitest run
packages/adapters/acpx-local/src/server/execute.test.ts` successfully
after the ACPX test fix.
- Ran `pnpm vitest run packages/db/src/backup-lib.test.ts` successfully
against the merge-ref-related DB failure path observed during
`/checkpr`.
- Pushed commit `9520a976` and allowed PR `#4903` checks to rerun on the
updated branch.

## Risks

- Low risk: the workflow change only affects how PR policy checks
determine the changed file set.
- Low risk: the ACPX change is test-only and aligns the test with the
instance-isolation behavior already used by
`scripts/run-vitest-stable.mjs` in CI.
- The remaining operational risk is limited to other unrelated
merge-ref-only failures that were not reproduced in the targeted local
verification above.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5-codex`, via the Codex local adapter in Paperclip.
- Tool-using coding model with shell execution, git, GitHub CLI, and
repository inspection in a local worktree.
- Context included the current repo, the Paperclip task thread, PR check
output, and the isolated execution workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-30 21:22:40 -07:00
github-actions[bot] 6a7cca95ef chore(lockfile): refresh pnpm-lock.yaml (#4899)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-04-30 20:00:07 -05:00
Dotta 4272c1604d Add ACPX local adapter runtime (#4893)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through a control plane
that can start, supervise, and recover agent runs.
> - Local adapters are the bridge between Paperclip issues and concrete
agent runtimes such as Claude, Codex, and other ACP-compatible tools.
> - The roadmap calls out broader “bring your own agent” and claw-style
agent support, and ACPX gives Paperclip one path to normalize multiple
ACP agents behind a single adapter.
> - The branch needed to become one reviewable PR against current
`paperclipai/paperclip:master`, without carrying stale base conflicts or
generated lockfile churn.
> - This pull request adds an experimental built-in `acpx_local`
adapter, integrates it through the server/CLI/UI adapter surfaces, and
adds regression coverage for runtime execution, skill sync, stream
parsing, diagnostics, and log redaction.
> - The benefit is that Paperclip can run Claude/Codex/custom ACP agents
through ACPX while keeping operator configuration, skills, logging, and
transcript rendering inside the existing adapter model.

## What Changed

- Added `@paperclipai/adapter-acpx-local` with server execution, config
schema, ACPX session handling, CLI formatting, UI config helpers, and
stdout parsing.
- Registered `acpx_local` across CLI, server, shared constants, UI
adapter metadata, adapter capabilities, and agent creation/editing
surfaces.
- Added ACPX runtime execution support with persistent sessions,
local-agent JWT environment handling, skill snapshots, runtime skill
materialization, and isolation/security regressions.
- Added ACPX adapter diagnostics and marked the adapter experimental in
the UI.
- Added command/env secret redaction for resolved command metadata in
adapter-utils, server event storage, and the Agent Detail invocation UI.
- Added Storybook coverage for ACPX config, transcript rendering, and
skill states, plus PR screenshots under `docs/pr-screenshots/pap-2944/`.
- Rebased the branch onto current `public-gh/master`; `pnpm-lock.yaml`
is intentionally not included and there are no migration/schema changes.

## Verification

- `pnpm exec vitest run
packages/adapters/acpx-local/src/server/execute.test.ts
packages/adapters/acpx-local/src/server/test.test.ts
packages/adapters/acpx-local/src/cli/format-event.test.ts
packages/adapters/acpx-local/src/ui/parse-stdout.test.ts
packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/redaction.test.ts
server/src/__tests__/acpx-local-execute.test.ts
server/src/__tests__/acpx-local-skill-sync.test.ts
server/src/__tests__/acpx-local-adapter-environment.test.ts
server/src/__tests__/adapter-routes.test.ts
server/src/__tests__/agent-skills-routes.test.ts
ui/src/adapters/metadata.test.ts` — 12 files, 87 tests passed.
- `pnpm --filter @paperclipai/adapter-acpx-local typecheck` — passed.
- `pnpm --filter @paperclipai/server typecheck` — passed.
- `pnpm --filter @paperclipai/ui typecheck` — passed.
- Confirmed PR diff does not include `pnpm-lock.yaml`, database schema
files, or migrations.

Screenshots:

![ACPX Claude skills
light](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-claude-light.png?raw=true)
![ACPX Claude skills
dark](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-claude-dark.png?raw=true)
![ACPX custom skills
light](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-custom-light.png?raw=true)

## Risks

- Medium risk: this introduces a new built-in adapter package and
touches runtime execution, adapter registration, agent config, skills,
and transcript rendering.
- ACPX and ACP agent behavior can vary by installed tool versions; the
adapter is marked experimental to set operator expectations.
- `pnpm-lock.yaml` is excluded per repository PR policy, so dependency
lock refresh must be handled by the repo’s automation or maintainers.
- No database migration risk: no schema or migration files changed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with repository tool use,
shell execution, git operations, and local verification. Exact hosted
context window was not exposed in this environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 19:57:05 -05:00
Dotta ad5432fece [codex] Harden issue recovery reliability (#4875)
## Thinking Path

> - Paperclip is the control plane for autonomous agent companies, so
non-terminal issue state must always have a clear live, waiting, or
recovery owner.
> - This change stays inside the server reliability and liveness
subsystem for assigned issue recovery, blocker attention, and live-run
polling.
> - Closed PR #4860 mixed this reliability work with separate
mutation-boundary policy changes, which made review and merge risk too
broad.
> - [PAP-2981](/PAP/issues/PAP-2981) asked for a replacement PR
containing only the remaining reliability slice and explicitly excluding
user-assignment and execution-policy restrictions.
> - Follow-up review also split `advanced` run-liveness continuation
behavior out of this PR so it can be reviewed separately.
> - The implementation hardens repeated recovery escalation, expands
blocker-attention coverage for explicit waiting and recovery paths, and
caps company live-run polling defaults.
> - The benefit is a smaller reliability PR that improves liveness
behavior without changing agent/user mutation authorization boundaries
or `advanced` continuation semantics.

## What Changed

- Avoid repeated liveness escalation updates when the source issue is
already blocked by the same open escalation.
- Treat open liveness escalation recovery issues, their source issues,
and their leaf blockers as covered waiting paths in blocker attention.
- Cap default company live-run polling at 50 rows for both `minCount`
and `limit`, including explicit zero values, to avoid unbounded
responses.
- Preserve the existing behavior where succeeded `advanced` runs are
considered productive/healthy for stranded-work recovery and are not
actionable bounded run-liveness continuations.
- Added focused server coverage for recovery dedupe, blocker attention,
liveness escalation, run continuations, and live-run polling.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
server/src/__tests__/issue-blocker-attention.test.ts
server/src/__tests__/run-continuations.test.ts
server/src/__tests__/agent-live-run-routes.test.ts`
- Result: 5 files passed, 63 tests passed.
- `pnpm --filter @paperclipai/server typecheck`
- Result: passed.
- No UI changes; screenshots are not applicable.

## Risks

- Recovery and blocker-attention classification changes can affect which
blocked chains are shown as covered versus needing attention.
- Live-run polling now treats omitted, invalid, or non-positive `limit`
/ `minCount` values as the capped default of 50.
- `advanced` run-liveness continuation behavior is intentionally
excluded from this PR and split for separate review.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 16:44:28 -05:00
Dotta a3de1d764d Add cheap model profiles for local adapters (#4881)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies, where
adapters are the boundary between the board, agents, and execution
runtimes.
> - Local adapters currently expose a primary runtime configuration, but
operators often need a cheaper model lane for routine or low-risk work.
> - That cheap lane has to stay adapter-owned: runtime profile settings
should not mutate the primary adapter config or bypass existing
auth/secret mediation.
> - Issue creation also needs an ergonomic way to request primary,
cheap, or custom model behavior for a selected assignee.
> - This pull request adds a first-class `cheap` model profile contract
across adapter capabilities, heartbeat config resolution, agent
configuration, and issue creation.
> - The benefit is cheaper task execution can be configured and
requested explicitly while preserving adapter boundaries, secret
handling, and audit visibility.

## What Changed

- Added adapter model-profile capability metadata and a `cheap` profile
contract for supported local adapters.
- Applied `runtimeConfig.modelProfiles.cheap.adapterConfig` during
heartbeat config resolution, including requested/applied/fallback run
metadata.
- Added agent configuration UI for cheap model profile settings without
writing those settings into primary `adapterConfig`.
- Added New Issue assignee model lane controls for Primary / Cheap /
Custom and request payload handling.
- Added run ledger profile badges and Storybook stories for the new
cheap-lane UI states.
- Added tests for validators, heartbeat model profile application,
permission/secret mediation, UI payload helpers, and run ledger
rendering.
- Added committed UI verification screenshots under
`docs/pr-screenshots/pap-2837/`.
- Addressed Greptile review feedback around cheap-profile defaults,
shared profile types, and fallback test data.

## Verification

Local:

- `pnpm exec vitest run packages/shared/src/validators/issue.test.ts
server/src/__tests__/adapter-registry.test.ts
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/heartbeat-model-profile.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/lib/agent-config-patch.test.ts
ui/src/lib/issue-assignee-overrides.test.ts
ui/src/lib/new-agent-runtime-config.test.ts` — passed, 8 files / 103
tests.
- `pnpm exec vitest run ui/src/lib/new-agent-runtime-config.test.ts
ui/src/components/IssueRunLedger.test.tsx` — passed after
Greptile/rebase follow-up, 2 files / 17 tests.
- `pnpm --filter @paperclipai/ui typecheck` — passed after
Greptile/rebase follow-up.
- `pnpm -r typecheck` — passed.
- `pnpm build` — passed.
- `pnpm test:run` — did not complete successfully in this local
worktree: it stopped in pre-existing `@paperclipai/adapter-utils`
sandbox/SSH fixture suites outside this PR diff. Failures were 5s local
timeouts plus `git init -b` unsupported by this machine's Git 2.21.0.
The branch-specific targeted suites above passed.
- Branch was fetched/rebased onto `public-gh/master`; `git rev-list
--left-right --count public-gh/master...HEAD` reports `0 9`.

Remote PR checks on latest head
`e30bf399146451c86cee98ed528d51d33fa5af5a`:

- `policy` — passed.
- `verify` — passed.
- `e2e` — passed.
- `Greptile Review` — passed, confidence score 5/5; Greptile review
threads resolved.
- `security/snyk (cryppadotta)` — passed.

Screenshots:

- [New issue cheap lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-cheap-desktop.png)
- [New issue custom lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-custom-desktop.png)
- [New issue unsupported adapter
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-unsupported-desktop.png)
- [Run ledger model profile badges
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/runledger-profile-badges-desktop.png)
- Mobile variants are also in `docs/pr-screenshots/pap-2837/`.

## Risks

- Medium: heartbeat config mediation now merges runtime model profiles
into adapter configs, so adapter secret normalization and host-command
restrictions must keep covering nested config paths.
- Medium: the UI adds another issue creation choice; unsupported
adapters must keep hiding the cheap lane and preserve primary behavior.
- Low migration risk: no database migration is included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

OpenAI Codex coding agent using GPT-5-class reasoning with repo tool use
and command execution. Exact served model/context window was not exposed
by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 15:32:04 -05:00
Dotta 1fe1067361 Polish board settings and skills workflow (#4863)
## Thinking Path

> - Paperclip's board UI and bundled skills are the operator layer for
configuring agents, routines, issue workflows, and local troubleshooting
loops.
> - The prior rollup mixed this operator polish with database backups,
backend reliability, thread scale, and cost/workflow primitives.
> - This pull request isolates the remaining board QoL, settings,
issue-detail integration, adapter config cleanup, and skills smoke
tooling.
> - It includes some integration-level overlap with the thread and
workflow slices so this branch can run from `origin/master` while still
preserving the full original work.
> - Preferred merge order is the narrower primitives first, then this
integration PR last.
> - The benefit is that reviewers can inspect the user-facing
board/settings/skills layer separately from backend infrastructure
changes.

## What Changed

- Added board/settings polish for agents, routines, company settings,
project workspace detail, and issue detail controls.
- Added agent/routine UI regression tests and New Issue dialog coverage.
- Integrated issue-detail activity/cost/interaction surfaces and leaf
work pause/resume controls.
- Cleaned bundled adapter UI config defaults and onboarding copy.
- Added terminal-bench loop and work-stoppage diagnosis skills plus a
smoke test script.
- Updated attachment type handling and Paperclip skill/API guidance.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/pages/Agents.test.tsx
ui/src/pages/Routines.test.tsx ui/src/components/NewIssueDialog.test.tsx
ui/src/pages/IssueDetail.test.tsx
server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts`
- Result: 7 test files passed, 54 tests passed.
- `pnpm run smoke:terminal-bench-loop-skill`
- Result: JSON output included `"ok": true` and `"cleanup": true`.
- UI screenshots not included because verification is focused
component/page coverage for the changed board surfaces.

## Risks

- This is the integration-heavy PR in the split and intentionally
overlaps some component/API primitives with the issue-thread and
workflow PRs so it can run from `origin/master`.
- Preferred merge order: #4859, #4860, #4861, #4862, then this PR last.
If earlier branches merge first, this PR may need a straightforward
conflict refresh in shared UI files.
- The terminal-bench smoke script creates temporary mock issues and
relies on cleanup; the verified run returned `cleanup: true`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 15:28:11 -05:00
Dotta c4269bab59 Add workflow interaction cancellation and issue cost summaries (#4862)
## Thinking Path

> - Paperclip coordinates work through issue-thread interactions, run
history, and cost telemetry.
> - Operators need workflow prompts to be cancellable and costs to be
visible at the issue level.
> - The earlier rollup mixed this workflow/cost work with database
backups, reliability recovery, thread scaling, and settings polish.
> - This pull request isolates the interaction and cost surfaces into a
reviewable slice.
> - The backend now supports cancelling pending question interactions
and summarizing issue-tree costs.
> - The UI component layer can render cancelled questions and interleave
activity with run ledger rows.

## What Changed

- Added `cancelled` as an issue-thread interaction status and result
shape for question interactions.
- Added the board-only `POST
/issues/:id/interactions/:interactionId/cancel` route and service
implementation.
- Added issue-tree cost summary support in the cost service and
`/issues/:id/cost-summary` API route.
- Extended shared cost exports and UI API/query keys for issue cost
summaries.
- Updated `IssueThreadInteractionCard` and `IssueRunLedger` components
for cancelled questions, issue cost surfaces, and activity/run
interleaving.
- Added focused server and component regression coverage.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- Result: 4 test files passed, 45 tests passed.
- UI screenshots not included because this PR updates reusable
components and API surfaces without wiring a new page-level layout.

## Risks

- Adds a new interaction terminal status; clients that switch
exhaustively on interaction status may need to handle `cancelled`.
- Issue-tree cost summaries use recursive issue traversal and should be
watched on unusually large issue trees.
- Page-level issue detail wiring is intentionally left to the board
QoL/issue-detail branch to keep this PR narrow.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 13:57:25 -05:00
Dotta 87f19cd9a6 Improve issue thread scale and markdown polish (#4861)
## Thinking Path

> - Paperclip's board UI is the operator surface for supervising
AI-agent companies.
> - Issue threads are where operators read progress, respond to agents,
inspect markdown, and jump through long histories.
> - Large threads and rich markdown had become difficult to navigate and
expensive to render.
> - The previous rollup mixed these UI scale fixes with unrelated
backend recovery, costs, backups, and settings changes.
> - This pull request isolates the issue-thread scale and markdown
polish work.
> - The benefit is a reviewable UI slice that can merge independently of
the backend reliability, database backup, workflow, and board QoL PRs.

## What Changed

- Virtualized long issue chat threads and stabilized
anchor/jump-to-latest behavior for large histories.
- Added incremental issue-list row loading and tests for
scroll-triggered pagination behavior.
- Hardened markdown body rendering and markdown editor behavior around
HTML tags, image drops, code-copy UI, and escaped newline handling.
- Added a long-thread measurement harness at
`scripts/measure-issue-chat-long-thread.mjs` plus
`perf:issue-chat-long-thread`.
- Added focused UI/lib regression coverage for thread rendering,
markdown, optimistic comments, and message building.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssuesList.test.tsx
ui/src/components/MarkdownBody.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/lib/issue-chat-messages.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`
- Result: 6 test files passed, 170 tests passed.
- UI screenshots not included because this PR is covered by targeted
component tests and does not introduce a new page layout.

## Risks

- Virtualization changes can affect scroll anchoring in edge cases on
very long threads.
- Markdown/editor hardening changes are intentionally defensive, but
malformed content may render differently than before.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 13:18:01 -05:00
Dotta cd606563f6 Expand database backups to non-system schemas (#4859)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - Reliable backups are part of operating that control plane safely.
> - The previous backup path was public-schema oriented and did not
clearly cover plugin-owned schemas or migration history.
> - Paperclip now has plugin database namespaces and Drizzle migration
state that must survive backup/restore.
> - This pull request expands logical database backups to non-system
schemas and documents the backup boundary.
> - The benefit is safer restore behavior for core and plugin-owned
database state without implying full filesystem disaster recovery.

## What Changed

- Include non-system database schemas in JavaScript and pg_dump backup
paths.
- Preserve enum, table, sequence, index, constraint, migration, and
plugin-schema objects across backup/restore.
- Add restore coverage for plugin-owned schemas and Drizzle migration
history.
- Clarify docs that DB backups are logical database backups, not full
instance filesystem backups.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run packages/db/src/backup-lib.test.ts`
- Result: 1 test file passed, 4 tests passed.
- Confirmed this PR does not include `pnpm-lock.yaml` or
`.github/workflows/*` changes.

## Risks

- Medium: backup generation touches schema discovery and restore
ordering, so unusual database objects may need additional coverage
later.
- No migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use enabled, medium reasoning
effort. Exact hosted context-window details are not exposed in this
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: no UI changes are included in this PR, so screenshots are not
applicable.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 12:54:35 -05:00
Devin Foley c0ce35d1fb Improve E2B plugin configuration UX and fix execution timeouts (#4802)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - E2B is a sandbox provider plugin that runs agent code in isolated
cloud environments
> - Operators configure E2B through the plugin settings page
> - But the E2B API key configuration was unclear — the settings field
description didn't explain that pasted keys are auto-saved as company
secrets, and the fallback to the host `E2B_API_KEY` variable wasn't
documented
> - Additionally, long-running E2B sandbox commands were timing out
because the plugin environment RPC driver used a fixed timeout, and
environment commands competed for the single foreground command slot
> - This PR clarifies the E2B configuration UX, fixes RPC timeouts for
plugin environment execution, and runs E2B environment commands in
background mode to avoid blocking the foreground slot
> - The benefit is clearer E2B setup for operators and more reliable
sandbox command execution

## What Changed

- Updated E2B plugin manifest and settings UI to clarify API key
configuration — field description now explains that pasted keys are
saved as company secrets and documents the `E2B_API_KEY` host fallback
- Added test coverage for the plugin settings page rendering
- Fixed `plugin-environment-driver.ts` to pass the configured timeout
through to RPC calls instead of using a hardcoded default
- Updated `environment-runtime.ts` to propagate timeout from the
environment lease to the plugin driver
- Changed E2B sandbox command execution to use background handles so
long-running agent commands don't block the foreground slot needed by
the callback bridge

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to plugin settings, verify E2B API key field shows
the updated description text
- Manual: run an E2B-backed agent task with a long-running command,
verify it completes without RPC timeout

## Risks

- Low risk. Configuration UX change is cosmetic. The timeout fix passes
an existing value through instead of dropping it. Background command
execution is a behavioral change but only affects E2B sandbox commands —
the foreground slot is still available for bridge health checks.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 17:12:30 -07:00
Devin Foley a4ac6ff133 Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end

## What Changed

- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export

## Verification

- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge

## Risks

- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
Devin Foley 4cf612a92d Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments that are leased, and the server
manages runtime state, workspace configuration, and plugin lifecycle
> - Several edge cases caused failures during concurrent operations: a
race condition in runtime state insertion could produce duplicate-key
errors, reused workspaces didn't sync their configuration when the
parent issue was updated, sandbox provider plugins could be queried
before registration completed, and orphaned environment leases from
failed runs were never released
> - This PR fixes these four runtime/environment issues
> - The benefit is more reliable concurrent agent execution and proper
resource cleanup

## What Changed

- `services/heartbeat.ts`: Fixed a race condition where concurrent
runtime state inserts could fail with a duplicate-key error by using an
upsert pattern
- `services/issues.ts`: Sync reused workspace configuration when an
issue is updated, so the workspace reflects the latest issue state
- `services/environment-runtime.ts`: Fixed a startup race where sandbox
provider plugins could be queried before registration completed, by
awaiting plugin readiness before resolving environment drivers
- `services/heartbeat.ts`: Release environment leases for orphaned runs
that lost their process without cleanup

## Verification

- `pnpm test` — all existing and new tests pass, including new tests for
runtime state upsert and process recovery lease cleanup
- `pnpm typecheck` — clean
- Manual: trigger concurrent agent runs to verify no duplicate-key
failures; verify orphaned leases are released after process loss

## Risks

- Low risk. The runtime state upsert changes insert-to-upsert behavior,
which could mask a legitimate duplicate if two different runs produce
the same key — but this is prevented by the run ID being part of the
key. The plugin startup await is bounded by the existing registration
timeout.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:10 -07:00
Devin Foley f9cf1d2f6a Add cursor sandbox support and fix SSH workspace sync (#4803)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, or on remote
hosts via SSH
> - The cursor adapter needs to resolve `cursor-agent` inside sandbox
environments where it's installed in `~/.local/bin`
> - But when using the default `agent` command on a sandbox target, the
adapter didn't know to look in `~/.local/bin/cursor-agent`, causing
"command not found" failures
> - Additionally, repeated SSH runs failed because `git checkout` during
workspace sync conflicted with leftover `.paperclip-runtime` files from
previous runs
> - This PR adds sandbox-aware command resolution for cursor and fixes
the SSH workspace sync conflict
> - The benefit is cursor works in E2B sandboxes out of the box, and
repeated SSH runs don't fail on workspace sync

## What Changed

- `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox
targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and
prefers `~/.local/bin/cursor-agent` when the default command is
requested; tightened the sandbox command probe to validate the binary
exists before launching; preserves explicit custom command overrides
- `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH
workspace sync to handle `.paperclip-runtime` untracked file conflicts
from previous runs

## Verification

- `pnpm test` — all existing and new tests pass, including cursor
sandbox probe, sandbox execution, and custom command override tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run a cursor-local task, verify
it resolves cursor-agent from the sandbox install path

## Risks

- Low-medium. The `--force` flag on git checkout could discard
uncommitted changes in the remote workspace, but the workspace is
managed by Paperclip and should not contain user edits.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:12:06 -07:00
Devin Foley a0f5cbffd7 Harden release flow with registry verification and dist-tag checks (#4800)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Paperclip is distributed as npm packages, including plugins like
`plugin-e2b`
> - The release process publishes canary and stable builds via npm
dist-tags
> - But there was no automated verification that published packages
actually landed with the correct dist-tags, and broken canary publishes
could silently ship to users
> - This PR adds a registry verification script that checks published
packages match their expected dist-tags, and wires it into PR CI so
regressions are caught before merge
> - The benefit is release integrity is verified automatically, and
broken dist-tag states are caught early

## What Changed

- Added `scripts/verify-release-registry-state.mjs` — verifies that
published npm packages have correct dist-tag assignments and detects
orphaned or mispointed tags
- Added `scripts/verify-release-registry-state.test.mjs` — test coverage
for the verification logic
- Updated `scripts/release.sh` to include canary dist-tag safety checks
before publishing
- Updated `.github/workflows/pr.yml` to run registry verification as a
CI step
- Updated `doc/PUBLISHING.md` and `doc/RELEASING.md` with the new
verification workflow

## Verification

- `pnpm test` — all tests pass including new verification script tests
- `node scripts/verify-release-registry-state.mjs` — runs against the
live npm registry and reports current state
- CI: the new PR workflow step runs on every PR push

## Risks

- Low risk. This is additive CI and tooling — no runtime code changes.
The registry verification is read-only (queries npm, does not publish).
The release script changes add safety checks that abort before
publishing if state is unexpected.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:20 -07:00
Devin Foley 367d4cab72 Fix SSH callback URL selection for LAN and private networks (#4799)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run on remote hosts via SSH environments
> - When a remote agent needs to call back to the Paperclip API, it
needs a reachable URL
> - But the runtime API URL candidate builder did not account for
private network topologies where the server is only reachable via LAN or
VPN addresses
> - Agents on SSH hosts were failing to connect because the callback URL
pointed to localhost or an unreachable address
> - This PR fixes callback URL selection to honor `PAPERCLIP_API_URL`,
prefer LAN-reachable candidates, filter unreachable link-local
addresses, and include interface hosts in onboarding invite URLs
> - The benefit is SSH-based agents can reliably reach the Paperclip API
on private networks without manual URL configuration

## What Changed

- `runtime-api.ts`: Added `PAPERCLIP_API_URL` as a first-priority
candidate in `buildRuntimeApiCandidateUrls`; extracted
`collectReachableInterfaceHosts` to enumerate non-loopback,
non-link-local network interface IPs with IPv4 preference
- `server/src/index.ts`: Export `PAPERCLIP_API_URL` from the server
environment so it is available to callback candidate resolution
- `server/src/routes/access.ts`: Include LAN interface hosts in
onboarding invite connection candidates
- `server/src/config.ts`: Attempted auto-allowing LAN interface hosts,
then reverted to the per-instance allowlist approach (both commits
included for history clarity)

## Verification

- `pnpm test` — all existing and new tests pass, including new tests for
LAN candidate ordering and link-local filtering
- `pnpm typecheck` — clean
- Manual: start a Paperclip server on a machine with a LAN IP, create an
SSH environment pointing to another host on the same LAN, verify the
agent's callback URL uses the LAN IP rather than localhost

## Risks

- Low-medium. The candidate list now includes more addresses (all
non-loopback LAN interfaces). These are candidates for the agent to try,
not an allowlist — the server's allowed hostnames still gate which
origins are accepted. Ordering change (LAN preferred over loopback)
could affect existing setups where localhost was intentionally
preferred.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:17 -07:00
Devin Foley 9b99d30330 Add dedicated environment settings page and test-in-environment (#4798)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments (local, SSH, E2B sandbox)
> - Operators need to configure and manage these environments
> - But environment settings were buried inside the general company
settings page, making them hard to find
> - Additionally, when testing an agent from the configuration form, the
test always ran locally regardless of which environment was selected
> - This PR moves environments into a dedicated top-level company
settings section and wires the "Test Environment" button to run inside
the selected environment
> - The benefit is operators can find and manage environments more
easily, and the test button now validates the actual environment the
agent will use

## What Changed

- Added a dedicated `CompanyEnvironments` settings page with its own
route and sidebar entry
- Updated `CompanySettingsSidebar` and `CompanySettingsNav` to include
the new environments section
- Modified the agent test route (`POST /agents/:id/test`) to accept an
optional `environmentId` parameter
- Updated all adapter `test.ts` handlers to resolve and use the
specified execution target environment
- Added `resolveTestExecutionTarget` to `execution-target.ts` for remote
environment test resolution with cwd fallback
- Moved the "Test Environment" button and its feedback display into the
`NewAgent` page footer for better UX flow

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to Company Settings, confirm "Environments" appears
as a top-level section
- Manual: configure an agent with a non-local environment, click "Test
Environment", confirm the test runs inside that environment

## Risks

- Low risk. UI-only routing change for the settings page. The
test-in-environment change adds an optional parameter with a local
fallback, so existing behavior is preserved when no environment is
specified.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:13 -07:00
921 changed files with 206175 additions and 6744 deletions
+77
View File
@@ -0,0 +1,77 @@
name: "Build: Dev"
on:
push:
branches: [dev]
workflow_dispatch:
permissions:
contents: read
packages: write
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 30
outputs:
image-tag: ${{ steps.tag.outputs.sha }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set image tag
id: tag
run: echo "sha=$(echo ${{ github.sha }} | cut -c1-7)" >> $GITHUB_OUTPUT
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Gitea Registry
uses: docker/login-action@v3
with:
registry: git.farh.net
username: ${{ gitea.repository_owner }}
password: ${{ secrets.REGISTRY_TOKEN }}
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: git.farh.net/farhoodlabs/paperclip-dev
tags: |
type=raw,value=latest
type=sha,prefix=
type=semver,pattern={{version}}
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
file: .farhoodlabs/Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
no-cache: true
update-infra:
needs: build
runs-on: ubuntu-latest
steps:
- name: Update dev image tag in infra repo
run: |
SHA="${{ needs.build.outputs.image-tag }}"
FILE="overlays/dev/kustomization.yaml"
response=$(curl -sS \
-H "Authorization: token ${{ secrets.REGISTRY_TOKEN }}" \
"https://git.farh.net/api/v1/repos/farhoodlabs/paperclip-infra/contents/$FILE")
file_sha=$(echo "$response" | jq -r '.sha')
content=$(echo "$response" | jq -r '.content' | base64 -d)
new_content=$(echo "$content" | sed "s/newTag: \".*\"/newTag: \"$SHA\"/")
encoded=$(printf '%s' "$new_content" | base64 -w 0)
curl -sS -X PUT \
-H "Authorization: token ${{ secrets.REGISTRY_TOKEN }}" \
"https://git.farh.net/api/v1/repos/farhoodlabs/paperclip-infra/contents/$FILE" \
-d "{\"message\":\"chore(cd): update paperclip-dev to $SHA\",\"content\":\"$encoded\",\"sha\":\"$file_sha\"}"
+48
View File
@@ -0,0 +1,48 @@
name: "Build: Production"
on:
push:
branches: [local]
workflow_dispatch:
permissions:
contents: read
packages: write
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Gitea Registry
uses: docker/login-action@v3
with:
registry: git.farh.net
username: ${{ gitea.repository_owner }}
password: ${{ secrets.REGISTRY_TOKEN }}
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: git.farh.net/farhoodlabs/paperclip
tags: |
type=raw,value=latest
type=sha,prefix=
type=semver,pattern={{version}}
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
file: .farhoodlabs/Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
no-cache: true
+79
View File
@@ -0,0 +1,79 @@
# Paperclip Fork — Project Context
This is a fork of [paperclipai/paperclip](https://github.com/paperclipai/paperclip).
Fork repo: https://git.farh.net/farhoodlabs/paperclip
## Branch Model
| Branch | Purpose |
|---|---|
| `master` | Mirrors `upstream/master` exactly + `.farhoodlabs/` overlay directory + `assemble-local.yml` action. Never commit application code here. |
| `local` | **Default branch.** Assembled automatically by `assemble-local.yml` on every `master` push. Contains: upstream + fork Dockerfile/workflows + all pending upstream PR cherry-picks. Builds `git.farh.net/farhoodlabs/paperclip`. |
| `dev` | Development branch based on upstream/master. Builds `git.farh.net/farhoodlabs/paperclip-dev` on every push. |
| PR branches | `skill-pat-feature`, `skill-scan-refresh`, `feat/company-portability-complete` — open PRs to upstream, never rebase onto master/local. |
**Never commit directly to `local`** — it is fully regenerated by the assemble action and any direct commits will be overwritten.
## Fork Overlay (`.farhoodlabs/`)
Files committed to `master` that get copied into position on `local` by the assemble action:
```
.farhoodlabs/
CLAUDE.md → CLAUDE.md (repo root)
Dockerfile → Dockerfile
.github/workflows/build-prod.yml → .github/workflows/build-prod.yml
.github/workflows/build-dev.yml → .github/workflows/build-dev.yml
```
The fork's Dockerfile production stage additions over upstream: `kubectl`, `kubeseal`, `uv`/`uvx`, `forgejo-cli` (`fj`, `fj-ex`, `fgj`), `nano`, `vim`.
To modify fork-specific files, edit them in `.farhoodlabs/` on `master` and push — the assemble action will apply them to `local` automatically.
## Pending Upstream PRs (included in `local`)
These are cherry-picked/squashed onto `local` by the assemble action. When upstream merges one, remove its entry from `assemble-local.yml`.
| PR | Branch | Method | Notes |
|---|---|---|---|
| #3237 | `skill-pat-feature` | cherry-pick | GitHub PAT support for private skill repos |
| #3351 | `skill-scan-refresh` | cherry-pick (exclude: skill-pat-feature) | Rebased onto skill-pat-feature |
| #3987 | `feat/company-portability-complete` | squash | Secrets export/import; squashed to bypass intra-PR merge commits |
## Common Tasks
### Sync upstream into master
```bash
git fetch upstream
git push origin upstream/master:master --force-with-lease
# assemble-local.yml triggers automatically and rebuilds local
```
### Add a new pending PR to local
Edit `.github/workflows/assemble-local.yml` on `master`:
- Simple PR (clean commits, no merge commits): add to `PR_CHERRY_PICK`
- Complex PR (has merge commits mixed in): add to `PR_SQUASH`
- If the branch was rebased onto another PR branch: use `exclude:base-branch`
### Remove a PR after upstream merges it
Delete its entry from `PR_CHERRY_PICK` or `PR_SQUASH` in `assemble-local.yml` on `master`.
### Submit a new PR to upstream
Branch from `upstream/master` (not from `local` or `master`):
```bash
git fetch upstream
git checkout -b feat/my-feature upstream/master
```
### Modify the fork Dockerfile
Edit `.farhoodlabs/Dockerfile` on `master`. Only modify the production stage — keep base/deps/build stages identical to upstream so diffs are minimal and upstream changes apply cleanly.
## Deployment
Paperclip runs in Kubernetes, not locally. Use `kubectl` to access it. The production image is `git.farh.net/farhoodlabs/paperclip:latest`.
## Key Files
- `.github/workflows/assemble-local.yml` — assembles `local` branch; edit this to manage pending PRs
- `.farhoodlabs/` — fork overlay; all fork-specific files live here on `master`
- `server/package.json` — has an adapter-utils workspace vs canary hack that needs fixing eventually
+101
View File
@@ -0,0 +1,101 @@
# syntax=docker/dockerfile:1.20
FROM node:lts-trixie-slim AS base
ARG USER_UID=1000
ARG USER_GID=1000
RUN apt-get update \
&& apt-get install -y --no-install-recommends ca-certificates gosu curl gh git wget ripgrep python3 \
&& rm -rf /var/lib/apt/lists/* \
&& corepack enable
# Modify the existing node user/group to have the specified UID/GID to match host user
RUN usermod -u $USER_UID --non-unique node \
&& groupmod -g $USER_GID --non-unique node \
&& usermod -g $USER_GID -d /paperclip node
FROM base AS deps
WORKDIR /app
COPY package.json pnpm-workspace.yaml pnpm-lock.yaml .npmrc ./
COPY cli/package.json cli/
COPY server/package.json server/
COPY ui/package.json ui/
COPY packages/shared/package.json packages/shared/
COPY packages/db/package.json packages/db/
COPY packages/adapter-utils/package.json packages/adapter-utils/
COPY packages/mcp-server/package.json packages/mcp-server/
COPY packages/adapters/acpx-local/package.json packages/adapters/acpx-local/
COPY packages/adapters/claude-local/package.json packages/adapters/claude-local/
COPY packages/adapters/codex-local/package.json packages/adapters/codex-local/
COPY packages/adapters/cursor-cloud/package.json packages/adapters/cursor-cloud/
COPY packages/adapters/cursor-local/package.json packages/adapters/cursor-local/
COPY packages/adapters/gemini-local/package.json packages/adapters/gemini-local/
COPY packages/adapters/openclaw-gateway/package.json packages/adapters/openclaw-gateway/
COPY packages/adapters/opencode-local/package.json packages/adapters/opencode-local/
COPY packages/adapters/pi-local/package.json packages/adapters/pi-local/
COPY packages/plugins/sdk/package.json packages/plugins/sdk/
COPY --parents packages/plugins/sandbox-providers/./*/package.json packages/plugins/sandbox-providers/
COPY packages/plugins/paperclip-plugin-fake-sandbox/package.json packages/plugins/paperclip-plugin-fake-sandbox/
COPY patches/ patches/
RUN pnpm install --frozen-lockfile
FROM base AS build
WORKDIR /app
COPY --from=deps /app /app
COPY . .
RUN pnpm --filter @paperclipai/ui build
RUN pnpm --filter @paperclipai/plugin-sdk build
RUN pnpm --filter @paperclipai/server build
RUN test -f server/dist/index.js || (echo "ERROR: server build output missing" && exit 1)
FROM base AS production
ARG USER_UID=1000
ARG USER_GID=1000
WORKDIR /app
COPY --chown=node:node --from=build /app /app
# Fork additions: kubectl, kubeseal, uv, forgejo CLIs, gitea tea CLI, editor tools
# Upstream installs: claude-code, codex, opencode-ai, openssh-client, jq
RUN apt-get update \
&& apt-get install -y --no-install-recommends openssh-client jq nano vim \
&& rm -rf /var/lib/apt/lists/* \
&& curl -fsSL https://dl.k8s.io/release/v1.32.0/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl \
&& chmod +x /usr/local/bin/kubectl \
&& curl -fsSL https://github.com/bitnami-labs/sealed-secrets/releases/download/v0.36.6/kubeseal-0.36.6-linux-amd64.tar.gz | tar -xzf - -C /tmp \
&& mv /tmp/kubeseal /usr/local/bin/kubeseal \
&& rm -rf /tmp/kubeseal /tmp/LICENSE /tmp/README.md \
&& curl -LsSf https://astral.sh/uv/install.sh | sh \
&& mv /root/.local/bin/uv /usr/local/bin/uv \
&& mv /root/.local/bin/uvx /usr/local/bin/uvx \
&& curl -fsSL https://codeberg.org/forgejo-contrib/forgejo-cli/releases/download/v0.4.1/forgejo-cli-linux.tar.gz | tar -xzf - -C /usr/local/bin \
&& chmod +x /usr/local/bin/fj \
&& curl -fsSL https://github.com/JKamsker/forgejo-cli-ex/releases/download/v0.1.7/fj-ex-linux-x86_64.tar.gz | tar -xzf - -C /usr/local/bin \
&& chmod +x /usr/local/bin/fj-ex \
&& curl -fsSL https://codeberg.org/romaintb/fgj/releases/download/v0.3.0/fgj_linux_amd64 -o /usr/local/bin/fgj \
&& chmod +x /usr/local/bin/fgj \
&& curl -fsSL https://dl.gitea.com/tea/0.14.0/tea-0.14.0-linux-amd64 -o /usr/local/bin/tea \
&& chmod +x /usr/local/bin/tea \
&& npm install --global --omit=dev @anthropic-ai/claude-code@latest @openai/codex@latest opencode-ai \
&& mkdir -p /paperclip \
&& chown node:node /paperclip
COPY scripts/docker-entrypoint.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/docker-entrypoint.sh
ENV NODE_ENV=production \
HOME=/paperclip \
HOST=0.0.0.0 \
PORT=3100 \
SERVE_UI=true \
PAPERCLIP_HOME=/paperclip \
PAPERCLIP_INSTANCE_ID=default \
USER_UID=${USER_UID} \
USER_GID=${USER_GID} \
PAPERCLIP_CONFIG=/paperclip/instances/default/config.json \
PAPERCLIP_DEPLOYMENT_MODE=authenticated \
PAPERCLIP_DEPLOYMENT_EXPOSURE=private \
OPENCODE_ALLOW_ALL_MODELS=true
VOLUME ["/paperclip"]
EXPOSE 3100
ENTRYPOINT ["docker-entrypoint.sh"]
CMD ["node", "--import", "./server/node_modules/tsx/dist/loader.mjs", "server/dist/index.js"]
+77
View File
@@ -0,0 +1,77 @@
name: "Build: Dev"
on:
push:
branches: [dev]
workflow_dispatch:
permissions:
contents: read
packages: write
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 30
outputs:
image-tag: ${{ steps.tag.outputs.sha }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set image tag
id: tag
run: echo "sha=$(echo ${{ github.sha }} | cut -c1-7)" >> $GITHUB_OUTPUT
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Gitea Registry
uses: docker/login-action@v3
with:
registry: git.farh.net
username: ${{ gitea.repository_owner }}
password: ${{ secrets.REGISTRY_TOKEN }}
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: git.farh.net/farhoodlabs/paperclip-dev
tags: |
type=raw,value=latest
type=sha,prefix=
type=semver,pattern={{version}}
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
file: .farhoodlabs/Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
no-cache: true
update-infra:
needs: build
runs-on: ubuntu-latest
steps:
- name: Update dev image tag in infra repo
run: |
SHA="${{ needs.build.outputs.image-tag }}"
FILE="overlays/dev/kustomization.yaml"
response=$(curl -sS \
-H "Authorization: token ${{ secrets.REGISTRY_TOKEN }}" \
"https://git.farh.net/api/v1/repos/farhoodlabs/paperclip-infra/contents/$FILE")
file_sha=$(echo "$response" | jq -r '.sha')
content=$(echo "$response" | jq -r '.content' | base64 -d)
new_content=$(echo "$content" | sed "s/newTag: \".*\"/newTag: \"$SHA\"/")
encoded=$(printf '%s' "$new_content" | base64 -w 0)
curl -sS -X PUT \
-H "Authorization: token ${{ secrets.REGISTRY_TOKEN }}" \
"https://git.farh.net/api/v1/repos/farhoodlabs/paperclip-infra/contents/$FILE" \
-d "{\"message\":\"chore(cd): update paperclip-dev to $SHA\",\"content\":\"$encoded\",\"sha\":\"$file_sha\"}"
+48
View File
@@ -0,0 +1,48 @@
name: "Build: Production"
on:
push:
branches: [local]
workflow_dispatch:
permissions:
contents: read
packages: write
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Gitea Registry
uses: docker/login-action@v3
with:
registry: git.farh.net
username: ${{ gitea.repository_owner }}
password: ${{ secrets.REGISTRY_TOKEN }}
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: git.farh.net/farhoodlabs/paperclip
tags: |
type=raw,value=latest
type=sha,prefix=
type=semver,pattern={{version}}
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
file: .farhoodlabs/Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
no-cache: true
+14 -50
View File
@@ -1,55 +1,19 @@
# Disabled in fork — Docker builds are handled by the fork overlay:
# build-prod.yml triggers on `local` branch → ghcr.io/farhoodlabs/paperclip
# build-dev.yml triggers on `dev` branch → ghcr.io/farhoodlabs/paperclip-dev
# See .farhoodlabs/.github/workflows/ and .github/workflows/assemble-local.yml
#
# NOTE: upstream may overwrite this file when master is synced. Re-apply if that happens,
# or use the sync-upstream.yml action which re-applies these overrides automatically.
name: Docker
on:
push:
branches:
- "master"
tags:
- "v*"
permissions:
contents: read
packages: write
workflow_dispatch:
inputs:
note:
description: "Disabled in fork. Use build-prod.yml (local) or build-dev.yml (dev)."
required: false
jobs:
build-and-push:
disabled:
runs-on: ubuntu-latest
timeout-minutes: 30
concurrency:
group: docker-${{ github.ref }}
cancel-in-progress: true
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/${{ github.repository }}
tags: |
type=raw,value=latest,enable={{is_default_branch}}
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=sha
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
cache-from: type=gha
cache-to: type=gha,mode=max
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
- run: echo "Disabled. See build-prod.yml and build-dev.yml."
+90 -10
View File
@@ -23,7 +23,9 @@ jobs:
- name: Block manual lockfile edits
if: github.head_ref != 'chore/refresh-lockfile'
run: |
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}" "${{ github.event.pull_request.head.sha }}")"
# Diff the PR branch against its merge base so recent base-branch commits
# do not masquerade as changes made by the PR itself.
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }}")"
if printf '%s\n' "$changed" | grep -qx 'pnpm-lock.yaml'; then
echo "Do not commit pnpm-lock.yaml in pull requests. CI owns lockfile updates."
exit 1
@@ -43,9 +45,18 @@ jobs:
- name: Validate Dockerfile deps stage
run: node ./scripts/check-docker-deps-stage.mjs
- name: Validate release package manifest
run: node ./scripts/release-package-map.mjs check
- name: Verify release package bootstrap for changed manifests
run: |
mapfile -t changed_paths < <(git diff --name-only "${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }}")
PAPERCLIP_RELEASE_BOOTSTRAP_BASE_SHA="${{ github.event.pull_request.base.sha }}" \
node ./scripts/check-release-package-bootstrap.mjs "${changed_paths[@]}"
- name: Validate dependency resolution when manifests change
run: |
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}" "${{ github.event.pull_request.head.sha }}")"
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }}")"
manifest_pattern='(^|/)package\.json$|^pnpm-workspace\.yaml$|^\.npmrc$|^pnpmfile\.(cjs|js|mjs)$'
if printf '%s\n' "$changed" | grep -Eq "$manifest_pattern"; then
pnpm install --lockfile-only --ignore-scripts --no-frozen-lockfile
@@ -74,16 +85,88 @@ jobs:
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Typecheck
run: pnpm -r typecheck
- name: Typecheck workspaces whose build scripts skip TypeScript
run: pnpm run typecheck:build-gaps
- name: Run tests
run: pnpm test:run
- name: Run general test suites
run: pnpm test:run:general
- name: Verify release registry test coverage
run: pnpm run test:release-registry
- name: Build
run: pnpm build
- name: Release canary dry run
verify_serialized_server:
name: Verify serialized server suites (${{ matrix.shard_label }})
needs: [policy]
runs-on: ubuntu-latest
timeout-minutes: 20
strategy:
fail-fast: false
matrix:
include:
- shard_index: 0
shard_count: 4
shard_label: 1/4
- shard_index: 1
shard_count: 4
shard_label: 2/4
- shard_index: 2
shard_count: 4
shard_label: 3/4
- shard_index: 3
shard_count: 4
shard_label: 4/4
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Run serialized server test shard
run: pnpm test:run:serialized -- --shard-index ${{ matrix.shard_index }} --shard-count ${{ matrix.shard_count }}
canary_dry_run:
name: Canary Dry Run
needs: [policy]
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --frozen-lockfile
# `release.sh` always executes its Step 2/7 workspace build, even when
# `--skip-verify` bypasses the initial verification gate.
- name: Release canary dry run via release.sh internal build
run: |
git checkout -B master HEAD
git checkout -- pnpm-lock.yaml
@@ -112,9 +195,6 @@ jobs:
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Build
run: pnpm build
- name: Install Playwright
run: npx playwright install --with-deps chromium
+10 -90
View File
@@ -1,96 +1,16 @@
# Disabled in fork — `gh` CLI and GitHub-specific commands are not available on Gitea.
# Lockfile refreshes are managed directly in development workflows.
#
# NOTE: upstream may overwrite this file when master is synced. Re-apply if that happens.
name: Refresh Lockfile
on:
push:
branches:
- master
workflow_dispatch:
concurrency:
group: refresh-lockfile-master
cancel-in-progress: false
inputs:
note:
description: "Disabled in fork. Uses GitHub-specific gh CLI."
required: false
jobs:
refresh:
disabled:
runs-on: ubuntu-latest
timeout-minutes: 10
permissions:
contents: write
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
run_install: false
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: pnpm
- name: Refresh pnpm lockfile
run: pnpm install --lockfile-only --ignore-scripts --no-frozen-lockfile
- name: Fail on unexpected file changes
run: |
changed="$(git status --porcelain)"
if [ -z "$changed" ]; then
echo "Lockfile is already up to date."
exit 0
fi
if printf '%s\n' "$changed" | grep -Fvq ' pnpm-lock.yaml'; then
echo "Unexpected files changed during lockfile refresh:"
echo "$changed"
exit 1
fi
- name: Create or update pull request
id: upsert-pr
env:
GH_TOKEN: ${{ github.token }}
REPO_OWNER: ${{ github.repository_owner }}
run: |
if git diff --quiet -- pnpm-lock.yaml; then
echo "Lockfile unchanged, nothing to do."
echo "pr_url=" >> "$GITHUB_OUTPUT"
exit 0
fi
BRANCH="chore/refresh-lockfile"
git config user.name "lockfile-bot"
git config user.email "lockfile-bot@users.noreply.github.com"
git checkout -B "$BRANCH"
git add pnpm-lock.yaml
git commit -m "chore(lockfile): refresh pnpm-lock.yaml"
git push --force origin "$BRANCH"
# Only reuse an open PR from this repository owner, not a fork with the same branch name.
pr_url="$(
gh pr list --state open --head "$BRANCH" --json url,headRepositoryOwner \
--jq ".[] | select(.headRepositoryOwner.login == \"$REPO_OWNER\") | .url" |
head -n 1
)"
if [ -z "$pr_url" ]; then
pr_url="$(gh pr create \
--head "$BRANCH" \
--title "chore(lockfile): refresh pnpm-lock.yaml" \
--body "Auto-generated lockfile refresh after dependencies changed on master. This PR only updates pnpm-lock.yaml.")"
echo "Created new PR: $pr_url"
else
echo "PR already exists: $pr_url"
fi
echo "pr_url=$pr_url" >> "$GITHUB_OUTPUT"
- name: Enable auto-merge for lockfile PR
if: steps.upsert-pr.outputs.pr_url != ''
env:
GH_TOKEN: ${{ github.token }}
run: |
gh pr merge --auto --squash --delete-branch "${{ steps.upsert-pr.outputs.pr_url }}"
- run: echo "Disabled. Lockfile management requires GitHub-specific tooling."
+8 -253
View File
@@ -1,261 +1,16 @@
# Disabled in fork — package publishing is not applicable to this fork.
#
# NOTE: upstream may overwrite this file when master is synced. Re-apply if that happens,
# or use the sync-upstream.yml action which re-applies these overrides automatically.
name: Release
on:
push:
branches:
- master
workflow_dispatch:
inputs:
source_ref:
description: Commit SHA, branch, or tag to publish as stable
required: true
type: string
default: master
stable_date:
description: Enter a UTC date in YYYY-MM-DD format, for example 2026-03-18. Do not enter a version string. The workflow will resolve that date to a stable version such as 2026.318.0, then 2026.318.1 for the next same-day stable.
note:
description: "Disabled in fork."
required: false
type: string
dry_run:
description: Preview the stable release without publishing
required: true
type: boolean
default: false
concurrency:
group: release-${{ github.event_name }}-${{ github.ref }}
cancel-in-progress: false
jobs:
verify_canary:
if: github.event_name == 'push'
disabled:
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
- name: Typecheck
run: pnpm -r typecheck
- name: Run tests
run: pnpm test:run
- name: Build
run: pnpm build
publish_canary:
if: github.event_name == 'push'
needs: verify_canary
runs-on: ubuntu-latest
timeout-minutes: 45
environment: npm-canary
permissions:
contents: write
id-token: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
- name: Restore tracked install-time changes
run: git checkout -- pnpm-lock.yaml
- name: Configure git author
run: |
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
- name: Publish canary
env:
GITHUB_ACTIONS: "true"
run: ./scripts/release.sh canary --skip-verify
- name: Push canary tag
run: |
tag="$(git tag --points-at HEAD | grep '^canary/v' | head -1)"
if [ -z "$tag" ]; then
echo "Error: no canary tag points at HEAD after release." >&2
exit 1
fi
git push origin "refs/tags/${tag}"
verify_stable:
if: github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ inputs.source_ref }}
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
- name: Typecheck
run: pnpm -r typecheck
- name: Run tests
run: pnpm test:run
- name: Build
run: pnpm build
preview_stable:
if: github.event_name == 'workflow_dispatch' && inputs.dry_run
needs: verify_stable
runs-on: ubuntu-latest
timeout-minutes: 45
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ inputs.source_ref }}
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
- name: Dry-run stable release
env:
GITHUB_ACTIONS: "true"
run: |
args=(stable --skip-verify --dry-run)
if [ -n "${{ inputs.stable_date }}" ]; then
args+=(--date "${{ inputs.stable_date }}")
fi
./scripts/release.sh "${args[@]}"
publish_stable:
if: github.event_name == 'workflow_dispatch' && !inputs.dry_run
needs: verify_stable
runs-on: ubuntu-latest
timeout-minutes: 45
environment: npm-stable
permissions:
contents: write
id-token: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ inputs.source_ref }}
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
- name: Restore tracked install-time changes
run: git checkout -- pnpm-lock.yaml
- name: Configure git author
run: |
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
- name: Publish stable
env:
GITHUB_ACTIONS: "true"
run: |
args=(stable --skip-verify)
if [ -n "${{ inputs.stable_date }}" ]; then
args+=(--date "${{ inputs.stable_date }}")
fi
./scripts/release.sh "${args[@]}"
- name: Push stable tag
run: |
tag="$(git tag --points-at HEAD | grep '^v' | head -1)"
if [ -z "$tag" ]; then
echo "Error: no stable tag points at HEAD after release." >&2
exit 1
fi
git push origin "refs/tags/${tag}"
- name: Create GitHub Release
env:
GH_TOKEN: ${{ github.token }}
PUBLISH_REMOTE: origin
run: |
version="$(git tag --points-at HEAD | grep '^v' | head -1 | sed 's/^v//')"
if [ -z "$version" ]; then
echo "Error: no v* tag points at HEAD after stable release." >&2
exit 1
fi
./scripts/create-github-release.sh "$version"
- run: echo "Disabled in fork."
+70
View File
@@ -0,0 +1,70 @@
name: Sync upstream
# Syncs upstream/master into this fork's master, then re-applies fork overrides
# for any upstream workflow files that should not run in the fork (docker.yml, release.yml).
# Triggers assemble-local.yml automatically via the master push.
#
# Run manually or on a schedule to keep master current with upstream.
on:
schedule:
- cron: '0 7 * * *' # daily at 2am EST (UTC-5)
workflow_dispatch:
permissions:
contents: write
jobs:
sync:
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- name: Checkout master
uses: actions/checkout@v4
with:
fetch-depth: 0
token: ${{ secrets.GITHUB_TOKEN }}
- name: Configure git
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
- name: Fetch upstream
run: |
git remote add upstream https://github.com/paperclipai/paperclip.git 2>/dev/null || true
git fetch upstream
- name: Fast-forward master to upstream
run: |
git merge --ff-only upstream/master || {
echo "::error::Cannot fast-forward master to upstream/master — diverged history"
echo "::error::Resolve manually: git fetch upstream && git rebase upstream/master"
exit 1
}
- name: Re-apply fork workflow overrides
run: |
# These files are overridden in the fork to prevent upstream workflows from
# running on fork pushes. Re-apply after each upstream sync.
OVERRIDE_FILES=(
".github/workflows/docker.yml"
".github/workflows/release.yml"
)
changed=false
for f in "${OVERRIDE_FILES[@]}"; do
fork_version=$(git show origin/master:"$f" 2>/dev/null || true)
current=$(cat "$f" 2>/dev/null || true)
if [ "$fork_version" != "$current" ]; then
echo "Re-applying fork override: $f"
git checkout origin/master -- "$f"
changed=true
fi
done
if [ "$changed" = true ]; then
git add "${OVERRIDE_FILES[@]}"
git commit -m "chore(ci): re-apply fork workflow overrides after upstream sync"
fi
- name: Push master
run: git push origin master
+79
View File
@@ -0,0 +1,79 @@
# Paperclip Fork — Project Context
This is a fork of [paperclipai/paperclip](https://github.com/paperclipai/paperclip).
Fork repo: https://git.farh.net/farhoodlabs/paperclip
## Branch Model
| Branch | Purpose |
|---|---|
| `master` | Mirrors `upstream/master` exactly + `.farhoodlabs/` overlay directory + `assemble-local.yml` action. Never commit application code here. |
| `local` | **Default branch.** Assembled automatically by `assemble-local.yml` on every `master` push. Contains: upstream + fork Dockerfile/workflows + all pending upstream PR cherry-picks. Builds `git.farh.net/farhoodlabs/paperclip`. |
| `dev` | Development branch based on upstream/master. Builds `git.farh.net/farhoodlabs/paperclip-dev` on every push. |
| PR branches | `skill-pat-feature`, `skill-scan-refresh`, `feat/company-portability-complete` — open PRs to upstream, never rebase onto master/local. |
**Never commit directly to `local`** — it is fully regenerated by the assemble action and any direct commits will be overwritten.
## Fork Overlay (`.farhoodlabs/`)
Files committed to `master` that get copied into position on `local` by the assemble action:
```
.farhoodlabs/
CLAUDE.md → CLAUDE.md (repo root)
Dockerfile → Dockerfile
.github/workflows/build-prod.yml → .github/workflows/build-prod.yml
.github/workflows/build-dev.yml → .github/workflows/build-dev.yml
```
The fork's Dockerfile production stage additions over upstream: `kubectl`, `kubeseal`, `uv`/`uvx`, `forgejo-cli` (`fj`, `fj-ex`, `fgj`), `nano`, `vim`.
To modify fork-specific files, edit them in `.farhoodlabs/` on `master` and push — the assemble action will apply them to `local` automatically.
## Pending Upstream PRs (included in `local`)
These are cherry-picked/squashed onto `local` by the assemble action. When upstream merges one, remove its entry from `assemble-local.yml`.
| PR | Branch | Method | Notes |
|---|---|---|---|
| #3237 | `skill-pat-feature` | cherry-pick | GitHub PAT support for private skill repos |
| #3351 | `skill-scan-refresh` | cherry-pick (exclude: skill-pat-feature) | Rebased onto skill-pat-feature |
| #3987 | `feat/company-portability-complete` | squash | Secrets export/import; squashed to bypass intra-PR merge commits |
## Common Tasks
### Sync upstream into master
```bash
git fetch upstream
git push origin upstream/master:master --force-with-lease
# assemble-local.yml triggers automatically and rebuilds local
```
### Add a new pending PR to local
Edit `.github/workflows/assemble-local.yml` on `master`:
- Simple PR (clean commits, no merge commits): add to `PR_CHERRY_PICK`
- Complex PR (has merge commits mixed in): add to `PR_SQUASH`
- If the branch was rebased onto another PR branch: use `exclude:base-branch`
### Remove a PR after upstream merges it
Delete its entry from `PR_CHERRY_PICK` or `PR_SQUASH` in `assemble-local.yml` on `master`.
### Submit a new PR to upstream
Branch from `upstream/master` (not from `local` or `master`):
```bash
git fetch upstream
git checkout -b feat/my-feature upstream/master
```
### Modify the fork Dockerfile
Edit `.farhoodlabs/Dockerfile` on `master`. Only modify the production stage — keep base/deps/build stages identical to upstream so diffs are minimal and upstream changes apply cleanly.
## Deployment
Paperclip runs in Kubernetes, not locally. Use `kubectl` to access it. The production image is `git.farh.net/farhoodlabs/paperclip:latest`.
## Key Files
- `.github/workflows/assemble-local.yml` — assembles `local` branch; edit this to manage pending PRs
- `.farhoodlabs/` — fork overlay; all fork-specific files live here on `master`
- `server/package.json` — has an adapter-utils workspace vs canary hack that needs fixing eventually
+3
View File
@@ -22,8 +22,10 @@ COPY packages/shared/package.json packages/shared/
COPY packages/db/package.json packages/db/
COPY packages/adapter-utils/package.json packages/adapter-utils/
COPY packages/mcp-server/package.json packages/mcp-server/
COPY packages/adapters/acpx-local/package.json packages/adapters/acpx-local/
COPY packages/adapters/claude-local/package.json packages/adapters/claude-local/
COPY packages/adapters/codex-local/package.json packages/adapters/codex-local/
COPY packages/adapters/cursor-cloud/package.json packages/adapters/cursor-cloud/
COPY packages/adapters/cursor-local/package.json packages/adapters/cursor-local/
COPY packages/adapters/gemini-local/package.json packages/adapters/gemini-local/
COPY packages/adapters/openclaw-gateway/package.json packages/adapters/openclaw-gateway/
@@ -32,6 +34,7 @@ COPY packages/adapters/pi-local/package.json packages/adapters/pi-local/
COPY packages/plugins/sdk/package.json packages/plugins/sdk/
COPY --parents packages/plugins/sandbox-providers/./*/package.json packages/plugins/sandbox-providers/
COPY packages/plugins/paperclip-plugin-fake-sandbox/package.json packages/plugins/paperclip-plugin-fake-sandbox/
COPY packages/plugins/plugin-llm-wiki/package.json packages/plugins/plugin-llm-wiki/
COPY patches/ patches/
RUN pnpm install --frozen-lockfile
+3 -1
View File
@@ -37,8 +37,10 @@
},
"dependencies": {
"@clack/prompts": "^0.10.0",
"@paperclipai/adapter-acpx-local": "workspace:*",
"@paperclipai/adapter-claude-local": "workspace:*",
"@paperclipai/adapter-codex-local": "workspace:*",
"@paperclipai/adapter-cursor-cloud": "workspace:*",
"@paperclipai/adapter-cursor-local": "workspace:*",
"@paperclipai/adapter-gemini-local": "workspace:*",
"@paperclipai/adapter-opencode-local": "workspace:*",
@@ -48,7 +50,7 @@
"@paperclipai/db": "workspace:*",
"@paperclipai/server": "workspace:*",
"@paperclipai/shared": "workspace:*",
"drizzle-orm": "0.38.4",
"drizzle-orm": "0.45.2",
"dotenv": "^17.0.1",
"commander": "^13.1.0",
"embedded-postgres": "^18.1.0-beta.16",
+2
View File
@@ -244,6 +244,7 @@ describe("renderCompanyImportPreview", () => {
billingCode: null,
executionWorkspaceSettings: null,
assigneeAdapterOverrides: null,
comments: [],
metadata: null,
},
],
@@ -460,6 +461,7 @@ describe("import selection catalog", () => {
billingCode: null,
executionWorkspaceSettings: null,
assigneeAdapterOverrides: null,
comments: [],
metadata: null,
},
],
+6 -4
View File
@@ -1,3 +1,4 @@
import fs from "node:fs";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it } from "vitest";
@@ -16,13 +17,14 @@ describe("home path resolution", () => {
});
it("defaults to ~/.paperclip and default instance", () => {
delete process.env.PAPERCLIP_HOME;
const home = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-home-paths-"));
process.env.PAPERCLIP_HOME = home;
delete process.env.PAPERCLIP_INSTANCE_ID;
const paths = describeLocalInstancePaths();
expect(paths.homeDir).toBe(path.resolve(os.homedir(), ".paperclip"));
expect(paths.homeDir).toBe(home);
expect(paths.instanceId).toBe("default");
expect(paths.configPath).toBe(path.resolve(os.homedir(), ".paperclip", "instances", "default", "config.json"));
expect(paths.configPath).toBe(path.resolve(home, "instances", "default", "config.json"));
});
it("supports PAPERCLIP_HOME and explicit instance ids", () => {
@@ -34,7 +36,7 @@ describe("home path resolution", () => {
});
it("rejects invalid instance ids", () => {
expect(() => resolvePaperclipInstanceId("bad/id")).toThrow(/Invalid instance id/);
expect(() => resolvePaperclipInstanceId("bad/id")).toThrow(/Invalid PAPERCLIP_INSTANCE_ID/);
});
it("expands ~ prefixes", () => {
+30
View File
@@ -6,6 +6,7 @@ import { onboard } from "../commands/onboard.js";
import type { PaperclipConfig } from "../config/schema.js";
const ORIGINAL_ENV = { ...process.env };
const ORIGINAL_CWD = process.cwd();
function createExistingConfigFixture() {
const root = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-onboard-"));
@@ -85,10 +86,18 @@ describe("onboard", () => {
delete process.env.PAPERCLIP_AGENT_JWT_SECRET;
delete process.env.PAPERCLIP_SECRETS_MASTER_KEY;
delete process.env.PAPERCLIP_SECRETS_MASTER_KEY_FILE;
delete process.env.PAPERCLIP_HOME;
delete process.env.PAPERCLIP_CONFIG;
delete process.env.PAPERCLIP_INSTANCE_ID;
delete process.env.PAPERCLIP_BIND;
delete process.env.PAPERCLIP_BIND_HOST;
delete process.env.PAPERCLIP_TAILNET_BIND_HOST;
delete process.env.HOST;
});
afterEach(() => {
process.env = { ...ORIGINAL_ENV };
process.chdir(ORIGINAL_CWD);
});
it("preserves an existing config when rerun without flags", async () => {
@@ -125,6 +134,27 @@ describe("onboard", () => {
expect(raw.server.host).toBe("127.0.0.1");
});
it("creates instance-root config and data paths for a fresh PAPERCLIP_HOME", async () => {
const home = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-onboard-home-"));
const cwd = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-onboard-cwd-"));
process.chdir(cwd);
process.env.PAPERCLIP_HOME = home;
await onboard({ yes: true, invokedByRun: true });
const instanceRoot = path.join(home, "instances", "default");
const configPath = path.join(instanceRoot, "config.json");
const raw = JSON.parse(fs.readFileSync(configPath, "utf8")) as PaperclipConfig;
expect(raw.database.embeddedPostgresDataDir).toBe(path.join(instanceRoot, "db"));
expect(raw.database.backup.dir).toBe(path.join(instanceRoot, "data", "backups"));
expect(raw.logging.logDir).toBe(path.join(instanceRoot, "logs"));
expect(raw.storage.localDisk.baseDir).toBe(path.join(instanceRoot, "data", "storage"));
expect(raw.secrets.localEncrypted.keyFilePath).toBe(path.join(instanceRoot, "secrets", "master.key"));
expect(fs.existsSync(path.join(instanceRoot, ".env"))).toBe(true);
expect(fs.existsSync(path.join(instanceRoot, "secrets", "master.key"))).toBe(true);
});
it("supports authenticated/private quickstart bind presets", async () => {
const configPath = createFreshConfigPath();
process.env.PAPERCLIP_TAILNET_BIND_HOST = "100.64.0.8";
+164
View File
@@ -0,0 +1,164 @@
import fs from "node:fs";
import os from "node:os";
import path from "node:path";
import { Command } from "commander";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
const mocks = vi.hoisted(() => ({
scaffoldPluginProject: vi.fn((options: { outputDir: string }) => options.outputDir),
}));
vi.mock("../../../packages/plugins/create-paperclip-plugin/src/index.js", async () => {
const actual =
await vi.importActual<typeof import("../../../packages/plugins/create-paperclip-plugin/src/index.js")>(
"../../../packages/plugins/create-paperclip-plugin/src/index.js",
);
return {
...actual,
scaffoldPluginProject: mocks.scaffoldPluginProject,
};
});
import {
buildPluginInstallRequest,
buildPluginInitNextCommands,
buildPluginInitScaffoldOptions,
registerPluginCommands,
} from "../commands/client/plugin.js";
const tempDirs: string[] = [];
function makeTempDir(): string {
const dir = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-cli-plugin-"));
tempDirs.push(dir);
return dir;
}
afterEach(() => {
while (tempDirs.length > 0) {
const dir = tempDirs.pop();
if (dir) fs.rmSync(dir, { recursive: true, force: true });
}
});
describe("plugin init", () => {
beforeEach(() => {
mocks.scaffoldPluginProject.mockClear();
});
it("maps package name and flags to scaffolder options", () => {
const cwd = path.resolve("/tmp/paperclip-cli-test");
const options = buildPluginInitScaffoldOptions(
"@acme/plugin-linear",
{
output: "plugins",
template: "connector",
category: "automation",
displayName: "Linear Bridge",
description: "Syncs Linear issues",
author: "Acme",
sdkPath: "../paperclip/packages/plugins/sdk",
},
cwd,
);
expect(options).toEqual({
pluginName: "@acme/plugin-linear",
outputDir: path.resolve(cwd, "plugins", "plugin-linear"),
template: "connector",
category: "automation",
displayName: "Linear Bridge",
description: "Syncs Linear issues",
author: "Acme",
sdkPath: "../paperclip/packages/plugins/sdk",
});
});
it("builds exact next commands using the scaffold path", () => {
expect(buildPluginInitNextCommands("/tmp/acme plugin")).toEqual([
"cd '/tmp/acme plugin'",
"pnpm install",
"pnpm dev",
"paperclipai plugin install '/tmp/acme plugin'",
]);
});
it("registers the CLI wrapper and invokes the existing scaffolder", async () => {
const program = new Command();
program.exitOverride();
program.configureOutput({ writeOut: () => {}, writeErr: () => {} });
registerPluginCommands(program);
await program.parseAsync(
[
"plugin",
"init",
"demo-plugin",
"--output",
"/tmp/paperclip-init-output",
"--template",
"workspace",
"--category",
"workspace",
"--display-name",
"Demo Plugin",
"--description",
"Demo description",
"--author",
"Paperclip",
"--sdk-path",
"/repo/packages/plugins/sdk",
],
{ from: "user" },
);
expect(mocks.scaffoldPluginProject).toHaveBeenCalledTimes(1);
expect(mocks.scaffoldPluginProject).toHaveBeenCalledWith({
pluginName: "demo-plugin",
outputDir: path.resolve("/tmp/paperclip-init-output", "demo-plugin"),
template: "workspace",
category: "workspace",
displayName: "Demo Plugin",
description: "Demo description",
author: "Paperclip",
sdkPath: "/repo/packages/plugins/sdk",
});
});
});
describe("plugin install", () => {
it("resolves an existing relative local path to an absolute local install request", () => {
const cwd = makeTempDir();
const pluginDir = path.join(cwd, "demo-plugin");
fs.mkdirSync(pluginDir);
expect(buildPluginInstallRequest("demo-plugin", {}, { cwd })).toEqual({
packageName: pluginDir,
version: undefined,
isLocalPath: true,
});
});
it("keeps an absolute local path absolute and marks it as local", () => {
const pluginDir = path.join(makeTempDir(), "demo-plugin");
fs.mkdirSync(pluginDir);
expect(buildPluginInstallRequest(pluginDir, {}, { cwd: "/" })).toEqual({
packageName: pluginDir,
version: undefined,
isLocalPath: true,
});
});
it("preserves npm package installs when no local path exists", () => {
expect(
buildPluginInstallRequest("@acme/plugin-linear", { version: "1.2.3" }, {
cwd: makeTempDir(),
}),
).toEqual({
packageName: "@acme/plugin-linear",
version: "1.2.3",
isLocalPath: false,
});
});
});
+257
View File
@@ -0,0 +1,257 @@
import { afterEach, beforeEach, describe, expect, it } from "vitest";
import type { Agent, CompanySecret } from "@paperclipai/shared";
import type { PaperclipConfig } from "../config/schema.js";
import { secretsCheck } from "../checks/secrets-check.js";
import {
buildInlineMigrationSecretName,
buildMigratedAgentEnv,
collectInlineSecretMigrationCandidates,
parseSecretsInclude,
toPlainEnvValue,
} from "../commands/client/secrets.js";
function agent(partial: Partial<Agent>): Agent {
return {
id: "agent-12345678",
companyId: "company-1",
name: "Coder",
urlKey: "coder",
role: "engineer",
title: null,
icon: null,
status: "idle",
reportsTo: null,
capabilities: null,
adapterType: "codex_local",
adapterConfig: {},
runtimeConfig: {},
budgetMonthlyCents: 0,
spentMonthlyCents: 0,
pauseReason: null,
pausedAt: null,
permissions: {
canCreateAgents: false,
},
lastHeartbeatAt: null,
metadata: null,
createdAt: new Date("2026-04-26T00:00:00.000Z"),
updatedAt: new Date("2026-04-26T00:00:00.000Z"),
...partial,
};
}
function secret(partial: Partial<CompanySecret>): CompanySecret {
return {
id: "secret-1",
companyId: "company-1",
key: "agent_agent-12_anthropic_api_key",
name: "agent_agent-12_anthropic_api_key",
provider: "local_encrypted",
status: "active",
managedMode: "paperclip_managed",
externalRef: null,
providerConfigId: null,
providerMetadata: null,
latestVersion: 1,
description: null,
lastResolvedAt: null,
lastRotatedAt: null,
deletedAt: null,
createdByAgentId: null,
createdByUserId: null,
createdAt: new Date("2026-04-26T00:00:00.000Z"),
updatedAt: new Date("2026-04-26T00:00:00.000Z"),
...partial,
};
}
function configWithSecretsProvider(provider: PaperclipConfig["secrets"]["provider"]): PaperclipConfig {
return {
$meta: {
version: 1,
updatedAt: "2026-05-02T00:00:00.000Z",
source: "configure",
},
database: {
mode: "embedded-postgres",
embeddedPostgresDataDir: "/tmp/paperclip/db",
embeddedPostgresPort: 55432,
backup: {
enabled: true,
intervalMinutes: 60,
retentionDays: 30,
dir: "/tmp/paperclip/backups",
},
},
logging: {
mode: "file",
logDir: "/tmp/paperclip/logs",
},
server: {
deploymentMode: "local_trusted",
exposure: "private",
host: "127.0.0.1",
port: 3100,
allowedHostnames: [],
serveUi: true,
},
auth: {
baseUrlMode: "auto",
disableSignUp: false,
},
telemetry: {
enabled: true,
},
storage: {
provider: "local_disk",
localDisk: {
baseDir: "/tmp/paperclip/storage",
},
s3: {
bucket: "paperclip",
region: "us-east-1",
prefix: "",
forcePathStyle: false,
},
},
secrets: {
provider,
strictMode: true,
localEncrypted: {
keyFilePath: "/tmp/paperclip/secrets/master.key",
},
},
};
}
describe("secrets CLI helpers", () => {
const originalEnv = { ...process.env };
beforeEach(() => {
process.env = { ...originalEnv };
delete process.env.PAPERCLIP_SECRETS_AWS_REGION;
delete process.env.AWS_REGION;
delete process.env.AWS_DEFAULT_REGION;
delete process.env.PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID;
delete process.env.PAPERCLIP_SECRETS_AWS_KMS_KEY_ID;
});
afterEach(() => {
process.env = { ...originalEnv };
});
it("parses declaration include filters", () => {
expect(parseSecretsInclude("agents,projects,tasks")).toEqual({
company: false,
agents: true,
projects: true,
issues: true,
skills: false,
});
});
it("detects inline sensitive env values that need migration", () => {
const rows = collectInlineSecretMigrationCandidates(
[
agent({
id: "agent-12345678",
adapterConfig: {
env: {
ANTHROPIC_API_KEY: "sk-ant-test",
GH_TOKEN: {
type: "plain",
value: "ghp-test",
},
PATH: {
type: "plain",
value: "/usr/bin",
},
OPENAI_API_KEY: {
type: "secret_ref",
secretId: "secret-existing",
},
},
},
}),
],
[
secret({
id: "secret-gh-token",
name: buildInlineMigrationSecretName("agent-12345678", "GH_TOKEN"),
}),
],
);
expect(rows).toEqual([
{
agentId: "agent-12345678",
agentName: "Coder",
envKey: "ANTHROPIC_API_KEY",
secretName: "agent_agent-12_anthropic_api_key",
existingSecretId: null,
},
{
agentId: "agent-12345678",
agentName: "Coder",
envKey: "GH_TOKEN",
secretName: "agent_agent-12_gh_token",
existingSecretId: "secret-gh-token",
},
]);
});
it("builds migrated env bindings without preserving secret values", () => {
const next = buildMigratedAgentEnv(
{
ANTHROPIC_API_KEY: "sk-ant-test",
NODE_ENV: {
type: "plain",
value: "development",
},
},
new Map([["ANTHROPIC_API_KEY", "secret-1"]]),
);
expect(next).toEqual({
ANTHROPIC_API_KEY: {
type: "secret_ref",
secretId: "secret-1",
version: "latest",
},
NODE_ENV: {
type: "plain",
value: "development",
},
});
expect(JSON.stringify(next)).not.toContain("sk-ant-test");
});
it("reads only explicit plain env values", () => {
expect(toPlainEnvValue("plain-value")).toBe("plain-value");
expect(toPlainEnvValue({ type: "plain", value: "wrapped" })).toBe("wrapped");
expect(toPlainEnvValue({ type: "secret_ref", secretId: "secret-1" })).toBeNull();
});
it("reports the AWS bootstrap config required by doctor", () => {
const result = secretsCheck(configWithSecretsProvider("aws_secrets_manager"));
expect(result.status).toBe("fail");
expect(result.message).toContain("PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID");
expect(result.repairHint).toContain("AWS SDK default credential chain");
expect(result.repairHint).toContain("Do not store AWS root credentials");
});
it("passes AWS doctor checks when non-secret provider config is present", () => {
process.env.PAPERCLIP_SECRETS_AWS_REGION = "us-east-1";
process.env.PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID = "prod-us-1";
process.env.PAPERCLIP_SECRETS_AWS_KMS_KEY_ID =
"arn:aws:kms:us-east-1:123456789012:key/test";
process.env.AWS_PROFILE = "paperclip-prod";
const result = secretsCheck(configWithSecretsProvider("aws_secrets_manager"));
expect(result.status).toBe("pass");
expect(result.message).toContain("prod-us-1");
expect(result.message).toContain("AWS_PROFILE/shared config");
});
});
+14
View File
@@ -1,7 +1,9 @@
import type { CLIAdapterModule } from "@paperclipai/adapter-utils";
import { printAcpxStreamEvent } from "@paperclipai/adapter-acpx-local/cli";
import { printClaudeStreamEvent } from "@paperclipai/adapter-claude-local/cli";
import { printCodexStreamEvent } from "@paperclipai/adapter-codex-local/cli";
import { printCursorStreamEvent } from "@paperclipai/adapter-cursor-local/cli";
import { printCursorCloudEvent } from "@paperclipai/adapter-cursor-cloud/cli";
import { printGeminiStreamEvent } from "@paperclipai/adapter-gemini-local/cli";
import { printOpenCodeStreamEvent } from "@paperclipai/adapter-opencode-local/cli";
import { printPiStreamEvent } from "@paperclipai/adapter-pi-local/cli";
@@ -14,6 +16,11 @@ const claudeLocalCLIAdapter: CLIAdapterModule = {
formatStdoutEvent: printClaudeStreamEvent,
};
const acpxLocalCLIAdapter: CLIAdapterModule = {
type: "acpx_local",
formatStdoutEvent: printAcpxStreamEvent,
};
const codexLocalCLIAdapter: CLIAdapterModule = {
type: "codex_local",
formatStdoutEvent: printCodexStreamEvent,
@@ -34,6 +41,11 @@ const cursorLocalCLIAdapter: CLIAdapterModule = {
formatStdoutEvent: printCursorStreamEvent,
};
const cursorCloudCLIAdapter: CLIAdapterModule = {
type: "cursor_cloud",
formatStdoutEvent: printCursorCloudEvent,
};
const geminiLocalCLIAdapter: CLIAdapterModule = {
type: "gemini_local",
formatStdoutEvent: printGeminiStreamEvent,
@@ -46,11 +58,13 @@ const openclawGatewayCLIAdapter: CLIAdapterModule = {
const adaptersByType = new Map<string, CLIAdapterModule>(
[
acpxLocalCLIAdapter,
claudeLocalCLIAdapter,
codexLocalCLIAdapter,
openCodeLocalCLIAdapter,
piLocalCLIAdapter,
cursorLocalCLIAdapter,
cursorCloudCLIAdapter,
geminiLocalCLIAdapter,
openclawGatewayCLIAdapter,
processCLIAdapter,
+98 -4
View File
@@ -5,6 +5,9 @@ import type { PaperclipConfig } from "../config/schema.js";
import type { CheckResult } from "./index.js";
import { resolveRuntimeLikePath } from "./path-resolver.js";
const AWS_CREDENTIAL_SOURCE_HINT =
"Provide AWS runtime credentials through the AWS SDK default credential chain: IAM role/workload identity, AWS_PROFILE/SSO/shared credentials, web identity, container/instance metadata, or short-lived shell credentials";
function decodeMasterKey(raw: string): Buffer | null {
const trimmed = raw.trim();
if (!trimmed) return null;
@@ -47,13 +50,16 @@ function withStrictModeNote(
export function secretsCheck(config: PaperclipConfig, configPath?: string): CheckResult {
const provider = config.secrets.provider;
if (provider === "aws_secrets_manager") {
return withStrictModeNote(awsSecretsManagerCheck(), config);
}
if (provider !== "local_encrypted") {
return {
name: "Secrets adapter",
status: "fail",
message: `${provider} is configured, but this build only supports local_encrypted`,
message: `${provider} is configured, but this build only supports local_encrypted and aws_secrets_manager`,
canRepair: false,
repairHint: "Run `paperclipai configure --section secrets` and set provider to local_encrypted",
repairHint: "Run `paperclipai configure --section secrets` and choose local_encrypted or aws_secrets_manager",
};
}
@@ -135,12 +141,100 @@ export function secretsCheck(config: PaperclipConfig, configPath?: string): Chec
};
}
const keyMode = fs.statSync(keyFilePath).mode & 0o777;
const permissionWarning =
(keyMode & 0o077) !== 0
? `; key file permissions are ${keyMode.toString(8)} (run chmod 600 ${keyFilePath})`
: "";
return withStrictModeNote(
{
name: "Secrets adapter",
status: "pass",
message: `Local encrypted provider configured with key file ${keyFilePath}`,
status: permissionWarning ? "warn" : "pass",
message: `Local encrypted provider configured with key file ${keyFilePath}${permissionWarning}`,
repairHint: permissionWarning
? "Restrict the local encrypted secrets key file to owner read/write permissions"
: undefined,
},
config,
);
}
function awsSecretsManagerCheck(): CheckResult {
const missingConfig = missingAwsSecretsManagerConfig();
if (missingConfig.length > 0) {
return {
name: "Secrets adapter",
status: "fail",
message: `AWS Secrets Manager provider is missing non-secret config: ${missingConfig.join(", ")}`,
canRepair: false,
repairHint:
`Set ${missingConfig.join(", ")} in the Paperclip server runtime. ${AWS_CREDENTIAL_SOURCE_HINT}. Do not store AWS root credentials or long-lived IAM user keys in Paperclip secrets.`,
};
}
const staticEnvCredentials =
process.env.AWS_ACCESS_KEY_ID?.trim() && process.env.AWS_SECRET_ACCESS_KEY?.trim();
const credentialSource = detectedAwsCredentialSources().join(", ");
const message =
`AWS Secrets Manager provider configured for deployment ${process.env.PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID}; ` +
`runtime credentials source: ${credentialSource || "AWS SDK default credential chain"}`;
if (staticEnvCredentials) {
return {
name: "Secrets adapter",
status: "warn",
message,
canRepair: false,
repairHint:
"AWS static environment credentials are visible. Use only short-lived shell credentials locally; prefer IAM role/workload identity for hosted deployments and never store AWS access keys in Paperclip company secrets.",
};
}
return {
name: "Secrets adapter",
status: "pass",
message,
};
}
function missingAwsSecretsManagerConfig(): string[] {
const missing: string[] = [];
if (
!(
process.env.PAPERCLIP_SECRETS_AWS_REGION?.trim() ||
process.env.AWS_REGION?.trim() ||
process.env.AWS_DEFAULT_REGION?.trim()
)
) {
missing.push("PAPERCLIP_SECRETS_AWS_REGION or AWS_REGION/AWS_DEFAULT_REGION");
}
if (!process.env.PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID?.trim()) {
missing.push("PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID");
}
if (!process.env.PAPERCLIP_SECRETS_AWS_KMS_KEY_ID?.trim()) {
missing.push("PAPERCLIP_SECRETS_AWS_KMS_KEY_ID");
}
return missing;
}
function detectedAwsCredentialSources(): string[] {
const sources: string[] = [];
if (process.env.AWS_PROFILE?.trim()) sources.push("AWS_PROFILE/shared config");
if (process.env.AWS_ACCESS_KEY_ID?.trim() && process.env.AWS_SECRET_ACCESS_KEY?.trim()) {
sources.push("temporary AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY environment credentials");
}
if (process.env.AWS_WEB_IDENTITY_TOKEN_FILE?.trim() && process.env.AWS_ROLE_ARN?.trim()) {
sources.push("AWS web identity token");
}
if (
process.env.AWS_CONTAINER_CREDENTIALS_RELATIVE_URI?.trim() ||
process.env.AWS_CONTAINER_CREDENTIALS_FULL_URI?.trim()
) {
sources.push("AWS container credentials endpoint");
}
if (process.env.AWS_SHARED_CREDENTIALS_FILE?.trim() || process.env.AWS_CONFIG_FILE?.trim()) {
sources.push("custom AWS shared credentials/config file");
}
return sources;
}
+185 -25
View File
@@ -1,5 +1,11 @@
import path from "node:path";
import { Command } from "commander";
import { existsSync } from "node:fs";
import { Command, Option } from "commander";
import {
scaffoldPluginProject,
shellQuote,
type ScaffoldPluginOptions,
} from "../../../../packages/plugins/create-paperclip-plugin/src/index.js";
import pc from "picocolors";
import {
addCommonClientOptions,
@@ -39,28 +45,101 @@ interface PluginInstallOptions extends BaseClientOptions {
version?: string;
}
interface PluginInstallRequest {
packageName: string;
version?: string;
isLocalPath: boolean;
}
interface PluginUninstallOptions extends BaseClientOptions {
force?: boolean;
}
interface PluginInitOptions extends BaseClientOptions {
output?: string;
template?: ScaffoldPluginOptions["template"];
category?: ScaffoldPluginOptions["category"];
displayName?: string;
description?: string;
author?: string;
sdkPath?: string;
}
interface PluginInitResult {
outputDir: string;
nextCommands: string[];
}
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
function expandHomePath(packageArg: string): string {
if (!packageArg.startsWith("~")) return packageArg;
const home = process.env.HOME ?? process.env.USERPROFILE ?? "";
return path.resolve(home, packageArg.slice(1).replace(/^[\\/]/, ""));
}
function hasLocalPathSyntax(packageArg: string): boolean {
return (
path.isAbsolute(packageArg) ||
packageArg.startsWith("./") ||
packageArg.startsWith("../") ||
packageArg.startsWith("~") ||
packageArg.startsWith(".\\") ||
packageArg.startsWith("..\\")
);
}
function isExistingRelativePath(
packageArg: string,
cwd: string,
pathExists: (targetPath: string) => boolean,
): boolean {
if (packageArg.trim() === "") return false;
if (hasLocalPathSyntax(packageArg)) return false;
return pathExists(path.resolve(cwd, packageArg));
}
/**
* Resolve a local path argument to an absolute path so the server can find the
* plugin on disk regardless of where the user ran the CLI.
*/
function resolvePackageArg(packageArg: string, isLocal: boolean): string {
function resolvePackageArg(packageArg: string, isLocal: boolean, cwd = process.cwd()): string {
if (!isLocal) return packageArg;
// Already absolute
if (path.isAbsolute(packageArg)) return packageArg;
// Expand leading ~ to home directory
if (packageArg.startsWith("~")) {
const home = process.env.HOME ?? process.env.USERPROFILE ?? "";
return path.resolve(home, packageArg.slice(1).replace(/^[\\/]/, ""));
if (packageArg.startsWith("~")) return expandHomePath(packageArg);
return path.resolve(cwd, packageArg);
}
export function buildPluginInstallRequest(
packageArg: string,
opts: Pick<PluginInstallOptions, "local" | "version"> = {},
deps: { cwd?: string; existsSync?: (targetPath: string) => boolean } = {},
): PluginInstallRequest {
const cwd = deps.cwd ?? process.cwd();
const pathExists = deps.existsSync ?? existsSync;
const isLocal =
opts.local ||
hasLocalPathSyntax(packageArg) ||
(opts.version ? false : isExistingRelativePath(packageArg, cwd, pathExists));
if (isLocal && opts.version) {
throw new Error("--version is only supported for npm package installs, not local plugin paths.");
}
return path.resolve(process.cwd(), packageArg);
return {
packageName: resolvePackageArg(packageArg, Boolean(isLocal), cwd),
version: opts.version,
isLocalPath: Boolean(isLocal),
};
}
export function renderLocalPluginInstallHint(packagePath: string): string {
return [
pc.dim("Local plugin installs run trusted local code from your machine."),
pc.dim(`Keep ${pc.cyan("pnpm dev")} running in ${packagePath}; Paperclip watches rebuilt dist output and reloads the plugin worker.`),
].join("\n");
}
function formatPlugin(p: PluginRecord): string {
@@ -87,6 +166,58 @@ function formatPlugin(p: PluginRecord): string {
return parts.join(" ");
}
function packageToDirName(pluginName: string): string {
return pluginName.replace(/^@[^/]+\//, "");
}
export function buildPluginInitScaffoldOptions(
packageName: string,
opts: PluginInitOptions,
cwd = process.cwd(),
): ScaffoldPluginOptions {
const outputRoot = path.resolve(cwd, opts.output ?? ".");
const outputDir = path.resolve(outputRoot, packageToDirName(packageName));
return {
pluginName: packageName,
outputDir,
template: opts.template,
category: opts.category,
displayName: opts.displayName,
description: opts.description,
author: opts.author,
sdkPath: opts.sdkPath,
};
}
export function buildPluginInitNextCommands(outputDir: string): string[] {
const quotedOutputDir = shellQuote(outputDir);
return [
`cd ${quotedOutputDir}`,
"pnpm install",
"pnpm dev",
`paperclipai plugin install ${quotedOutputDir}`,
];
}
export function renderPluginInitSuccess(result: PluginInitResult): string {
return [
pc.green(`✓ Created plugin scaffold at ${result.outputDir}`),
"",
"Next commands:",
...result.nextCommands.map((command) => ` ${pc.cyan(command)}`),
].join("\n");
}
export function runPluginInitCommand(packageName: string, opts: PluginInitOptions): PluginInitResult {
const scaffoldOptions = buildPluginInitScaffoldOptions(packageName, opts);
const outputDir = scaffoldPluginProject(scaffoldOptions);
return {
outputDir,
nextCommands: buildPluginInitNextCommands(outputDir),
};
}
// ---------------------------------------------------------------------------
// Command registration
// ---------------------------------------------------------------------------
@@ -94,6 +225,43 @@ function formatPlugin(p: PluginRecord): string {
export function registerPluginCommands(program: Command): void {
const plugin = program.command("plugin").description("Plugin lifecycle management");
// -------------------------------------------------------------------------
// plugin init <package-name>
// -------------------------------------------------------------------------
addCommonClientOptions(
plugin
.command("init <packageName>")
.description("Scaffold a local Paperclip plugin project")
.option("--output <dir>", "Directory to create the plugin folder in")
.addOption(
new Option("--template <template>", "Starter template")
.choices(["default", "connector", "workspace", "environment"])
.default("default"),
)
.addOption(
new Option("--category <category>", "Manifest category")
.choices(["connector", "workspace", "automation", "ui", "environment"]),
)
.option("--display-name <name>", "Manifest display name")
.option("--description <description>", "Manifest description")
.option("--author <author>", "Manifest author")
.option("--sdk-path <path>", "Local @paperclipai/plugin-sdk package path")
.action((packageName: string, opts: PluginInitOptions) => {
try {
const result = runPluginInitCommand(packageName, opts);
if (opts.json) {
printOutput(result, { json: true });
return;
}
console.log(renderPluginInitSuccess(result));
} catch (err) {
handleCommandError(err);
}
}),
);
// -------------------------------------------------------------------------
// plugin list
// -------------------------------------------------------------------------
@@ -147,31 +315,19 @@ export function registerPluginCommands(program: Command): void {
try {
const ctx = resolveCommandContext(opts);
// Auto-detect local paths: starts with . or / or ~ or is an absolute path
const isLocal =
opts.local ||
packageArg.startsWith("./") ||
packageArg.startsWith("../") ||
packageArg.startsWith("/") ||
packageArg.startsWith("~");
const resolvedPackage = resolvePackageArg(packageArg, isLocal);
const installRequest = buildPluginInstallRequest(packageArg, opts);
if (!ctx.json) {
console.log(
pc.dim(
isLocal
? `Installing plugin from local path: ${resolvedPackage}`
: `Installing plugin: ${resolvedPackage}${opts.version ? `@${opts.version}` : ""}`,
installRequest.isLocalPath
? `Installing plugin from local path: ${installRequest.packageName}`
: `Installing plugin: ${installRequest.packageName}${opts.version ? `@${opts.version}` : ""}`,
),
);
}
const installedPlugin = await ctx.api.post<PluginRecord>("/api/plugins/install", {
packageName: resolvedPackage,
version: opts.version,
isLocalPath: isLocal,
});
const installedPlugin = await ctx.api.post<PluginRecord>("/api/plugins/install", installRequest);
if (ctx.json) {
printOutput(installedPlugin, { json: true });
@@ -192,6 +348,10 @@ export function registerPluginCommands(program: Command): void {
if (installedPlugin.lastError) {
console.log(pc.red(` Warning: ${installedPlugin.lastError}`));
}
if (installRequest.isLocalPath) {
console.log(renderLocalPluginInstallHint(installRequest.packageName));
}
} catch (err) {
handleCommandError(err);
}
+501
View File
@@ -0,0 +1,501 @@
import { Command } from "commander";
import pc from "picocolors";
import type {
Agent,
AgentEnvConfig,
CompanyPortabilityEnvInput,
CompanyPortabilityExportPreviewResult,
CompanyPortabilityInclude,
CompanySecret,
EnvBinding,
SecretProvider,
SecretProviderDescriptor,
} from "@paperclipai/shared";
import {
addCommonClientOptions,
formatInlineRecord,
handleCommandError,
printOutput,
resolveCommandContext,
type BaseClientOptions,
} from "./common.js";
interface SecretListOptions extends BaseClientOptions {
companyId?: string;
}
interface SecretDeclarationsOptions extends BaseClientOptions {
companyId?: string;
include?: string;
kind?: "all" | "secret" | "plain";
}
interface SecretCreateOptions extends BaseClientOptions {
companyId?: string;
name?: string;
key?: string;
provider?: SecretProvider;
value?: string;
valueEnv?: string;
description?: string;
}
interface SecretLinkOptions extends BaseClientOptions {
companyId?: string;
name?: string;
key?: string;
provider?: SecretProvider;
externalRef?: string;
providerVersionRef?: string;
description?: string;
}
interface SecretDoctorOptions extends BaseClientOptions {
companyId?: string;
}
interface SecretMigrateInlineEnvOptions extends BaseClientOptions {
companyId?: string;
apply?: boolean;
}
interface SecretProviderHealth {
provider: SecretProvider;
status: "ok" | "warn" | "error";
message: string;
warnings?: string[];
backupGuidance?: string[];
details?: Record<string, unknown>;
}
interface SecretProviderHealthResponse {
providers: SecretProviderHealth[];
}
export interface InlineSecretMigrationCandidate {
agentId: string;
agentName: string;
envKey: string;
secretName: string;
existingSecretId: string | null;
}
const SENSITIVE_ENV_KEY_RE =
/(^token$|[-_]?token$|api[-_]?key|access[-_]?token|auth(?:_?token)?|authorization|bearer|secret|passwd|password|credential|jwt|private[-_]?key|cookie|connectionstring)/i;
const DEFAULT_DECLARATION_INCLUDE: CompanyPortabilityInclude = {
company: true,
agents: true,
projects: true,
issues: false,
skills: false,
};
export function parseSecretsInclude(input: string | undefined): CompanyPortabilityInclude {
if (!input?.trim()) return { ...DEFAULT_DECLARATION_INCLUDE };
const values = input.split(",").map((part) => part.trim().toLowerCase()).filter(Boolean);
const include = {
company: values.includes("company"),
agents: values.includes("agents"),
projects: values.includes("projects"),
issues: values.includes("issues") || values.includes("tasks"),
skills: values.includes("skills"),
};
if (!Object.values(include).some(Boolean)) {
throw new Error("Invalid --include value. Use one or more of: company,agents,projects,issues,tasks,skills");
}
return include;
}
export function isSensitiveEnvKey(key: string): boolean {
return SENSITIVE_ENV_KEY_RE.test(key);
}
export function toPlainEnvValue(binding: unknown): string | null {
if (typeof binding === "string") return binding;
if (typeof binding !== "object" || binding === null || Array.isArray(binding)) return null;
const record = binding as Record<string, unknown>;
if (record.type === "plain" && typeof record.value === "string") return record.value;
return null;
}
export function buildInlineMigrationSecretName(agentId: string, key: string): string {
return `agent_${agentId.slice(0, 8)}_${key.toLowerCase()}`;
}
export function collectInlineSecretMigrationCandidates(
agents: Agent[],
existingSecrets: CompanySecret[],
): InlineSecretMigrationCandidate[] {
const secretByName = new Map(existingSecrets.map((secret) => [secret.name, secret]));
const candidates: InlineSecretMigrationCandidate[] = [];
for (const agent of agents) {
const env = asRecord(agent.adapterConfig.env);
if (!env) continue;
for (const [envKey, binding] of Object.entries(env)) {
if (!isSensitiveEnvKey(envKey)) continue;
const plain = toPlainEnvValue(binding);
if (plain === null || plain.trim().length === 0) continue;
const secretName = buildInlineMigrationSecretName(agent.id, envKey);
candidates.push({
agentId: agent.id,
agentName: agent.name,
envKey,
secretName,
existingSecretId: secretByName.get(secretName)?.id ?? null,
});
}
}
return candidates;
}
export function buildMigratedAgentEnv(
env: Record<string, unknown>,
secretIdByEnvKey: Map<string, string>,
): AgentEnvConfig {
const next: AgentEnvConfig = { ...(env as Record<string, EnvBinding>) };
for (const [envKey, secretId] of secretIdByEnvKey) {
next[envKey] = {
type: "secret_ref",
secretId,
version: "latest",
};
}
return next;
}
function asRecord(value: unknown): Record<string, unknown> | null {
if (typeof value !== "object" || value === null || Array.isArray(value)) return null;
return value as Record<string, unknown>;
}
function readValueFromOptions(opts: SecretCreateOptions): string {
if (opts.value !== undefined && opts.valueEnv !== undefined) {
throw new Error("Use only one of --value or --value-env.");
}
if (opts.valueEnv !== undefined) {
const value = process.env[opts.valueEnv];
if (!value) throw new Error(`Environment variable ${opts.valueEnv} is empty or unset.`);
return value;
}
if (opts.value !== undefined) return opts.value;
throw new Error("Secret value is required. Pass --value or --value-env.");
}
function renderDeclaration(input: CompanyPortabilityEnvInput): Record<string, unknown> {
const scope = input.agentSlug
? `agent:${input.agentSlug}`
: input.projectSlug
? `project:${input.projectSlug}`
: "company";
return {
key: input.key,
scope,
kind: input.kind,
requirement: input.requirement,
portability: input.portability,
hasDefault: input.defaultValue !== null && input.defaultValue.length > 0,
description: input.description,
};
}
function renderSecret(secret: CompanySecret): Record<string, unknown> {
return {
id: secret.id,
name: secret.name,
key: secret.key,
provider: secret.provider,
status: secret.status,
managedMode: secret.managedMode,
latestVersion: secret.latestVersion,
externalRef: secret.externalRef ? "yes" : "no",
};
}
function printProviderHealth(rows: SecretProviderHealth[], json: boolean): void {
if (json) {
printOutput(rows, { json: true });
return;
}
if (rows.length === 0) {
printOutput([], { json: false });
return;
}
for (const row of rows) {
console.log(
formatInlineRecord({
id: row.provider,
status: row.status,
message: row.message,
}),
);
for (const warning of row.warnings ?? []) {
console.log(pc.yellow(`warning=${warning}`));
}
const missingConfig = asStringArray(row.details?.missingConfig);
if (missingConfig.length > 0) {
console.log(pc.dim(`missingConfig=${missingConfig.join(",")}`));
}
const credentialSource = typeof row.details?.credentialSource === "string"
? row.details.credentialSource
: null;
if (credentialSource) {
console.log(pc.dim(`credentialSource=${credentialSource}`));
}
const detectedCredentialSources = asStringArray(row.details?.detectedCredentialSources);
if (detectedCredentialSources.length > 0) {
console.log(pc.dim(`detectedCredentialSources=${detectedCredentialSources.join(",")}`));
}
for (const guidance of row.backupGuidance ?? []) {
console.log(pc.dim(`backup=${guidance}`));
}
}
}
function asStringArray(value: unknown): string[] {
return Array.isArray(value)
? value.filter((entry): entry is string => typeof entry === "string" && entry.length > 0)
: [];
}
async function migrateInlineEnv(opts: SecretMigrateInlineEnvOptions): Promise<void> {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const companyId = ctx.companyId!;
const agents = (await ctx.api.get<Agent[]>(`/api/companies/${companyId}/agents`)) ?? [];
const secrets = (await ctx.api.get<CompanySecret[]>(`/api/companies/${companyId}/secrets`)) ?? [];
const candidates = collectInlineSecretMigrationCandidates(agents, secrets);
if (!opts.apply) {
printOutput(
{
apply: false,
agentsToUpdate: new Set(candidates.map((candidate) => candidate.agentId)).size,
secretsToCreate: candidates.filter((candidate) => !candidate.existingSecretId).length,
secretsToRotate: candidates.filter((candidate) => candidate.existingSecretId).length,
candidates,
},
{ json: ctx.json },
);
if (!ctx.json) {
console.log(pc.dim("Re-run with --apply to create/rotate secrets and update agent env bindings."));
}
return;
}
const createdOrRotated = new Map<string, string>();
let createdSecrets = 0;
let rotatedSecrets = 0;
for (const candidate of candidates) {
const agent = agents.find((row) => row.id === candidate.agentId);
const env = asRecord(agent?.adapterConfig.env);
const value = env ? toPlainEnvValue(env[candidate.envKey]) : null;
if (!value) continue;
if (candidate.existingSecretId) {
await ctx.api.post(`/api/secrets/${candidate.existingSecretId}/rotate`, { value });
createdOrRotated.set(`${candidate.agentId}:${candidate.envKey}`, candidate.existingSecretId);
rotatedSecrets += 1;
continue;
}
const created = await ctx.api.post<CompanySecret>(`/api/companies/${companyId}/secrets`, {
name: candidate.secretName,
provider: "local_encrypted",
value,
description: `Migrated from agent ${candidate.agentId} env ${candidate.envKey}`,
});
if (!created) throw new Error(`Secret create returned no data for ${candidate.secretName}`);
createdOrRotated.set(`${candidate.agentId}:${candidate.envKey}`, created.id);
createdSecrets += 1;
}
let updatedAgents = 0;
for (const agent of agents) {
const env = asRecord(agent.adapterConfig.env);
if (!env) continue;
const secretIdByEnvKey = new Map<string, string>();
for (const [key] of Object.entries(env)) {
const secretId = createdOrRotated.get(`${agent.id}:${key}`);
if (secretId) secretIdByEnvKey.set(key, secretId);
}
if (secretIdByEnvKey.size === 0) continue;
const adapterConfig = {
...agent.adapterConfig,
env: buildMigratedAgentEnv(env, secretIdByEnvKey),
};
await ctx.api.patch(`/api/agents/${agent.id}`, {
adapterConfig,
replaceAdapterConfig: true,
});
updatedAgents += 1;
}
printOutput(
{
apply: true,
updatedAgents,
createdSecrets,
rotatedSecrets,
},
{ json: ctx.json },
);
}
export function registerSecretCommands(program: Command): void {
const secrets = program.command("secrets").description("Secret declaration and provider operations");
addCommonClientOptions(
secrets
.command("list")
.description("List secret metadata for a company")
.requiredOption("-C, --company-id <id>", "Company ID")
.action(async (opts: SecretListOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const rows = (await ctx.api.get<CompanySecret[]>(`/api/companies/${ctx.companyId}/secrets`)) ?? [];
printOutput(ctx.json ? rows : rows.map(renderSecret), { json: ctx.json });
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("declarations")
.description("List portable env declarations emitted by company export")
.requiredOption("-C, --company-id <id>", "Company ID")
.option("--include <values>", "Comma-separated include set: company,agents,projects,issues,tasks,skills", "company,agents,projects")
.option("--kind <kind>", "Filter declarations: all | secret | plain", "all")
.action(async (opts: SecretDeclarationsOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const kind = opts.kind ?? "all";
if (!["all", "secret", "plain"].includes(kind)) {
throw new Error("Invalid --kind value. Use: all, secret, plain");
}
const preview = await ctx.api.post<CompanyPortabilityExportPreviewResult>(
`/api/companies/${ctx.companyId}/exports/preview`,
{ include: parseSecretsInclude(opts.include) },
);
const declarations = (preview?.manifest.envInputs ?? [])
.filter((entry) => kind === "all" || entry.kind === kind);
printOutput(ctx.json ? declarations : declarations.map(renderDeclaration), { json: ctx.json });
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("create")
.description("Create a Paperclip-managed secret")
.requiredOption("-C, --company-id <id>", "Company ID")
.requiredOption("--name <name>", "Secret display name")
.option("--key <key>", "Portable secret key")
.option("--provider <provider>", "Secret provider id")
.option("--value <value>", "Secret value")
.option("--value-env <name>", "Read secret value from an environment variable")
.option("--description <text>", "Description")
.action(async (opts: SecretCreateOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const created = await ctx.api.post<CompanySecret>(`/api/companies/${ctx.companyId}/secrets`, {
name: opts.name,
key: opts.key,
provider: opts.provider,
value: readValueFromOptions(opts),
description: opts.description,
});
printOutput(ctx.json ? created : renderSecret(created!), { json: ctx.json });
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("link")
.description("Link an external provider-owned secret without storing its value in Paperclip")
.requiredOption("-C, --company-id <id>", "Company ID")
.requiredOption("--name <name>", "Secret display name")
.requiredOption("--provider <provider>", "Secret provider id")
.requiredOption("--external-ref <ref>", "Provider secret ARN/name/path/reference")
.option("--key <key>", "Portable secret key")
.option("--provider-version-ref <ref>", "Provider version id or label")
.option("--description <text>", "Description")
.action(async (opts: SecretLinkOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const created = await ctx.api.post<CompanySecret>(`/api/companies/${ctx.companyId}/secrets`, {
name: opts.name,
key: opts.key,
provider: opts.provider,
managedMode: "external_reference",
externalRef: opts.externalRef,
providerVersionRef: opts.providerVersionRef,
description: opts.description,
});
printOutput(ctx.json ? created : renderSecret(created!), { json: ctx.json });
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("doctor")
.description("Run secret provider health checks through the Paperclip API")
.requiredOption("-C, --company-id <id>", "Company ID")
.action(async (opts: SecretDoctorOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const health = await ctx.api.get<SecretProviderHealthResponse>(
`/api/companies/${ctx.companyId}/secret-providers/health`,
);
printProviderHealth(health?.providers ?? [], ctx.json);
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("providers")
.description("List configured secret provider descriptors")
.requiredOption("-C, --company-id <id>", "Company ID")
.action(async (opts: SecretDoctorOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const rows = (await ctx.api.get<SecretProviderDescriptor[]>(
`/api/companies/${ctx.companyId}/secret-providers`,
)) ?? [];
printOutput(rows, { json: ctx.json });
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("migrate-inline-env")
.description("Migrate inline sensitive agent env values into secret references")
.requiredOption("-C, --company-id <id>", "Company ID")
.option("--apply", "Persist changes; default is a dry run", false)
.action(async (opts: SecretMigrateInlineEnvOptions) => {
try {
await migrateInlineEnv(opts);
} catch (err) {
handleCommandError(err);
}
}),
);
}
+26 -33
View File
@@ -1,32 +1,31 @@
import os from "node:os";
import path from "node:path";
import {
expandHomePrefix,
resolveDefaultBackupDir as resolveSharedDefaultBackupDir,
resolveDefaultEmbeddedPostgresDir as resolveSharedDefaultEmbeddedPostgresDir,
resolveDefaultLogsDir as resolveSharedDefaultLogsDir,
resolveDefaultSecretsKeyFilePath as resolveSharedDefaultSecretsKeyFilePath,
resolveDefaultStorageDir as resolveSharedDefaultStorageDir,
resolveHomeAwarePath,
resolvePaperclipConfigPathForInstance,
resolvePaperclipHomeDir,
resolvePaperclipInstanceId,
resolvePaperclipInstanceRoot as resolveSharedPaperclipInstanceRoot,
} from "@paperclipai/shared/home-paths";
const DEFAULT_INSTANCE_ID = "default";
const INSTANCE_ID_RE = /^[a-zA-Z0-9_-]+$/;
export function resolvePaperclipHomeDir(): string {
const envHome = process.env.PAPERCLIP_HOME?.trim();
if (envHome) return path.resolve(expandHomePrefix(envHome));
return path.resolve(os.homedir(), ".paperclip");
}
export function resolvePaperclipInstanceId(override?: string): string {
const raw = override?.trim() || process.env.PAPERCLIP_INSTANCE_ID?.trim() || DEFAULT_INSTANCE_ID;
if (!INSTANCE_ID_RE.test(raw)) {
throw new Error(
`Invalid instance id '${raw}'. Allowed characters: letters, numbers, '_' and '-'.`,
);
}
return raw;
}
export {
expandHomePrefix,
resolveHomeAwarePath,
resolvePaperclipHomeDir,
resolvePaperclipInstanceId,
};
export function resolvePaperclipInstanceRoot(instanceId?: string): string {
const id = resolvePaperclipInstanceId(instanceId);
return path.resolve(resolvePaperclipHomeDir(), "instances", id);
return resolveSharedPaperclipInstanceRoot({ instanceId });
}
export function resolveDefaultConfigPath(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "config.json");
return resolvePaperclipConfigPathForInstance({ instanceId });
}
export function resolveDefaultContextPath(): string {
@@ -38,29 +37,23 @@ export function resolveDefaultCliAuthPath(): string {
}
export function resolveDefaultEmbeddedPostgresDir(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "db");
return resolveSharedDefaultEmbeddedPostgresDir({ instanceId });
}
export function resolveDefaultLogsDir(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "logs");
return resolveSharedDefaultLogsDir({ instanceId });
}
export function resolveDefaultSecretsKeyFilePath(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "secrets", "master.key");
return resolveSharedDefaultSecretsKeyFilePath({ instanceId });
}
export function resolveDefaultStorageDir(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "data", "storage");
return resolveSharedDefaultStorageDir({ instanceId });
}
export function resolveDefaultBackupDir(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "data", "backups");
}
export function expandHomePrefix(value: string): string {
if (value === "~") return os.homedir();
if (value.startsWith("~/")) return path.resolve(os.homedir(), value.slice(2));
return value;
return resolveSharedDefaultBackupDir({ instanceId });
}
export function describeLocalInstancePaths(instanceId?: string) {
+2
View File
@@ -18,6 +18,7 @@ import { registerActivityCommands } from "./commands/client/activity.js";
import { registerDashboardCommands } from "./commands/client/dashboard.js";
import { registerRoutineCommands } from "./commands/routines.js";
import { registerFeedbackCommands } from "./commands/client/feedback.js";
import { registerSecretCommands } from "./commands/client/secrets.js";
import { applyDataDirOverride, type DataDirOptionLike } from "./config/data-dir.js";
import { loadPaperclipEnvFile } from "./config/env.js";
import { initTelemetryFromConfigFile, flushTelemetry } from "./telemetry.js";
@@ -147,6 +148,7 @@ registerActivityCommands(program);
registerDashboardCommands(program);
registerRoutineCommands(program);
registerFeedbackCommands(program);
registerSecretCommands(program);
registerWorktreeCommands(program);
registerEnvLabCommands(program);
registerPluginCommands(program);
+4 -2
View File
@@ -32,7 +32,7 @@ export async function promptSecrets(current?: SecretsConfig): Promise<SecretsCon
{
value: "aws_secrets_manager" as const,
label: "AWS Secrets Manager",
hint: "requires external adapter integration",
hint: "requires runtime AWS credentials and provider env config",
},
{
value: "gcp_secret_manager" as const,
@@ -84,7 +84,9 @@ export async function promptSecrets(current?: SecretsConfig): Promise<SecretsCon
if (provider !== "local_encrypted") {
p.note(
`${provider} is not fully wired in this build yet. Keep local_encrypted unless you are actively implementing that adapter.`,
provider === "aws_secrets_manager"
? "AWS credentials must come from the Paperclip server runtime (IAM role/workload identity, AWS_PROFILE/SSO/shared credentials, or short-lived shell env), not from Paperclip company secrets."
: `${provider} is not fully wired in this build yet. Keep local_encrypted unless you are actively implementing that adapter.`,
"Heads up",
);
}
+1 -1
View File
@@ -4,5 +4,5 @@
"outDir": "dist",
"rootDir": ".."
},
"include": ["src", "../packages/shared/src"]
"include": ["src", "../packages/shared/src", "../packages/plugins/create-paperclip-plugin/src"]
}
+48 -1
View File
@@ -143,6 +143,32 @@ pnpm paperclipai agent local-cli codexcoder --company-id <company-id>
pnpm paperclipai agent local-cli claudecoder --company-id <company-id>
```
## Secrets Commands
```sh
pnpm paperclipai secrets list --company-id <company-id>
pnpm paperclipai secrets declarations --company-id <company-id> [--include agents,projects] [--kind secret]
pnpm paperclipai secrets create --company-id <company-id> --name anthropic-api-key --value-env ANTHROPIC_API_KEY
pnpm paperclipai secrets link --company-id <company-id> --name prod-stripe-key --provider aws_secrets_manager --external-ref <provider-ref>
pnpm paperclipai secrets doctor --company-id <company-id>
pnpm paperclipai secrets migrate-inline-env --company-id <company-id> [--apply]
```
Secret listing and declarations never print secret values. `create` accepts
`--value-env` so shell history does not capture the value. `link` records
provider-owned references without copying the secret value into Paperclip.
For AWS-backed secrets, `secrets doctor` reports missing non-secret provider
env and the expected AWS SDK runtime credential source; do not store AWS
bootstrap credentials in Paperclip secrets.
Per-company provider vaults (multiple vault instances per provider, default
vault selection, coming-soon GCP/Vault) are configured from the board UI under
`Company Settings → Secrets → Provider vaults` or through
`/api/companies/{companyId}/secret-provider-configs`. There is no CLI surface
for vault management today. See the
[secrets deploy guide](../docs/deploy/secrets.md#provider-vaults) and
[API reference](../docs/api/secrets.md#provider-vaults) for the contract.
## Approval Commands
```sh
@@ -178,7 +204,28 @@ pnpm paperclipai heartbeat run --agent-id <agent-id> [--api-base http://localhos
## Local Storage Defaults
Default local instance root is `~/.paperclip/instances/default`:
Local Paperclip data lives under the selected instance root. `PAPERCLIP_HOME` chooses the home directory and `PAPERCLIP_INSTANCE_ID` chooses the instance.
```text
~/.paperclip/ # PAPERCLIP_HOME
└── instances/
└── default/ # instance root (PAPERCLIP_INSTANCE_ID)
├── config.json # runtime config
├── .env # instance env file
├── db/ # embedded PostgreSQL data
├── data/
│ ├── storage/ # local_disk uploads
│ └── backups/ # automatic DB backups
├── logs/
├── secrets/
│ └── master.key # local_encrypted master key
├── workspaces/ # default agent workspaces
├── projects/ # project execution workspaces
├── companies/ # per-company adapter homes (e.g. codex-home)
└── codex-home/ # per-instance codex home (when not company-scoped)
```
Default paths for the canonical install:
- config: `~/.paperclip/instances/default/config.json`
- embedded db: `~/.paperclip/instances/default/db`
+16 -1
View File
@@ -149,7 +149,15 @@ The plugin runtime tracks plugin-owned database namespaces and migrations in `pl
## Backups
Paperclip supports automatic and manual database backups. See `doc/DEVELOPING.md` for the current `paperclipai db:backup` / `pnpm db:backup` commands and backup retention configuration.
Paperclip supports automatic and manual logical database backups. These dumps include
non-system database schemas such as `public`, the Drizzle migration journal, and
plugin-owned database schemas. See `doc/DEVELOPING.md` for the current
`paperclipai db:backup` / `pnpm db:backup` commands and backup retention
configuration.
Database backups do not include non-database instance files such as local-disk
uploads, workspace files, or the local encrypted secrets master key. Back those paths
up separately when you need full instance disaster recovery.
## Secret storage
@@ -163,6 +171,8 @@ For local/default installs, the active provider is `local_encrypted`:
- Secret material is encrypted at rest with a local master key.
- Default key file: `~/.paperclip/instances/default/secrets/master.key` (auto-created if missing).
- CLI config location: `~/.paperclip/instances/default/config.json` under `secrets.localEncrypted.keyFilePath`.
- Backup/restore requires both the database metadata and the local master key file; either artifact alone is insufficient.
- The server best-effort enforces `0600` key file permissions and provider health reports permission warnings.
Optional overrides:
@@ -184,5 +194,10 @@ pnpm paperclipai configure --section secrets
Inline secret migration command:
```sh
pnpm paperclipai secrets migrate-inline-env --company-id <company-id> --apply
# direct database maintenance fallback
pnpm secrets:migrate-inline-env --apply
```
Hosted AWS provider notes live in [SECRETS-AWS-PROVIDER.md](./SECRETS-AWS-PROVIDER.md).
+40 -4
View File
@@ -157,6 +157,27 @@ See `doc/DOCKER.md` for API key wiring (`OPENAI_API_KEY` / `ANTHROPIC_API_KEY`)
For a separate review-oriented container that keeps `codex`/`claude` login state in Docker volumes and checks out PRs into an isolated scratch workspace, see `doc/UNTRUSTED-PR-REVIEW.md`.
## Local Instance Layout
Every local install keeps runtime state directly under the selected instance root:
```text
~/.paperclip/instances/default/ # instance root
config.json # runtime config
.env # instance env file
db/ # embedded PostgreSQL data
data/
storage/ # local_disk uploads
backups/ # automatic DB backups
logs/
secrets/master.key # local_encrypted master key
workspaces/<agent-id>/ # default agent workspaces
projects/ # project execution workspaces
companies/<company-id>/codex-home/ # per-company codex_local home
```
`PAPERCLIP_HOME` and `PAPERCLIP_INSTANCE_ID` override the home root and instance id respectively. `paperclipai onboard` echoes the resolved values in its banner (`Local home: <home> | instance: <id> | config: <path>`) so you can confirm where state will land before continuing.
## Database in Dev (Auto-Handled)
For local development, leave `DATABASE_URL` unset.
@@ -164,7 +185,7 @@ The server will automatically use embedded PostgreSQL and persist data at:
- `~/.paperclip/instances/default/db`
Override home and instance:
Override home or instance:
```sh
PAPERCLIP_HOME=/custom/path PAPERCLIP_INSTANCE_ID=dev pnpm paperclipai run
@@ -280,7 +301,7 @@ paperclipai worktree init --from-data-dir ~/.paperclip
paperclipai worktree init --force
```
Repair an already-created repo-managed worktree and reseed its isolated instance from the main default install:
Repair an already-created repo-managed worktree and reseed its isolated instance from the main default install. Point `--from-config` at the instance config:
```sh
cd /path/to/paperclip/.paperclip/worktrees/PAP-884-ai-commits-component
@@ -421,7 +442,9 @@ If you set `DATABASE_URL`, the server will use that instead of embedded PostgreS
## Automatic DB Backups
Paperclip can run automatic DB backups on a timer. Defaults:
Paperclip can run automatic logical database backups on a timer. These backups cover
non-system database schemas, including migration history and plugin-owned database
schemas. Defaults:
- enabled
- every 60 minutes
@@ -449,6 +472,10 @@ Environment overrides:
- `PAPERCLIP_DB_BACKUP_RETENTION_DAYS=<days>`
- `PAPERCLIP_DB_BACKUP_DIR=/absolute/or/~/path`
DB backups are not full instance filesystem backups. For full local disaster
recovery, also back up local storage files and the local encrypted secrets key if
those providers are enabled.
## Secrets in Dev
Agent env vars now support secret references. By default, secret values are stored with local encryption and only secret refs are persisted in agent config.
@@ -456,6 +483,7 @@ Agent env vars now support secret references. By default, secret values are stor
- Default local key path: `~/.paperclip/instances/default/secrets/master.key`
- Override key material directly: `PAPERCLIP_SECRETS_MASTER_KEY`
- Override key file path: `PAPERCLIP_SECRETS_MASTER_KEY_FILE`
- Back up the key file and database together; either one alone is not enough to restore local encrypted secrets.
Strict mode (recommended outside local trusted machines):
@@ -464,12 +492,20 @@ PAPERCLIP_SECRETS_STRICT_MODE=true
```
When strict mode is enabled, sensitive env keys (for example `*_API_KEY`, `*_TOKEN`, `*_SECRET`) must use secret references instead of inline plain values.
Authenticated deployments default strict mode on unless explicitly overridden.
CLI configuration support:
- `pnpm paperclipai onboard` writes a default `secrets` config section (`local_encrypted`, strict mode off, key file path set) and creates a local key file when needed.
- `pnpm paperclipai configure --section secrets` lets you update provider/strict mode/key path and creates the local key file when needed.
- `pnpm paperclipai doctor` validates secrets adapter configuration and can create a missing local key file with `--repair`.
- `pnpm paperclipai doctor` validates secrets adapter configuration, can create a missing local key file with `--repair`, and reports missing AWS Secrets Manager bootstrap env when that provider is selected.
- Provider health is available at `GET /api/companies/:companyId/secret-providers/health` and reports local key permission warnings plus backup guidance.
Per-company provider vaults are configured in the board UI under
`Company Settings → Secrets → Provider vaults`, backed by
`/api/companies/{companyId}/secret-provider-configs`. The CLI does not own
vault lifecycle today. See `docs/deploy/secrets.md` (`Provider Vaults` section)
for the operator model.
Migration helper for existing inline env secrets:
+59
View File
@@ -143,6 +143,13 @@ This keeps the default install path unchanged while allowing explicit installs w
npx paperclipai@canary onboard
```
The release script now verifies two things after a canary publish:
- the `canary` dist-tag resolves to the version that was just published
- every published internal `@paperclipai/*` dependency referenced by that manifest exists on npm
It also treats `latest -> canary` as a failure by default, because npm metadata can otherwise leave the default install path pointing at an unreleased canary dependency graph. Only pass `./scripts/release.sh canary --allow-canary-latest` when that `latest` behavior is explicitly intended.
### Stable
Stable publishes use the npm dist-tag `latest`.
@@ -169,6 +176,58 @@ That means:
See [doc/RELEASE-AUTOMATION-SETUP.md](RELEASE-AUTOMATION-SETUP.md) for the GitHub/npm setup steps.
## Release enrollment for new public packages
Paperclip does not auto-publish every non-private workspace package anymore.
CI publishing is controlled by [`scripts/release-package-manifest.json`](../scripts/release-package-manifest.json).
When you add a new public package:
1. add it to the manifest and decide whether CI should publish it immediately
2. if CI should publish it, bootstrap the package on npm before merge
3. if CI should not publish it yet, keep `"publishFromCi": false`
4. only enable `"publishFromCi": true` after npm trusted publishing is configured for that package
PR CI now checks changed release-enabled package manifests against npm. That catches a missing first-publish bootstrap before the change reaches `master`.
### One-time bootstrap sequence for a new package
The first publish of a brand-new package still needs one human maintainer with npm write access.
After that, trusted publishing can take over.
Example for `@paperclipai/adapter-acpx-local` from the repo root:
```bash
# safe preview
pnpm run release:bootstrap-package -- @paperclipai/adapter-acpx-local
# one-time first publish from an authenticated maintainer machine
pnpm run release:bootstrap-package -- @paperclipai/adapter-acpx-local --publish --otp 123456
```
The helper script:
- checks that the package does not already exist on npm
- builds the target package unless `--skip-build` is passed
- runs `npm pack --dry-run` in the package directory
- only runs the real `npm publish --access public` when `--publish --otp <code>` is provided
For the real `--publish` step, the maintainer machine must already be authenticated to npm.
If `npm whoami` returns `401`, first run `npm logout --registry=https://registry.npmjs.org/` to clear any stale local auth, then run `npm login` or `npm adduser` locally as an npm org member, and finally rerun the helper.
That local human auth is fine for the one-time bootstrap publish; we just do not want the same auth model inside CI.
The helper now requires `--otp <code>` up front for `--publish`, so it fails before the real publish attempt if the one-time password is missing.
After that first publish succeeds:
1. open `https://www.npmjs.com/package/@paperclipai/adapter-acpx-local`
2. go to `Settings``Trusted publishing`
3. add repository `paperclipai/paperclip`
4. set workflow filename to `release.yml`
5. optionally go to `Settings``Publishing access` and enable `Require two-factor authentication and disallow tokens`
6. keep `publishFromCi: true` in [`scripts/release-package-manifest.json`](../scripts/release-package-manifest.json)
Once those steps are done, future canary and stable publishes for that package are automated through GitHub OIDC. The manual step is only the first package creation on npm.
## Rollback model
Rollback does not unpublish anything.
+21
View File
@@ -67,6 +67,27 @@ Why:
- the single `release.yml` workflow handles both canary and stable publishing
- GitHub environments `npm-canary` and `npm-stable` still enforce different approval rules on the GitHub side
### 2.2.1. Newly added public packages need a bootstrap phase
Trusted publishing is configured on the npm package itself, not at the repo scope.
That means a brand-new public package must not be auto-enrolled into CI publishing until its npm package exists and its trusted publisher has been configured.
Repo policy:
1. add every non-private package to [`scripts/release-package-manifest.json`](../scripts/release-package-manifest.json)
2. set `"publishFromCi": true` only when CI is expected to publish that package
3. if the package is not ready for CI publishing yet, keep `"publishFromCi": false`
4. complete the package bootstrap before merging any PR that changes a release-enabled new package
Bootstrap sequence for a new package:
1. publish the package once from a trusted maintainer machine using normal npm auth
2. open that package on npm and add the `paperclipai/paperclip` trusted publisher for `.github/workflows/release.yml`
3. rerun or dry-run the release flow as needed to confirm CI publishing now works
4. only then enable `"publishFromCi": true`
PR CI enforces this by checking changed release-enabled package manifests against npm. That keeps `master` canary publishing healthy while preserving the no-long-lived-token model for normal CI releases.
### 2.3. Verify trusted publishing before removing old auth
After the workflows are live:
+2
View File
@@ -63,6 +63,8 @@ It:
- verifies the pushed commit
- computes the canary version for the current UTC date
- publishes under npm dist-tag `canary`
- verifies that `canary` resolves to the just-published version and that published internal dependencies exist on npm
- fails by default if npm leaves `latest` pointing at a canary; use `--allow-canary-latest` only when that state is intentional
- creates a git tag `canary/vYYYY.MDD.P-canary.N`
Users install canaries with:
+368
View File
@@ -0,0 +1,368 @@
# AWS Secrets Manager Provider
Operational contract for the hosted `aws_secrets_manager` secret provider used by Paperclip Cloud.
## Scope
- Hosted provider for Paperclip-managed secrets when Paperclip Cloud runs on AWS.
- Source of truth for secret values is AWS Secrets Manager, not Postgres.
- Paperclip stores only metadata needed for ownership, bindings, version selection, audit, and runtime resolution.
- AWS provider bootstrap credentials are deployment/runtime credentials, not Paperclip-managed company secrets.
- Remote import for existing AWS secrets is metadata-only. Preview/import uses
AWS inventory metadata and creates Paperclip external references; it does not
copy plaintext into Paperclip.
- Per-company AWS provider vaults (named instances of `aws_secrets_manager`
with their own region, namespace, prefix, KMS key id, and tags) are managed
in the board UI under `Company Settings → Secrets → Provider vaults`. See
[Provider Vaults](../docs/deploy/secrets.md#provider-vaults) for the operator
model and [Provider Vaults API](../docs/api/secrets.md#provider-vaults) for
the routes. The bootstrap trust model in this document still applies — vault
config carries non-sensitive routing metadata only, never AWS credentials.
## Bootstrap Trust Model
The AWS provider has a chicken-and-egg boundary: Paperclip cannot use
`company_secrets` to unlock the AWS provider that stores those secrets. The
initial AWS trust must exist before the Paperclip server starts.
Allowed bootstrap locations:
- Infrastructure IAM or workload identity attached to the Paperclip server
runtime.
- Process environment or orchestrator secret store used to start the Paperclip
server.
- Local AWS SDK sources such as `AWS_PROFILE`, AWS SSO/shared config, web
identity, container metadata, or instance metadata.
- Short-lived shell credentials for local development only.
Do not ask operators to paste AWS root credentials or long-lived IAM user access
keys into the Paperclip board UI. Do not store those bootstrap keys in
`company_secrets`.
## Paperclip Cloud Bootstrap
Paperclip Cloud must provision the AWS backing resources before any board user
can create AWS-backed company secrets:
1. Create or select the deployment KMS key.
2. Create the Paperclip server runtime role for the deployment.
3. Attach a minimum IAM policy scoped to the deployment Secrets Manager prefix
and the configured KMS key.
4. Configure the server runtime with the non-secret provider environment
variables below.
5. Run `paperclipai doctor` or the provider health endpoint from the deployed
runtime and confirm that the provider reports the expected region, prefix,
deployment id, KMS setting, and AWS SDK credential source.
Once this is in place, the board UI can create Paperclip-managed AWS secrets and
Paperclip will write them under the deployment/company namespace.
## Self-Hosted And Local Bootstrap
Self-hosted AWS deployments should use the AWS SDK default credential provider
chain. Preferred sources are role-based:
- EC2 instance profile.
- ECS task role.
- EKS IRSA or another OIDC web identity role.
- AWS SSO/shared config via `AWS_PROFILE`.
Local development can use:
```sh
aws sso login --profile paperclip-dev
AWS_PROFILE=paperclip-dev \
PAPERCLIP_SECRETS_PROVIDER=aws_secrets_manager \
PAPERCLIP_SECRETS_AWS_REGION=us-east-1 \
PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID=dev-local \
PAPERCLIP_SECRETS_AWS_KMS_KEY_ID=arn:aws:kms:us-east-1:123456789012:key/abcd-... \
pnpm dev
```
Temporary `AWS_ACCESS_KEY_ID`/`AWS_SECRET_ACCESS_KEY` environment credentials
are acceptable only as a local break-glass or short-lived test source. They
should not be written to Paperclip config, committed to `.env` files, stored in
`company_secrets`, or used as the default Paperclip Cloud bootstrap path.
## Deployment Config
Required environment variables:
```sh
PAPERCLIP_SECRETS_PROVIDER=aws_secrets_manager
PAPERCLIP_SECRETS_AWS_REGION=us-east-1
PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID=prod-us-1
PAPERCLIP_SECRETS_AWS_KMS_KEY_ID=arn:aws:kms:us-east-1:123456789012:key/abcd-...
```
Optional environment variables:
```sh
PAPERCLIP_SECRETS_AWS_PREFIX=paperclip
PAPERCLIP_SECRETS_AWS_ENVIRONMENT=production
PAPERCLIP_SECRETS_AWS_PROVIDER_OWNER=paperclip
PAPERCLIP_SECRETS_AWS_ENDPOINT=
PAPERCLIP_SECRETS_AWS_DELETE_RECOVERY_DAYS=30
```
Naming convention for Paperclip-managed secrets:
```text
paperclip/{deploymentId}/{companyId}/{secretKey}
```
Tag set for Paperclip-managed secrets:
- `paperclip:managed-by=paperclip`
- `paperclip:provider-owner=<owner tag>`
- `paperclip:deployment-id=<deployment id>`
- `paperclip:company-id=<company id>`
- `paperclip:secret-key=<secret key>`
- `paperclip:environment=<environment tag>`
## IAM And KMS Assumptions
Launch posture:
- One Paperclip app role per deployment.
- One deployment-scoped KMS key per deployment at launch.
- Future per-company KMS keys remain compatible because Paperclip stores provider refs and version metadata separately from values.
Minimum IAM boundary:
- Allow `secretsmanager:CreateSecret`, `PutSecretValue`, `GetSecretValue`, and `DeleteSecret`.
- Scope resources to the deployment prefix:
```text
arn:aws:secretsmanager:<region>:<account-id>:secret:paperclip/<deployment-id>/*
```
- Allow `kms:Encrypt`, `kms:Decrypt`, `kms:GenerateDataKey`, and `kms:DescribeKey` for the configured deployment CMK.
- Deny wildcard access outside the deployment prefix.
- Prefer workload identity / role-based auth. Do not store AWS credentials inline in Paperclip config.
Example minimum policy shape:
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PaperclipDeploymentSecrets",
"Effect": "Allow",
"Action": [
"secretsmanager:CreateSecret",
"secretsmanager:PutSecretValue",
"secretsmanager:GetSecretValue",
"secretsmanager:DeleteSecret"
],
"Resource": "arn:aws:secretsmanager:<region>:<account-id>:secret:paperclip/<deployment-id>/*"
},
{
"Sid": "PaperclipDeploymentKms",
"Effect": "Allow",
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:GenerateDataKey",
"kms:DescribeKey"
],
"Resource": "arn:aws:kms:<region>:<account-id>:key/<key-id>"
}
]
}
```
Operational expectation:
- Paperclip-managed secrets may be deleted only by Paperclip or an operator with equivalent break-glass access.
- External references may resolve through Paperclip runtime, but Paperclip should not delete the external secret resource.
## Remote Import Inventory IAM
Remote import preview needs one additional AWS permission:
```json
{
"Sid": "PaperclipRemoteSecretInventory",
"Effect": "Allow",
"Action": "secretsmanager:ListSecrets",
"Resource": "*"
}
```
This is intentionally separate from the managed create/rotate/delete policy.
AWS treats `ListSecrets` as an account/Region inventory action; do not document
secret ARNs, names, tags, or AWS request filters as an IAM boundary for it. Use
`Resource: "*"` and decide whether inventory exposure is acceptable for the AWS
account and Region behind each provider vault.
Remote import preview/import must not call:
- `secretsmanager:GetSecretValue`
- `secretsmanager:BatchGetSecretValue`
- `kms:Decrypt`
Those permissions are only needed later when a bound runtime resolves an
imported external reference. For imported refs, scope read permissions to the
operator-approved external prefixes that Paperclip is allowed to consume:
```json
{
"Sid": "PaperclipResolveImportedExternalReferences",
"Effect": "Allow",
"Action": "secretsmanager:GetSecretValue",
"Resource": [
"arn:aws:secretsmanager:<region>:<account-id>:secret:<approved-external-prefix>/*"
]
}
```
If selected external secrets use customer-managed KMS keys, also grant
`kms:Decrypt` and `kms:DescribeKey` on those keys. Keep managed write/delete
permissions scoped to `paperclip/<deployment-id>/*`; do not broaden them for
remote import.
Safe scoping guidance:
- Prefer one Paperclip runtime role per environment/account.
- Point provider vaults at the intended AWS account and Region instead of a
broad central admin role.
- Enable `ListSecrets` only in accounts where inventory exposure is acceptable.
- Keep preview/import board-only; agent API keys must not call these routes.
- Treat AWS tag/name filters as search UX only, not permission enforcement.
Paperclip also blocks importing refs under its own managed namespace as
external references. Use the Paperclip-managed flow for
`paperclip/{deploymentId}/{companyId}/{secretKey}` resources.
## Existing AWS Secrets
V1 keeps existing AWS Secrets Manager entries as **linked external references**, not adopted
Paperclip-managed resources.
Use the Paperclip-managed flow when Paperclip should create and rotate the value. The AWS
secret name is derived from deployment and company scope:
```text
paperclip/{deploymentId}/{companyId}/{secretKey}
```
Use the external-reference flow when the secret already exists at an operator-owned path such
as:
```text
/paperclip-bench/anthropic_api_key
```
In that mode Paperclip stores only the path or ARN, resolves it at runtime, and records
redacted access events. Operators rotate the actual value in AWS. Update the Paperclip
reference only when the AWS path, ARN, or pinned provider version changes.
Paperclip does not currently offer an "adopt existing AWS secret" flow that takes over future
`PutSecretValue` writes for an arbitrary existing secret. Adding that later requires explicit
confirmation UX, scope validation, expected Paperclip tags, and security/cloud-ops review.
## Data Custody
- Paperclip stores `externalRef`, `providerVersionRef`, provider id, fingerprint hash, status, and binding metadata.
- Paperclip does not store AWS secret plaintext in `company_secret_versions.material`.
- Runtime resolution fetches the value from AWS only when a bound consumer needs it.
## Rotation Runbook
Manual Paperclip-managed rotation:
1. Write the new value through the Paperclip secret rotate flow.
2. Paperclip creates a new AWS secret version with `PutSecretValue`.
3. Paperclip records the new `providerVersionRef` in `company_secret_versions`.
4. Re-run or restart affected workloads that consume `latest`, or pin consumers to a specific Paperclip version before rollout when you need staged release safety.
Guidance:
- Prefer pinned Paperclip secret versions for risky rollouts.
- Treat provider-native automatic rotation as a later enhancement; current V1 flow is explicit create-new-version plus controlled rollout.
## Backup And Restore Runbook
What must survive:
- Paperclip database metadata for secret ownership, bindings, status, and provider version refs.
- AWS Secrets Manager namespace under the configured deployment prefix.
- The configured KMS key and its decrypt permissions.
Restore checklist:
1. Restore Paperclip database metadata.
2. Confirm the same AWS Secrets Manager namespace still exists.
3. Confirm the Paperclip runtime role can call `GetSecretValue` on the restored prefix.
4. Confirm the role still has decrypt access to the CMK referenced by `PAPERCLIP_SECRETS_AWS_KMS_KEY_ID`.
5. Run the live smoke below or a targeted runtime secret resolution test.
## Provider Outage Runbook
Symptoms:
- Secret create/rotate/resolve operations fail with AWS provider errors.
- Agent runs fail before adapter invocation on required secret resolution.
- Remote import preview fails to list AWS inventory.
Immediate actions:
1. Confirm AWS regional health and Secrets Manager availability.
2. Confirm the runtime role still has `GetSecretValue` and KMS decrypt permissions.
3. Check for accidental prefix, region, deployment id, or KMS key config drift.
4. Retry a single resolution after AWS service health is green.
5. If outage persists, pause high-risk runs that require secret access rather than churning retries.
Remote import-specific actions:
- Missing list permission: add `secretsmanager:ListSecrets` with
`Resource: "*"` only when inventory import is approved for that vault's
AWS account and Region.
- Throttling: narrow the search, wait briefly, and retry with backoff. Avoid
full-account enumeration.
- Invalid or stale cursor: refresh the preview and discard the old
`NextToken`.
- Large account: load pages intentionally, keep one in-flight preview request
per vault/search, and do not run background full-account crawls.
- Runtime read failure after import: verify `GetSecretValue` and KMS decrypt
on the selected external secret. Visibility in `ListSecrets` does not prove
read permission.
## Incident Response Runbook
Potential incidents:
- Cross-company access caused by IAM scoping drift.
- KMS policy drift causing decrypt failures or over-broad access.
- Suspected secret exposure in logs, transcripts, or downstream agent output.
Response steps:
1. Stop or pause affected Paperclip runs.
2. Audit recent Paperclip secret access events for impacted secret ids and consumers.
3. Audit AWS CloudTrail for `ListSecrets`, `GetSecretValue`,
`PutSecretValue`, and `DeleteSecret` calls on the relevant vault account,
Region, deployment prefix, and approved external prefixes.
4. Rotate impacted secrets in AWS through Paperclip-managed versioning.
5. Re-scope IAM and KMS policies before resuming normal traffic.
6. If a value may have reached an agent transcript or external system, treat it as exposed and rotate immediately.
## Optional Live Smoke
This is safe to skip locally. Run it only against a dedicated AWS test namespace.
Prerequisites:
- AWS credentials or workload identity with the deployment-scoped IAM permissions above.
- `PAPERCLIP_SECRETS_PROVIDER=aws_secrets_manager`
- The required `PAPERCLIP_SECRETS_AWS_*` environment variables set.
Suggested smoke:
1. Create a test secret through the Paperclip board or API under a throwaway company.
2. Confirm the resulting AWS secret name matches `paperclip/{deploymentId}/{companyId}/{secretKey}`.
3. Rotate the secret once and confirm a new `providerVersionRef` appears in Paperclip metadata.
4. Resolve the secret through a bound runtime path, not by adding a general-purpose reveal endpoint.
5. Delete the throwaway secret and confirm AWS schedules deletion with the configured recovery window.
+5 -4
View File
@@ -37,7 +37,7 @@ These decisions close open questions from `SPEC.md` for V1.
| Visibility | Full visibility to board and all agents in same company |
| Communication | Tasks + comments only (no separate chat system) |
| Task ownership | Single assignee; atomic checkout required for `in_progress` transition |
| Recovery | Liveness/watchdog recovery preserves explicit ownership: retry lost execution continuity where safe, otherwise create visible recovery issues or require human escalation (see `doc/execution-semantics.md`) |
| Recovery | Liveness/watchdog recovery preserves explicit ownership: retry lost execution continuity where safe, otherwise open visible source-scoped recovery actions by default, use issue-backed recovery only for independent repair work, or require human escalation (see `doc/execution-semantics.md`) |
| Agent adapters | Built-in `process`, `http`, local CLI/session adapters, and OpenClaw gateway support; external adapters can also be loaded through the adapter plugin flow |
| Plugin framework | Local/self-hosted early plugin runtime is in scope; cloud marketplace and packaged public distribution remain out of scope |
| Auth | Mode-dependent human auth (`local_trusted` implicit board in current code; authenticated mode uses sessions), API keys for agents |
@@ -150,7 +150,7 @@ Invariant: every business record belongs to exactly one company.
- `capabilities` text null
- `adapter_type` text; built-ins include `process`, `http`, `claude_local`, `codex_local`, `gemini_local`, `opencode_local`, `pi_local`, `cursor`, and `openclaw_gateway`
- `adapter_config` jsonb not null
- `runtime_config` jsonb not null default `{}`
- `runtime_config` jsonb not null default `{}`; may include Paperclip runtime policy such as `modelProfiles.cheap.adapterConfig` for an optional low-cost model lane that does not change the primary adapter config
- `default_environment_id` uuid fk `environments.id` null
- `context_mode` enum: `thin | fat` default `thin`
- `budget_monthly_cents` int not null default 0
@@ -434,9 +434,10 @@ Side effects:
V1 non-terminal liveness rule:
- agent-owned `todo`, `in_progress`, `in_review`, and `blocked` issues must have a live execution path, an explicit waiting path, or an explicit recovery path
- `in_review` is healthy only when a typed execution participant, pending issue-thread interaction or approval, user owner, active run, queued wake, or explicit recovery issue owns the next action
- `in_review` is healthy only when a typed execution participant, pending issue-thread interaction or approval, user owner, active run, queued wake, or explicit recovery action owns the next action
- a blocked chain is covered only when each unresolved leaf issue is live or explicitly waiting
- when Paperclip cannot safely infer the next action, it surfaces the problem through visible blocked/recovery work instead of silently completing or reassigning work
- explicit recovery actions are the liveness primitive; source-scoped actions are the default form, issue-backed recovery is a fallback for independent repair work or safety boundaries, and comments alone are evidence rather than a healthy liveness path
Detailed ownership, execution, blocker, active-run watchdog, crash-recovery, and non-terminal liveness semantics are documented in `doc/execution-semantics.md`.
@@ -676,7 +677,7 @@ Per-agent schedule fields in `adapter_config`:
- `enabled` boolean
- `intervalSec` integer (minimum 30)
- `maxConcurrentRuns` integer; new agents default to `5`
- `maxConcurrentRuns` integer; new agents default to `20`; scheduler clamps configured values to `1..50`
Scheduler must skip invocation when:
Binary file not shown.

After

Width:  |  Height:  |  Size: 140 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 137 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 61 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

+83 -19
View File
@@ -67,13 +67,15 @@ This is the right state for:
- waiting on another issue
- waiting on a human decision
- waiting on an external dependency or system
- waiting on an external dependency or system when Paperclip does not own a scheduled re-check
- work that automatic recovery could not safely continue
### `in_review`
Execution work is paused because the next move belongs to a reviewer or approver, not the current executor.
An external review service can also be a valid review path when the issue keeps an agent assignee and has an active one-shot monitor that will wake that assignee to check the service later.
### `done`
The work is complete and terminal.
@@ -154,7 +156,7 @@ If a parent is truly waiting on a child, model that with blockers. Do not rely o
For agent-owned, non-terminal issues, Paperclip should never leave work in a state where nobody is responsible for the next move and nothing will wake or surface it.
This is a visibility contract, not an auto-completion contract. If Paperclip cannot safely infer the next action, it should surface the ambiguity with a blocked state, a visible comment, or an explicit recovery issue. It must not silently mark work done from prose comments or guess that a dependency is complete.
This is a visibility contract, not an auto-completion contract. If Paperclip cannot safely infer the next action, it should surface the ambiguity with a blocked state, a visible notice, or an explicit recovery action. It must not silently mark work done from prose comments or guess that a dependency is complete.
An issue is healthy when the product can answer "what moves this forward next?" without requiring a human to reconstruct intent from the whole thread. An issue is stalled when it is non-terminal but has no live execution path, no explicit waiting path, and no recovery path.
@@ -164,9 +166,35 @@ The valid action-path primitives are:
- a queued wake or continuation that can be delivered to the responsible agent
- a typed execution-policy participant, such as `executionState.currentParticipant`
- a pending issue-thread interaction or linked approval that is waiting for a specific responder
- a one-shot issue monitor (`executionPolicy.monitor.nextCheckAt`) that will wake the assignee for a future check
- a human owner via `assigneeUserId`
- a first-class blocker chain whose unresolved leaf issues are themselves healthy
- an open explicit recovery issue that names the owner and action needed to restore liveness
- an open explicit recovery action that names the owner and action needed to restore liveness
### Explicit recovery actions
An explicit recovery action is a typed liveness repair path for a source issue. It is the recovery primitive; the action can be rendered directly on the source issue or backed by a separate recovery issue when the repair needs its own work item.
A valid recovery action must name:
- the source issue and company
- the recovery kind and idempotency fingerprint
- the recovery owner, plus previous or return owner when ownership may temporarily shift
- the cause, bounded evidence, and next action
- the wake, monitor, timeout, retry, or escalation policy that will move the action forward
- the resolution outcome when closed, such as restored, delegated, false positive, blocked, escalated, or cancelled
A source-scoped recovery action is the default form. Use it when the next safe move is to repair the source issue's liveness directly: restore a wake path, clarify disposition, re-establish a monitor, record a false positive, or delegate real follow-up work from the source issue.
Use an issue-backed recovery action only when the recovery is genuinely independent work or when source-scoped handling would be unsafe or unclear. Examples include:
- long or cross-agent repair work with its own assignee, subtasks, or blockers
- real delegated follow-up that should block the source issue as a first-class dependency
- active-run watchdog work that must observe a still-running source process without interfering with it
- recovery that needs separate review, approval, security handling, or escalation ownership
- cases where source issue ownership cannot be changed or restored safely
A comment or system notice can be evidence for a recovery action, but it is not a recovery action by itself. Comment-only recovery is not a healthy liveness path because it does not define a typed owner, wake or monitor policy, retry bound, timeout, escalation path, or resolution outcome.
### Agent-assigned `todo`
@@ -180,6 +208,16 @@ A healthy dispatch state means at least one of these is true:
An assigned `todo` issue is stalled when dispatch was interrupted, no wake remains queued or running, and no recovery path has been opened.
### Agent-assigned `backlog`
This is parked state, not dispatch state.
Assigning an issue normally implies executable intent. When create APIs receive an assignee and no explicit status, Paperclip defaults the issue to `todo` so the assignee has a wake path instead of silently inheriting the unassigned `backlog` default.
An explicit assigned `backlog` issue remains valid when the creator is deliberately parking the work. It must not wake the assignee just because it has an assignee. Paperclip should make that choice visible in activity and UI so operators can distinguish intentional parking from a missed handoff.
An assigned `backlog` issue becomes a liveness problem when another issue is blocked on it and there is no explicit waiting path such as a human owner, active run, queued wake, pending interaction or approval, monitor, or open recovery action. In that case the blocked parent should surface "blocked by parked work" rather than treating the dependency chain as healthy.
### Agent-assigned `in_progress`
This is active-work state.
@@ -188,7 +226,8 @@ A healthy active-work state means at least one of these is true:
- there is an active run for the issue
- there is already a queued continuation wake
- there is an open explicit recovery issue for the lost execution path
- there is an active one-shot monitor that will wake the assignee for a future check
- there is an open explicit recovery action for the lost execution path
An agent-owned `in_progress` issue is stalled when it has no active run, no queued continuation, and no explicit recovery surface. A still-running but silent process is not automatically stalled; it is handled by the active-run watchdog contract.
@@ -202,11 +241,34 @@ A healthy `in_review` issue has at least one valid action path:
- a pending issue-thread interaction or linked approval waiting for a named responder
- a human owner via `assigneeUserId`
- an active run or queued wake that is expected to process the review state
- an open explicit recovery issue for an ambiguous review handoff
- an active one-shot monitor for an external service or async review loop that the assignee owns
- an open explicit recovery action for an ambiguous review handoff
Agent-assigned `in_review` with no typed participant is only healthy when one of the other paths exists. Assignment to the same agent that produced the handoff is not, by itself, a review path.
An `in_review` issue is stalled when it has no typed participant, no pending interaction or approval, no user owner, no active run, no queued wake, and no explicit recovery issue. Paperclip should surface that state as recovery work rather than silently completing the issue or leaving blocker chains parked indefinitely.
An `in_review` issue is stalled when it has no typed participant, no pending interaction or approval, no user owner, no active monitor, no active run, no queued wake, and no explicit recovery action. Paperclip should surface that state as recovery work rather than silently completing the issue or leaving blocker chains parked indefinitely.
### Issue monitors
An issue monitor is a one-shot deferred action path for agent-owned issues in `in_progress` or `in_review`.
Use a monitor when the current assignee owns a future check against an async system or external service. Examples include Greptile review loops, GitHub checks, Vercel deployments, or provider jobs where the agent should come back later and decide what happens next.
Monitor policy lives under `executionPolicy.monitor` and includes:
- `nextCheckAt`: when Paperclip should wake the assignee
- `notes`: non-secret instructions for what the assignee should check
- `serviceName`: optional non-secret external-service context
- `externalRef`: optional external-service reference input; Paperclip treats it as secret-adjacent, redacts it before persistence/visibility, and omits it from activity and wake payloads
- `timeoutAt`, `maxAttempts`, and `recoveryPolicy`: optional recovery hints for bounded waits
Monitors are not recurring intervals. When a monitor fires, Paperclip clears the scheduled monitor and queues an `issue_monitor_due` wake for the assignee. If the external service is still pending, the assignee must explicitly re-arm the monitor with a new `nextCheckAt`. If the issue moves to `done`, `cancelled`, an invalid status, or a human/unassigned owner, the monitor is cleared.
Because `serviceName` and `notes` remain visible in issue activity and wake context, operators should keep them short and non-secret. Put enough context for the assignee to know what to inspect, but do not include signed URLs, bearer tokens, customer secrets, tenant-private identifiers, or provider links with embedded credentials.
Monitor bounds are enforced. Paperclip rejects attempts to re-arm a monitor whose `timeoutAt` or `maxAttempts` is already exhausted. When a scheduled monitor reaches an exhausted bound at trigger time, Paperclip clears it and follows `recoveryPolicy`: `wake_owner` queues a bounded recovery wake for the assignee, `create_recovery_issue` opens visible issue-backed recovery work, and `escalate_to_board` records a board-visible escalation comment/activity.
Use `blocked` instead of a monitor when no Paperclip assignee owns a responsible polling path. In that case, name the external owner/action or create first-class recovery/blocker work.
### `blocked`
@@ -215,12 +277,12 @@ This is explicit waiting state.
A healthy `blocked` issue has an explicit waiting path:
- first-class blockers exist, and each unresolved leaf has a valid action path under this contract
- the issue is blocked on an explicit recovery issue that itself has a live or waiting path
- the issue has an explicit recovery action that itself has a live or waiting path
- the issue is waiting on a pending interaction, linked approval, human owner, or clearly named external owner/action
A blocker chain is covered only when its unresolved leaf is live or explicitly waiting. An intermediate `blocked` issue does not make the chain healthy by itself.
A `blocked` issue is stalled when the unresolved blocker leaf has no active run, queued wake, typed participant, pending interaction or approval, user owner, external owner/action, or recovery issue. In that case the parent should show the first stalled leaf instead of presenting the dependency as calmly covered.
A `blocked` issue is stalled when the unresolved blocker leaf has no active run, queued wake, typed participant, pending interaction or approval, user owner, external owner/action, or recovery action. In that case the parent should show the first stalled leaf instead of presenting the dependency as calmly covered.
## 8. Crash and Restart Recovery
@@ -240,7 +302,7 @@ Example:
Recovery rule:
- if the latest issue-linked run failed/timed out/cancelled and no live execution path remains, Paperclip queues one automatic assignment recovery wake
- if that recovery wake also finishes and the issue is still stranded, Paperclip moves the issue to `blocked` and posts a visible comment
- if that recovery wake also finishes and the issue is still stranded, Paperclip moves the issue to `blocked` and opens or updates an explicit recovery action when a bounded owner/action is known; the visible comment is evidence, not the recovery path by itself
This is a dispatch recovery, not a continuation recovery.
@@ -256,7 +318,7 @@ Example:
Recovery rule:
- Paperclip queues one automatic continuation wake
- if that continuation wake also finishes and the issue is still stranded, Paperclip moves the issue to `blocked` and posts a visible comment
- if that continuation wake also finishes and the issue is still stranded, Paperclip moves the issue to `blocked` and opens or updates an explicit recovery action when a bounded owner/action is known; the visible comment is evidence, not the recovery path by itself
This is an active-work continuity recovery.
@@ -269,7 +331,7 @@ On startup and on the periodic recovery loop, Paperclip now does four things in
1. reap orphaned `running` runs
2. resume persisted `queued` runs
3. reconcile stranded assigned work
4. scan silent active runs and create or update explicit watchdog review issues
4. scan silent active runs and create or update explicit watchdog recovery actions
The stranded-work pass closes the gap where issue state survives a crash but the wake/run path does not. The silent-run scan covers the separate case where a live process exists but has stopped producing observable output.
@@ -282,11 +344,11 @@ The recovery service owns this contract:
- classify active-run output silence as `ok`, `suspicious`, `critical`, `snoozed`, or `not_applicable`
- collect bounded evidence from run logs, recent run events, child issues, and blockers
- preserve redaction and truncation before evidence is written to issue descriptions
- create at most one open `stale_active_run_evaluation` issue per run
- create at most one open watchdog recovery action per run; issue-backed implementations use `stale_active_run_evaluation` issues
- honor active snooze decisions before creating more review work
- build the `outputSilence` summary shown by live-run and active-run API responses
Suspicious silence creates a medium-priority review issue for the selected recovery owner. Critical silence raises that review issue to high priority and blocks the source issue on the explicit evaluation task without cancelling the active process.
Suspicious silence creates a medium-priority watchdog recovery action for the selected recovery owner. Critical silence raises that recovery action to high priority and, when issue-backed evaluation is needed for correctness, blocks the source issue on the explicit evaluation task without cancelling the active process.
Watchdog decisions are explicit operator/recovery-owner decisions:
@@ -296,7 +358,7 @@ Watchdog decisions are explicit operator/recovery-owner decisions:
Operators should prefer `snooze` for known time-bounded quiet periods. `continue` is only a short acknowledgement of the current evidence; if the run remains silent after the re-arm window, the periodic watchdog scan can create or update review work again.
The board can record watchdog decisions. The assigned owner of the watchdog evaluation issue can also record them. Other agents cannot.
The board can record watchdog decisions. The assigned owner of an issue-backed watchdog evaluation can also record them. Other agents cannot.
## 11. Auto-Recover vs Explicit Recovery vs Human Escalation
@@ -314,9 +376,9 @@ Examples:
Auto-recovery preserves the existing owner. It does not choose a replacement agent.
### Explicit Recovery Issue
### Explicit Recovery Action
Paperclip creates an explicit recovery issue when the system can identify a problem but cannot safely complete the work itself.
Paperclip opens an explicit recovery action when the system can identify a problem but cannot safely complete the work itself.
Examples:
@@ -324,9 +386,11 @@ Examples:
- a dependency graph has an invalid/uninvokable owner, unassigned blocker, or invalid review participant
- an active run is silent past the watchdog threshold
The source issue remains visible and blocked on the recovery issue when blocking is necessary for correctness. The recovery owner must restore a live path, resolve the source issue manually, or record the reason it is a false positive.
The recovery action stays source-scoped by default. The source issue should show the recovery owner, cause, evidence, next action, and wake or monitor policy in its own thread/detail surface.
Instance-level issue-graph liveness auto-recovery is disabled by default. When enabled, its lookback window means "dependency paths updated within the last N hours"; older findings remain advisory and are counted as outside the configured lookback instead of creating recovery issues automatically. This is an operator noise control, not the older staleness delay for determining whether a chain is old enough to surface.
Create an issue-backed recovery action only when a separate issue is the right execution object. In that fallback form, the source issue remains visible and is blocked on the recovery issue when blocking is necessary for correctness. The recovery owner must restore a live path, resolve the source issue manually, delegate real follow-up work, or record the reason the signal is a false positive.
Instance-level issue-graph liveness auto-recovery is disabled by default. When enabled, its lookback window means "dependency paths updated within the last N hours"; older findings remain advisory and are counted as outside the configured lookback instead of creating recovery actions automatically. This is an operator noise control, not the older staleness delay for determining whether a chain is old enough to surface.
### Human Escalation
@@ -354,7 +418,7 @@ The recovery model is intentionally conservative:
- preserve ownership
- retry once when the control plane lost execution continuity
- create explicit recovery work when the system can identify a bounded recovery owner/action
- open an explicit recovery action when the system can identify a bounded recovery owner/action
- escalate visibly when the system cannot safely keep going
## 13. Practical Interpretation
@@ -0,0 +1,86 @@
# Plugin Secret Refs: Company Scope Reintroduction Plan
Date: 2026-04-26
Status: follow-up after fail-closed mitigation
Related issue: PAP-2394
## Current state
`PAP-2394` now fails closed:
- `POST /api/plugins/:pluginId/config` rejects any config containing plugin secret refs.
- `ctx.secrets.resolve()` is disabled for plugin workers.
This removes the release-blocking cross-company exposure path, but it also disables plugin secret-ref support until the runtime carries company scope end to end.
## Vulnerability summary
The original design mixed an instance-global config store with company-scoped secret bindings:
- [server/src/routes/plugins.ts](/Users/dotta/paperclip/.paperclip/worktrees/PAP-2339-secrets-make-a-plan/server/src/routes/plugins.ts:1898) saved one global plugin config row, then wrote bindings into `company_secret_bindings` grouped by each referenced secret's owning company.
- [packages/db/src/schema/plugin_config.ts](/Users/dotta/paperclip/.paperclip/worktrees/PAP-2339-secrets-make-a-plan/packages/db/src/schema/plugin_config.ts:15) stored one config row per plugin, with no company dimension.
- [packages/db/src/schema/company_secret_bindings.ts](/Users/dotta/paperclip/.paperclip/worktrees/PAP-2339-secrets-make-a-plan/packages/db/src/schema/company_secret_bindings.ts:5) already modeled bindings as company-scoped.
- [server/src/services/plugin-secrets-handler.ts](/Users/dotta/paperclip/.paperclip/worktrees/PAP-2339-secrets-make-a-plan/server/src/services/plugin-secrets-handler.ts:212) resolved by `pluginId` + secret UUID, with no active company context from the bridge call.
- [packages/plugins/sdk/src/worker-rpc-host.ts](/Users/dotta/paperclip/.paperclip/worktrees/PAP-2339-secrets-make-a-plan/packages/plugins/sdk/src/worker-rpc-host.ts:384) exposed `ctx.config.get()` and `ctx.secrets.resolve()` without a company parameter.
This violated Least Privilege, Complete Mediation, and Secure Defaults.
## Recommended end state
Re-enable plugin secret refs only after both of these are true:
1. Plugin config reads/writes are company-scoped.
2. Runtime secret resolution carries explicit company context and enforces it at resolution time.
## Implementation plan
### 1. Make plugin config company-scoped
- Add `company_id` to `plugin_config`, with a unique index on `(plugin_id, company_id)`.
- Update registry helpers to require `companyId` for `getConfig`, `upsertConfig`, `patchConfig`, and `deleteConfig`.
- Update plugin config routes to require `companyId` and call `assertCompanyAccess(req, companyId)`.
- Keep instance-global plugin lifecycle state separate from company-scoped plugin config.
### 2. Propagate company context through the worker runtime
- Extend the SDK so `ctx.config.get()` and `ctx.secrets.resolve()` can receive or derive `companyId`.
- Introduce worker request context storage for handlers that already run with company scope:
- `getData`
- `performAction`
- scoped API routes
- tool executions
- environment driver calls
- Fail closed when plugin code tries to read company-scoped config or secrets outside an active company context.
### 3. Rebind secrets by `(companyId, pluginId, configPath)`
- On config save, validate every referenced secret belongs to the authorized company.
- Store bindings only for that company.
- Resolve secrets only by the current company-scoped binding, never by bare plugin ID plus UUID.
- Treat stale bindings as invalid and remove them on config replacement.
### 4. Prevent cross-company config disclosure
- When returning config to the UI, only materialize the selected company's secret refs.
- Never expose another company's secret UUIDs through the global plugin config surface.
## Required regression coverage
- Company A board user cannot save plugin config that references a Company B secret.
- Company A plugin execution cannot resolve a Company B secret even if the same plugin is configured for Company B.
- Company-scoped config reads only return the selected company's secret bindings.
- Config replacement removes stale bindings for the same `(companyId, pluginId)` target.
- Runtime calls without company context fail closed.
## Migration notes
- Existing `plugin_config` rows need a migration strategy before re-enable.
- Safest default: do not auto-assume a company for historical secret refs.
- Prefer one of:
- explicit admin migration per company, or
- import existing rows as non-secret config only and require re-entry of secret refs.
## Release posture
- Keep plugin secret refs disabled until all steps above land.
- Do not restore the feature behind a soft warning; the insecure path must remain unavailable by default.
@@ -0,0 +1,135 @@
# LLM Wiki Paperclip Asset And Work-Product Security Gate
Status: accepted Phase 5 policy
Date: 2026-05-06
Owner: Security engineering
Scope: Paperclip-derived ingestion into the LLM Wiki before any asset or work-product content indexing ships
## Decision
Phase 5 remains **fail-closed** for Paperclip assets and work products.
- Paperclip-derived **text extraction is allowed only** for issue titles/descriptions, issue comments, and issue documents.
- Paperclip **assets/attachments** and **issue work products** are **metadata-only** in Phase 5.
- **Linked summaries** and **content extraction** for assets/work products are **not approved** in Phase 5.
- No implementation may fetch `/api/assets/:id/content`, dereference a work-product `url`, scrape preview pages, or embed binary/blob content into source bundles or source snapshots.
This keeps the secure path easier than the insecure one and avoids broadening the wiki into a second content-distribution channel.
## Allowed Source Kinds
These source kinds may contribute body text to Paperclip-derived source bundles:
| Source kind | Allowed body fields | Reason |
| --- | --- | --- |
| Issue | `title`, `description`, identifier/status metadata | First-party Paperclip text under company ACL |
| Comment | `body` | First-party Paperclip text under company ACL |
| Document | `body`, `title`, `key`, revision metadata | First-party Paperclip text under company ACL |
## Assets And Work Products
### Assets / attachments
Allowed in Phase 5:
- metadata-only references built from allowlisted structured fields already stored in Paperclip
- recommended fields: `issueId`, `issueCommentId`, `attachmentId`, `assetId`, `originalFilename`, `contentType`, `byteSize`, `sha256`, `createdAt`, `createdByAgentId`, `createdByUserId`
Disallowed in Phase 5:
- fetching asset bytes from `/api/assets/:id/content`
- parsing any blob body, including `text/plain`, `text/markdown`, `application/json`, images, SVG, PDFs, archives, or office formats
- storing `contentPath` in wiki source bundles or source snapshots
- model summarization of attachment bodies
### Work products
Allowed in Phase 5:
- metadata-only references built from allowlisted structured fields already stored in Paperclip
- recommended fields: `issueId`, `workProductId`, `type`, `provider`, `title`, `status`, `reviewState`, `healthStatus`, `externalId`, `isPrimary`, `createdAt`, `updatedAt`
- optional boolean/derived metadata such as `hasUrl: true`
Disallowed in Phase 5:
- fetching or crawling the work-product `url`
- scraping preview pages, artifacts, pull requests, branches, commits, or custom provider targets through the wiki ingestion path
- storing raw `url` values in wiki source bundles or source snapshots
- model-authored linked summaries derived from off-record content
## MIME Allowlists And Size Caps
No MIME allowlist is approved for asset content extraction in Phase 5 because **no asset body extraction is approved at all**.
- Every asset MIME type is treated as opaque for Paperclip-derived indexing.
- Existing upload limits remain storage concerns, not ingestion approvals.
- Work-product destinations are also opaque regardless of MIME type or size.
Any future issue that wants blob parsing must define:
- a positive MIME allowlist
- per-type parser strategy
- per-source size caps
- sandbox/isolation requirements
- prompt-injection handling
- regression tests for refusal paths
## Redaction Rules
Metadata-only means **structured facts only**, not capability-bearing links.
- Do not persist `contentPath` for assets.
- Do not persist raw work-product `url` values.
- Do not persist query strings, fragments, signed URL tokens, or userinfo.
- Prefer stable identifiers (`assetId`, `workProductId`, `externalId`) over links.
This addresses Sensitive Information Disclosure, Unsafe Consumption of APIs, and Insecure Output Handling risks.
## Provenance Rules
Every metadata-only reference must preserve enough provenance to explain where it came from without reading the underlying content:
- `companyId`
- `issueId`
- attachment/work-product id
- producer identity when available
- timestamps
- an explicit `metadata_only` marker in any future reference/snapshot schema
## Review-Required Behavior
Human review is **not** required for plain metadata-only references that stay inside the allowlisted fields above.
Human review **is required**, with a separate security sign-off issue, before enabling any of the following:
- asset body extraction
- work-product URL fetching
- linked summaries generated from asset/work-product content
- storing raw blob links or raw remote URLs in wiki source material
- non-default-space routing for Paperclip-derived asset/work-product references
## Security Rationale
This gate exists because the current host surfaces have different trust properties:
- issue/comment/document text is first-party Paperclip content already exposed through company-scoped issue/document APIs
- asset content is a blob download surface (`/api/assets/:id/content`) and can carry prompt-injection or parser-risk payloads
- work products can point at arbitrary destinations through `url`, which reintroduces SSRF, token leakage, and prompt-injection risk if dereferenced automatically
Relevant threat classes:
- OWASP LLM Top 10: Prompt Injection, Sensitive Information Disclosure, Insecure Output Handling, Excessive Agency
- OWASP API Top 10: SSRF, Unsafe Consumption of APIs, Broken Object Property Level Authorization
- Saltzer & Schroeder: Least Privilege, Fail Securely, Complete Mediation, Secure Defaults
## Follow-Up Implementation Scope
A follow-up implementation issue is justified only for **metadata-only references**.
That implementation must:
- keep assets/work products out of source-bundle body text
- never fetch blob bytes or remote URLs
- redact capability-bearing link fields
- mark references as `metadata_only`
- ship tests proving source bundles/snapshots never contain `contentPath` or raw work-product `url` fields
+136
View File
@@ -0,0 +1,136 @@
# Local Plugin Development
This is the short happy-path guide for developing a Paperclip plugin from a folder on your machine. You will scaffold a plugin, run it in watch mode, install it into a running Paperclip instance from an absolute local path, and edit code with the plugin worker reloading after each rebuild.
For the full alpha surface — manifest fields, capabilities, managed agents/projects/routines, UI slots, scoped API routes — see [`PLUGIN_AUTHORING_GUIDE.md`](./PLUGIN_AUTHORING_GUIDE.md).
## Prerequisites
- Node.js 22+ and `pnpm`.
- A local Paperclip checkout you can run from source. Local plugin installs read source from disk, so the running server must be able to see the path you give it.
## The five steps
```bash
# 1. Start Paperclip locally
pnpm paperclipai run
# 2. Scaffold a plugin outside the Paperclip repo
paperclipai plugin init @acme/hello-plugin --output ~/dev/paperclip-plugins
# 3. Install dependencies and start the watch build
cd ~/dev/paperclip-plugins/hello-plugin
pnpm install
pnpm dev
# 4. In another terminal, install the plugin from its absolute path
paperclipai plugin install ~/dev/paperclip-plugins/hello-plugin
# 5. Confirm it loaded
paperclipai plugin list
paperclipai plugin inspect acme.hello-plugin
```
That's the loop. The rest of this page explains what each step does and what to expect when you edit code.
### 1. Start Paperclip
```bash
pnpm paperclipai run
```
Paperclip listens on `http://127.0.0.1:3100` by default. The CLI talks to that server, so leave it running.
### 2. Scaffold the plugin
```bash
paperclipai plugin init @acme/hello-plugin --output ~/dev/paperclip-plugins
```
This creates `~/dev/paperclip-plugins/hello-plugin/` with `src/manifest.ts`, `src/worker.ts`, `src/ui/index.tsx`, an esbuild watch config, a Vitest config, and a snapshot of `@paperclipai/plugin-sdk` from your local Paperclip checkout. You can run the package and tests without publishing anything to npm.
Useful flags:
- `--template <default|connector|workspace|environment>` — starter shape.
- `--category <connector|workspace|automation|ui|environment>` — manifest category.
- `--display-name`, `--description`, `--author` — manifest metadata.
- `--sdk-path <absolute-path>` — point at a specific `packages/plugins/sdk` checkout if you have more than one.
When `plugin init` finishes, it prints the next four commands literally. You can copy them.
### 3. Install dependencies and run the watch build
```bash
cd ~/dev/paperclip-plugins/hello-plugin
pnpm install
pnpm dev
```
`pnpm dev` runs `esbuild --watch` against the plugin source and emits `dist/manifest.js`, `dist/worker.js`, and `dist/ui/`. Leave it running. Every time you save, esbuild rebuilds the affected output file.
If your plugin has UI and you want a browser-side dev server with hot module replacement during local UI iteration, run `pnpm dev:ui` in a second terminal. It serves `dist/ui/` on `http://127.0.0.1:4177`. This is optional; Paperclip can load the built UI directly from `dist/ui/` without it.
### 4. Install from the absolute path
```bash
paperclipai plugin install ~/dev/paperclip-plugins/hello-plugin
```
The CLI auto-detects local paths (anything that looks absolute, starts with `./`, `../`, or `~`, or resolves to an existing folder relative to the current directory) and sends `{ isLocalPath: true }` to `POST /api/plugins/install` with the resolved absolute path. If you want to be explicit, pass `--local`.
You will see a confirmation like:
```
Installing plugin from local path: /Users/you/dev/paperclip-plugins/hello-plugin
✓ Installed acme.hello-plugin v0.1.0 (ready)
Local plugin installs run trusted local code from your machine.
Keep `pnpm dev` running in /Users/you/dev/paperclip-plugins/hello-plugin;
Paperclip watches rebuilt dist output and reloads the plugin worker.
```
Relative paths are resolved against the current working directory, so `paperclipai plugin install .` from inside the plugin folder works too.
### 5. Inspect
```bash
paperclipai plugin list
paperclipai plugin inspect acme.hello-plugin
```
`list` shows plugin key, status, version, and short error. `inspect` prints the same record with the full last error if there is one. Both accept `--json` if you want to script against them.
## Reload semantics, honestly
Paperclip watches the on-disk plugin package after a local install. The watcher targets the runtime entrypoints declared in the package's `paperclipPlugin` field (`dist/manifest.js`, `dist/worker.js`, `dist/ui/`).
What that means in practice:
- **Worker code:** save a `.ts` file → esbuild rewrites `dist/worker.js` → Paperclip debounces ~500ms and restarts the plugin worker. The next worker call uses the new code. There is no in-process hot module replacement for worker code; it is a worker restart.
- **Manifest:** save `src/manifest.ts``dist/manifest.js` rewrites → the worker restarts and the host re-reads the manifest.
- **Plugin UI:** save a `.tsx` file → esbuild rewrites `dist/ui/` → Paperclip reloads the UI bundle on its next mount. To get HMR during UI iteration, run `pnpm dev:ui` and point at the dev server with `devUiUrl` in your manifest while developing.
- **Without `pnpm dev`:** the watcher only fires on `dist/*` changes. If you stop the watch build, source edits do not reach Paperclip. Restart `pnpm dev` (or run `pnpm build` once) before expecting changes.
- **`node_modules`, `.git`, `.paperclip-sdk`, and other dotfolders are ignored.** Adding a dependency requires the new code to actually be imported and rebuilt before the worker sees it.
The server never compiles plugin source for you. The package's own build scripts own that step.
## Local path plugins vs npm packages
Both go through the same install endpoint, but they mean different things:
- **Local path plugins are trusted local code.** Paperclip executes worker code from disk under the same trust boundary as the rest of the running instance. This is meant for developing or operating a plugin against a checkout you control. There is no signature check, no sandboxing of worker code, and no provenance metadata beyond the path. Do not install local-path plugins you did not write.
- **npm packages are the deployable artifact.** `paperclipai plugin install @acme/plugin-foo` (optionally `--version 1.2.3`) installs from your configured npm registry, version-pins, and produces an install record that other operators can reproduce. Ship plugins this way.
When you are done iterating locally, publish the package and reinstall the npm-package form so the install reflects what you will ship.
## Common things to do next
- **Restart cleanly:** `paperclipai plugin disable <key>` pauses the plugin without removing it. `paperclipai plugin enable <key>` brings it back. `paperclipai plugin uninstall <key>` removes the install record; add `--force` to also purge plugin state and settings.
- **Browse examples:** `paperclipai plugin examples` lists the bundled example plugins that ship with the repo, each with a ready-to-run `paperclipai plugin install <path>` line.
- **Go deeper:** [`PLUGIN_AUTHORING_GUIDE.md`](./PLUGIN_AUTHORING_GUIDE.md) covers worker capabilities, managed agents/projects/routines, plugin database namespaces, scoped API routes, and the shared UI components in `@paperclipai/plugin-sdk/ui`. [`PLUGIN_SPEC.md`](./PLUGIN_SPEC.md) is the longer-form specification, including future ideas that are not yet implemented.
## Troubleshooting
- **`Plugin install returned no plugin record` or `error` status.** Run `paperclipai plugin inspect <key>` for the last error. The most common causes are (1) the plugin has not built yet — run `pnpm dev` or `pnpm build` first, (2) the `paperclipPlugin` entries in `package.json` point at files that do not exist on disk, or (3) the manifest failed validation. The Paperclip server log has the full validation error.
- **Edits do not seem to reload.** Confirm `pnpm dev` is still running and writing to `dist/`. If you renamed entry files, update the `paperclipPlugin.manifest` / `paperclipPlugin.worker` / `paperclipPlugin.ui` fields in `package.json` so the watcher targets them.
- **Worker restarts but UI is stale.** Hard-reload the page. If you want HMR, run `pnpm dev:ui` and set `devUiUrl` in your manifest to `http://127.0.0.1:4177` during development.
- **Path arguments fail on Windows.** Quote paths that contain spaces, and prefer absolute paths over `~`-prefixed paths in non-bash shells.
+352 -29
View File
@@ -4,6 +4,8 @@ This guide describes the current, implemented way to create a Paperclip plugin i
It is intentionally narrower than [PLUGIN_SPEC.md](./PLUGIN_SPEC.md). The spec includes future ideas; this guide only covers the alpha surface that exists now.
> **New to plugins?** Start with the short [Local Plugin Development guide](./LOCAL_PLUGIN_DEVELOPMENT.md) — it walks the CLI happy path (`plugin init` → `pnpm dev` → `plugin install <path>`) end to end. Come back here for the full manifest surface, worker capabilities, and UI components.
## Current reality
- Treat plugin workers and plugin UI as trusted code.
@@ -13,28 +15,20 @@ It is intentionally narrower than [PLUGIN_SPEC.md](./PLUGIN_SPEC.md). The spec i
- Plugin database migrations are restricted to a host-derived plugin namespace.
- Plugin-owned JSON API routes must be declared in the manifest and are mounted
only under `/api/plugins/:pluginId/api/*`.
- There is no host-provided shared React component kit for plugins yet.
- The host provides a small shared React component kit through
`@paperclipai/plugin-sdk/ui`; use it for common Paperclip controls before
building custom versions.
- `ctx.assets` is not supported in the current runtime.
## Scaffold a plugin
Use the scaffold package:
Use the CLI scaffold command:
```bash
pnpm --filter @paperclipai/create-paperclip-plugin build
node packages/plugins/create-paperclip-plugin/dist/index.js @yourscope/plugin-name --output ./packages/plugins/examples
paperclipai plugin init @yourscope/plugin-name --output /absolute/path/to/plugin-repos
```
For a plugin that lives outside the Paperclip repo:
```bash
pnpm --filter @paperclipai/create-paperclip-plugin build
node packages/plugins/create-paperclip-plugin/dist/index.js @yourscope/plugin-name \
--output /absolute/path/to/plugin-repos \
--sdk-path /absolute/path/to/paperclip/packages/plugins/sdk
```
That creates a package with:
That creates `<output>/plugin-name/` with:
- `src/manifest.ts`
- `src/worker.ts`
@@ -45,11 +39,13 @@ That creates a package with:
Inside this monorepo, the scaffold uses `workspace:*` for `@paperclipai/plugin-sdk`.
Outside this monorepo, the scaffold snapshots `@paperclipai/plugin-sdk` from the local Paperclip checkout into a `.paperclip-sdk/` tarball so you can build and test a plugin without publishing anything to npm first.
Outside this monorepo, the scaffold snapshots `@paperclipai/plugin-sdk` from the local Paperclip checkout into a `.paperclip-sdk/` tarball so you can build and test a plugin without publishing anything to npm first. Pass `--sdk-path /absolute/path/to/paperclip/packages/plugins/sdk` if you have more than one Paperclip checkout.
## Recommended local workflow
## Local development workflow
From the generated plugin folder:
See the short [Local Plugin Development guide](./LOCAL_PLUGIN_DEVELOPMENT.md) for the full happy path (`pnpm dev``paperclipai plugin install <absolute-path>``paperclipai plugin list`) and reload semantics.
Minimum verification from the generated plugin folder:
```bash
pnpm install
@@ -58,16 +54,6 @@ pnpm test
pnpm build
```
For local development, install it into Paperclip from an absolute local path through the plugin manager or API. The server supports local filesystem installs and watches local-path plugins for file changes so worker restarts happen automatically after rebuilds.
Example:
```bash
curl -X POST http://127.0.0.1:3100/api/plugins/install \
-H "Content-Type: application/json" \
-d '{"packageName":"/absolute/path/to/your-plugin","isLocalPath":true}'
```
## Supported alpha surface
Worker:
@@ -83,10 +69,11 @@ Worker:
- database namespace via `ctx.db`
- scoped JSON API routes declared with `apiRoutes`
- entities
- projects and project workspaces
- projects, project workspaces, and plugin-managed projects
- companies
- issues, comments, namespaced `plugin:<pluginKey>` origins, blocker relations, checkout assertions, assignment wakeups, and orchestration summaries
- agents and agent sessions
- agents, plugin-managed agents, and agent sessions
- plugin-managed routines
- goals
- data/actions
- streams
@@ -143,6 +130,161 @@ handler. The worker receives sanitized headers, route params, query, parsed JSON
body, actor context, and company id. Do not use plugin routes to claim core
paths; they always remain under `/api/plugins/:pluginId/api/*`.
## Managed Paperclip resources
Plugins that provide durable Paperclip business objects should declare them in
the manifest and let the host create or relink the actual records per company.
Do this for plugin-owned agents, plugin-owned projects, and recurring automation.
Do not hide long-lived work behind private plugin state when it should be visible
to the board, scoped to a company, audited, budgeted, and assigned like normal
Paperclip work.
Use these surfaces:
- Managed agents: declare top-level `agents[]` and require
`agents.managed`. Use this when the plugin provides a named worker the board
should see in the org, budget, pause, invoke, and inspect. Managed agents are
normal Paperclip agents with plugin ownership metadata, not background plugin
workers.
- Managed projects: declare top-level `projects[]` and require
`projects.managed`. Use this when the plugin needs a stable company-scoped
project for its issues, routines, or workspace-oriented UI. Keep plugin work
in a project instead of scattering generated issues across unrelated projects.
- Managed routines: declare top-level `routines[]` and require
`routines.managed`. Use this for scheduled, webhook, or manually triggered
jobs that should create visible Paperclip issues. Prefer managed routines over
plugin `jobs[]` for recurring business work; plugin jobs are for plugin
runtime maintenance that does not need a board-visible task trail.
Managed resources are resolved by stable plugin keys, not hardcoded database
ids. In a worker action or data handler, call `ctx.agents.managed.reconcile()`,
`ctx.projects.managed.reconcile()`, and `ctx.routines.managed.reconcile()` for
the current `companyId`. `reconcile()` creates the missing resource, relinks a
recoverable binding, or returns the existing resource. `reset()` reapplies the
manifest defaults when the operator wants to restore the plugin's suggested
configuration.
Declare dependencies between managed resources with refs. A routine can point
at a managed agent through `assigneeRef` and at a managed project through
`projectRef`. Reconcile the referenced agent and project before reconciling the
routine; if a ref is still missing, the routine resolution reports
`missing_refs` instead of guessing.
```ts
import type { PaperclipPluginManifestV1 } from "@paperclipai/plugin-sdk";
const manifest: PaperclipPluginManifestV1 = {
id: "example.research-plugin",
apiVersion: 1,
version: "0.1.0",
displayName: "Research Plugin",
description: "Creates a managed research agent and scheduled research routine.",
author: "Example",
categories: ["automation"],
capabilities: [
"agents.managed",
"projects.managed",
"routines.managed",
"instance.settings.register",
],
entrypoints: {
worker: "./dist/worker.js",
ui: "./dist/ui",
},
agents: [
{
agentKey: "researcher",
displayName: "Researcher",
role: "research",
title: "Research Agent",
capabilities: "Runs recurring research briefs for this company.",
adapterPreference: ["codex_local", "claude_local", "process"],
instructions: {
content: "Follow the Paperclip heartbeat and produce concise research briefs.",
},
},
],
projects: [
{
projectKey: "research",
displayName: "Research",
description: "Recurring research work created by the Research Plugin.",
status: "in_progress",
},
],
routines: [
{
routineKey: "weekly-brief",
title: "Weekly research brief",
description: "Create a short research brief for the board.",
assigneeRef: { resourceKind: "agent", resourceKey: "researcher" },
projectRef: { resourceKind: "project", resourceKey: "research" },
priority: "medium",
triggers: [
{
kind: "schedule",
label: "Monday morning",
cronExpression: "0 9 * * 1",
timezone: "America/Chicago",
enabled: false,
},
],
},
],
ui: {
slots: [
{
type: "settingsPage",
id: "settings",
displayName: "Research",
exportName: "SettingsPage",
},
],
},
};
export default manifest;
```
In the worker, expose a small setup action or settings-page action that
reconciles the resources for the selected company:
```ts
import { definePlugin } from "@paperclipai/plugin-sdk";
export default definePlugin({
setup(ctx) {
ctx.actions.register("setup-company", async (params) => {
const companyId = String(params.companyId ?? "");
if (!companyId) throw new Error("companyId is required");
const project = await ctx.projects.managed.reconcile("research", companyId);
const agent = await ctx.agents.managed.reconcile("researcher", companyId);
const routine = await ctx.routines.managed.reconcile("weekly-brief", companyId);
return { project, agent, routine };
});
},
});
```
Authoring rules:
- Keep keys stable once published. Renaming `agentKey`, `projectKey`, or
`routineKey` creates a new managed resource from the host's point of view.
- Use managed agents for plugin-provided labor. Use `ctx.agents.invoke()` or
`ctx.agents.sessions` only after you have a real agent id, either selected by
the operator or resolved from `ctx.agents.managed`.
- Use managed routines for recurring or externally triggered work that should
produce tasks. Schedule, webhook, and API triggers are visible routine
triggers, and each run has the normal Paperclip issue/audit trail.
- Use managed projects to keep plugin-generated work organized and to give
project-scoped plugin UI a stable home. For filesystem access inside a
project, still resolve project workspaces through `ctx.projects`.
- Keep defaults conservative. Managed declarations are suggestions owned by the
plugin, but the resulting resources are normal Paperclip records that the
operator can inspect, pause, and adjust.
UI:
- `usePluginData`
@@ -168,6 +310,187 @@ Mount surfaces currently wired in the host include:
- `commentAnnotation`
- `commentContextMenuItem`
## Shared host components
Use shared components from `@paperclipai/plugin-sdk/ui` when the plugin needs a
Paperclip-native control. The host owns the implementation, so plugins inherit
the board's current styling, ordering, recent selections, and dark-mode behavior
without importing `ui/src` internals.
Currently exposed components include:
- `MarkdownBlock` and `MarkdownEditor` for rendered and editable markdown.
- `FileTree` for serializable file and directory trees.
- `IssuesList` for a native company-scoped issue table.
- `AssigneePicker` for the same agent/user selector used in the new issue pane.
Use the controlled `value` format `agent:<id>`, `user:<id>`, or `""`.
- `ProjectPicker` for the same project selector used in the new issue pane.
Use the controlled project id value, or `""` for no project.
- `ManagedRoutinesList` for plugin-owned routine settings pages.
```tsx
import { AssigneePicker, ProjectPicker } from "@paperclipai/plugin-sdk/ui";
export function PluginAssignmentControls({ companyId }: { companyId: string }) {
const [assignee, setAssignee] = useState("");
const [projectId, setProjectId] = useState("");
return (
<>
<AssigneePicker
companyId={companyId}
value={assignee}
onChange={(value) => setAssignee(value)}
/>
<ProjectPicker
companyId={companyId}
value={projectId}
onChange={setProjectId}
/>
</>
);
}
```
## File and path UI
Plugin UI often needs to render a file tree, accept a folder path, or browse a
project workspace. There are three different surfaces for that, and they map to
different trust and data-flow boundaries. Pick the surface that matches the
data the plugin actually has.
### When to use the shared `FileTree`
Use `FileTree` from `@paperclipai/plugin-sdk/ui` whenever the plugin only needs
to render a serializable file/directory list and react to selection or
expand/collapse. The host owns the implementation, so plugin UI inherits the
board's icons, indent, focus ring, and dark-mode styling without importing host
internals.
```tsx
import {
FileTree,
type FileTreeNode,
} from "@paperclipai/plugin-sdk/ui";
const nodes: FileTreeNode[] = [
{ name: "AGENTS.md", path: "AGENTS.md", kind: "file", children: [] },
{
name: "wiki",
path: "wiki",
kind: "dir",
children: [
{ name: "index.md", path: "wiki/index.md", kind: "file", children: [] },
],
},
];
export function WikiTree() {
const [expanded, setExpanded] = useState<Set<string>>(() => new Set(["wiki"]));
const [selected, setSelected] = useState<string | null>(null);
return (
<FileTree
nodes={nodes}
selectedFile={selected}
expandedPaths={expanded}
onSelectFile={(path) => setSelected(path)}
onToggleDir={(path) =>
setExpanded((current) => {
const next = new Set(current);
next.has(path) ? next.delete(path) : next.add(path);
return next;
})
}
/>
);
}
```
Good fits:
- LLM Wiki page navigation in `packages/plugins/plugin-llm-wiki` builds a
`FileTreeNode[]` from worker query results and renders it through `FileTree`.
- The example `plugin-file-browser-example` lazily fetches a directory's
children through a `loadFileList` action when `onToggleDir` fires, then
merges the children into the local tree state — letting the shared component
handle rendering and selection.
Boundary rules:
- Keep the prop surface serializable (`nodes`, `expandedPaths`, `checkedPaths`,
`fileBadges`, `fileTones`). Do not pass arbitrary render functions across the
plugin/host boundary in v1; the supported escape hatches are
`fileBadges` (status pill keyed by path) and `fileTones` (row tone keyed by
path).
- Do not import the host's `FileTree.tsx` or any `ui/src/*` module. The SDK
declaration is the only supported import path for plugin UI.
- The shared `FileTree` is for rendering and selection. Plugin-specific editors,
ingest flows, query forms, and lint runs stay inside the plugin and do not
belong as `FileTree` props.
### When to declare `localFolders`
When the plugin needs operator-configured filesystem roots — typically for
trusted local plugins like wiki tooling — declare `localFolders[]` on the
manifest and add the `local.folders` capability. The host renders a settings
surface for the operator to set the absolute path, validates the path
server-side (containment, symlinks, required files/directories), and exposes
`ctx.localFolders.readText()` and `ctx.localFolders.writeTextAtomic()` in the
worker.
```ts
export const manifest = {
capabilities: ["local.folders"],
localFolders: [
{
folderKey: "content-root",
displayName: "Content root",
access: "readWrite",
requiredDirectories: ["sources", "pages"],
requiredFiles: ["schema.md"],
},
],
};
```
Use this when:
- The data lives outside any project workspace.
- Reads and writes need company-scoped configuration.
- The operator picks the path once in plugin settings and the worker resolves
files relative to that root.
Do not use `localFolders` to grant the UI direct browser-side access to the
filesystem — there is no such capability. The browser still goes through the
worker via `getData` / `performAction`, and the worker only exposes paths it
chose to expose.
### When to keep worker-mediated project workspace browsing
When the data lives inside an existing project workspace, keep the browsing
flow worker-mediated:
- The worker uses `ctx.projects.listWorkspaces()` to resolve the workspace
path, then reads its filesystem with normal Node APIs.
- The plugin UI calls a `getData` handler for the root listing and an action
for lazy children, then renders them through `FileTree`.
- The worker is the only side that touches the disk. The browser receives a
serializable tree and never sees raw absolute paths it can replay.
The example `plugin-file-browser-example` is the reference for this pattern:
the worker registers `fileList` (data) and `loadFileList` (action) over the
same handler, and the UI uses the action for on-toggle directory loading so the
shared `FileTree` stays the rendering surface.
### Mixing surfaces
A single plugin can use more than one of these. The LLM Wiki uses
`localFolders` for its content root, then renders the resulting page list
through `FileTree`. The file browser example uses `ctx.projects.listWorkspaces`
to pick a workspace and renders its on-disk tree through `FileTree` with lazy
loading. Pick the boundary per data source, not per plugin.
## Company routes
Plugins may declare a `page` slot with `routePath` to own a company route like:
+17 -2
View File
@@ -27,7 +27,7 @@ Current limitations to keep in mind:
- Published npm packages are the intended install artifact for deployed plugins.
- The repo example plugins under `packages/plugins/examples/` are development conveniences. They work from a source checkout and should not be assumed to exist in a generic published build unless they are explicitly shipped with that build.
- Dynamic plugin install is not yet cloud-ready for horizontally scaled or ephemeral deployments. There is no shared artifact store, install coordination, or cross-node distribution layer yet.
- The current runtime does not yet ship a real host-provided plugin UI component kit, and it does not support plugin asset uploads/reads. Treat those as future-scope ideas in this spec, not current implementation promises.
- The current runtime ships a small host-provided plugin UI component kit through `@paperclipai/plugin-sdk/ui`, but does not support plugin asset uploads/reads yet. Treat plugin asset APIs as future-scope ideas, not current implementation promises.
- Scoped plugin API routes are JSON-only and must be declared in `apiRoutes`.
They mount under `/api/plugins/:pluginId/api/*`; plugins cannot shadow core
API routes.
@@ -976,13 +976,23 @@ export function DashboardWidget({ context }: PluginWidgetProps) {
The SDK includes a `ui` subpath export that plugin frontends import. This subpath provides:
- **Bridge hooks**: `usePluginData(key, params)`, `usePluginAction(key)`, `useHostContext()`
- **Bridge hooks**: `usePluginData(key, params)`, `usePluginAction(key)`, `useHostContext()`, `useHostNavigation()`
- **Design tokens**: colors, spacing, typography, shadows matching the host theme
- **Shared components**: `MetricCard`, `StatusBadge`, `DataTable`, `LogView`, `ActionBar`, `Spinner`, etc.
- **Type definitions**: `PluginPageProps`, `PluginWidgetProps`, `PluginDetailTabProps`
Plugins are encouraged but not required to use the shared components. A plugin may render entirely custom UI as long as it communicates through the bridge.
`useHostNavigation()` is the supported way for plugin UI to navigate to
Paperclip-internal pages. It exposes `resolveHref(to)`, `navigate(to,
options?)`, and `linkProps(to, options?)`. Plugin links should prefer
`linkProps()` so anchors keep real `href` values for copy-link, modifier-click,
middle-click, and open-in-new-tab behavior while plain left-clicks route through
the host SPA router. The host resolves company-scoped paths against the active
company prefix without double-prefixing already-prefixed paths. Plugin UI should
not use raw same-origin `href`s or `window.location.assign()` for internal
Paperclip navigation because those can force a full document reload.
### 19.0.2 Bundle Isolation
Plugin UI bundles are loaded as standard ES modules, not iframed. This gives plugins full rendering performance and access to the host's design tokens.
@@ -1062,6 +1072,11 @@ The host SDK ships shared components that plugins can import to quickly build UI
| `LogView` | Scrollable log output with timestamps | Webhook deliveries, job output, process logs |
| `JsonTree` | Collapsible JSON tree for debugging | Raw API responses, plugin state inspection |
| `Spinner` | Loading indicator | Data fetch states |
| `FileTree` | Host-styled file/directory tree | Wiki pages, workspace files, import previews |
| `IssuesList` | Host issue list | Plugin pages that need a native issue view |
| `AssigneePicker` | Host assignee picker for agents and board users | Creating issues, assigning routines, filtering work |
| `ProjectPicker` | Host project picker | Creating issues, scoping dashboards, filtering work |
| `ManagedRoutinesList` | Host routine list | Plugin settings pages that manage routines |
Plugins may also use entirely custom components. The shared components exist to reduce boilerplate and keep visual consistency, not to limit what plugins can render.
Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

+19 -2
View File
@@ -75,11 +75,28 @@ Fields:
```
PATCH /api/routines/{routineId}
{
"status": "paused"
"status": "paused",
"baseRevisionId": "{latestRevisionId}"
}
```
All fields from create are updatable. **Agents can only update routines assigned to themselves and cannot reassign a routine to another agent.**
All fields from create are updatable. `baseRevisionId` is optional for backward compatibility; when provided, stale values return `409 Conflict` with the current revision id. **Agents can only update routines assigned to themselves and cannot reassign a routine to another agent.**
## List Revisions
```
GET /api/routines/{routineId}/revisions
```
Returns append-only routine definition revisions newest first. Snapshots include routine fields and safe trigger metadata only; webhook secret values and `secretId` are never returned.
## Restore Revision
```
POST /api/routines/{routineId}/revisions/{revisionId}/restore
```
Restores a historical routine definition by creating a new latest revision copied from the selected revision. Historical revision rows, routine run history, and activity history are preserved. If restoring a deleted webhook trigger requires recreating it, the response can include one-time replacement secret material for that trigger.
## Add Trigger
+133
View File
@@ -0,0 +1,133 @@
---
title: Secrets Remote Import
summary: AWS Secrets Manager metadata-only remote import API
---
Remote import lets the board link existing AWS Secrets Manager entries as
Paperclip `external_reference` secrets without copying plaintext into
Paperclip.
Both routes are board-only and company-scoped. The selected provider vault must
belong to the company, use `aws_secrets_manager`, and have a selectable status
(`ready` or `warning`). Disabled, coming-soon, or cross-company vaults are
rejected.
Remote import is an inventory and metadata workflow. Preview calls AWS
`ListSecrets` only and import stores a Paperclip external reference plus
fingerprint/version metadata. Neither route calls `GetSecretValue` or
`BatchGetSecretValue`, requests `SecretString`, requires KMS decrypt, logs raw
remote metadata, or copies secret plaintext into Paperclip.
## Preview Remote AWS Secrets
```
POST /api/companies/{companyId}/secrets/remote-import/preview
{
"providerConfigId": "<aws-vault-uuid>",
"query": "stripe",
"nextToken": "optional-provider-page-token",
"pageSize": 50
}
```
`query` is optional and is sent to AWS as an inventory filter. Treat it as
non-secret metadata because AWS may record list request parameters in
CloudTrail. `nextToken` is an opaque AWS cursor; pass it back unchanged.
`pageSize` is capped at 100.
Response:
```json
{
"providerConfigId": "<aws-vault-uuid>",
"provider": "aws_secrets_manager",
"nextToken": null,
"candidates": [
{
"externalRef": "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/stripe",
"remoteName": "prod/stripe",
"name": "prod/stripe",
"key": "prod-stripe",
"providerVersionRef": null,
"providerMetadata": {
"lastChangedDate": "2026-05-06T00:00:00.000Z",
"hasDescription": true
},
"status": "ready",
"importable": true,
"conflicts": []
}
]
}
```
Candidate `status` values:
- `ready`: no existing exact external reference and no name/key collision.
- `duplicate`: an existing secret already has the exact provider `externalRef`.
- `conflict`: the suggested Paperclip `name` or `key` is already in use.
Conflict `type` values are `exact_reference`, `name`, `key`, and
`provider_guardrail`. AWS refs under Paperclip's own managed namespace are
blocked as external references so one company cannot import another company's
Paperclip-managed AWS secret through a broad runtime role.
## Import Remote AWS Secret References
```
POST /api/companies/{companyId}/secrets/remote-import
{
"providerConfigId": "<aws-vault-uuid>",
"secrets": [
{
"externalRef": "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/stripe",
"name": "Stripe production key",
"key": "stripe-production-key",
"description": "Stripe key used by production checkout",
"providerVersionRef": null,
"providerMetadata": {
"lastChangedDate": "2026-05-06T00:00:00.000Z",
"hasDescription": true
}
}
]
}
```
The import response is row-level. Ready rows become active
`external_reference` secrets with version metadata only. Exact-reference
duplicates and name/key conflicts are skipped without failing the whole request.
The `secrets` array accepts 1-100 rows, and the backend re-checks duplicates and
conflicts at submit time.
Each row may include an optional Paperclip `description` entered during review;
blank descriptions are stored as `null`. AWS provider descriptions are not
copied into this field.
```json
{
"providerConfigId": "<aws-vault-uuid>",
"provider": "aws_secrets_manager",
"importedCount": 1,
"skippedCount": 1,
"errorCount": 0,
"results": [
{
"externalRef": "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/stripe",
"name": "Stripe production key",
"key": "stripe-production-key",
"status": "imported",
"reason": null,
"secretId": "<paperclip-secret-id>",
"conflicts": []
}
]
}
```
Activity logs record aggregate counts and provider/vault ids only, not remote
secret names, ARNs, tags, or values.
Imported references may still fail during a future bound runtime resolution if
the Paperclip runtime role can list the AWS secret but lacks
`secretsmanager:GetSecretValue` or required KMS decrypt permission for that
specific secret.
+361 -4
View File
@@ -25,16 +25,357 @@ POST /api/companies/{companyId}/secrets
The value is encrypted at rest. Only the secret ID and metadata are returned.
## Update Secret
To link a provider-owned secret without copying the value into Paperclip, create
an external-reference secret:
```json
{
"name": "prod-stripe-key",
"provider": "aws_secrets_manager",
"managedMode": "external_reference",
"externalRef": "arn:aws:secretsmanager:us-east-1:123456789012:secret:paperclip/prod/stripe",
"providerVersionRef": "version-id-or-label"
}
```
Paperclip stores the provider reference and a non-sensitive fingerprint only.
The value is resolved, when the provider is configured, through the server
runtime path that enforces binding context and records access events.
## Provider Health
```
PATCH /api/secrets/{secretId}
GET /api/companies/{companyId}/secret-providers/health
```
Returns provider setup diagnostics, warnings, and local backup guidance. Health
responses must not include secret values or provider credentials.
For `aws_secrets_manager`, an unready health response names the missing
non-secret provider environment variables, the AWS SDK default credential source
expected by the server runtime, and the custody rule that AWS bootstrap
credentials must not be stored in Paperclip `company_secrets`.
The equivalent CLI check is:
```sh
pnpm paperclipai secrets doctor --company-id {companyId}
```
## Provider Vaults
Provider vaults are named, company-scoped configurations that route secret
material to one of the supported provider backends. See the
[secrets deploy guide](/deploy/secrets#provider-vaults) for the operator model
and custody rules.
All routes below require board auth and company access. Mutating routes emit
`secret_provider_config.*` activity-log entries. No route in this surface
returns provider credential values; submitting credential-shaped fields in
`config` is rejected at validation time.
### List Vaults
```
GET /api/companies/{companyId}/secret-provider-configs
```
Returns every vault for the company (including disabled rows for audit), each
with id, provider, displayName, status, isDefault, non-sensitive `config`,
latest health snapshot (`healthStatus`, `healthCheckedAt`, `healthMessage`,
`healthDetails`), `disabledAt`, and audit columns.
### Create Vault
```
POST /api/companies/{companyId}/secret-provider-configs
{
"provider": "aws_secrets_manager",
"displayName": "Prod US-East",
"isDefault": true,
"config": {
"region": "us-east-1",
"namespace": "paperclip",
"secretNamePrefix": "paperclip",
"kmsKeyId": "arn:aws:kms:us-east-1:123456789012:key/abcd-...",
"environmentTag": "production"
}
}
```
Per-provider `config` shapes:
- `local_encrypted`: optional `backupReminderAcknowledged: boolean`.
- `aws_secrets_manager`: required `region`; optional `namespace`,
`secretNamePrefix`, `kmsKeyId`, `ownerTag`, `environmentTag`.
- `gcp_secret_manager` (coming soon): optional `projectId`, `location`,
`namespace`, `secretNamePrefix`.
- `vault` (coming soon): optional origin-only HTTPS `address`, `namespace`,
`mountPath`, `secretPathPrefix`. `address` values with embedded credentials,
paths, query strings, or fragments are rejected.
`status` defaults to `ready` for `local_encrypted` and `aws_secrets_manager`,
and to `coming_soon` for `gcp_secret_manager` and `vault`. Coming-soon and
disabled vaults cannot be marked `isDefault`. Setting `isDefault: true` clears
the previous default for the same provider in the same transaction.
### Get Vault
```
GET /api/secret-provider-configs/{id}
```
### Update Vault
```
PATCH /api/secret-provider-configs/{id}
{
"displayName": "Prod US-East-2",
"config": {
"region": "us-east-2",
"kmsKeyId": "arn:aws:kms:us-east-2:123456789012:key/abcd-..."
}
}
```
`config` is replaced wholesale on update — pass the full provider config
payload, not a partial diff. Status transitions for `gcp_secret_manager` and
`vault` are constrained to `coming_soon` and `disabled` until their runtime
modules ship.
### Disable Vault
```
DELETE /api/secret-provider-configs/{id}
```
Soft-deletes the vault: status flips to `disabled`, `isDefault` clears, and
`disabledAt` is stamped. Disabled vaults remain in `GET` results for audit
purposes but are no longer offered in the secret create/rotate flow.
### Set Default
```
POST /api/secret-provider-configs/{id}/default
```
Marks the target vault as the default for its provider family and clears the
previous default. Returns 422 when the target is `coming_soon` or `disabled`.
### Run Health Check
```
POST /api/secret-provider-configs/{id}/health
```
Runs a provider-specific health probe and persists the result on the vault.
Response shape:
```json
{
"configId": "<uuid>",
"provider": "aws_secrets_manager",
"status": "ready" | "warning" | "error" | "coming_soon" | "disabled",
"message": "Provider vault is ready to handle managed writes",
"details": {
"code": "provider_ready",
"message": "...",
"guidance": ["..."]
},
"checkedAt": "2026-05-06T14:00:00.000Z"
}
```
Health responses never include provider credentials or secret values. For AWS
vaults, `details.guidance` may include missing non-secret env names and the
expected AWS SDK credential source; coming-soon vaults always return
`status: "coming_soon"` with `code: "runtime_locked"` and never call into
provider modules.
### Selecting A Vault When Creating Or Rotating Secrets
`POST /api/companies/{companyId}/secrets` and
`POST /api/secrets/{secretId}/rotate` both accept an optional
`providerConfigId` field that pins the secret to a specific vault. When
omitted (or null), the operation runs through the deployment-level provider
configuration — the same path existing installs already use. The board UI
preselects the company's default vault for the chosen provider before
submitting, so callers should usually send an explicit `providerConfigId`.
Coming-soon and disabled vaults are rejected with a 422; a vault that does not
match the secret's provider is rejected the same way.
```json
POST /api/companies/{companyId}/secrets
{
"name": "prod-stripe-key",
"provider": "aws_secrets_manager",
"providerConfigId": "<vault-uuid>",
"managedMode": "external_reference",
"externalRef": "arn:aws:secretsmanager:us-east-1:123456789012:secret:paperclip/prod/stripe"
}
```
### Response Redaction Rules
Every route in this surface enforces the same redaction contract:
- Secret values are never returned. The board UI never has a "reveal value"
affordance; resolution happens server-side at runtime under a binding.
- Provider credential values are never accepted, stored, returned, logged, or
echoed in error messages. Submitting credential-shaped fields fails
validation with a non-leaking error.
- Activity log entries record vault id, provider, displayName, status, and
isDefault transitions — never `config` payloads or health detail bodies.
## Remote Import From AWS Secrets Manager
Remote import links existing AWS Secrets Manager entries into Paperclip as
`external_reference` secrets. Import stores provider reference metadata only; it
does not copy the remote secret plaintext into Paperclip.
The routes are board-only and company-scoped. `providerConfigId` must point to
a same-company AWS provider vault with status `ready` or `warning`. Disabled,
coming-soon, non-AWS, and cross-company vaults are rejected. Imported secrets
resolve later through the selected vault, so runtime reads still need
`secretsmanager:GetSecretValue` and any required KMS decrypt permission on the
selected external secret.
### Preview Remote Import Candidates
```
POST /api/companies/{companyId}/secrets/remote-import/preview
{
"providerConfigId": "<aws-vault-uuid>",
"query": "stripe",
"nextToken": "opaque-provider-token",
"pageSize": 50
}
```
`query` is optional and is passed to AWS Secrets Manager inventory filtering.
Treat it as non-secret metadata because AWS may record list request parameters
in CloudTrail. `nextToken` is an opaque AWS cursor; callers must pass it back
unchanged and must not synthesize offsets. `pageSize` is optional, defaults to
50 in the UI, and is capped at 100.
Preview uses AWS `ListSecrets` only. It must not call `GetSecretValue` or
`BatchGetSecretValue`, must not request `SecretString`, and must not require KMS
decrypt. The response contains sanitized metadata for display and conflict
decisions:
```json
{
"providerConfigId": "<aws-vault-uuid>",
"provider": "aws_secrets_manager",
"nextToken": null,
"candidates": [
{
"externalRef": "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/stripe",
"remoteName": "prod/stripe",
"name": "prod/stripe",
"key": "prod-stripe",
"providerVersionRef": null,
"providerMetadata": {
"createdDate": "2026-05-06T00:00:00.000Z",
"lastChangedDate": "2026-05-06T00:00:00.000Z",
"hasDescription": true,
"hasKmsKey": true,
"tagCount": 3
},
"status": "ready",
"importable": true,
"conflicts": []
}
]
}
```
Candidate statuses:
- `ready`: the row can be selected for import.
- `duplicate`: a Paperclip secret already links the same canonical provider
reference for the same provider vault.
- `conflict`: the row has a name/key collision or provider guardrail failure.
Conflict types are `exact_reference`, `name`, `key`, and
`provider_guardrail`. AWS refs under Paperclip's own managed namespace are
blocked as external references; use the Paperclip-managed secret flow for those
resources instead.
### Import Selected Remote References
```
POST /api/companies/{companyId}/secrets/remote-import
{
"providerConfigId": "<aws-vault-uuid>",
"secrets": [
{
"externalRef": "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/stripe",
"name": "Stripe production key",
"key": "stripe-production-key",
"description": "Stripe key used by production checkout",
"providerVersionRef": null,
"providerMetadata": {
"createdDate": "2026-05-06T00:00:00.000Z"
}
}
]
}
```
The `secrets` array accepts 1-100 rows. Each row may override the suggested
Paperclip `name`, `key`, optional Paperclip `description`,
`providerVersionRef`, and sanitized `providerMetadata`. Blank descriptions are
stored as `null`; AWS provider descriptions are not copied into Paperclip
descriptions. The backend re-checks duplicate refs and name/key conflicts at
submit time; a stale preview does not bypass those checks.
The import response is row-level:
```json
{
"providerConfigId": "<aws-vault-uuid>",
"provider": "aws_secrets_manager",
"importedCount": 1,
"skippedCount": 1,
"errorCount": 0,
"results": [
{
"externalRef": "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/stripe",
"name": "Stripe production key",
"key": "stripe-production-key",
"status": "imported",
"reason": null,
"secretId": "<paperclip-secret-id>",
"conflicts": []
}
]
}
```
Row statuses:
- `imported`: Paperclip created an active `external_reference` secret and one
metadata-only version row.
- `skipped`: the row had an exact-reference duplicate or name/key conflict.
- `error`: the provider rejected the reference or the row failed validation.
Activity logs for preview/import store aggregate counts, provider id, and vault
id only. They must not store remote secret names, ARNs, descriptions, tags,
plaintext values, provider credentials, or raw AWS error blobs.
## Rotate Secret
```
POST /api/secrets/{secretId}/rotate
{
"value": "sk-ant-new-value..."
}
```
Creates a new version of the secret. Agents referencing `"version": "latest"` automatically get the new value on next heartbeat.
Creates a new version of the secret. Agents referencing `"version": "latest"`
automatically get the new value on next heartbeat. Pin to a specific version
when a bad `latest` rollout would affect many agents at once.
## Using Secrets in Agent Config
@@ -52,4 +393,20 @@ Reference secrets in agent adapter config instead of inline values:
}
```
The server resolves and decrypts secret references at runtime, injecting the real value into the agent process environment.
The server resolves and decrypts secret references at runtime, injecting the
real value into the agent process environment. Paperclip's custody guarantees
end at injection: the agent process can read, log, or forward the value, so
treat any secret bound to an agent as exposed to that agent. See the custody
boundaries note in the [secrets deploy guide](/deploy/secrets#custody-boundaries).
## Portability
Company export/import APIs represent agent and project environment requirements
as declarations in the package manifest. Exports omit secret values, secret IDs,
provider references, and encrypted provider material. Use:
```sh
pnpm paperclipai secrets declarations --company-id {companyId}
```
to inspect the declarations that an export would emit before moving a package.
Binary file not shown.

After

Width:  |  Height:  |  Size: 258 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 321 KiB

+10
View File
@@ -57,6 +57,16 @@ pnpm paperclipai context set --api-key-env-var-name PAPERCLIP_API_KEY
export PAPERCLIP_API_KEY=...
```
Secret operations are available under `paperclipai secrets`:
```sh
pnpm paperclipai secrets declarations --company-id <company-id> --kind secret
pnpm paperclipai secrets create --company-id <company-id> --name anthropic-api-key --value-env ANTHROPIC_API_KEY
pnpm paperclipai secrets link --company-id <company-id> --name prod-stripe-key --provider aws_secrets_manager --external-ref <provider-ref>
pnpm paperclipai secrets doctor --company-id <company-id>
pnpm paperclipai secrets migrate-inline-env --company-id <company-id> --apply
```
Context is stored at `~/.paperclip/context.json`.
## Command Categories
+9 -1
View File
@@ -67,7 +67,8 @@ Validates:
- Server configuration
- Database connectivity
- Secrets adapter configuration
- Secrets adapter configuration, including AWS Secrets Manager non-secret env
config when selected
- Storage configuration
- Missing key files
@@ -81,6 +82,13 @@ pnpm paperclipai configure --section secrets
pnpm paperclipai configure --section storage
```
`--section secrets` updates the deployment-level provider used as the fallback
for secrets that do not target a specific company vault. Per-company provider
vaults (named instances, default vault selection, multiple vaults per provider,
coming-soon GCP/Vault) live in the board UI under
`Company Settings → Secrets → Provider vaults` and the
`/api/companies/{companyId}/secret-provider-configs` API.
## `paperclipai env`
Show resolved environment configuration:
+318
View File
@@ -5,6 +5,52 @@ summary: Master key, encryption, and strict mode
Paperclip encrypts secrets at rest using a local master key. Agent environment variables that contain sensitive values (API keys, tokens) are stored as encrypted secret references.
## Custody Boundaries
Paperclip protects secret values up to the moment they are handed to an agent
or workload:
- Storage: values are encrypted at rest by the active provider. The local
provider keeps them encrypted with a key that never leaves the host.
- Transport: values are decrypted server-side and injected into the agent
process environment, SSH command env, sandbox driver, or HTTP request
immediately before the call. Paperclip does not return decrypted values to
the board UI.
- Audit: each resolution records a non-sensitive event (secret id, version,
provider id, consumer, outcome) without the value or provider credentials.
Once a value reaches the consuming process, Paperclip can no longer guarantee
secrecy. The agent (or sandbox, or remote host) can read the value, write it to
its own logs or transcript, or pass it to downstream tools. Treat any secret
you bind to an agent as exposed to that agent. Limit blast radius with bindings
(only bind what each agent needs), short-lived provider credentials where the
provider supports them, and rotation when an agent transcript or downstream
system might have captured a value.
## Using Secrets In Runs
Creating a company secret does not automatically create an environment variable.
You use a secret by binding it into an agent, project, environment, or plugin
configuration field that supports secret references.
For agent and project environment variables:
1. Create or link the secret in `Company Settings > Secrets`.
2. Open the agent's `Environment variables` field, or the project's `Env`
field.
3. Add the environment variable key the process expects, such as `GH_TOKEN` or
`OPENAI_API_KEY`.
4. Set the row source to `Secret`, select the stored secret, and choose either
`latest` or a pinned version.
At runtime, Paperclip resolves the selected secret server-side and injects the
resolved value under the env key from the binding row. The stored secret name
can be human-readable; the binding key is what the agent process receives.
Project env applies to every issue run in that project. When a project env key
matches an agent env key, the project value wins before Paperclip injects its
own `PAPERCLIP_*` runtime variables.
## Default Provider: `local_encrypted`
Secrets are encrypted with a local master key stored at:
@@ -14,6 +60,13 @@ Secrets are encrypted with a local master key stored at:
```
This key is auto-created during onboarding. The key never leaves your machine.
Paperclip best-effort enforces `0600` permissions when it creates or loads the
key file. `paperclipai doctor` and the provider health API warn when the file is
readable by group or other users.
Back up the key file together with database backups. A database backup without
the key cannot decrypt local secrets, and a key backup without the database
metadata is not enough to restore named secret versions.
## Configuration
@@ -35,6 +88,7 @@ Validate secrets config:
```sh
pnpm paperclipai doctor
pnpm paperclipai secrets doctor --company-id <company-id>
```
### Environment Overrides
@@ -55,15 +109,279 @@ PAPERCLIP_SECRETS_STRICT_MODE=true
Recommended for any deployment beyond local trusted.
Authenticated deployments default strict mode on unless explicitly overridden by
configuration or `PAPERCLIP_SECRETS_STRICT_MODE=false`.
## External References
Provider-owned secrets can be linked without copying values into Paperclip by
using `managedMode: "external_reference"` plus a provider `externalRef`.
Paperclip stores metadata and a non-sensitive fingerprint, never the value.
Runtime resolution remains server-side and binding-enforced.
The built-in AWS, GCP, and Vault provider IDs currently accept external
reference metadata, but runtime resolution requires provider configuration in the
deployment. Their provider health check reports this as a warning until
configured.
For hosted Paperclip Cloud on AWS, see the AWS Secrets Manager operational
contract — required env vars, IAM/KMS scoping, naming and tag conventions, and
backup/rotation/incident runbooks — in `doc/SECRETS-AWS-PROVIDER.md`.
## Provider Vaults
A *provider vault* is a named, company-scoped configuration that points secret
material at one of the supported provider backends. Each company can configure
multiple vaults, including more than one vault per provider family, and pick a
default vault per family for new secret operations. Existing secrets created
before any vault was configured continue to resolve through the deployment-level
default provider — no migration is required.
### Where to configure
Open `Company Settings → Secrets` in the board UI and switch to the
`Provider vaults` tab. From there you can:
- Create a vault for any supported provider family.
- Edit the non-secret config of an existing vault.
- Set one ready vault per provider family as the company default.
- Disable a vault (a soft delete that keeps audit history).
- Run a health check against a vault and read the latest result inline.
The same operations are exposed under
`/api/companies/{companyId}/secret-provider-configs` for automation. See the
[secrets API reference](/api/secrets#provider-vaults) for the full route table.
### Custody Of Provider Credentials
Provider vaults intentionally store only **non-sensitive** configuration:
region, project id, namespace, prefix, KMS key id, mount path, address, and
similar routing metadata. The API, UI, and activity log never accept, return,
or display provider credential values. Submitting fields with names like
`accessKeyId`, `secretAccessKey`, `token`, `password`, `serviceAccountJson`,
`privateKey`, `keyFile`, `unsealKey`, or any common credential alias is rejected
at validation time.
That keeps the bootstrap rule from the AWS provider applicable to every
provider family: **provider credentials live in deployment infrastructure
identity, not in Paperclip company secrets**. Allowed credential sources are
workload identity attached to the Paperclip server (instance profile, IRSA, ECS
task role), `AWS_PROFILE` / SSO / shared config for local runs, an orchestrator
secret store that boots the server, or short-lived shell credentials for local
development. Do not paste long-lived API keys into the vault config.
### Vault Status
Each vault carries a status that drives what the runtime can do with it:
| Status | Meaning |
|---------------|-----------------------------------------------------------------------------------------------|
| `ready` | Selectable for create/rotate/resolve. Eligible to be the default. |
| `warning` | Saved config exists but health needs attention (for example missing AWS env). Still selectable. |
| `coming_soon` | Visible and editable as draft metadata, but locked out of all runtime operations. |
| `disabled` | Soft-deleted. Hidden from the secret create/rotate flow. |
`gcp_secret_manager` and `vault` are pinned to `coming_soon` until their
runtime modules ship. The settings UI lets you save draft configuration for
those providers (and surfaces them on the vault list), but secret create,
rotate, and resolve calls that target a coming-soon vault fail with a clear
runtime-locked error.
### Default Vault Behavior
A company can mark **one** ready (or warning) vault per provider family as the
default. The secret create and rotate dialogs preselect the default vault for
the chosen provider so operators don't have to remember which vault to pick.
Coming-soon and disabled vaults cannot be marked default; attempting to do so
returns a validation error. Setting a new default automatically clears the
previous default for that provider.
If a secret is created without any `providerConfigId` (no vaults exist yet, or
the operator clears the selector), runtime resolution falls back to the
deployment-level provider configuration — the same path existing installs use.
This keeps secrets created before any provider vault was configured working
without migration. Picking the default in the UI is an explicit selection, not
a runtime fallback: the create call still sends an explicit `providerConfigId`.
### Multiple Vaults Per Provider
Multiple vaults from the same provider family are first-class. Common patterns:
- Two AWS vaults pointing at different regions or KMS keys for environment
separation.
- A staging Vault address alongside a production address.
- A dedicated GCP project for a single product line while the rest of the
company uses another.
Each vault has its own display name, status, default flag, and health record.
Operators choose the vault explicitly when creating or rotating a secret; the
default vault is preselected to avoid accidental routing to the wrong account.
### Per-Vault Health Checks
`POST /api/secret-provider-configs/{id}/health` runs a provider-specific health
probe and stores the result on the vault row. The settings UI exposes the same
action and renders the result inline. Health responses include a status,
operator-facing message, and structured guidance (such as missing env var
names, expected credential sources, and backup reminders). They never include
provider credentials or secret values. Coming-soon vaults always return a
`runtime_locked` health code and never call into provider modules.
### Provider-Specific Notes
**Local encrypted vaults** wrap the existing `local_encrypted` provider. The
master key path and rotation guidance described above still applies. A local
vault config is mostly bookkeeping plus an explicit acknowledgement that the
key file is backed up alongside the database.
**AWS Secrets Manager vaults** read the per-vault `region`, `namespace`,
`secretNamePrefix`, `kmsKeyId`, `ownerTag`, and `environmentTag` to route
managed writes and external-reference reads. The vault config supplements (and
can override) the deployment-level `PAPERCLIP_SECRETS_AWS_*` env. Bootstrap
credentials still come from the AWS SDK default credential chain — see
`doc/SECRETS-AWS-PROVIDER.md` for the full IAM and KMS contract.
**GCP Secret Manager** and **HashiCorp Vault** vaults are coming soon. You can
save draft `projectId`, `location`, `namespace`, `address`, and `mountPath`
metadata so the company is ready to flip them on when the provider modules
ship. Vault `address` values must be origin-only `http(s)://host[:port]` URLs;
addresses with embedded credentials, paths, query strings, or fragments are
rejected.
### Remote Import From AWS Vaults
AWS provider vaults can import existing AWS Secrets Manager entries as
Paperclip `external_reference` secrets. This is a metadata-only link: Paperclip
stores the AWS ARN/path, a fingerprint/version reference, and binding metadata.
It does not read, copy, store, log, or display the remote plaintext secret
value during preview or import.
Operator flow in the board UI:
1. Open `Company Settings -> Secrets`.
2. Confirm at least one AWS provider vault is `ready` or `warning`.
3. In the `Secrets` tab, choose `Import from vault`.
4. Select an AWS vault, search the remote inventory, and load more pages as
needed.
5. Check the rows to import, review/edit the Paperclip name and key, then
submit.
6. Review the result summary for created, skipped, and failed rows.
The preview list is intentionally paged and search-first. AWS accounts can have
large per-Region inventories, and `ListSecrets` returns opaque `NextToken`
cursors. Do not expect Paperclip to crawl a whole account in the background;
load pages deliberately and retry throttled requests with backoff.
Remote import exposes AWS secret metadata visible to the Paperclip runtime
role, including names/ARNs and safe derived fields such as dates, whether a
description or KMS key exists, and tag count. Treat names, ARNs, tags, and
search text as operational metadata that may be sensitive. The API and activity
log must not store raw descriptions, tags, plaintext values, provider
credentials, or raw AWS error blobs.
Required AWS posture:
- Preview needs optional `secretsmanager:ListSecrets` permission on
`Resource: "*"`. AWS does not support constraining `ListSecrets` to
individual secret ARNs or tags as an IAM boundary.
- Preview/import must not call `secretsmanager:GetSecretValue`,
`secretsmanager:BatchGetSecretValue`, or KMS decrypt.
- Runtime resolution of an imported reference still needs
`secretsmanager:GetSecretValue` on the selected external ARN/path and KMS
decrypt when that secret uses a customer-managed key.
- Keep managed create/rotate/delete permissions scoped to the Paperclip
deployment prefix. Do not broaden managed write/delete permissions just
because import inventory is enabled.
Safe scoping comes from deployment posture rather than AWS list filtering:
dedicated Paperclip runtime roles per environment/account, AWS vaults pointed at
the intended account and Region, import-enabled roles only where inventory
exposure is acceptable, and board-only access to the import routes. Tags and
name filters are search aids, not a permission model.
If import preview fails:
- `AccessDenied` or `not authorized`: the runtime role is missing
`secretsmanager:ListSecrets`; add the optional inventory statement only if
remote import should be enabled for that vault.
- Throttling: retry after a short delay and narrow the search before loading
more pages.
- Invalid cursor: refresh the preview; AWS `NextToken` values are opaque and
can expire or become stale.
- Runtime resolution failure after import: verify `GetSecretValue` and KMS
decrypt scope for the selected external secret. Being visible in inventory is
not proof that the runtime role can read the value.
### Backup And Restore
Each provider family has a different backup story:
- `local_encrypted`: back up the local master key file and the Paperclip
database together. Either alone is not enough to restore the encrypted
values, and the vault row only records the path and acknowledgement, not the
key bytes.
- `aws_secrets_manager`: back up Paperclip's database for vault metadata
(vault id, region, prefix, KMS key id, default flag, bindings, version
pointers). The actual secret values live in AWS Secrets Manager under the
configured prefix; restore by pointing the same Paperclip company at the
same AWS namespace and confirming the runtime role still has
`GetSecretValue` plus KMS decrypt. The full restore checklist lives in
`doc/SECRETS-AWS-PROVIDER.md`.
- `gcp_secret_manager` and `vault`: while these are coming soon, only the
draft vault config exists in Paperclip. Database backups capture it. There
is nothing to restore on the provider side until runtime support lands.
### AWS Provider Bootstrap Boundary
The AWS Secrets Manager provider cannot bootstrap itself from Paperclip
`company_secrets`. Its initial AWS access must be present before the server can
create or resolve AWS-backed company secrets, regardless of whether you use the
deployment-level default or a per-company vault.
For Paperclip Cloud, provision the server runtime IAM role/workload identity,
KMS key, deployment prefix, and non-secret `PAPERCLIP_SECRETS_AWS_*` environment
configuration before enabling AWS-backed secrets in the board UI. For
self-hosted and local runs, use the AWS SDK default credential chain: instance
profile, ECS task role, EKS IRSA/OIDC web identity, AWS SSO/shared config via
`AWS_PROFILE`, or short-lived shell credentials for local development.
Do not store AWS root credentials or long-lived IAM user access keys in
Paperclip secrets. Bootstrap material belongs in infrastructure IAM/workload
identity, the process environment, an AWS profile, or the orchestrator secret
store.
## Migrating Inline Secrets
If you have existing agents with inline API keys in their config, migrate them to encrypted secret refs:
```sh
pnpm paperclipai secrets migrate-inline-env --company-id <company-id>
pnpm paperclipai secrets migrate-inline-env --company-id <company-id> --apply
# low-level script for direct database maintenance
pnpm secrets:migrate-inline-env # dry run
pnpm secrets:migrate-inline-env --apply # apply migration
```
Use the CLI command for normal operations because it goes through the Paperclip
API, creates or rotates secret records, and updates agent env bindings with
audit logging.
## Portable Declarations
Company exports include only environment declarations. They do not include
secret IDs, provider references, encrypted material, or plaintext values.
```sh
pnpm paperclipai secrets declarations --company-id <company-id> --kind secret
```
Before importing a package into another instance, use those declarations to
create local values or link hosted provider references in the target deployment.
For hosted providers such as AWS Secrets Manager, the hosted provider remains
the value custodian; Paperclip stores metadata and provider version references,
not provider credentials or plaintext secret values.
## Secret References in Agent Config
Agent environment variables use secret references:
Binary file not shown.

After

Width:  |  Height:  |  Size: 182 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 108 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 191 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 121 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 183 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 105 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 188 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 335 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 151 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 87 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 180 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 701 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 316 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 694 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 546 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 701 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 316 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 694 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 118 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 118 KiB

+8 -1
View File
@@ -15,9 +15,12 @@
"build-storybook": "pnpm --filter @paperclipai/ui build-storybook",
"build": "pnpm run preflight:workspace-links && pnpm -r build",
"typecheck": "pnpm run preflight:workspace-links && pnpm -r typecheck",
"typecheck:build-gaps": "pnpm run preflight:workspace-links && node scripts/run-typecheck-build-gaps.mjs",
"test": "pnpm run test:run",
"test:watch": "pnpm run preflight:workspace-links && vitest",
"test:run": "pnpm run preflight:workspace-links && node scripts/run-vitest-stable.mjs",
"test:run:general": "pnpm run preflight:workspace-links && pnpm --filter @paperclipai/plugin-sdk build && node scripts/run-vitest-stable.mjs --mode general",
"test:run:serialized": "pnpm run preflight:workspace-links && pnpm --filter @paperclipai/plugin-sdk build && node scripts/run-vitest-stable.mjs --mode serialized",
"db:generate": "pnpm --filter @paperclipai/db generate",
"db:migrate": "pnpm --filter @paperclipai/db migrate",
"issue-references:backfill": "pnpm run preflight:workspace-links && tsx scripts/backfill-issue-reference-mentions.ts",
@@ -30,18 +33,22 @@
"release:stable": "./scripts/release.sh stable",
"release:github": "./scripts/create-github-release.sh",
"release:rollback": "./scripts/rollback-latest.sh",
"release:bootstrap-package": "node scripts/bootstrap-npm-package.mjs",
"check:tokens": "node scripts/check-forbidden-tokens.mjs",
"docs:dev": "cd docs && npx mintlify dev",
"smoke:openclaw-join": "./scripts/smoke/openclaw-join.sh",
"smoke:openclaw-docker-ui": "./scripts/smoke/openclaw-docker-ui.sh",
"smoke:openclaw-sse-standalone": "./scripts/smoke/openclaw-sse-standalone.sh",
"smoke:terminal-bench-loop-skill": "node scripts/smoke/terminal-bench-loop-skill-smoke.mjs",
"test:release-registry": "node --test scripts/verify-release-registry-state.test.mjs scripts/release-package-map.test.mjs scripts/check-release-package-bootstrap.test.mjs",
"test:e2e": "npx playwright test --config tests/e2e/playwright.config.ts",
"test:e2e:headed": "npx playwright test --config tests/e2e/playwright.config.ts --headed",
"test:e2e:multiuser-authenticated": "npx playwright test --config tests/e2e/playwright-multiuser-authenticated.config.ts",
"evals:smoke": "cd evals/promptfoo && npx promptfoo@0.103.3 eval",
"test:release-smoke": "npx playwright test --config tests/release-smoke/playwright.config.ts",
"test:release-smoke:headed": "npx playwright test --config tests/release-smoke/playwright.config.ts --headed",
"metrics:paperclip-commits": "tsx scripts/paperclip-commit-metrics.ts"
"metrics:paperclip-commits": "tsx scripts/paperclip-commit-metrics.ts",
"perf:issue-chat-long-thread": "node scripts/measure-issue-chat-long-thread.mjs"
},
"devDependencies": {
"@playwright/test": "^1.58.2",
@@ -0,0 +1,220 @@
import { mkdir, mkdtemp, readFile, rm, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { execFile as execFileCallback } from "node:child_process";
import { promisify } from "node:util";
import { afterEach, describe, expect, it } from "vitest";
import { prepareCommandManagedRuntime } from "./command-managed-runtime.js";
import type { RunProcessResult } from "./server-utils.js";
const execFile = promisify(execFileCallback);
describe("command managed runtime", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("keeps the runtime overlay out of sandbox workspace sync by default", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-command-runtime-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(path.join(localWorkspaceDir, ".paperclip-runtime"), { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "local workspace\n", "utf8");
await writeFile(path.join(localWorkspaceDir, ".paperclip-runtime", "state.json"), "{\"keep\":true}\n", "utf8");
const calls: Array<{
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}> = [];
const runner = {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}): Promise<RunProcessResult> => {
calls.push({ ...input });
const startedAt = new Date().toISOString();
const env = {
...process.env,
...input.env,
};
const command =
input.command === "sh" ? "/bin/sh" : input.command === "bash" ? "/bin/bash" : input.command;
const args = [...(input.args ?? [])];
if (
input.stdin != null &&
(input.command === "sh" || input.command === "bash") &&
(args[0] === "-c" || args[0] === "-lc") &&
typeof args[1] === "string"
) {
env.PAPERCLIP_TEST_STDIN = input.stdin;
args[1] = `printf '%s' \"$PAPERCLIP_TEST_STDIN\" | (${args[1]})`;
}
try {
const result = await execFile(command, args, {
cwd: input.cwd,
env,
maxBuffer: 32 * 1024 * 1024,
timeout: input.timeoutMs,
});
return {
exitCode: 0,
signal: null,
timedOut: false,
stdout: result.stdout,
stderr: result.stderr,
pid: null,
startedAt,
};
} catch (error) {
const err = error as NodeJS.ErrnoException & {
stdout?: string;
stderr?: string;
code?: string | number | null;
signal?: NodeJS.Signals | null;
killed?: boolean;
};
return {
exitCode: typeof err.code === "number" ? err.code : null,
signal: err.signal ?? null,
timedOut: Boolean(err.killed && input.timeoutMs),
stdout: err.stdout ?? "",
stderr: err.stderr ?? "",
pid: null,
startedAt,
};
}
},
};
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "claude",
workspaceLocalDir: localWorkspaceDir,
});
await expect(readFile(path.join(remoteWorkspaceDir, "README.md"), "utf8")).resolves.toBe("local workspace\n");
await expect(readFile(path.join(remoteWorkspaceDir, ".paperclip-runtime", "state.json"), "utf8")).rejects
.toMatchObject({ code: "ENOENT" });
expect(calls.every((call) => call.stdin == null)).toBe(true);
await mkdir(path.join(remoteWorkspaceDir, ".paperclip-runtime"), { recursive: true });
await writeFile(path.join(remoteWorkspaceDir, "README.md"), "remote workspace\n", "utf8");
await writeFile(path.join(remoteWorkspaceDir, ".paperclip-runtime", "remote-state.json"), "{\"remote\":true}\n", "utf8");
await prepared.restoreWorkspace();
await expect(readFile(path.join(localWorkspaceDir, "README.md"), "utf8")).resolves.toBe("remote workspace\n");
await expect(readFile(path.join(localWorkspaceDir, ".paperclip-runtime", "state.json"), "utf8")).resolves
.toBe("{\"keep\":true}\n");
await expect(readFile(path.join(localWorkspaceDir, ".paperclip-runtime", "remote-state.json"), "utf8")).rejects
.toMatchObject({ code: "ENOENT" });
expect(calls.every((call) => call.stdin == null)).toBe(true);
});
it("runs setup commands from a stable root cwd when staging into a nested remote workspace dir", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-command-runtime-nested-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteBaseDir = path.join(rootDir, "remote-base");
const remoteWorkspaceDir = path.join(remoteBaseDir, ".paperclip-runtime", "runs", "test", "workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteBaseDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "local workspace\n", "utf8");
const calls: Array<{
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}> = [];
const runner = {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}): Promise<RunProcessResult> => {
calls.push({ ...input });
const startedAt = new Date().toISOString();
try {
const result = await execFile(input.command === "sh" ? "/bin/sh" : input.command, input.args ?? [], {
cwd: input.cwd,
env: {
...process.env,
...input.env,
},
maxBuffer: 32 * 1024 * 1024,
timeout: input.timeoutMs,
});
return {
exitCode: 0,
signal: null,
timedOut: false,
stdout: result.stdout,
stderr: result.stderr,
pid: null,
startedAt,
};
} catch (error) {
const err = error as NodeJS.ErrnoException & {
stdout?: string;
stderr?: string;
code?: string | number | null;
signal?: NodeJS.Signals | null;
killed?: boolean;
};
return {
exitCode: typeof err.code === "number" ? err.code : null,
signal: err.signal ?? null,
timedOut: Boolean(err.killed && input.timeoutMs),
stdout: err.stdout ?? "",
stderr: err.stderr ?? "",
pid: null,
startedAt,
};
}
},
};
await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteBaseDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
workspaceRemoteDir: remoteWorkspaceDir,
});
expect(calls.length).toBeGreaterThan(0);
expect(calls.every((call) => call.cwd === "/")).toBe(true);
await expect(readFile(path.join(remoteWorkspaceDir, "README.md"), "utf8")).resolves.toBe("local workspace\n");
});
});
@@ -6,6 +6,7 @@ import {
type SandboxManagedRuntimeClient,
type SandboxRemoteExecutionSpec,
} from "./sandbox-managed-runtime.js";
import { preferredShellForSandbox, shellCommandArgs } from "./sandbox-shell.js";
import type { RunProcessResult } from "./server-utils.js";
export interface CommandManagedRuntimeRunner {
@@ -23,10 +24,10 @@ export interface CommandManagedRuntimeRunner {
export interface CommandManagedRuntimeSpec {
providerKey?: string | null;
shellCommand?: "bash" | "sh" | null;
leaseId?: string | null;
remoteCwd: string;
timeoutMs?: number | null;
paperclipApiUrl?: string | null;
}
export type CommandManagedRuntimeAsset = SandboxManagedRuntimeAsset;
@@ -35,6 +36,12 @@ function shellQuote(value: string) {
return `'${value.replace(/'/g, `'"'"'`)}'`;
}
function mergeRuntimeExcludes(entries: string[] | undefined): string[] {
return [...new Set([".paperclip-runtime", ...(entries ?? [])])];
}
const REMOTE_WRITE_BASE64_CHUNK_SIZE = 32 * 1024;
function toBuffer(bytes: Buffer | Uint8Array | ArrayBuffer): Buffer {
if (Buffer.isBuffer(bytes)) return bytes;
if (bytes instanceof ArrayBuffer) return Buffer.from(bytes);
@@ -48,16 +55,18 @@ function requireSuccessfulResult(result: RunProcessResult, action: string): void
throw new Error(`${action} failed with exit code ${result.exitCode ?? "null"}${detail}`);
}
function createCommandManagedRuntimeClient(input: {
export function createCommandManagedRuntimeClient(input: {
runner: CommandManagedRuntimeRunner;
remoteCwd: string;
commandCwd: string;
timeoutMs: number;
shellCommand?: "bash" | "sh" | null;
}): SandboxManagedRuntimeClient {
const shellCommand = preferredShellForSandbox(input.shellCommand);
const runShell = async (script: string, opts: { stdin?: string; timeoutMs?: number } = {}) => {
const result = await input.runner.execute({
command: "sh",
args: ["-lc", script],
cwd: input.remoteCwd,
command: shellCommand,
args: shellCommandArgs(script),
cwd: input.commandCwd,
stdin: opts.stdin,
timeoutMs: opts.timeoutMs ?? input.timeoutMs,
});
@@ -71,29 +80,53 @@ function createCommandManagedRuntimeClient(input: {
},
writeFile: async (remotePath, bytes) => {
const body = toBuffer(bytes).toString("base64");
const remoteDir = path.posix.dirname(remotePath);
const remoteTempPath = `${remotePath}.paperclip-upload.b64`;
await runShell(
`mkdir -p ${shellQuote(path.posix.dirname(remotePath))} && base64 -d > ${shellQuote(remotePath)}`,
{ stdin: body },
`mkdir -p ${shellQuote(remoteDir)} && rm -f ${shellQuote(remoteTempPath)} && : > ${shellQuote(remoteTempPath)}`,
);
for (let offset = 0; offset < body.length; offset += REMOTE_WRITE_BASE64_CHUNK_SIZE) {
const chunk = body.slice(offset, offset + REMOTE_WRITE_BASE64_CHUNK_SIZE);
await runShell(`printf '%s' ${shellQuote(chunk)} >> ${shellQuote(remoteTempPath)}`);
}
await runShell(
`base64 -d < ${shellQuote(remoteTempPath)} > ${shellQuote(remotePath)} && rm -f ${shellQuote(remoteTempPath)}`,
);
},
readFile: async (remotePath) => {
const result = await runShell(`base64 < ${shellQuote(remotePath)}`);
return Buffer.from(result.stdout.replace(/\s+/g, ""), "base64");
},
listFiles: async (remotePath) => {
const result = await runShell(
`if [ -d ${shellQuote(remotePath)} ]; then ` +
`for entry in ${shellQuote(remotePath)}/*; do ` +
`[ -f "$entry" ] || continue; ` +
`basename "$entry"; ` +
`done; ` +
`fi`,
);
return result.stdout
.split(/\r?\n/)
.map((entry) => entry.trim())
.filter((entry) => entry.length > 0)
.sort((left, right) => left.localeCompare(right));
},
remove: async (remotePath) => {
const result = await input.runner.execute({
command: "sh",
args: ["-lc", `rm -rf ${shellQuote(remotePath)}`],
cwd: input.remoteCwd,
command: shellCommand,
args: shellCommandArgs(`rm -rf ${shellQuote(remotePath)}`),
cwd: input.commandCwd,
timeoutMs: input.timeoutMs,
});
requireSuccessfulResult(result, `remove ${remotePath}`);
},
run: async (command, options) => {
const result = await input.runner.execute({
command: "sh",
args: ["-lc", command],
cwd: input.remoteCwd,
command: shellCommand,
args: shellCommandArgs(command),
cwd: input.commandCwd,
timeoutMs: options.timeoutMs,
});
requireSuccessfulResult(result, command);
@@ -111,9 +144,15 @@ export async function prepareCommandManagedRuntime(input: {
preserveAbsentOnRestore?: string[];
assets?: CommandManagedRuntimeAsset[];
installCommand?: string | null;
/** When provided alongside `installCommand`, skip the install if `command -v <detectCommand>` succeeds. */
detectCommand?: string | null;
}): Promise<PreparedSandboxManagedRuntime> {
const timeoutMs = input.spec.timeoutMs && input.spec.timeoutMs > 0 ? input.spec.timeoutMs : 300_000;
const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
// Managed-runtime sync/restore scripts use absolute paths throughout, so
// run them from a stable cwd. The target workspace itself may be removed or
// recreated during a run, which breaks shell startup if we chdir into it.
const commandCwd = "/";
const runtimeSpec: SandboxRemoteExecutionSpec = {
transport: "sandbox",
provider: input.spec.providerKey ?? "sandbox",
@@ -121,22 +160,62 @@ export async function prepareCommandManagedRuntime(input: {
remoteCwd: workspaceRemoteDir,
timeoutMs,
apiKey: null,
paperclipApiUrl: input.spec.paperclipApiUrl ?? null,
};
const client = createCommandManagedRuntimeClient({
runner: input.runner,
remoteCwd: workspaceRemoteDir,
commandCwd,
timeoutMs,
shellCommand: input.spec.shellCommand,
});
const shellCommand = preferredShellForSandbox(input.spec.shellCommand);
if (input.installCommand?.trim()) {
const installCommand = input.installCommand.trim();
const detectCommand = input.detectCommand?.trim();
// Skip the install when the binary is already on PATH. Without this
// probe the install runs unconditionally on every execute() call (and
// also runs a second time after `ensureAdapterExecutionTargetCommandResolvable`
// has already installed it during the resolvability gate).
if (detectCommand) {
const probe = await input.runner.execute({
command: shellCommand,
args: shellCommandArgs(`command -v ${shellQuote(detectCommand)} >/dev/null 2>&1`),
cwd: commandCwd,
timeoutMs,
});
if (!probe.timedOut && (probe.exitCode ?? 1) === 0) {
return await prepareSandboxManagedRuntime({
spec: runtimeSpec,
client,
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir,
workspaceExclude: mergeRuntimeExcludes(input.workspaceExclude),
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
assets: input.assets,
});
}
}
const result = await input.runner.execute({
command: "sh",
args: ["-lc", input.installCommand.trim()],
cwd: workspaceRemoteDir,
command: shellCommand,
args: shellCommandArgs(installCommand),
cwd: commandCwd,
timeoutMs,
});
requireSuccessfulResult(result, input.installCommand.trim());
// A failed install is not always fatal: the CLI may already be on PATH
// from a previous lease, the template image, or another path entry. Log
// and continue rather than aborting the agent run; downstream code that
// exec's the CLI will surface a clear "command not found" if it is in
// fact missing. The test path's `maybeRunSandboxInstallCommand` already
// honors this contract — keep them consistent.
if (result.timedOut || (result.exitCode ?? 0) !== 0) {
const tail = (text: string) =>
text.split(/\r?\n/).filter((line) => line.trim().length > 0).slice(-3).join(" | ").slice(0, 480);
const reason = result.timedOut ? "timed out" : `exited ${result.exitCode ?? "?"}`;
console.warn(
`[paperclip] managed-runtime install command ${reason}: ${installCommand} :: ${tail(result.stderr || result.stdout)}`,
);
}
}
return await prepareSandboxManagedRuntime({
@@ -145,7 +224,7 @@ export async function prepareCommandManagedRuntime(input: {
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir,
workspaceExclude: input.workspaceExclude,
workspaceExclude: mergeRuntimeExcludes(input.workspaceExclude),
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
assets: input.assets,
});
@@ -0,0 +1,21 @@
export const REDACTED_COMMAND_TEXT_VALUE = "***REDACTED***";
const COMMAND_CLI_SECRET_OPTION_RE =
/(\B-{1,2}(?:api[-_]?key|(?:access[-_]?|auth[-_]?)?token|token|authorization|bearer|secret|passwd|password|credential|jwt|private[-_]?key|cookie|connectionstring)(?:\s+|=)(["']?))[^\s"'`]+(\2)/gi;
const COMMAND_ENV_SECRET_ASSIGNMENT_RE =
/(\b[A-Za-z0-9_]*(?:TOKEN|KEY|SECRET|PASSWORD|PASSWD|AUTHORIZATION|JWT)[A-Za-z0-9_]*\s*=\s*)[^\s"'`]+/gi;
const COMMAND_AUTHORIZATION_BEARER_RE = /(\bAuthorization\s*:\s*Bearer\s+)[^\s"'`]+/gi;
const COMMAND_OPENAI_KEY_RE = /\bsk-[A-Za-z0-9_-]{12,}\b/g;
const COMMAND_GITHUB_TOKEN_RE = /\bgh[pousr]_[A-Za-z0-9_]{20,}\b/g;
const COMMAND_JWT_RE =
/\b[A-Za-z0-9_-]{8,}\.[A-Za-z0-9_-]{8,}\.[A-Za-z0-9_-]{8,}(?:\.[A-Za-z0-9_-]{8,})?\b/g;
export function redactCommandText(command: string, redactedValue = REDACTED_COMMAND_TEXT_VALUE): string {
return command
.replace(COMMAND_AUTHORIZATION_BEARER_RE, `$1${redactedValue}`)
.replace(COMMAND_CLI_SECRET_OPTION_RE, `$1${redactedValue}$3`)
.replace(COMMAND_ENV_SECRET_ASSIGNMENT_RE, `$1${redactedValue}`)
.replace(COMMAND_OPENAI_KEY_RE, redactedValue)
.replace(COMMAND_GITHUB_TOKEN_RE, redactedValue)
.replace(COMMAND_JWT_RE, redactedValue);
}
@@ -1,14 +1,65 @@
import { describe, expect, it, vi } from "vitest";
import { createServer } from "node:http";
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
import {
DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC,
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetToRemoteSpec,
adapterExecutionTargetUsesPaperclipBridge,
ensureAdapterExecutionTargetCommandResolvable,
resolveAdapterExecutionTargetTimeoutSec,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
startAdapterExecutionTargetPaperclipBridge,
type AdapterSandboxExecutionTarget,
} from "./execution-target.js";
import { runChildProcess } from "./server-utils.js";
describe("sandbox adapter execution targets", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
vi.unstubAllEnvs();
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
function createLocalSandboxRunner() {
let counter = 0;
return {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
onSpawn?: (meta: { pid: number; startedAt: string }) => Promise<void>;
}) => {
counter += 1;
const command = input.command === "bash" ? "/bin/bash" : input.command;
return runChildProcess(`sandbox-run-${counter}`, command, input.args ?? [], {
cwd: input.cwd ?? process.cwd(),
env: input.env ?? {},
stdin: input.stdin,
timeoutSec: Math.max(1, Math.ceil((input.timeoutMs ?? 30_000) / 1000)),
graceSec: 5,
onLog: input.onLog ?? (async () => {}),
onSpawn: input.onSpawn
? async (meta) => input.onSpawn?.({ pid: meta.pid, startedAt: meta.startedAt })
: undefined,
});
},
};
}
it("executes through the provider-neutral runner without a remote spec", async () => {
const runner = {
execute: vi.fn(async () => ({
@@ -61,6 +112,89 @@ describe("sandbox adapter execution targets", () => {
});
});
it("applies the remote sandbox fallback when adapter timeoutSec is unset", () => {
const sandboxTarget: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
remoteCwd: "/workspace",
runner: createLocalSandboxRunner(),
};
expect(resolveAdapterExecutionTargetTimeoutSec(sandboxTarget, 0)).toBe(
DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC,
);
expect(resolveAdapterExecutionTargetTimeoutSec(sandboxTarget, 90)).toBe(90);
expect(resolveAdapterExecutionTargetTimeoutSec({
kind: "remote",
transport: "ssh",
remoteCwd: "/workspace",
spec: {
host: "127.0.0.1",
port: 22,
username: "fixture",
remoteWorkspacePath: "/workspace",
remoteCwd: "/workspace",
privateKey: "KEY",
knownHosts: "host key",
strictHostKeyChecking: true,
},
}, 0)).toBe(0);
});
it("uses the caller timeout override when installing a missing sandbox command", async () => {
const runner = {
execute: vi.fn()
.mockResolvedValueOnce({
exitCode: 1,
signal: null,
timedOut: false,
stdout: "",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})
.mockResolvedValueOnce({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})
.mockResolvedValueOnce({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "/usr/bin/opencode\n",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
}),
};
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
remoteCwd: "/workspace",
timeoutMs: 300_000,
runner,
};
await ensureAdapterExecutionTargetCommandResolvable(
"opencode",
target,
"/local/workspace",
{},
{ installCommand: "npm install -g opencode", timeoutSec: 1800 },
);
expect(runner.execute).toHaveBeenNthCalledWith(2, expect.objectContaining({
command: "sh",
args: ["-c", "npm install -g opencode"],
timeoutMs: 1_800_000,
}));
});
it("runs shell commands through the same runner", async () => {
const runner = {
execute: vi.fn(async () => ({
@@ -88,9 +222,359 @@ describe("sandbox adapter execution targets", () => {
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "sh",
args: ["-lc", 'printf %s "$HOME"'],
args: ["-c", 'printf %s "$HOME"'],
cwd: "/workspace",
timeoutMs: 7000,
}));
});
it("strips inherited host identity env before sandbox execution", async () => {
vi.stubEnv("PATH", "/host/bin:/usr/bin");
vi.stubEnv("HOME", "/Users/local");
vi.stubEnv("TMPDIR", "/var/folders/local/T");
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "ok\n",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
remoteCwd: "/workspace",
runner,
};
await runAdapterExecutionTargetProcess("run-1b", target, "agent-cli", ["--json"], {
cwd: "/local/workspace",
env: {
PATH: "/host/bin:/usr/bin",
HOME: "/Users/local",
TMPDIR: "/var/folders/local/T",
SAFE_VALUE: "visible",
},
timeoutSec: 5,
graceSec: 1,
onLog: async () => {},
});
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
env: {
SAFE_VALUE: "visible",
},
}));
});
it("preserves explicit remote identity env overrides for sandbox execution", async () => {
vi.stubEnv("PATH", "/host/bin:/usr/bin");
vi.stubEnv("HOME", "/Users/local");
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "ok\n",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
remoteCwd: "/workspace",
runner,
};
await runAdapterExecutionTargetProcess("run-1c", target, "agent-cli", ["--json"], {
cwd: "/local/workspace",
env: {
PATH: "/custom/remote/bin:/usr/bin",
HOME: "/home/sandbox",
SAFE_VALUE: "visible",
},
timeoutSec: 5,
graceSec: 1,
onLog: async () => {},
});
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
env: {
PATH: "/custom/remote/bin:/usr/bin",
HOME: "/home/sandbox",
SAFE_VALUE: "visible",
},
}));
});
it("treats SSH targets as bridge-only", () => {
const target = {
kind: "remote" as const,
transport: "ssh" as const,
remoteCwd: "/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "paperclip",
remoteWorkspacePath: "/workspace",
remoteCwd: "/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
};
expect(adapterExecutionTargetUsesPaperclipBridge(target)).toBe(true);
expect(adapterExecutionTargetSessionIdentity(target)).toEqual({
transport: "ssh",
host: "ssh.example.test",
port: 22,
username: "paperclip",
remoteCwd: "/workspace",
});
});
it("uses the provider-declared shell for sandbox helper commands", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "/home/sandbox",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "custom-provider",
shellCommand: "bash",
remoteCwd: "/workspace",
runner,
};
await runAdapterExecutionTargetShellCommand("run-2b", target, 'printf %s "$HOME"', {
cwd: "/local/workspace",
env: {},
timeoutSec: 7,
});
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "bash",
args: ["-c", 'printf %s "$HOME"'],
cwd: "/workspace",
timeoutMs: 7000,
}));
});
it("starts a localhost Paperclip bridge for sandbox targets in bridge mode", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-execution-target-bridge-"));
cleanupDirs.push(rootDir);
const remoteCwd = path.join(rootDir, "workspace");
const runtimeRootDir = path.join(remoteCwd, ".paperclip-runtime", "codex");
await mkdir(runtimeRootDir, { recursive: true });
const requests: Array<{ method: string; url: string; auth: string | null; runId: string | null }> = [];
const apiServer = createServer((req, res) => {
requests.push({
method: req.method ?? "GET",
url: req.url ?? "/",
auth: req.headers.authorization ?? null,
runId: typeof req.headers["x-paperclip-run-id"] === "string" ? req.headers["x-paperclip-run-id"] : null,
});
res.writeHead(200, { "content-type": "application/json" });
res.end(JSON.stringify({ ok: true }));
});
await new Promise<void>((resolve, reject) => {
apiServer.once("error", reject);
apiServer.listen(0, "127.0.0.1", () => resolve());
});
const address = apiServer.address();
if (!address || typeof address === "string") {
throw new Error("Expected the bridge test API server to listen on a TCP port.");
}
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "e2b",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd,
runner: createLocalSandboxRunner(),
timeoutMs: 30_000,
};
const bridge = await startAdapterExecutionTargetPaperclipBridge({
runId: "run-bridge",
target,
runtimeRootDir,
adapterKey: "codex",
hostApiToken: "real-run-jwt",
hostApiUrl: `http://127.0.0.1:${address.port}`,
});
try {
expect(bridge).not.toBeNull();
expect(bridge?.env.PAPERCLIP_API_URL).toMatch(/^http:\/\/127\.0\.0\.1:\d+$/);
expect(bridge?.env.PAPERCLIP_API_KEY).not.toBe("real-run-jwt");
expect(bridge?.env.PAPERCLIP_API_BRIDGE_MODE).toBe("queue_v1");
const response = await fetch(`${bridge!.env.PAPERCLIP_API_URL}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridge!.env.PAPERCLIP_API_KEY}`,
accept: "application/json",
},
});
expect(response.status).toBe(200);
expect(await response.json()).toEqual({ ok: true });
expect(requests).toEqual([{
method: "GET",
url: "/api/agents/me",
auth: "Bearer real-run-jwt",
runId: "run-bridge",
}]);
} finally {
await bridge?.stop();
await new Promise<void>((resolve) => apiServer.close(() => resolve()));
}
});
it("uses the effective adapter timeout when starting the sandbox callback bridge", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-execution-target-bridge-timeout-"));
cleanupDirs.push(rootDir);
const remoteCwd = path.join(rootDir, "workspace");
const runtimeRootDir = path.join(remoteCwd, ".paperclip-runtime", "codex");
await mkdir(runtimeRootDir, { recursive: true });
const delegateRunner = createLocalSandboxRunner();
const runner = {
execute: vi.fn(async (input: Parameters<typeof delegateRunner.execute>[0]) => delegateRunner.execute(input)),
};
const apiServer = createServer((req, res) => {
res.writeHead(200, { "content-type": "application/json" });
res.end(JSON.stringify({ ok: true }));
});
await new Promise<void>((resolve, reject) => {
apiServer.once("error", reject);
apiServer.listen(0, "127.0.0.1", () => resolve());
});
const address = apiServer.address();
if (!address || typeof address === "string") {
throw new Error("Expected the bridge timeout test API server to listen on a TCP port.");
}
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "cloudflare",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd,
runner,
timeoutMs: 30_000,
};
const bridge = await startAdapterExecutionTargetPaperclipBridge({
runId: "run-bridge-timeout",
target,
runtimeRootDir,
adapterKey: "codex",
timeoutSec: DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC,
hostApiToken: "real-run-jwt",
hostApiUrl: `http://127.0.0.1:${address.port}`,
});
try {
expect(bridge).not.toBeNull();
expect(runner.execute).toHaveBeenCalled();
expect(runner.execute.mock.calls.some(([input]) => input.timeoutMs === 1_800_000)).toBe(true);
} finally {
await bridge?.stop();
await new Promise<void>((resolve) => apiServer.close(() => resolve()));
}
});
it("fails oversized host responses with a 502 before returning them to the sandbox client", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-execution-target-bridge-limit-"));
cleanupDirs.push(rootDir);
const remoteCwd = path.join(rootDir, "workspace");
const runtimeRootDir = path.join(remoteCwd, ".paperclip-runtime", "codex");
await mkdir(runtimeRootDir, { recursive: true });
const requests: Array<{ method: string; url: string; auth: string | null; runId: string | null }> = [];
const largeBody = "x".repeat(64);
const apiServer = createServer((req, res) => {
requests.push({
method: req.method ?? "GET",
url: req.url ?? "/",
auth: req.headers.authorization ?? null,
runId: typeof req.headers["x-paperclip-run-id"] === "string" ? req.headers["x-paperclip-run-id"] : null,
});
res.writeHead(200, {
"content-type": "application/json",
"content-length": String(Buffer.byteLength(largeBody, "utf8")),
});
res.end(largeBody);
});
await new Promise<void>((resolve, reject) => {
apiServer.once("error", reject);
apiServer.listen(0, "127.0.0.1", () => resolve());
});
const address = apiServer.address();
if (!address || typeof address === "string") {
throw new Error("Expected the bridge test API server to listen on a TCP port.");
}
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "e2b",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd,
runner: createLocalSandboxRunner(),
timeoutMs: 30_000,
};
const bridge = await startAdapterExecutionTargetPaperclipBridge({
runId: "run-bridge-limit",
target,
runtimeRootDir,
adapterKey: "codex",
hostApiToken: "real-run-jwt",
hostApiUrl: `http://127.0.0.1:${address.port}`,
maxBodyBytes: 32,
});
try {
const response = await fetch(`${bridge!.env.PAPERCLIP_API_URL}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridge!.env.PAPERCLIP_API_KEY}`,
accept: "application/json",
},
});
expect(response.status).toBe(502);
await expect(response.json()).resolves.toEqual({
error: "Bridge response body exceeded the configured size limit of 32 bytes.",
});
expect(requests).toEqual([{
method: "GET",
url: "/api/agents/me",
auth: "Bearer real-run-jwt",
runId: "run-bridge-limit",
}]);
} finally {
await bridge?.stop();
await new Promise<void>((resolve) => apiServer.close(() => resolve()));
}
});
});
@@ -1,13 +1,18 @@
import { afterEach, describe, expect, it, vi } from "vitest";
import * as ssh from "./ssh.js";
import * as serverUtils from "./server-utils.js";
import {
adapterExecutionTargetUsesManagedHome,
ensureAdapterExecutionTargetRuntimeCommandInstalled,
resolveAdapterExecutionTargetCwd,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
} from "./execution-target.js";
describe("runAdapterExecutionTargetShellCommand", () => {
afterEach(() => {
vi.restoreAllMocks();
vi.unstubAllEnvs();
});
it("quotes remote shell commands with the shared SSH quoting helper", async () => {
@@ -40,16 +45,68 @@ describe("runAdapterExecutionTargetShellCommand", () => {
},
);
// runSshCommand owns profile sourcing and the outer shell wrapper —
// the caller passes the raw command string. Wrapping it here would
// double-nest the login shell and re-source profiles after the explicit
// env override, silently undoing identity-var preservation.
expect(runSshCommandSpy).toHaveBeenCalledWith(
expect.objectContaining({
host: "ssh.example.test",
username: "ssh-user",
}),
`sh -lc ${ssh.shellQuote(`printf '%s\\n' "$HOME" && echo "it's ok"`)}`,
`printf '%s\\n' "$HOME" && echo "it's ok"`,
expect.any(Object),
);
});
it("sanitizes inherited host env before SSH shell execution", async () => {
vi.stubEnv("PATH", "/host/bin:/usr/bin");
vi.stubEnv("HOME", "/Users/local");
const runSshCommandSpy = vi.spyOn(ssh, "runSshCommand").mockResolvedValue({
stdout: "",
stderr: "",
});
await runAdapterExecutionTargetShellCommand(
"run-1b",
{
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
},
"env",
{
cwd: "/tmp/local",
env: {
PATH: "/host/bin:/usr/bin",
HOME: "/Users/local",
SAFE_VALUE: "visible",
},
},
);
expect(runSshCommandSpy).toHaveBeenCalledWith(
expect.any(Object),
expect.any(String),
expect.objectContaining({
env: {
SAFE_VALUE: "visible",
},
}),
);
});
it("returns a timedOut result when the SSH shell command times out", async () => {
vi.spyOn(ssh, "runSshCommand").mockRejectedValue(Object.assign(new Error("timed out"), {
code: "ETIMEDOUT",
@@ -159,3 +216,188 @@ describe("runAdapterExecutionTargetShellCommand", () => {
})).toBe(false);
});
});
describe("runAdapterExecutionTargetProcess", () => {
afterEach(() => {
vi.restoreAllMocks();
vi.unstubAllEnvs();
});
it("sanitizes inherited host env before SSH process execution", async () => {
vi.stubEnv("PATH", "/host/bin:/usr/bin");
vi.stubEnv("HOME", "/Users/local");
const runChildProcessSpy = vi.spyOn(serverUtils, "runChildProcess").mockResolvedValue({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
});
await runAdapterExecutionTargetProcess(
"run-ssh-process",
{
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
},
"agent-cli",
["--json"],
{
cwd: "/tmp/local",
env: {
PATH: "/host/bin:/usr/bin",
HOME: "/Users/local",
SAFE_VALUE: "visible",
},
timeoutSec: 5,
graceSec: 1,
onLog: async () => {},
},
);
expect(runChildProcessSpy).toHaveBeenCalledWith(
"run-ssh-process",
"agent-cli",
["--json"],
expect.objectContaining({
env: {
SAFE_VALUE: "visible",
},
}),
);
});
});
describe("ensureAdapterExecutionTargetRuntimeCommandInstalled", () => {
afterEach(() => {
vi.restoreAllMocks();
});
it("runs install commands for sandbox targets", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
await ensureAdapterExecutionTargetRuntimeCommandInstalled({
runId: "run-install",
target: {
kind: "remote",
transport: "sandbox",
providerKey: "e2b",
remoteCwd: "/remote/workspace",
runner,
},
installCommand: "npm install -g @google/gemini-cli",
cwd: "/local/workspace",
env: { PATH: "/usr/bin" },
timeoutSec: 30,
});
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "sh",
args: ["-c", "npm install -g @google/gemini-cli"],
cwd: "/remote/workspace",
env: { PATH: "/usr/bin" },
timeoutMs: 30_000,
}));
});
it("skips install commands for SSH targets", async () => {
const runSshCommandSpy = vi.spyOn(ssh, "runSshCommand").mockResolvedValue({
stdout: "",
stderr: "",
});
await ensureAdapterExecutionTargetRuntimeCommandInstalled({
runId: "run-skip",
target: {
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
},
installCommand: "npm install -g @google/gemini-cli",
cwd: "/tmp/local",
env: {},
});
expect(runSshCommandSpy).not.toHaveBeenCalled();
});
});
describe("resolveAdapterExecutionTargetCwd", () => {
const sshTarget = {
kind: "remote" as const,
transport: "ssh" as const,
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
};
it("falls back to the remote cwd when no adapter cwd is configured", () => {
expect(resolveAdapterExecutionTargetCwd(sshTarget, "", "/Users/host/repo/server")).toBe(
"/srv/paperclip/workspace",
);
expect(resolveAdapterExecutionTargetCwd(sshTarget, " ", "/Users/host/repo/server")).toBe(
"/srv/paperclip/workspace",
);
expect(resolveAdapterExecutionTargetCwd(sshTarget, null, "/Users/host/repo/server")).toBe(
"/srv/paperclip/workspace",
);
});
it("preserves an explicit adapter cwd when one is configured", () => {
expect(
resolveAdapterExecutionTargetCwd(
sshTarget,
"/srv/paperclip/custom-agent-dir",
"/Users/host/repo/server",
),
).toBe("/srv/paperclip/custom-agent-dir");
});
it("keeps the local fallback cwd for local targets", () => {
expect(resolveAdapterExecutionTargetCwd(null, "", "/Users/host/repo/server")).toBe(
"/Users/host/repo/server",
);
});
});
+702 -22
View File
@@ -10,7 +10,15 @@ import {
remoteExecutionSessionMatches,
type RemoteManagedRuntimeAsset,
} from "./remote-managed-runtime.js";
import { parseSshRemoteExecutionSpec, runSshCommand, shellQuote } from "./ssh.js";
import {
createCommandManagedSandboxCallbackBridgeQueueClient,
createSandboxCallbackBridgeAsset,
createSandboxCallbackBridgeToken,
DEFAULT_SANDBOX_CALLBACK_BRIDGE_MAX_BODY_BYTES,
startSandboxCallbackBridgeServer,
startSandboxCallbackBridgeWorker,
} from "./sandbox-callback-bridge.js";
import { createSshCommandManagedRuntimeRunner, parseSshRemoteExecutionSpec, runSshCommand, shellQuote } from "./ssh.js";
import {
ensureCommandResolvable,
resolveCommandForLogs,
@@ -18,6 +26,8 @@ import {
type RunProcessResult,
type TerminalResultCleanupOptions,
} from "./server-utils.js";
import { sanitizeRemoteExecutionEnv } from "./remote-execution-env.js";
import { preferredShellForSandbox, shellCommandArgs } from "./sandbox-shell.js";
export interface AdapterLocalExecutionTarget {
kind: "local";
@@ -31,7 +41,6 @@ export interface AdapterSshExecutionTarget {
environmentId?: string | null;
leaseId?: string | null;
remoteCwd: string;
paperclipApiUrl?: string | null;
spec: SshRemoteExecutionSpec;
}
@@ -39,10 +48,10 @@ export interface AdapterSandboxExecutionTarget {
kind: "remote";
transport: "sandbox";
providerKey?: string | null;
shellCommand?: "bash" | "sh" | null;
environmentId?: string | null;
leaseId?: string | null;
remoteCwd: string;
paperclipApiUrl?: string | null;
timeoutMs?: number | null;
runner?: CommandManagedRuntimeRunner;
}
@@ -58,6 +67,7 @@ export type AdapterManagedRuntimeAsset = RemoteManagedRuntimeAsset;
export interface PreparedAdapterExecutionTargetRuntime {
target: AdapterExecutionTarget;
workspaceRemoteDir: string | null;
runtimeRootDir: string | null;
assetDirs: Record<string, string>;
restoreWorkspace(): Promise<void>;
@@ -82,6 +92,15 @@ export interface AdapterExecutionTargetShellOptions {
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
}
export interface AdapterExecutionTargetPaperclipBridgeHandle {
env: Record<string, string>;
stop(): Promise<void>;
}
export { sanitizeRemoteExecutionEnv } from "./remote-execution-env.js";
export const DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC = 1_800;
function parseObject(value: unknown): Record<string, unknown> {
return value && typeof value === "object" && !Array.isArray(value)
? (value as Record<string, unknown>)
@@ -96,6 +115,27 @@ function readStringMeta(parsed: Record<string, unknown>, key: string): string |
return readString(parsed[key]);
}
function resolveHostForUrl(rawHost: string): string {
const host = rawHost.trim();
if (!host || host === "0.0.0.0" || host === "::") return "localhost";
if (host.includes(":") && !host.startsWith("[") && !host.endsWith("]")) return `[${host}]`;
return host;
}
function resolveDefaultPaperclipApiUrl(): string {
const runtimeHost = resolveHostForUrl(
process.env.PAPERCLIP_LISTEN_HOST ?? process.env.HOST ?? "localhost",
);
// 3100 matches the default Paperclip dev server port when the runtime does not provide one.
const runtimePort = process.env.PAPERCLIP_LISTEN_PORT ?? process.env.PORT ?? "3100";
return `http://${runtimeHost}:${runtimePort}`;
}
function isBridgeDebugEnabled(env: NodeJS.ProcessEnv): boolean {
const value = env.PAPERCLIP_BRIDGE_DEBUG?.trim().toLowerCase();
return value === "1" || value === "true" || value === "yes";
}
function isAdapterExecutionTargetInstance(value: unknown): value is AdapterExecutionTarget {
const parsed = parseObject(value);
if (parsed.kind === "local") return true;
@@ -130,12 +170,48 @@ export function adapterExecutionTargetRemoteCwd(
return target?.kind === "remote" ? target.remoteCwd : localCwd;
}
export function adapterExecutionTargetPaperclipApiUrl(
export function overrideAdapterExecutionTargetRemoteCwd(
target: AdapterExecutionTarget | null | undefined,
): string | null {
if (target?.kind !== "remote") return null;
if (target.transport === "ssh") return target.paperclipApiUrl ?? target.spec.paperclipApiUrl ?? null;
return target.paperclipApiUrl ?? null;
remoteCwd: string | null | undefined,
): AdapterExecutionTarget | null | undefined {
const nextRemoteCwd = remoteCwd?.trim();
if (!target || target.kind !== "remote" || !nextRemoteCwd) {
return target;
}
if (target.remoteCwd === nextRemoteCwd) {
return target;
}
if (target.transport === "ssh") {
return {
...target,
remoteCwd: nextRemoteCwd,
spec: {
...target.spec,
remoteCwd: nextRemoteCwd,
},
};
}
return {
...target,
remoteCwd: nextRemoteCwd,
};
}
export function resolveAdapterExecutionTargetCwd(
target: AdapterExecutionTarget | null | undefined,
configuredCwd: string | null | undefined,
localFallbackCwd: string,
): string {
if (typeof configuredCwd === "string" && configuredCwd.trim().length > 0) {
return configuredCwd;
}
return adapterExecutionTargetRemoteCwd(target, localFallbackCwd);
}
export function adapterExecutionTargetUsesPaperclipBridge(
target: AdapterExecutionTarget | null | undefined,
): boolean {
return target?.kind === "remote";
}
export function describeAdapterExecutionTarget(
@@ -148,6 +224,26 @@ export function describeAdapterExecutionTarget(
return `sandbox environment${target.providerKey ? ` (${target.providerKey})` : ""}`;
}
export function resolveAdapterExecutionTargetTimeoutSec(
target: AdapterExecutionTarget | null | undefined,
configuredTimeoutSec: number | null | undefined,
): number {
const normalizedConfiguredTimeoutSec =
typeof configuredTimeoutSec === "number" && Number.isFinite(configuredTimeoutSec) && configuredTimeoutSec > 0
? Math.floor(configuredTimeoutSec)
: 0;
if (normalizedConfiguredTimeoutSec > 0) return normalizedConfiguredTimeoutSec;
// Local and SSH adapters preserve the historical "0 means no adapter
// timeout" behavior. Sandbox-backed runs execute through provider RPCs
// that usually apply their own shorter command defaults, so request an
// explicit longer timeout for full adapter runs when the adapter leaves
// timeoutSec unset.
if (target?.kind === "remote" && target.transport === "sandbox") {
return DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC;
}
return 0;
}
function requireSandboxRunner(target: AdapterSandboxExecutionTarget): CommandManagedRuntimeRunner {
if (target.runner) return target.runner;
throw new Error(
@@ -155,13 +251,47 @@ function requireSandboxRunner(target: AdapterSandboxExecutionTarget): CommandMan
);
}
function preferredSandboxShell(target: AdapterSandboxExecutionTarget): "bash" | "sh" {
return preferredShellForSandbox(target.shellCommand);
}
type AdapterCommandCapableExecutionTarget = AdapterSshExecutionTarget | AdapterSandboxExecutionTarget;
function adapterExecutionTargetCommandRunner(target: AdapterCommandCapableExecutionTarget): CommandManagedRuntimeRunner {
if (target.transport === "ssh") {
return createSshCommandManagedRuntimeRunner({
spec: target.spec,
defaultCwd: target.remoteCwd,
maxBufferBytes: DEFAULT_SANDBOX_CALLBACK_BRIDGE_MAX_BODY_BYTES * 4,
});
}
return requireSandboxRunner(target);
}
function adapterExecutionTargetShellCommand(target: AdapterCommandCapableExecutionTarget): "bash" | "sh" {
return target.transport === "ssh" ? "sh" : preferredSandboxShell(target);
}
function adapterExecutionTargetTimeoutMs(
target: AdapterCommandCapableExecutionTarget,
): number | null | undefined {
return target.transport === "sandbox" ? target.timeoutMs : undefined;
}
export async function ensureAdapterExecutionTargetCommandResolvable(
command: string,
target: AdapterExecutionTarget | null | undefined,
cwd: string,
env: NodeJS.ProcessEnv,
options: { installCommand?: string | null; timeoutSec?: number | null } = {},
) {
if (target?.kind === "remote" && target.transport === "sandbox") {
await ensureSandboxCommandResolvable(
command,
target,
options.installCommand?.trim() || null,
options.timeoutSec,
);
return;
}
await ensureCommandResolvable(command, cwd, env, {
@@ -169,6 +299,87 @@ export async function ensureAdapterExecutionTargetCommandResolvable(
});
}
async function probeSandboxCommandResolvable(
command: string,
target: AdapterSandboxExecutionTarget,
): Promise<{ resolved: boolean; timedOut: boolean; stderr: string }> {
const runner = requireSandboxRunner(target);
const probeScript = `command -v ${shellQuote(command)}`;
const result = await runner.execute({
command: "sh",
args: ["-c", probeScript],
cwd: target.remoteCwd,
timeoutMs: target.timeoutMs ?? 15_000,
});
return {
resolved: !result.timedOut && (result.exitCode ?? 1) === 0,
timedOut: result.timedOut,
stderr: result.stderr.trim(),
};
}
async function ensureSandboxCommandResolvable(
command: string,
target: AdapterSandboxExecutionTarget,
installCommand: string | null,
timeoutSec?: number | null,
): Promise<void> {
// Probe whether the binary is resolvable inside the sandbox. We previously
// short-circuited this for sandbox targets, which let the caller report a
// success message even when the CLI was missing from the image. Now we run
// a real `command -v` through the same runner the hello probe will use, so
// the first step honestly reflects whether the binary is on PATH. The
// sandbox provider is responsible for sourcing login profiles (e2b mirrors
// SSH's buildSshSpawnTarget) so this and the hello probe agree on PATH.
let probe = await probeSandboxCommandResolvable(command, target);
if (probe.resolved) return;
if (probe.timedOut) {
throw new Error(`Timed out checking command "${command}" on sandbox target.`);
}
// If the caller supplied an install command, attempt the install once via
// the sandbox runner (which the sandbox provider wraps in a login shell)
// and re-probe before reporting failure. This lets fresh sandbox leases
// bring up the CLI before the resolvability gate, mirroring the test path.
let installFailureDetail: string | null = null;
if (installCommand) {
const runner = requireSandboxRunner(target);
const installTimeoutMs =
typeof timeoutSec === "number" && Number.isFinite(timeoutSec) && timeoutSec > 0
? Math.floor(timeoutSec * 1000)
: target.timeoutMs ?? 300_000;
try {
const installResult = await runner.execute({
command: "sh",
args: shellCommandArgs(installCommand),
cwd: target.remoteCwd,
timeoutMs: installTimeoutMs,
});
if (installResult.timedOut) {
installFailureDetail = `install command timed out: ${installCommand}`;
} else if ((installResult.exitCode ?? 0) !== 0) {
const tail = (text: string) =>
text.split(/\r?\n/).filter((line) => line.trim().length > 0).slice(-2).join(" | ").slice(0, 240);
const reason = tail(installResult.stderr || installResult.stdout) || `exit ${installResult.exitCode ?? "?"}`;
installFailureDetail = `install command exited ${installResult.exitCode ?? "?"}: ${reason}`;
}
} catch (err) {
installFailureDetail = `install command threw: ${err instanceof Error ? err.message : String(err)}`;
}
probe = await probeSandboxCommandResolvable(command, target);
if (probe.resolved) return;
if (probe.timedOut) {
throw new Error(`Timed out checking command "${command}" on sandbox target.`);
}
}
const probeStderr = probe.stderr.length > 0 ? ` probe stderr: ${probe.stderr}` : "";
const installDetail = installFailureDetail ? `; ${installFailureDetail}` : "";
throw new Error(
`Command "${command}" is not installed or not on PATH in the sandbox environment${installDetail}.${probeStderr}`,
);
}
export async function resolveAdapterExecutionTargetCommandForLogs(
command: string,
target: AdapterExecutionTarget | null | undefined,
@@ -192,11 +403,12 @@ export async function runAdapterExecutionTargetProcess(
): Promise<RunProcessResult> {
if (target?.kind === "remote" && target.transport === "sandbox") {
const runner = requireSandboxRunner(target);
const env = sanitizeRemoteExecutionEnv(options.env);
return await runner.execute({
command,
args,
cwd: target.remoteCwd,
env: options.env,
env,
stdin: options.stdin,
timeoutMs: options.timeoutSec > 0 ? options.timeoutSec * 1000 : target.timeoutMs ?? undefined,
onLog: options.onLog,
@@ -206,9 +418,14 @@ export async function runAdapterExecutionTargetProcess(
});
}
const env =
target?.kind === "remote" && target.transport === "ssh"
? sanitizeRemoteExecutionEnv(options.env)
: options.env;
return await runChildProcess(runId, command, args, {
cwd: options.cwd,
env: options.env,
env,
stdin: options.stdin,
timeoutSec: options.timeoutSec,
graceSec: options.graceSec,
@@ -228,9 +445,16 @@ export async function runAdapterExecutionTargetShellCommand(
const onLog = options.onLog ?? (async () => {});
if (target?.kind === "remote") {
const startedAt = new Date().toISOString();
const env = sanitizeRemoteExecutionEnv(options.env);
if (target.transport === "ssh") {
try {
const result = await runSshCommand(target.spec, `sh -lc ${shellQuote(command)}`, {
// Pass the raw command — `runSshCommand` owns profile sourcing and
// the outer shell wrapper. Wrapping again here would nest a second
// shell after the explicit `env KEY=VAL` overrides, re-sourcing
// login profiles AFTER the override and silently undoing any
// identity var (NVM_DIR / PATH / etc.) that a profile re-exports.
const result = await runSshCommand(target.spec, command, {
env,
timeoutMs: (options.timeoutSec ?? 15) * 1000,
});
if (result.stdout) await onLog("stdout", result.stdout);
@@ -282,11 +506,12 @@ export async function runAdapterExecutionTargetShellCommand(
}
}
const shellCommand = preferredSandboxShell(target);
return await requireSandboxRunner(target).execute({
command: "sh",
args: ["-lc", command],
command: shellCommand,
args: shellCommandArgs(command),
cwd: target.remoteCwd,
env: options.env,
env,
timeoutMs: (options.timeoutSec ?? 15) * 1000,
onLog,
});
@@ -307,6 +532,111 @@ export async function runAdapterExecutionTargetShellCommand(
);
}
export interface AdapterSandboxInstallCommandCheck {
code: string;
level: "info" | "warn" | "error";
message: string;
detail?: string;
hint?: string;
}
// Best-effort run of an adapter-supplied install command on a sandbox target
// before the resolvability + hello probe. Returns null for non-sandbox
// targets so callers can no-op. Returns a structured check otherwise — never
// throws — so the rest of the test still runs and reports the post-install
// state honestly. Caller pushes the check into its result array; the test
// report shows whether install was attempted and what came back.
export async function maybeRunSandboxInstallCommand(input: {
runId: string;
target: AdapterExecutionTarget | null | undefined;
adapterKey: string;
installCommand: string;
/** When provided, skip the install if `command -v <detectCommand>` succeeds. */
detectCommand?: string | null;
env?: Record<string, string>;
timeoutSec?: number;
}): Promise<AdapterSandboxInstallCommandCheck | null> {
const { target, adapterKey, installCommand } = input;
if (!target || target.kind !== "remote" || target.transport !== "sandbox") {
return null;
}
const trimmed = installCommand.trim();
if (trimmed.length === 0) return null;
const code = `${adapterKey}_install_command_run`;
// Skip install when the binary is already on PATH. Avoids running
// network-dependent installers (e.g. `curl ... | bash`) on every test
// probe when the CLI is preinstalled on the lease/template.
const detectCommand = input.detectCommand?.trim();
if (detectCommand) {
try {
const probe = await runAdapterExecutionTargetShellCommand(
input.runId,
target,
`command -v ${shellQuote(detectCommand)} >/dev/null 2>&1`,
{
cwd: target.remoteCwd,
env: input.env ?? {},
timeoutSec: 30,
graceSec: 5,
},
);
if (!probe.timedOut && probe.exitCode === 0) {
return {
code,
level: "info",
message: `${detectCommand} already on PATH; skipped install.`,
};
}
} catch {
// Fall through to actually running the install — failure to probe
// is not a reason to skip the install gate.
}
}
let result;
try {
result = await runAdapterExecutionTargetShellCommand(input.runId, target, trimmed, {
cwd: target.remoteCwd,
env: input.env ?? {},
timeoutSec: input.timeoutSec ?? 240,
graceSec: 10,
});
} catch (err) {
return {
code,
level: "warn",
message: "Install command threw before completion.",
detail: err instanceof Error ? err.message : String(err),
};
}
const tail = (text: string) =>
text.split(/\r?\n/).filter((line) => line.trim().length > 0).slice(-3).join(" | ").slice(0, 480);
if (result.timedOut) {
return {
code,
level: "warn",
message: `Install command timed out: ${trimmed}`,
detail: tail(result.stderr || result.stdout),
};
}
if ((result.exitCode ?? 1) === 0) {
return {
code,
level: "info",
message: `Install command ran: ${trimmed}`,
...(tail(result.stdout) ? { detail: tail(result.stdout) } : {}),
};
}
return {
code,
level: "warn",
message: `Install command exited ${result.exitCode}: ${trimmed}`,
detail: tail(result.stderr || result.stdout),
};
}
export async function readAdapterExecutionTargetHomeDir(
runId: string,
target: AdapterExecutionTarget | null | undefined,
@@ -322,6 +652,91 @@ export async function readAdapterExecutionTargetHomeDir(
return homeDir.length > 0 ? homeDir : null;
}
export async function ensureAdapterExecutionTargetRuntimeCommandInstalled(input: {
runId: string;
target: AdapterExecutionTarget | null | undefined;
installCommand?: string | null;
detectCommand?: string | null;
cwd: string;
env: Record<string, string>;
timeoutSec?: number;
graceSec?: number;
onLog?: AdapterExecutionTargetShellOptions["onLog"];
}): Promise<void> {
const installCommand = input.installCommand?.trim();
if (!installCommand || input.target?.kind !== "remote" || input.target.transport !== "sandbox") {
return;
}
const detectCommand = input.detectCommand?.trim();
if (detectCommand) {
const probe = await runAdapterExecutionTargetShellCommand(
input.runId,
input.target,
`command -v ${shellQuote(detectCommand)} >/dev/null 2>&1`,
{
cwd: input.cwd,
env: input.env,
timeoutSec: input.timeoutSec,
graceSec: input.graceSec,
},
);
if (!probe.timedOut && probe.exitCode === 0) {
return;
}
}
const result = await runAdapterExecutionTargetShellCommand(
input.runId,
input.target,
installCommand,
{
cwd: input.cwd,
env: input.env,
timeoutSec: input.timeoutSec,
graceSec: input.graceSec,
onLog: input.onLog,
},
);
// A failed or timed-out install is not necessarily fatal: the CLI may already
// be on PATH from a previous lease's install, the template image, or another
// path entry. Re-run the detect probe (when one is configured) so a transient
// install failure does not abort the agent run when the binary is reachable.
const installFailed = result.timedOut || (result.exitCode ?? 0) !== 0;
if (!installFailed) {
return;
}
if (detectCommand) {
const recheck = await runAdapterExecutionTargetShellCommand(
input.runId,
input.target,
`command -v ${shellQuote(detectCommand)} >/dev/null 2>&1`,
{
cwd: input.cwd,
env: input.env,
timeoutSec: input.timeoutSec,
graceSec: input.graceSec,
},
);
if (!recheck.timedOut && recheck.exitCode === 0) {
if (input.onLog) {
const reason = result.timedOut ? "timed out" : `exited ${result.exitCode ?? "?"}`;
await input.onLog(
"stderr",
`[paperclip] Install command ${reason} (${installCommand}) but ${detectCommand} is on PATH; continuing.\n`,
);
}
return;
}
}
if (result.timedOut) {
throw new Error(`Timed out while installing the adapter runtime command via: ${installCommand}`);
}
throw new Error(`Failed to install the adapter runtime command via: ${installCommand}`);
}
export async function ensureAdapterExecutionTargetFile(
runId: string,
target: AdapterExecutionTarget | null | undefined,
@@ -336,6 +751,64 @@ export async function ensureAdapterExecutionTargetFile(
);
}
/**
* Ensure a working directory exists (and is a directory) on the execution target.
*
* For local targets this delegates to the local `ensureAbsoluteDirectory` helper
* (Node fs). For remote (SSH/sandbox) targets it shells out and runs
* `mkdir -p` (when allowed) followed by a `[ -d ]` check so the result reflects
* the directory state inside the environment, not on the Paperclip host.
*
* Throws an Error with a human-readable message on failure.
*/
export async function ensureAdapterExecutionTargetDirectory(
runId: string,
target: AdapterExecutionTarget | null | undefined,
cwd: string,
options: AdapterExecutionTargetShellOptions & { createIfMissing?: boolean },
): Promise<void> {
const createIfMissing = options.createIfMissing ?? false;
if (!target || target.kind === "local") {
const { ensureAbsoluteDirectory } = await import("./server-utils.js");
await ensureAbsoluteDirectory(cwd, { createIfMissing });
return;
}
// Remote (SSH or sandbox): both expect POSIX absolute paths inside the env.
if (!cwd.startsWith("/")) {
throw new Error(`Working directory must be an absolute POSIX path on the remote target: "${cwd}"`);
}
const quoted = shellQuote(cwd);
const script = createIfMissing
? `mkdir -p ${quoted} && [ -d ${quoted} ]`
: `[ -d ${quoted} ]`;
const result = await runAdapterExecutionTargetShellCommand(runId, target, script, {
cwd: target.kind === "remote" ? target.remoteCwd : cwd,
env: options.env,
timeoutSec: options.timeoutSec ?? 15,
graceSec: options.graceSec ?? 5,
onLog: options.onLog,
});
if (result.timedOut) {
throw new Error(`Timed out checking working directory on remote target: "${cwd}"`);
}
if ((result.exitCode ?? 1) !== 0) {
const detail = (result.stderr || result.stdout || "").trim();
if (createIfMissing) {
throw new Error(
`Could not create working directory "${cwd}" on remote target${detail ? `: ${detail}` : "."}`,
);
}
throw new Error(
`Working directory does not exist on remote target: "${cwd}"${detail ? ` (${detail})` : ""}`,
);
}
}
export function adapterExecutionTargetSessionIdentity(
target: AdapterExecutionTarget | null | undefined,
): Record<string, unknown> | null {
@@ -347,7 +820,6 @@ export function adapterExecutionTargetSessionIdentity(
environmentId: target.environmentId ?? null,
leaseId: target.leaseId ?? null,
remoteCwd: target.remoteCwd,
...(target.paperclipApiUrl ? { paperclipApiUrl: target.paperclipApiUrl } : {}),
};
}
@@ -366,8 +838,7 @@ export function adapterExecutionTargetSessionMatches(
readStringMeta(parsedSaved, "providerKey") === current?.providerKey &&
readStringMeta(parsedSaved, "environmentId") === current?.environmentId &&
readStringMeta(parsedSaved, "leaseId") === current?.leaseId &&
readStringMeta(parsedSaved, "remoteCwd") === current?.remoteCwd &&
readStringMeta(parsedSaved, "paperclipApiUrl") === (current?.paperclipApiUrl ?? null)
readStringMeta(parsedSaved, "remoteCwd") === current?.remoteCwd
);
}
@@ -392,7 +863,6 @@ export function parseAdapterExecutionTarget(value: unknown): AdapterExecutionTar
environmentId: readStringMeta(parsed, "environmentId"),
leaseId: readStringMeta(parsed, "leaseId"),
remoteCwd: spec.remoteCwd,
paperclipApiUrl: readStringMeta(parsed, "paperclipApiUrl") ?? spec.paperclipApiUrl ?? null,
spec,
};
}
@@ -407,7 +877,6 @@ export function parseAdapterExecutionTarget(value: unknown): AdapterExecutionTar
environmentId: readStringMeta(parsed, "environmentId"),
leaseId: readStringMeta(parsed, "leaseId"),
remoteCwd,
paperclipApiUrl: readStringMeta(parsed, "paperclipApiUrl"),
timeoutMs: typeof parsed.timeoutMs === "number" ? parsed.timeoutMs : null,
};
}
@@ -428,7 +897,6 @@ export function adapterExecutionTargetFromRemoteExecution(
environmentId: metadata.environmentId ?? null,
leaseId: metadata.leaseId ?? null,
remoteCwd: ssh.remoteCwd,
paperclipApiUrl: ssh.paperclipApiUrl ?? null,
spec: ssh,
};
}
@@ -450,18 +918,24 @@ export function readAdapterExecutionTarget(input: {
}
export async function prepareAdapterExecutionTargetRuntime(input: {
runId: string;
target: AdapterExecutionTarget | null | undefined;
adapterKey: string;
workspaceLocalDir: string;
timeoutSec?: number;
workspaceRemoteDir?: string;
workspaceExclude?: string[];
preserveAbsentOnRestore?: string[];
assets?: AdapterManagedRuntimeAsset[];
installCommand?: string | null;
/** When provided alongside `installCommand`, skip the install if the binary is already on PATH. */
detectCommand?: string | null;
}): Promise<PreparedAdapterExecutionTargetRuntime> {
const target = input.target ?? { kind: "local" as const };
if (target.kind === "local") {
return {
target,
workspaceRemoteDir: null,
runtimeRootDir: null,
assetDirs: {},
restoreWorkspace: async () => {},
@@ -471,12 +945,15 @@ export async function prepareAdapterExecutionTargetRuntime(input: {
if (target.transport === "ssh") {
const prepared = await prepareRemoteManagedRuntime({
spec: target.spec,
runId: input.runId,
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir: input.workspaceRemoteDir,
assets: input.assets,
});
return {
target,
workspaceRemoteDir: prepared.workspaceRemoteDir,
runtimeRootDir: prepared.runtimeRootDir,
assetDirs: prepared.assetDirs,
restoreWorkspace: prepared.restoreWorkspace,
@@ -487,20 +964,26 @@ export async function prepareAdapterExecutionTargetRuntime(input: {
runner: requireSandboxRunner(target),
spec: {
providerKey: target.providerKey,
shellCommand: target.shellCommand,
leaseId: target.leaseId,
remoteCwd: target.remoteCwd,
timeoutMs: target.timeoutMs,
paperclipApiUrl: target.paperclipApiUrl,
timeoutMs:
input.timeoutSec && input.timeoutSec > 0
? input.timeoutSec * 1000
: target.timeoutMs,
},
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir: input.workspaceRemoteDir,
workspaceExclude: input.workspaceExclude,
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
assets: input.assets,
installCommand: input.installCommand,
detectCommand: input.detectCommand,
});
return {
target,
workspaceRemoteDir: prepared.workspaceRemoteDir,
runtimeRootDir: prepared.runtimeRootDir,
assetDirs: prepared.assetDirs,
restoreWorkspace: prepared.restoreWorkspace,
@@ -514,3 +997,200 @@ export function runtimeAssetDir(
): string {
return prepared.assetDirs[key] ?? path.posix.join(fallbackRemoteCwd, ".paperclip-runtime", key);
}
function buildBridgeResponseHeaders(response: Response): Record<string, string> {
const out: Record<string, string> = {};
for (const key of ["content-type", "etag", "last-modified"]) {
const value = response.headers.get(key);
if (value && value.trim().length > 0) out[key] = value.trim();
}
return out;
}
function buildBridgeForwardUrl(baseUrl: string, request: { path: string; query: string }): URL {
const url = new URL(request.path, baseUrl);
const query = request.query.trim();
url.search = query.startsWith("?") ? query.slice(1) : query;
return url;
}
function bridgeResponseBodyLimitError(maxBodyBytes: number): Error {
return new Error(`Bridge response body exceeded the configured size limit of ${maxBodyBytes} bytes.`);
}
async function readBridgeForwardResponseBody(response: Response, maxBodyBytes: number): Promise<string> {
const rawContentLength = response.headers.get("content-length");
if (rawContentLength) {
const contentLength = Number.parseInt(rawContentLength, 10);
if (Number.isFinite(contentLength) && contentLength > maxBodyBytes) {
throw bridgeResponseBodyLimitError(maxBodyBytes);
}
}
if (!response.body) {
return "";
}
const reader = response.body.getReader();
const chunks: Buffer[] = [];
let totalBytes = 0;
while (true) {
const { done, value } = await reader.read();
if (done) break;
if (!value) continue;
totalBytes += value.byteLength;
if (totalBytes > maxBodyBytes) {
await reader.cancel().catch(() => undefined);
throw bridgeResponseBodyLimitError(maxBodyBytes);
}
chunks.push(Buffer.from(value));
}
return Buffer.concat(chunks, totalBytes).toString("utf8");
}
export async function startAdapterExecutionTargetPaperclipBridge(input: {
runId: string;
target: AdapterExecutionTarget | null | undefined;
runtimeRootDir: string | null | undefined;
adapterKey: string;
timeoutSec?: number | null;
hostApiToken: string | null | undefined;
hostApiUrl?: string | null;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
maxBodyBytes?: number | null;
}): Promise<AdapterExecutionTargetPaperclipBridgeHandle | null> {
if (!adapterExecutionTargetUsesPaperclipBridge(input.target)) {
return null;
}
if (!input.target || input.target.kind !== "remote") {
return null;
}
const target = input.target;
const onLog = input.onLog ?? (async () => {});
const hostApiToken = input.hostApiToken?.trim() ?? "";
if (hostApiToken.length === 0) {
throw new Error("Sandbox bridge mode requires a host-side Paperclip API token.");
}
const runtimeRootDir =
input.runtimeRootDir?.trim().length
? input.runtimeRootDir.trim()
: path.posix.join(target.remoteCwd, ".paperclip-runtime", input.adapterKey);
const bridgeRuntimeDir = path.posix.join(runtimeRootDir, "paperclip-bridge");
const queueDir = path.posix.join(bridgeRuntimeDir, "queue");
const assetRemoteDir = path.posix.join(bridgeRuntimeDir, "server");
const bridgeToken = createSandboxCallbackBridgeToken();
const maxBodyBytes =
typeof input.maxBodyBytes === "number" && Number.isFinite(input.maxBodyBytes) && input.maxBodyBytes > 0
? Math.trunc(input.maxBodyBytes)
: DEFAULT_SANDBOX_CALLBACK_BRIDGE_MAX_BODY_BYTES;
const hostApiUrl =
input.hostApiUrl?.trim() ||
process.env.PAPERCLIP_RUNTIME_API_URL?.trim() ||
process.env.PAPERCLIP_API_URL?.trim() ||
resolveDefaultPaperclipApiUrl();
const shellCommand = adapterExecutionTargetShellCommand(target);
const runner = adapterExecutionTargetCommandRunner(target);
const bridgeTimeoutMs =
typeof input.timeoutSec === "number" && Number.isFinite(input.timeoutSec) && input.timeoutSec > 0
? Math.trunc(input.timeoutSec * 1000)
: adapterExecutionTargetTimeoutMs(target);
await onLog(
"stdout",
`[paperclip] Starting sandbox callback bridge for ${input.adapterKey} in ${bridgeRuntimeDir}.\n`,
);
const bridgeAsset = await createSandboxCallbackBridgeAsset();
let server: Awaited<ReturnType<typeof startSandboxCallbackBridgeServer>> | null = null;
let worker: Awaited<ReturnType<typeof startSandboxCallbackBridgeWorker>> | null = null;
try {
const client = createCommandManagedSandboxCallbackBridgeQueueClient({
runner,
remoteCwd: target.remoteCwd,
timeoutMs: bridgeTimeoutMs,
shellCommand,
});
// PAPERCLIP_BRIDGE_DEBUG opts into verbose stdout logs of every bridge
// proxy request/response. The query string is logged verbatim, so callers
// who pass auth tokens or other sensitive values as query parameters
// should be aware those values appear in the host process's stdout when
// this flag is enabled. Only intended for active debugging in trusted
// environments.
const bridgeDebugEnabled = isBridgeDebugEnabled(process.env);
worker = await startSandboxCallbackBridgeWorker({
client,
queueDir,
maxBodyBytes,
handleRequest: async (request) => {
const method = request.method.trim().toUpperCase() || "GET";
if (bridgeDebugEnabled) {
await onLog(
"stdout",
`[paperclip] Bridge proxy ${method} ${request.path}${request.query ? `?${request.query}` : ""}\n`,
);
}
const headers = new Headers();
for (const [key, value] of Object.entries(request.headers)) {
if (value.trim().length === 0) continue;
headers.set(key, value);
}
headers.set("authorization", `Bearer ${hostApiToken}`);
headers.set("x-paperclip-run-id", input.runId);
const response = await fetch(buildBridgeForwardUrl(hostApiUrl, request), {
method,
headers,
...(method === "GET" || method === "HEAD" ? {} : { body: request.body }),
signal: AbortSignal.timeout(30_000),
});
if (bridgeDebugEnabled) {
await onLog(
"stdout",
`[paperclip] Bridge proxy response ${response.status} for ${method} ${request.path}${request.query ? `?${request.query}` : ""}\n`,
);
}
return {
status: response.status,
headers: buildBridgeResponseHeaders(response),
body: await readBridgeForwardResponseBody(response, maxBodyBytes),
};
},
});
server = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: target.remoteCwd,
assetRemoteDir,
queueDir,
bridgeToken,
bridgeAsset,
timeoutMs: bridgeTimeoutMs,
maxBodyBytes,
shellCommand,
});
} catch (error) {
await Promise.allSettled([
server?.stop(),
worker?.stop(),
bridgeAsset.cleanup(),
]);
throw error;
}
return {
env: {
PAPERCLIP_API_URL: server.baseUrl,
PAPERCLIP_API_KEY: bridgeToken,
PAPERCLIP_API_BRIDGE_MODE: "queue_v1",
},
stop: async () => {
await Promise.allSettled([
server?.stop(),
]);
await Promise.allSettled([
worker?.stop(),
bridgeAsset.cleanup(),
]);
},
};
}
+20
View File
@@ -20,11 +20,14 @@ export type {
AdapterSkillContext,
AdapterSessionCodec,
AdapterModel,
AdapterModelProfileKey,
AdapterModelProfileDefinition,
HireApprovedPayload,
HireApprovedHookResult,
ConfigFieldOption,
ConfigFieldSchema,
AdapterConfigSchema,
AdapterRuntimeCommandSpec,
ServerAdapterModule,
QuotaWindow,
ProviderQuotaResult,
@@ -53,4 +56,21 @@ export {
redactHomePathUserSegmentsInValue,
redactTranscriptEntryPaths,
} from "./log-redaction.js";
export {
REDACTED_COMMAND_TEXT_VALUE,
redactCommandText,
} from "./command-redaction.js";
export { buildSandboxNpmInstallCommand } from "./sandbox-install-command.js";
export { inferOpenAiCompatibleBiller } from "./billing.js";
// Keep the root adapter-utils entry browser-safe because the UI imports it.
// The sandbox callback bridge stays available via its dedicated subpath export.
export type {
SandboxCallbackBridgeRequest,
SandboxCallbackBridgeResponse,
SandboxCallbackBridgeAsset,
SandboxCallbackBridgeDirectories,
SandboxCallbackBridgeRouteRule,
SandboxCallbackBridgeQueueClient,
SandboxCallbackBridgeWorkerHandle,
StartedSandboxCallbackBridgeServer,
} from "./sandbox-callback-bridge.js";
@@ -0,0 +1,49 @@
const REMOTE_EXECUTION_ENV_IDENTITY_KEYS = new Set([
"PATH",
"HOME",
"PWD",
"SHELL",
"USER",
"LOGNAME",
"NVM_DIR",
"TMPDIR",
"TMP",
"TEMP",
"XDG_CONFIG_HOME",
"XDG_CACHE_HOME",
"XDG_DATA_HOME",
"XDG_STATE_HOME",
"XDG_RUNTIME_DIR",
]);
function readEnvValueCaseInsensitive(env: NodeJS.ProcessEnv, key: string): string | undefined {
const direct = env[key];
if (typeof direct === "string") return direct;
const upper = key.toUpperCase();
for (const [candidateKey, candidateValue] of Object.entries(env)) {
if (candidateKey.toUpperCase() === upper && typeof candidateValue === "string") {
return candidateValue;
}
}
return undefined;
}
export function sanitizeRemoteExecutionEnv(
env: Record<string, string>,
inheritedEnv: NodeJS.ProcessEnv = process.env,
): Record<string, string> {
const sanitized: Record<string, string> = {};
for (const [key, value] of Object.entries(env)) {
const normalizedKey = key.toUpperCase();
if (!REMOTE_EXECUTION_ENV_IDENTITY_KEYS.has(normalizedKey)) {
sanitized[key] = value;
continue;
}
const inheritedValue = readEnvValueCaseInsensitive(inheritedEnv, key);
if (typeof inheritedValue === "string" && inheritedValue === value) {
continue;
}
sanitized[key] = value;
}
return sanitized;
}
@@ -5,6 +5,7 @@ import {
restoreWorkspaceFromSshExecution,
syncDirectoryToSsh,
} from "./ssh.js";
import { captureDirectorySnapshot } from "./workspace-restore-merge.js";
export interface RemoteManagedRuntimeAsset {
key: string;
@@ -44,7 +45,6 @@ export function buildRemoteExecutionSessionIdentity(spec: SshRemoteExecutionSpec
port: spec.port,
username: spec.username,
remoteCwd: spec.remoteCwd,
...(spec.paperclipApiUrl ? { paperclipApiUrl: spec.paperclipApiUrl } : {}),
} as const;
}
@@ -58,26 +58,37 @@ export function remoteExecutionSessionMatches(saved: unknown, current: SshRemote
asString(parsedSaved.host) === currentIdentity.host &&
asNumber(parsedSaved.port) === currentIdentity.port &&
asString(parsedSaved.username) === currentIdentity.username &&
asString(parsedSaved.remoteCwd) === currentIdentity.remoteCwd &&
asString(parsedSaved.paperclipApiUrl) === asString(currentIdentity.paperclipApiUrl)
asString(parsedSaved.remoteCwd) === currentIdentity.remoteCwd
);
}
export async function prepareRemoteManagedRuntime(input: {
spec: SshRemoteExecutionSpec;
runId: string;
adapterKey: string;
workspaceLocalDir: string;
workspaceRemoteDir?: string;
assets?: RemoteManagedRuntimeAsset[];
}): Promise<PreparedRemoteManagedRuntime> {
const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
const baseWorkspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
const workspaceRemoteDir = path.posix.join(
baseWorkspaceRemoteDir,
".paperclip-runtime",
"runs",
input.runId,
"workspace",
);
const runtimeRootDir = path.posix.join(workspaceRemoteDir, ".paperclip-runtime", input.adapterKey);
await prepareWorkspaceForSshExecution({
const preparedWorkspace = await prepareWorkspaceForSshExecution({
spec: input.spec,
localDir: input.workspaceLocalDir,
remoteDir: workspaceRemoteDir,
});
const restoreExclude = preparedWorkspace.gitBacked ? [".git", ".paperclip-runtime"] : [".paperclip-runtime"];
const baselineSnapshot = await captureDirectorySnapshot(input.workspaceLocalDir, {
exclude: restoreExclude,
});
const assetDirs: Record<string, string> = {};
try {
@@ -97,6 +108,8 @@ export async function prepareRemoteManagedRuntime(input: {
spec: input.spec,
localDir: input.workspaceLocalDir,
remoteDir: workspaceRemoteDir,
baselineSnapshot,
restoreGitHistory: preparedWorkspace.gitBacked,
});
throw error;
}
@@ -112,6 +125,8 @@ export async function prepareRemoteManagedRuntime(input: {
spec: input.spec,
localDir: input.workspaceLocalDir,
remoteDir: workspaceRemoteDir,
baselineSnapshot,
restoreGitHistory: preparedWorkspace.gitBacked,
});
},
};
@@ -0,0 +1,983 @@
import { execFile as execFileCallback } from "node:child_process";
import { mkdir, mkdtemp, readFile, readdir, rm, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { promisify } from "node:util";
import { afterEach, describe, expect, it, vi } from "vitest";
import { prepareCommandManagedRuntime } from "./command-managed-runtime.js";
import {
authorizeSandboxCallbackBridgeRequestWithRoutes,
createCommandManagedSandboxCallbackBridgeQueueClient,
createFileSystemSandboxCallbackBridgeQueueClient,
createSandboxCallbackBridgeAsset,
createSandboxCallbackBridgeToken,
sandboxCallbackBridgeDirectories,
syncSandboxCallbackBridgeEntrypoint,
startSandboxCallbackBridgeServer,
startSandboxCallbackBridgeWorker,
} from "./sandbox-callback-bridge.js";
import type { RunProcessResult } from "./server-utils.js";
const execFile = promisify(execFileCallback);
describe("sandbox callback bridge", () => {
const cleanupDirs: string[] = [];
const cleanupFns: Array<() => Promise<void>> = [];
function createExecRunner() {
return {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}): Promise<RunProcessResult> => {
const startedAt = new Date().toISOString();
const env = {
...process.env,
...input.env,
};
const command =
input.command === "sh" ? "/bin/sh" : input.command === "bash" ? "/bin/bash" : input.command;
const args = [...(input.args ?? [])];
if (
input.stdin != null &&
(input.command === "sh" || input.command === "bash") &&
(args[0] === "-c" || args[0] === "-lc") &&
typeof args[1] === "string"
) {
env.PAPERCLIP_TEST_STDIN = input.stdin;
args[1] = `printf '%s' \"$PAPERCLIP_TEST_STDIN\" | (${args[1]})`;
}
try {
const result = await execFile(command, args, {
cwd: input.cwd,
env,
maxBuffer: 32 * 1024 * 1024,
timeout: input.timeoutMs,
});
return {
exitCode: 0,
signal: null,
timedOut: false,
stdout: result.stdout,
stderr: result.stderr,
pid: null,
startedAt,
};
} catch (error) {
const err = error as NodeJS.ErrnoException & {
stdout?: string;
stderr?: string;
code?: string | number | null;
signal?: NodeJS.Signals | null;
killed?: boolean;
};
return {
exitCode: typeof err.code === "number" ? err.code : null,
signal: err.signal ?? null,
timedOut: Boolean(err.killed && input.timeoutMs),
stdout: err.stdout ?? "",
stderr: err.stderr ?? "",
pid: null,
startedAt,
};
}
},
};
}
async function waitForJsonFile(directory: string, timeoutMs = 2_000): Promise<string> {
const deadline = Date.now() + timeoutMs;
while (Date.now() < deadline) {
const entries = await readdir(directory).catch(() => []);
const match = entries.find((entry) => entry.endsWith(".json"));
if (match) return match;
await new Promise((resolve) => setTimeout(resolve, 10));
}
throw new Error(`Timed out waiting for a JSON file in ${directory}.`);
}
afterEach(async () => {
while (cleanupFns.length > 0) {
const cleanup = cleanupFns.pop();
if (!cleanup) continue;
await cleanup().catch(() => undefined);
}
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("round-trips localhost bridge requests over the sandbox queue without forwarding the bridge token", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-runtime-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [
{
key: "bridge",
localDir: bridgeAsset.localDir,
},
],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const bridgeToken = createSandboxCallbackBridgeToken();
const seenRequests: Array<{
method: string;
path: string;
query: string;
headers: Record<string, string>;
body: string;
}> = [];
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
authorizeRequest: async (request) =>
request.path === "/api/agents/me" ? null : `Route not allowed: ${request.method} ${request.path}`,
handleRequest: async (request) => {
seenRequests.push({
method: request.method,
path: request.path,
query: request.query,
headers: request.headers,
body: request.body,
});
return {
status: 200,
headers: {
"content-type": "application/json",
etag: '"bridge-rev-1"',
"last-modified": "Tue, 01 Apr 2025 00:00:00 GMT",
},
body: JSON.stringify({
ok: true,
method: request.method,
path: request.path,
}),
};
},
});
cleanupFns.push(async () => {
await worker.stop();
});
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
});
cleanupFns.push(async () => {
await bridge.stop();
});
const okResponse = await fetch(`${bridge.baseUrl}/api/agents/me?view=compact`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
accept: "application/json",
"if-none-match": '"client-cache-key"',
"x-paperclip-run-id": "run-bridge-1",
"x-bridge-debug": "drop-me",
},
});
expect(okResponse.status).toBe(200);
expect(okResponse.headers.get("content-type")).toContain("application/json");
expect(okResponse.headers.get("etag")).toBe('"bridge-rev-1"');
expect(okResponse.headers.get("last-modified")).toBe("Tue, 01 Apr 2025 00:00:00 GMT");
await expect(okResponse.json()).resolves.toMatchObject({
ok: true,
method: "GET",
path: "/api/agents/me",
});
const deniedResponse = await fetch(`${bridge.baseUrl}/api/issues/issue-1`, {
method: "PATCH",
headers: {
authorization: `Bearer ${bridgeToken}`,
"content-type": "application/json",
},
body: JSON.stringify({ status: "in_progress" }),
});
expect(deniedResponse.status).toBe(403);
await expect(deniedResponse.json()).resolves.toMatchObject({
error: "Route not allowed: PATCH /api/issues/issue-1",
});
const unauthorizedResponse = await fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: "Bearer wrong-token",
},
});
expect(unauthorizedResponse.status).toBe(401);
await expect(unauthorizedResponse.json()).resolves.toMatchObject({
error: "Invalid bridge token.",
});
expect(seenRequests).toHaveLength(1);
expect(seenRequests[0]).toMatchObject({
method: "GET",
path: "/api/agents/me",
query: "?view=compact",
body: "",
headers: {
accept: "application/json",
"if-none-match": '"client-cache-key"',
},
});
expect(seenRequests[0]?.headers.authorization).toBeUndefined();
expect(seenRequests[0]?.headers["x-paperclip-run-id"]).toBeUndefined();
});
it("denies non-allowlisted requests by default", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-default-policy-"));
cleanupDirs.push(rootDir);
const queueDir = path.posix.join(rootDir, "queue");
const directories = sandboxCallbackBridgeDirectories(queueDir);
let handled = 0;
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
handleRequest: async () => {
handled += 1;
return {
status: 200,
body: "should not happen",
};
},
});
await writeFile(
path.posix.join(directories.requestsDir, "req-1.json"),
`${JSON.stringify({
id: "req-1",
method: "DELETE",
path: "/api/secrets",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await worker.stop({ drainTimeoutMs: 1_000 });
const response = JSON.parse(
await readFile(path.posix.join(directories.responsesDir, "req-1.json"), "utf8"),
) as { status: number; body: string };
expect(handled).toBe(0);
expect(response.status).toBe(403);
expect(JSON.parse(response.body)).toEqual({
error: "Route not allowed: DELETE /api/secrets",
});
});
it("drains already-queued requests on stop", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-drain-"));
cleanupDirs.push(rootDir);
const queueDir = path.posix.join(rootDir, "queue");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const processed: string[] = [];
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
authorizeRequest: async () => null,
handleRequest: async (request) => {
processed.push(request.id);
await new Promise((resolve) => setTimeout(resolve, 25));
return {
status: 200,
body: request.id,
};
},
});
await writeFile(
path.posix.join(directories.requestsDir, "req-a.json"),
`${JSON.stringify({
id: "req-a",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await writeFile(
path.posix.join(directories.requestsDir, "req-b.json"),
`${JSON.stringify({
id: "req-b",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await worker.stop({ drainTimeoutMs: 1_000 });
expect(processed).toEqual(["req-a", "req-b"]);
await expect(readFile(path.posix.join(directories.responsesDir, "req-a.json"), "utf8")).resolves.toContain("\"req-a\"");
await expect(readFile(path.posix.join(directories.responsesDir, "req-b.json"), "utf8")).resolves.toContain("\"req-b\"");
});
it("writes fast 503 responses for queued requests that miss the drain deadline", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-drain-timeout-"));
cleanupDirs.push(rootDir);
const queueDir = path.posix.join(rootDir, "queue");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const processed: string[] = [];
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
authorizeRequest: async () => null,
handleRequest: async (request) => {
processed.push(request.id);
await new Promise((resolve) => setTimeout(resolve, 100));
return {
status: 200,
body: request.id,
};
},
});
await writeFile(
path.posix.join(directories.requestsDir, "req-a.json"),
`${JSON.stringify({
id: "req-a",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await writeFile(
path.posix.join(directories.requestsDir, "req-b.json"),
`${JSON.stringify({
id: "req-b",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
for (let attempt = 0; attempt < 50 && processed.length === 0; attempt += 1) {
await new Promise((resolve) => setTimeout(resolve, 5));
}
await worker.stop({ drainTimeoutMs: 10 });
expect(processed).toEqual(["req-a"]);
await expect(readFile(path.posix.join(directories.responsesDir, "req-a.json"), "utf8")).resolves.toContain("\"req-a\"");
await expect(readFile(path.posix.join(directories.responsesDir, "req-b.json"), "utf8")).resolves.toContain(
"Bridge worker stopped before request could be handled.",
);
});
it("handles SSH queue polling failures without emitting an unhandled rejection", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-ssh-failure-"));
cleanupDirs.push(rootDir);
const queueDir = path.posix.join(rootDir, "queue");
const unhandled: unknown[] = [];
const onUnhandledRejection = (reason: unknown) => {
unhandled.push(reason);
};
process.on("unhandledRejection", onUnhandledRejection);
try {
const worker = await startSandboxCallbackBridgeWorker({
client: {
makeDir: async () => {},
listJsonFiles: async () => {
throw new Error(
"list /remote/.paperclip-runtime/gemini/paperclip-bridge/queue/requests failed with exit code 255: kex_exchange_identification: read: Connection reset by peer",
);
},
readTextFile: async () => {
throw new Error("unexpected readTextFile");
},
writeTextFile: async () => {
throw new Error("unexpected writeTextFile");
},
rename: async () => {
throw new Error("unexpected rename");
},
remove: async () => {},
},
queueDir,
authorizeRequest: async () => null,
handleRequest: async () => ({
status: 200,
body: "ok",
}),
});
await new Promise((resolve) => setTimeout(resolve, 50));
await worker.stop();
expect(unhandled).toEqual([]);
} finally {
process.off("unhandledRejection", onUnhandledRejection);
}
});
it("serializes remote response writes so stop does not recreate a late orphaned response", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-response-lock-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge response lock test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [{ key: "bridge", localDir: bridgeAsset.localDir }],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const bridgeToken = createSandboxCallbackBridgeToken();
const seenRequestIds: string[] = [];
const worker = await startSandboxCallbackBridgeWorker({
client: createCommandManagedSandboxCallbackBridgeQueueClient({
runner,
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
}),
queueDir,
authorizeRequest: async () => null,
handleRequest: async (request) => {
seenRequestIds.push(request.id);
await new Promise((resolve) => setTimeout(resolve, 250));
return {
status: 200,
headers: { "content-type": "application/json" },
body: JSON.stringify({ ok: true, id: request.id }),
};
},
});
cleanupFns.push(async () => {
await worker.stop();
});
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
});
cleanupFns.push(async () => {
await bridge.stop();
});
const responsePromise = fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
},
});
for (let attempt = 0; attempt < 50 && seenRequestIds.length === 0; attempt += 1) {
await new Promise((resolve) => setTimeout(resolve, 5));
}
expect(seenRequestIds).toHaveLength(1);
await worker.stop({ drainTimeoutMs: 10 });
const response = await responsePromise;
expect(response.status).toBe(503);
await expect(response.json()).resolves.toEqual({
error: "Bridge worker stopped before request could be handled.",
});
await new Promise((resolve) => setTimeout(resolve, 300));
await expect(readdir(directories.responsesDir)).resolves.toEqual([]);
await expect(
readdir(directories.responsesDir).then((entries) =>
entries.filter((entry) => entry.endsWith(".tmp") || entry.includes(".paperclip-write.lock")),
),
).resolves.toEqual([]);
});
it("rejects non-JSON request bodies and full queues at the bridge server", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-server-guards-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge guard test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [{ key: "bridge", localDir: bridgeAsset.localDir }],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const bridgeToken = createSandboxCallbackBridgeToken();
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
maxQueueDepth: 1,
});
cleanupFns.push(async () => {
await bridge.stop();
});
await writeFile(
path.posix.join(directories.requestsDir, "existing.json"),
`${JSON.stringify({
id: "existing",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
const queueFullResponse = await fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
},
});
expect(queueFullResponse.status).toBe(503);
await expect(queueFullResponse.json()).resolves.toEqual({
error: "Bridge request queue is full.",
});
await rm(path.posix.join(directories.requestsDir, "existing.json"), { force: true });
const nonJsonResponse = await fetch(`${bridge.baseUrl}/api/issues/issue-1/comments`, {
method: "POST",
headers: {
authorization: `Bearer ${bridgeToken}`,
"content-type": "text/plain",
},
body: "not json",
});
expect(nonJsonResponse.status).toBe(415);
await expect(nonJsonResponse.json()).resolves.toEqual({
error: "Bridge only accepts JSON request bodies.",
});
});
it("returns a 502 when the host response times out", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-timeout-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge timeout test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [{ key: "bridge", localDir: bridgeAsset.localDir }],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const bridgeToken = createSandboxCallbackBridgeToken();
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
pollIntervalMs: 10,
responseTimeoutMs: 75,
});
cleanupFns.push(async () => {
await bridge.stop();
});
const response = await fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
},
});
expect(response.status).toBe(502);
await expect(response.json()).resolves.toEqual({
error: "Timed out waiting for host bridge response.",
});
});
it("returns a 502 for malformed host response files", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-malformed-response-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge malformed response test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [{ key: "bridge", localDir: bridgeAsset.localDir }],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const bridgeToken = createSandboxCallbackBridgeToken();
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
pollIntervalMs: 10,
responseTimeoutMs: 1_000,
});
cleanupFns.push(async () => {
await bridge.stop();
});
const responsePromise = fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
},
});
const requestFile = await waitForJsonFile(directories.requestsDir);
await writeFile(
path.posix.join(directories.responsesDir, requestFile),
'{"status":200,"headers":{"content-type":"application/json"},"body"',
"utf8",
);
const response = await responsePromise;
expect(response.status).toBe(502);
await expect(response.json()).resolves.toMatchObject({
error: expect.stringMatching(/JSON|Unexpected|Unterminated/i),
});
});
it("reuses an already-uploaded bridge entrypoint when the remote file hash matches", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-sync-"));
cleanupDirs.push(rootDir);
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
const remoteAssetDir = path.posix.join(
remoteWorkspaceDir,
".paperclip-runtime",
"codex",
"paperclip-bridge",
"server",
);
await mkdir(remoteWorkspaceDir, { recursive: true });
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const originalSource = await readFile(bridgeAsset.entrypoint, "utf8");
const expandedSource = `${originalSource}\n// bridge payload padding\n`;
await writeFile(bridgeAsset.entrypoint, expandedSource, "utf8");
const runner = createExecRunner();
const first = await syncSandboxCallbackBridgeEntrypoint({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: remoteAssetDir,
bridgeAsset,
timeoutMs: 30_000,
});
const second = await syncSandboxCallbackBridgeEntrypoint({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: remoteAssetDir,
bridgeAsset,
timeoutMs: 30_000,
});
expect(first.uploaded).toBe(true);
expect(second.uploaded).toBe(false);
await expect(readFile(path.posix.join(remoteAssetDir, "paperclip-bridge-server.mjs"), "utf8")).resolves.toBe(expandedSource);
await expect(
readdir(remoteAssetDir).then((entries) =>
entries.filter(
(entry) =>
entry.endsWith(".paperclip-upload.b64") ||
entry.endsWith(".partial") ||
entry === ".paperclip-bridge-upload.lock",
),
),
).resolves.toEqual([]);
});
it("rejects a corrupted bridge entrypoint upload without committing a torn remote file", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-sync-corrupt-"));
cleanupDirs.push(rootDir);
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
const remoteAssetDir = path.posix.join(
remoteWorkspaceDir,
".paperclip-runtime",
"codex",
"paperclip-bridge",
"server",
);
await mkdir(remoteWorkspaceDir, { recursive: true });
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const runner = {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}) =>
await createExecRunner().execute({
...input,
stdin: input.stdin != null ? "" : input.stdin,
}),
};
await expect(
syncSandboxCallbackBridgeEntrypoint({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: remoteAssetDir,
bridgeAsset,
timeoutMs: 30_000,
}),
).rejects.toThrow(/sha mismatch/i);
await expect(readFile(path.posix.join(remoteAssetDir, "paperclip-bridge-server.mjs"), "utf8")).rejects.toThrow();
await expect(
readdir(remoteAssetDir).then((entries) =>
entries.filter(
(entry) =>
entry.endsWith(".paperclip-upload.b64") ||
entry.endsWith(".partial") ||
entry === ".paperclip-bridge-upload.lock",
),
),
).resolves.toEqual([]);
});
it("permits the documented heartbeat surface and denies unrelated routes", () => {
const allowed: Array<{ method: string; path: string }> = [
{ method: "GET", path: "/api/agents/me" },
{ method: "GET", path: "/api/agents/me/inbox-lite" },
{ method: "GET", path: "/api/agents/me/inbox/mine" },
{ method: "GET", path: "/api/agents/agent-1" },
{ method: "GET", path: "/api/agents/agent-1/skills" },
{ method: "POST", path: "/api/agents/agent-1/skills/sync" },
{ method: "PATCH", path: "/api/agents/agent-1/instructions-path" },
{ method: "GET", path: "/api/companies/co-1" },
{ method: "GET", path: "/api/companies/co-1/dashboard" },
{ method: "GET", path: "/api/companies/co-1/agents" },
{ method: "GET", path: "/api/companies/co-1/issues" },
{ method: "GET", path: "/api/companies/co-1/projects" },
{ method: "GET", path: "/api/companies/co-1/goals" },
{ method: "GET", path: "/api/companies/co-1/org" },
{ method: "GET", path: "/api/companies/co-1/approvals" },
{ method: "GET", path: "/api/companies/co-1/routines" },
{ method: "GET", path: "/api/companies/co-1/skills" },
{ method: "GET", path: "/api/projects/proj-1" },
{ method: "GET", path: "/api/goals/goal-1" },
{ method: "GET", path: "/api/issues/issue-1" },
{ method: "GET", path: "/api/issues/issue-1/heartbeat-context" },
{ method: "GET", path: "/api/issues/issue-1/comments" },
{ method: "GET", path: "/api/issues/issue-1/comments/c-1" },
{ method: "POST", path: "/api/issues/issue-1/comments" },
{ method: "GET", path: "/api/issues/issue-1/documents" },
{ method: "GET", path: "/api/issues/issue-1/documents/plan" },
{ method: "GET", path: "/api/issues/issue-1/documents/plan/revisions" },
{ method: "PUT", path: "/api/issues/issue-1/documents/plan" },
{ method: "POST", path: "/api/issues/issue-1/checkout" },
{ method: "POST", path: "/api/issues/issue-1/release" },
{ method: "PATCH", path: "/api/issues/issue-1" },
{ method: "GET", path: "/api/issues/issue-1/approvals" },
{ method: "GET", path: "/api/issues/issue-1/interactions" },
{ method: "GET", path: "/api/issues/issue-1/interactions/inter-1" },
{ method: "POST", path: "/api/issues/issue-1/interactions" },
{ method: "POST", path: "/api/issues/issue-1/interactions/inter-1/accept" },
{ method: "POST", path: "/api/issues/issue-1/interactions/inter-1/reject" },
{ method: "POST", path: "/api/issues/issue-1/interactions/inter-1/respond" },
{ method: "POST", path: "/api/companies/co-1/issues" },
{ method: "GET", path: "/api/approvals/ap-1" },
{ method: "GET", path: "/api/approvals/ap-1/issues" },
{ method: "GET", path: "/api/approvals/ap-1/comments" },
{ method: "POST", path: "/api/approvals/ap-1/comments" },
{ method: "POST", path: "/api/companies/co-1/approvals" },
{ method: "GET", path: "/api/execution-workspaces/ws-1" },
{ method: "POST", path: "/api/execution-workspaces/ws-1/runtime-services/start" },
{ method: "POST", path: "/api/execution-workspaces/ws-1/runtime-services/stop" },
{ method: "POST", path: "/api/execution-workspaces/ws-1/runtime-services/restart" },
{ method: "GET", path: "/api/routines/r-1" },
{ method: "GET", path: "/api/routines/r-1/runs" },
{ method: "POST", path: "/api/companies/co-1/routines" },
{ method: "PATCH", path: "/api/routines/r-1" },
{ method: "POST", path: "/api/routines/r-1/run" },
{ method: "POST", path: "/api/routines/r-1/triggers" },
{ method: "PATCH", path: "/api/routine-triggers/t-1" },
{ method: "DELETE", path: "/api/routine-triggers/t-1" },
];
for (const request of allowed) {
expect(authorizeSandboxCallbackBridgeRequestWithRoutes(request)).toBeNull();
}
const denied: Array<{ method: string; path: string }> = [
{ method: "DELETE", path: "/api/secrets" },
// Pin the runtime-services regex to start/stop/restart only — anything
// else (delete, reset, wipe, etc.) must stay denied even if the API
// grows new actions later.
{ method: "POST", path: "/api/execution-workspaces/ws-1/runtime-services/delete" },
{ method: "POST", path: "/api/companies/co-1/agents" },
{ method: "POST", path: "/api/agents/agent-1/pause" },
{ method: "POST", path: "/api/agents/agent-1/terminate" },
{ method: "POST", path: "/api/agents/agent-1/keys" },
{ method: "POST", path: "/api/companies/co-1/exports" },
{ method: "POST", path: "/api/companies/co-1/imports/apply" },
{ method: "POST", path: "/api/companies/co-1/archive" },
{ method: "DELETE", path: "/api/issues/issue-1/documents/plan" },
{ method: "DELETE", path: "/api/issues/issue-1/approvals/ap-1" },
{ method: "POST", path: "/api/approvals/ap-1/approve" },
{ method: "POST", path: "/api/approvals/ap-1/reject" },
{ method: "POST", path: "/api/companies/co-1/logo" },
{ method: "GET", path: "/api/companies/co-1/secrets" },
{ method: "PATCH", path: "/api/secrets/secret-1" },
];
for (const request of denied) {
expect(authorizeSandboxCallbackBridgeRequestWithRoutes(request)).toBe(
`Route not allowed: ${request.method} ${request.path}`,
);
}
});
it("marks command-managed bridge operations with the bridge execution channel", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const client = createCommandManagedSandboxCallbackBridgeQueueClient({
runner,
remoteCwd: "/workspace",
timeoutMs: 30_000,
});
await client.makeDir("/workspace/.paperclip-runtime/codex/paperclip-bridge/queue");
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
env: {
PAPERCLIP_SANDBOX_EXEC_CHANNEL: "bridge",
},
}));
});
});

Some files were not shown because too many files have changed in this diff Show More