349 Commits

Author SHA1 Message Date
Chris Farhood 9f3f71a199 fork: add CLAUDE.md describing fork model and don'ts
Build: Dev / build (push) Failing after 9s
Build: Production / build (push) Successful in 3m17s
Build: Dev / update-infra (push) Has been skipped
Captures branch model (master/dev/local), the 3-file fork delta, upstream
sync procedure, and the post-reset rule against re-introducing fork code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 08:35:03 -04:00
Chris Farhood 1a1c57461f fork: production Dockerfile additions + Gitea registry build workflows
Build: Dev / update-infra (push) Successful in 1s
Build: Production / build (push) Successful in 4m6s
Build: Dev / build (push) Successful in 4m15s
Only fork divergence from upstream/master. Adds to the production stage:
  - kubectl, kubeseal (Kubernetes ops in deployed pods)
  - uv, uvx (Python tooling)
  - forgejo-cli (fj, fj-ex, fgj)
  - gitea tea CLI
  - mmx-cli
  - nano, vim

Workflows push to git.farh.net/farhoodlabs/paperclip{,-dev}.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 08:11:45 -04:00
Devin Foley 911a1e8b0d Fix continuation recovery retry streaks by failure cause (#7031)
Release / verify_stable (push) Has been skipped
Release / preview_stable (push) Has been skipped
Release / verify_canary (push) Failing after 3m52s
Docker / build-and-push (push) Failing after 7s
Refresh Lockfile / refresh (push) Failing after 18s
Release / publish_stable (push) Has been skipped
Release / publish_canary (push) Has been skipped
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The recovery subsystem is responsible for keeping assigned work
moving when a live heartbeat run disappears or fails.
> - `continuation_recovery` is the path that re-enqueues stranded
`in_progress` issues after an interrupted continuation attempt.
> - That path recently gained cause-aware retry classes and transient
retry caps, but the streak counter was still aggregating mixed failure
causes into one retry history.
> - That meant a sequence like `timeout -> timeout -> adapter_failed ->
adapter_failed` could escalate as a false `3x adapter_failed` streak
even though the latest cause had only happened twice.
> - This pull request makes continuation retry streaks count only
consecutive failures whose `errorCode` matches the latest run and adds a
regression test for the mixed-cause case.
> - The benefit is that transient retry backoff and escalation now match
the actual current failure cause instead of inheriting stale budget from
unrelated failures.

## What Changed

- Updated `summarizeRecentContinuationRetries(...)` to stop counting as
soon as the continuation failure cause no longer matches the latest
run's `errorCode`.
- Wired the continuation recovery escalation/backoff path to pass the
latest classified `errorCode` into the retry streak summarizer.
- Added a regression test proving mixed-cause continuation failures do
not consume the transient retry cap for a new failure cause.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`

## Risks

- Low risk. The behavioral change is intentionally narrow, but any
future continuation retry modes that rely on `errorCode = null` will now
be counted as a separate streak bucket and should be kept in mind when
adding new retry classifications.

## Model Used

- OpenAI Codex via Paperclip `codex_local` (GPT-5-based Codex coding
agent; exact backend revision is not surfaced in the runtime), with tool
use, shell execution, and patch application in the local repository.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-29 19:48:59 -07:00
Devin Foley aea35fe695 exe.dev config UX: advanced-options disclosure, form-default fix, SSH key handling (PAPA-407) (#7025)
## Thinking Path

> - Paperclip orchestrates AI agents and provisions sandboxed execution
environments for them; one of those provisioners is the exe.dev plugin,
which runs each agent inside a long-lived VM reached over SSH.
> - The instance-config form for that plugin is rendered generically by
`JsonSchemaForm` from the plugin's `instanceConfigSchema`, so any UX
problem with the form is split between the shared form component and the
plugin's schema/runtime code.
> - Users coming in cold hit a 12-field flat config they couldn't reason
about (PAPA-407), a form that silently submitted `cpu: 0` for untouched
optional fields (PAPA-407 root cause), a `sshPrivateKey` textarea that
truncated RSA-4096 keys at 4096 chars (PAPA-449), a save flow that
accepted clearly-malformed keys and only blew up at lease time with raw
SSH stderr (PAPA-450, PAPA-451), and a manifest that didn't distinguish
"essential" from "advanced" knobs (PAPA-410 / PAPA-411 — duplicate
sub-issues with identical scope; PAPA-418 reconciliation kept PAPA-410
canonical).
> - These problems all point at the same surface (exe.dev sandbox
config) and are tightly coupled in code — PAPA-449/450/451 patch fields
that PAPA-410/411 introduce — so they get reviewed together.
> - This pull request lands the shared-form changes (advanced-options
disclosure, optional-scalar defaults) and the exe.dev-specific changes
(manifest restructure, longer `maxLength`, stderr translation, save-time
key validation) as five focused commits stacked on `master`.
> - The benefit is a config form that defaults to the two fields a new
user actually needs (API key + SSH private key) with a collapsible
disclosure for the rest, no silent truncation or zero-default
submissions, and SSH key problems surfaced at save time with actionable
messages instead of cryptic post-provision failures.

## What Changed

- **JsonSchemaForm advanced-options disclosure** (PAPA-410, PAPA-411 —
same scope, see note above): adds `x-paperclip-advanced` /
`x-paperclip-group` schema annotations and renders flagged fields behind
a collapsible "Advanced options" disclosure that auto-opens when a
hidden field has a validation error. Exe.dev manifest is restructured to
use the new annotations, so essentials (`apiKey`, `sshPrivateKey`) show
by default while the long tail of optional knobs is grouped under "SSH
access" / "VM resources" / "More options" headings.
- **Omit optional scalar defaults** (PAPA-407): `getDefaultForSchema` no
longer materialises `0` / `""` for optional
`number`/`integer`/`string`/`secret-ref` fields without an explicit
`default`. Object recursion drops properties whose default is
`undefined`. Fields that declare a `default` (e.g. `sshPort: 22`) still
round-trip. Adds a regression test against `getDefaultValues`.
- **Raise `sshPrivateKey` `maxLength`** (PAPA-449): bumps the exe.dev
manifest cap from 4096 to 8192 so RSA-4096 OpenSSH private keys (which
can exceed 4 KB with comments/metadata) aren't silently truncated at
submit.
- **Translate `invalid format` SSH stderr** (PAPA-450):
`formatSshFailure` now recognises `Load key … invalid format` in
combined stderr/stdout and returns a specific message naming the
key-format problem ("isn't an OpenSSH/PEM private key — confirm the
secret starts with `-----BEGIN … PRIVATE KEY-----` and isn't the `.pub`
or a PuTTY `.ppk` export") instead of dumping the raw stderr.
- **Save-time SSH key validation** (PAPA-451):
`onEnvironmentValidateConfig` inline-parses `sshPrivateKey` and rejects
common failure modes — pasted public keys, PuTTY `.ppk` format, missing
`-----END-----` footer, non-base64 body — so the form surfaces an inline
error before any VM is provisioned. Secret-ref bindings (UUIDs) are
still passed through unchanged.

## Verification

CI gates (`pnpm typecheck`, `pnpm test`, the targeted vitest suites
below) all pass.

Run locally:

```bash
# Shared form
pnpm --filter @paperclipai/ui exec vitest run src/components/JsonSchemaForm
# 9 tests pass — includes the new "omits optional scalar fields" regression
# and the three advanced-options-disclosure tests.

# exe.dev plugin
cd packages/plugins/sandbox-providers/exe-dev && pnpm test
# 32 tests pass — includes the new sshPrivateKey-validation cases
# and the new "invalid format" stderr-translation case.
```

Manual smoke (after reinstalling the plugin so the DB manifest
refreshes):

1. Open the exe.dev environment config page. **Default view shows API
Key + SSH Private Key only**, with an "Advanced options" disclosure for
everything else (PAPA-410 / PAPA-411).
2. Paste a `.pub` file's contents into SSH Private Key, click Save.
**Inline error** rejecting the wrong-format key (PAPA-451).
3. Re-paste a valid OpenSSH/PEM private key longer than 4096 bytes —
saves cleanly (PAPA-449).
4. Save the form with everything optional left blank — server no longer
rejects with `"cpu must be greater than 0 when provided"` (PAPA-407).
5. Force a bad key through via a stored secret-ref binding and lease a
VM — failure message names the key-format problem instead of dumping raw
SSH stderr (PAPA-450).

## Risks

- **PAPA-410 / PAPA-411 manifest restructure** is the largest surface
here. Schemas using `x-paperclip-*` extensions are forward-compatible
with stricter JSON Schema validators (extensions are ignored by
default), and the form gracefully renders a flat layout when no field
opts in.
- **PAPA-407** changes form-default behaviour: optional scalar fields
that previously round-tripped as `""` / `0` will now be `undefined` and
absent from the submitted payload. Downstream consumers that expected
the empty-string/zero shape need to treat the field as optional.
Spot-checked the existing exe.dev driver — it already uses
`parseOptionalString` / `parseOptionalInteger`, which treat missing
fields as `null` rather than `0`/`""`.
- **PAPA-451** adds a save-time check, so a
previously-saved-but-malformed `sshPrivateKey` raw value will now fail
to re-save. Bound secret-refs are unaffected, matching how the user
reaches the bad-key state today (via the secrets picker).
- **PAPA-449** simply raises a cap; no semantic risk.
- **PAPA-450** only kicks in on the "invalid format" code path; existing
onboarding-marker branch is untouched.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (`claude-opus-4-7`)
- Capabilities used: code reading, code editing, test execution, git/PR
mechanics, Paperclip API for issue coordination

## Checklist

- [x] PR body sections present (Thinking Path, What Changed,
Verification, Risks, Model Used, Checklist)
- [x] Unit tests added for the new behaviours (JsonSchemaForm
default-value omission + advanced disclosure; exe.dev plugin validation
+ stderr translation)
- [x] Existing tests still pass locally (`vitest run` on both packages)
- [x] No raw secrets, IP addresses, or machine-local config in commits
or PR body
- [x] Commits are atomic per linked issue (PAPA-410 / PAPA-411,
PAPA-407, PAPA-449, PAPA-450, PAPA-451)
- [x] Branch is up-to-date with `origin/master`

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-29 18:19:37 -07:00
Dotta 8014445b23 Add v2026.529.0 release changelog (#6999)
## Release changelog: v2026.529.0

Stable changelog for the **v2026.529.0** release (released 2026-05-29),
generated with the `release-changelog` skill.

- Range: `v2026.525.0..origin/master` — 11 squash-merged PRs
- Adds `releases/v2026.529.0.md`
- **No breaking changes** — migrations are additive (`CREATE TABLE IF
NOT EXISTS`); the only `DROP CONSTRAINT` lines are FK adjustments, not
data loss
- **No external contributors** this cycle — all PR authors are Paperclip
founders, who are excluded from the Contributors section per the skill,
so that section is omitted

### Highlights
- Inline document annotations and comments (#6733)
- Company skills CLI and catalog management (#6782)
- Hide projects and agents from your sidebar (#6677)
- First-admin claim flow for fresh self-hosted deployments (#6755)
- Live Claude model discovery (#6953)

### Improvements
- Bundled plugins now appear in the plugin manager (#6734)
- Tighter workspace lifecycle guarantees (#6969)

### Fixes
- Accepted plans decompose exactly once (#6831)

Docs-only (README brand/license #6810, #6804) and CI-only (#6967)
changes were excluded as not materially user-facing.

Issue: PAP-10155

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-29 07:27:55 -10:00
Dotta 5153b01ada [codex] Add Claude model refresh (#6953)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through adapter-backed
local and external runtimes.
> - The agent configuration UI lets operators choose adapter models and
refresh model lists when adapters support live discovery.
> - Codex already had a live refresh path, but Claude Local only exposed
static fallback models and the UI hid the refresh action for Claude.
> - A newly available Claude Opus model should not require a code
release every time the model catalog changes.
> - This pull request adds Anthropic model discovery for Claude Local,
keeps the static fallback current with Claude Opus 4.8, and exposes the
existing refresh button in the Claude Local dropdown.
> - The benefit is that operators can refresh Claude models from the
same model selector flow they already use for Codex.

## What Changed

- Added `claude-opus-4-8` to the Claude Local fallback model list.
- Added Claude model discovery through Anthropic-compatible `GET
/v1/models` when `ANTHROPIC_API_KEY` is available.
- Added normal cache reuse, forced refresh support, a SHA-256-based
API-key fingerprint for cache keys, and warning logging for discovery
errors before fallback.
- Wired `claude_local.refreshModels` into the server adapter registry.
- Enabled the existing `Refresh models` dropdown action for
`claude_local` in `AgentConfigForm`.
- Added tests for Claude fallback, live discovery, API-failure fallback,
forced refresh, and the UI refresh-button gate.

## Verification

- `pnpm exec vitest run server/src/__tests__/adapter-models.test.ts`
- `pnpm exec vitest run ui/src/components/AgentConfigForm.test.ts`
- `pnpm --filter @paperclipai/adapter-claude-local typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- Greptile review reached Confidence Score: 5/5 on commit `b796cf4f1`
with addressed threads resolved.

UI note: the visible change is a conditional action row inside the
existing model dropdown; the regression test covers that `claude_local`
now receives the refresh action.

## Risks

- Low risk. Without `ANTHROPIC_API_KEY`, Claude Local still uses the
static fallback list.
- If Anthropic model discovery fails or times out, Paperclip falls back
to the existing cached or static list.
- Bedrock environments remain on Bedrock-native model IDs.

## Model Used

OpenAI GPT-5 via Codex local coding agent, with repository file access,
shell command execution, git operations, and targeted test/typecheck
verification. Exact context window is not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-29 07:03:07 -10:00
Devin Foley 1f70fd9a22 PAPA-430: workspace finalize gates + no-remote-git enforcement (#6969)
## Thinking Path

> - Paperclip orchestrates AI agents across isolated execution
workspaces; the local cwd is the only persistence boundary between runs.
> - Workspace lifecycle (worktree_prepare → execute →
workspace_finalize) and the wake/accept flow are what guarantee that
dependent issues see a consistent worktree.
> - PAPA-380 / PAPA-431 / PAPA-432 / PAPA-440 surfaced three holes in
that contract: silent env reuse across assignees, dependent wakes firing
before finalize, and `issue.interaction.accept` advancing before
finalize landed.
> - PAPA-441 / PAPA-442 then needed to document the "no remote git"
contract and prevent future adapter/runtime code from quietly
reintroducing `git push` as a backdoor sync.
> - This pull request lands those server fixes, the static
`check-no-git-push` enforcement, the AUTHORING.md cross-link, and the
Cody-review follow-ups on the PAPA-430 thread.
> - The benefit is that finalize is a real barrier — board accepts,
dependent wakes, and operator-set env all respect it — and adapter code
can't bypass it via raw `git push`.

## What Changed

- **server (PAPA-380, PAPA-431):** `execution-workspace-policy` refuses
silent env reuse when the assignee's resolved env disagrees with the
workspace it would inherit. The inheritance protection is now scoped to
the actual inheritance signal — explicit issue-level `environmentId` is
honored even when the agent's default env is `null`.
- **server (PAPA-432):** `heartbeat.ts` gates dependent wakes on
`listUnfinalizedExecutionWorkspaceIds`, and writes a
`workspace_finalize` row on the succeeded path. Write failures now
surface instead of being swallowed so dependents aren't silently
stranded behind a missing row.
- **server (PAPA-440):** `issue-thread-interactions.acceptInteraction`
adds a workspace_finalize precondition for `request_confirmation` (not
`suggest_tasks`). Accept returns 409 if finalize hasn't succeeded for
the latest workspace operation.
- **ci (PAPA-442):** new `scripts/check-no-git-push.mjs` static check
scans `packages/adapters/`, `packages/adapter-utils/`, `server/src/`,
and `cli/src/` for any `git push` invocation (string or args-array).
Wired into the `policy` PR job and `test:release-registry`. Operators
can opt in per-call with `// paperclip:allow-git-push: <reason>`.
Release scripts are out of scope by design.
- **docs (PAPA-441):** `AUTHORING.md` documents the no-remote-git
contract and cross-links the static check so adapter authors learn the
rule and the enforcement together.
- **review follow-up (PAPA-430, Cody):** three fixes — env resolver bug,
accept-gate scope (request_confirmation only), and finalize record write
on the succeeded path.

## Verification

- `pnpm exec vitest run
server/src/__tests__/execution-workspace-policy.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts` → 33/33
pass
- `node scripts/check-no-git-push.test.mjs` → check covers string form,
args-array form, comment exclusions, and per-line allow-comment.
- Manual: server compiles; the policy job runs the check in <1s before
heavier jobs.

## Risks

- **Behavioral shift in accept:** boards accepting
`request_confirmation` while finalize is in-flight now get 409s. This is
intentional — they can retry — but it changes timing on a hot path.
`suggest_tasks` is unaffected.
- **Workspace policy:** the env-reuse refusal is a new error path.
Issues that previously silently reused an env from a different-assignee
workspace will now fail-loud; the resolver still honors explicit
issue-level `executionWorkspaceSettings.environmentId`.
- **CI rule:** any future legitimate `git push` in scoped dirs must be
marked with the allow-comment, which is the intended ergonomic.

## Model Used

- Claude Opus 4.7 (`claude-opus-4-7`, extended thinking), via Claude
Code in the Paperclip executor adapter.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — server/CI/docs only)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Closes related issues: PAPA-430, PAPA-380, PAPA-431, PAPA-432, PAPA-440,
PAPA-441, PAPA-442

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-29 08:25:29 -07:00
Devin Foley 524e18b060 ci: use runner Chrome for headless workflows (#6967)
## Thinking Path

> - Paperclip relies on CI browser suites to protect control-plane
workflows, so a stalled browser bootstrap is a release blocker even when
app code is unchanged.
> - The failing signal on [PAPA-457](/PAP/issues/PAPA-457) was specific
to the PR e2e lane timing out before tests started, which pointed at
environment setup rather than assertions.
> - The first shell-only Chromium attempt reduced download size, but the
GitHub Actions log showed Playwright still hanging inside its install
step after the headless shell download finished.
> - That means the real problem is the Playwright browser-install path
itself on the hosted Ubuntu runner, not just the size of the downloaded
artifact.
> - GitHub's Ubuntu runners already ship Google Chrome, and Playwright
can target that binary through the `chrome` channel without downloading
its own Chromium bundle.
> - The safer workflow fix is therefore to remove the Playwright install
step from the affected headless jobs and make the Playwright configs
optionally use runner Chrome only when CI opts into it.
> - This keeps local defaults unchanged, removes the failing
browser-download dependency from CI, and preserves headless coverage for
PR, standalone e2e, and release-smoke workflows.

## What Changed

- Updated `.github/workflows/pr.yml`, `.github/workflows/e2e.yml`, and
`.github/workflows/release-smoke.yml` to stop downloading Playwright
browsers and instead verify the runner's preinstalled `google-chrome`.
- Passed `PAPERCLIP_PLAYWRIGHT_CHANNEL=chrome` into the headless PR,
standalone e2e, and release-smoke test steps so those jobs explicitly
use runner Chrome.
- Updated `tests/e2e/playwright.config.ts` and
`tests/release-smoke/playwright.config.ts` to honor
`PAPERCLIP_PLAYWRIGHT_CHANNEL` while keeping the default
local/browser-bundle behavior unchanged when the env var is absent.

## Verification

- Investigated the failed PR run log and confirmed the prior `Install
Playwright` step stalled after `chromium-headless-shell` reached 100%
download.
- `PLAYWRIGHT_BROWSERS_PATH="$(mktemp -d)"
PAPERCLIP_PLAYWRIGHT_CHANNEL=chrome PAPERCLIP_E2E_SKIP_LLM=true pnpm run
test:e2e`
Result: `7 passed (21.1s)` with an empty temporary Playwright browser
cache, proving the e2e suite runs without any Playwright browser
download when the `chrome` channel is selected.
- `git diff --check`

## Risks

- This assumes GitHub's Ubuntu runner continues to ship `google-chrome`;
if that image contract changes, these workflows would need a dedicated
Chrome install step.
- The `chrome` channel can differ slightly from Playwright-managed
Chromium, so the config gate is intentionally env-scoped to CI workflows
that need the hosted-runner path.

## Model Used

- OpenAI Codex, GPT-5-based coding agent running through Paperclip's
`codex_local` adapter with tool use, shell execution, and repository
editing enabled. The exact internal snapshot/version string is not
exposed in-session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-29 00:18:52 -07:00
Devin Foley d9f91576a0 Add accepted-plan decomposition exact-once guards and UI state (#6831)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, so
planning approvals and child-issue fan-out are part of the core
control-plane loop.
> - Accepted plans are supposed to be a safe bridge from planning into
execution, especially when agents wake from review decisions and reuse
isolated workspaces.
> - The duplicate-subtask incident showed that an accepted plan revision
could be interpreted more than once across overlapping runs, which broke
the single-source-of-truth model for issue decomposition.
> - Fixing that required tightening the backend contract first:
accepted-plan decomposition needs an exact-once fingerprint, durable
claim state, and retry-safe child creation.
> - Once that backend behavior existed, the board still needed
visibility into what happened, so the issue detail view needed a
dedicated decomposition section instead of forcing operators to
reconstruct child creation from raw activity.
> - This pull request adds the exact-once decomposition primitive,
hardens wake routing and regressions around the incident, and surfaces
decomposition state in the UI so future incidents are both prevented and
easier to inspect.

## What Changed

- Added accepted-plan decomposition semantics to
`doc/execution-semantics.md`, including the exact-once fingerprint,
durable claim/result expectations, and retry/resume behavior.
- Added persistent accepted-plan decomposition claims in the backend,
including schema, shared types/validators, service logic, and issue
routes for creating and listing decomposition state.
- Hardened heartbeat routing so an accepted-plan continuation stays
scoped to the relevant planning issue instead of opportunistically
re-decomposing another accepted issue on the same assignee.
- Added regression coverage for the original failure modes: concurrent
same-parent retries, cross-issue accepted-plan isolation, and partial
child recreation under the same fingerprint.
- Added the `Plan decomposition` issue-detail section plus supporting
API/query-key/activity formatting updates so operators can see revision
status, owner, child counts, and the linked child issues directly in the
UI.
- Included the small follow-up UI fix so the decomposition section still
renders when the issue work mode is no longer `planning`.

## Verification

- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/db typecheck`
- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts`
- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts -t
"lists persisted decompositions with child issue summaries"`
- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts -t
"accepted plan decomposition"
server/src/__tests__/heartbeat-accepted-plan-workspace-refresh.test.ts
server/src/__tests__/heartbeat-context-summary.test.ts`
- Manual UI path: create a planning issue without an isolated execution
workspace, add a `plan` document, accept the `request_confirmation`, let
Paperclip create child issues, then reopen the parent issue detail page
and confirm the `Plan decomposition` section shows the accepted
revision, status, idempotent-claim badge, and child links.
- Separate follow-up bug noted during manual UI validation: accepting a
plan on an issue whose run never records `workspace_finalize` is tracked
in `PAPA-445` and is not part of this PR’s fix scope.

## Risks

- This adds a new migration and a large Drizzle snapshot update;
reviewers should confirm the schema shape and generated metadata match
the intended decomposition table.
- The exact-once claim changes sit on the accepted-plan fan-out path, so
regressions there could block legitimate child creation or mis-handle
retries if the claim state machine is wrong.
- The new UI only appears when decomposition records exist; reviewers
should use the manual verification path above rather than expecting
existing issues on a stale local instance to show the section
automatically.
- `PAPA-445` remains an open follow-up for the `workspace_finalize`
accept gate when a planning handoff never records finalize; that bug can
interfere with reproducing the UI flow on isolated workspaces but does
not change the correctness of the exact-once decomposition feature
itself.

> Checked `ROADMAP.md`: this PR is a bug fix / control-plane hardening
change for accepted-plan decomposition, not a new uncoordinated roadmap
feature.

## Model Used

- OpenAI Codex via Paperclip `codex_local` (GPT-5-based coding agent;
exact backend model ID/context window not exposed in the run context),
with repository tool use, shell execution, and code-editing
capabilities.

<img width="806" height="1069" alt="Screenshot 2026-05-27 at 11 05
48 PM"
src="https://github.com/user-attachments/assets/5b00b670-96cd-4470-b0a3-581743bcae28"
/>


## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-28 23:30:18 -07:00
Dotta 9eac727cf1 [codex] Add skills CLI and catalog management (#6782)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies through
company-scoped control-plane workflows.
> - Agents need reusable, inspectable skills that can be installed,
reset, audited, exported, and assigned without bespoke local setup.
> - The existing skill truth model needed cleanup so bundled skills,
optional catalog skills, runtime skills, and adapter-provided skills
have clear provenance.
> - Operators also need a practical CLI and board UI for discovering and
managing company skills.
> - This pull request adds the skills CLI, packaged skills catalog,
company skills APIs, and catalog-aware board UI.
> - The benefit is a more reusable Paperclip company setup where skills
are portable, auditable, and easier for operators and agents to manage.

## What Changed

- Added `paperclipai skills` CLI commands and coverage for catalog
listing, installing, resetting, and inspecting company skills.
- Added a packaged `@paperclipai/skills-catalog` workspace with bundled
and optional skill content plus validation/build tests.
- Added shared company-skill types and validators used across CLI,
server, and UI contracts.
- Added server catalog APIs/services for company skill catalog
operations, reset semantics, audit behavior, and portability provenance.
- Updated adapter skill handling so runtime/catalog provenance remains
explicit across local adapters.
- Added board UI support for browsing and managing catalog-backed
company skills.
- Updated docs for the skills CLI/catalog flow and the company skills
Paperclip skill reference.
- Rebased the branch onto current `paperclipai/paperclip:master`; no
`pnpm-lock.yaml`, `.github/workflows`, or migration files are included
in the final PR diff.

## Verification

- Passed: `pnpm run preflight:workspace-links && pnpm exec vitest run
cli/src/__tests__/skills.test.ts
packages/skills-catalog/src/catalog-builder.test.ts
packages/skills-catalog/src/shipped-catalog.test.ts
packages/shared/src/validators/company-skill.test.ts
packages/adapter-utils/src/server-utils.test.ts
packages/plugins/create-paperclip-plugin/src/entrypoints.test.ts
server/src/__tests__/company-skills-catalog-service.test.ts
server/src/__tests__/company-skills-routes.test.ts
server/src/__tests__/company-portability.test.ts`.
- Passed: `pnpm exec vitest run
server/src/__tests__/workspace-runtime.test.ts -t "default
branch|origin/master|symbolic-ref"`.
- Attempted: full `server/src/__tests__/workspace-runtime.test.ts`. Four
provisioning tests failed while seeding an isolated worktree database
from the local Paperclip instance because the local plugin schema dump
contains a duplicate-column foreign key
(`plugin_content_machine_18a7bc327b.content_case_signals`). The
default-branch tests touched by the rebase conflict passed in the
focused run above.
- Checked final diff: no `pnpm-lock.yaml`, no `.github/workflows`, and
no migration-file changes relative to `master`.

## Risks

- Medium: this is a broad skills/catalog change touching CLI, server
APIs, shared contracts, adapter skill sync, and UI.
- Catalog validation and reset semantics need careful reviewer attention
because they affect reusable company setup and portability.
- No database migrations are included in this PR, so there is no
migration ordering/idempotency risk in the final diff.
- No lockfile is included by design; dependency resolution will be
handled by the repository lockfile workflow.

## Model Used

- OpenAI Codex coding agent based on GPT-5, running in Paperclip via the
`codex_local` adapter with shell, git, GitHub CLI, and code-editing tool
access. Exact hosted model build/context-window metadata is not exposed
in this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run targeted tests locally and documented the local
workspace-runtime seed failure above
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, screenshots were intentionally
omitted per PAP-10124 instructions; UI behavior is covered by tests and
reviewer inspection
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-28 07:33:51 -10:00
Dotta 8da50dbcf8 [codex] Add private browser first-admin claim flow (#6755)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Fresh self-hosted deployments need an operator path before any
invite exists.
> - Umbrel installs are private LAN deployments, so a one-time browser
claim is appropriate only when the deployment is private and unclaimed.
> - Public deployments and installs with active invites must keep the
existing invite-only model so admin creation is not exposed broadly.
> - GitHub PR #2927 established the useful direction, but it needed to
be adapted onto current `master` rather than merged as-is.
> - This pull request adds that adapted private-only claim flow across
server, UI, docs, and regression coverage.
> - The benefit is that a fresh private Umbrel-style install can be
claimed from the browser without weakening public deployment access.

## What Changed

- Added a first-admin claim service and access route support for
one-time admin claim eligibility on private unclaimed deployments.
- Updated the bootstrap/access UI so eligible private installs show a
setup claim path, while public and invited deployments keep invite-first
behavior.
- Added a bootstrap-pending setup UX lab covering claim, invite, public,
and signed-in access states.
- Updated deployment and local development docs for authenticated
private/public behavior and the Umbrel-style claim path.
- Added server and UI regression tests for private claim, public
no-claim, active invite fallback, existing board/no-access flows, and
health exposure reporting.
- Stabilized PR handoff verification by serializing the aggregate server
Vitest workspace run, forcing `NODE_ENV=test`, and relaxing the
heartbeat batching test around legitimate recovery follow-up runs.

## Verification

- `pnpm -r typecheck`
- `pnpm build`
- `pnpm vitest --run
server/src/__tests__/heartbeat-comment-wake-batching.test.ts`
- `pnpm vitest --run
server/src/__tests__/health-dev-server-token.test.ts`
- `pnpm test:run`
- QA validation: PAP-10115 passed browser validation with screenshots
for private fresh install claim, active invite versus claim conflict,
public invite-only/claim-absent behavior, existing invite fallback, and
normal board/no-access flows.
- GitHub closeout: issue #2579 and PR #2927 were updated with the
accepted direction: adapt the implementation, do not direct-merge #2927
as-is.

## Risks

- The claim endpoint must remain private-only and one-time; a regression
here could expose admin creation on public deployments.
- Existing invite behavior must remain intact for public deployments and
installs that already have an active invite.
- The stable Vitest harness now serializes the aggregate server
workspace group; this is slower, but it avoids DB-backed suite
collisions under root workspace mode.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected - check the roadmap
first. See `CONTRIBUTING.md`.
>
> ROADMAP.md checked: this is a scoped deployment bootstrap/access fix
and does not duplicate a listed roadmap project.

## Model Used

- OpenAI GPT-5 Codex via Paperclip `codex_local` for product
engineering, implementation, and verification, with tool-enabled local
code execution. Paperclip QA browser validation was performed in
PAP-10115 by the assigned QA agent; exact adapter model metadata for
that QA run is not exposed in this PR context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-27 21:15:01 -10:00
Devin Foley de36743583 docs(readme): align README with brand guidelines (PAPA-439) (#6810)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The README is the first impression for developers and operators
landing on the repo, so it has to reflect the current brand voice and
visual identity
> - The existing README leads with an outdated hero ("Open-source
orchestration for zero-human companies"), keeps a board-centric tagline
that no longer matches the positioning, advertises a removed COMING SOON
teaser, and still uses an old header image and an unnecessary footer
image
> - Out-of-date positioning at the top of the README undercuts the rest
of the doc and the brand guidelines refresh at
https://paperclip.ing/brand
> - This pull request swaps the README header image for the new brand
banner, updates the hero copy and tagline, and trims stale callouts so
the README matches the new brand guidelines
> - The benefit is a README that leads with the current positioning
("Paperclip is the app people use to manage AI agents for work.") and
current visual identity, with no stale teasers or extraneous footer
image

## What Changed

- Added new brand banner at \`doc/assets/banner.jpg\` and pointed the
README header \`<img>\` at it (alt text updated to the new tagline)
- Replaced the \`## What is Paperclip?\` + \`# Open-source orchestration
for zero-human companies\` heading pair with a single H1: \`# Paperclip
is the app people use to manage AI agents for work.\`
- Tightened the opening paragraphs ("Open-source orchestration for teams
of AI agents.", trimmed dashboard sentence, "Under the hood:" line,
period on the OpenClaw/Paperclip tagline)
- Removed the \`COMING SOON: Clipmart\` callout
- Softened the Governance copy by dropping "You're the board." in both
the Features grid and the Systems table
- Fixed typo: "solo-entreprenuer" → "solo entrepreneur"
- Removed the README footer image block entirely
- Updated the closing subline: "Built for people who want to run
companies, not babysit agents." → "Built for people who want to get work
done, not babysit agents."
- Left existing assets untouched on disk: \`doc/assets/header.png\` and
\`doc/assets/footer.jpg\` are unchanged from master (only the README
references changed)

## Verification

- \`git diff master..HEAD --stat\` → only \`README.md\` (10+/18-) and
the new \`doc/assets/banner.jpg\`
- Rendered the README locally and confirmed:
  - The header banner shows the new brand image
- The H1 reads "Paperclip is the app people use to manage AI agents for
work."
  - No COMING SOON Clipmart callout
- No footer image; closing subline reads "Built for people who want to
get work done, not babysit agents."
- No code paths changed; no test suite applies

## Risks

- Low risk. Docs-only change. \`cli/README.md\` still references the
on-master URL for \`doc/assets/header.png\`, which is intentionally left
in place so that link does not break.

## Model Used

- Claude (Anthropic), model id \`claude-opus-4-7\` ("Opus 4.7"), running
under Claude Code via the Paperclip claude_local adapter.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (n/a — docs-only)
- [x] I have added or updated tests where applicable (n/a — docs-only)
- [x] If this change affects the UI, I have included before/after
screenshots (n/a — README-only; rendered review described in
Verification)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Closes PAPA-439
2026-05-27 18:18:00 -07:00
Devin Foley a49afe5ea1 docs: update README license to Paperclip Labs, Inc (PAPA-437) (#6804)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The repository README closes with a license/copyright line that
downstream readers use to identify the legal entity behind the project
> - The line currently reads "MIT © 2026 Paperclip", which omits the
formal corporate name
> - The legal entity is "Paperclip Labs, Inc"; the README should reflect
that for accuracy
> - This pull request updates the README footer to "MIT © 2026 Paperclip
Labs, Inc"
> - The benefit is correct attribution of the MIT license to the actual
legal entity

## What Changed

- Updated `README.md` license line from "MIT © 2026 Paperclip" to "MIT ©
2026 Paperclip Labs, Inc"

## Verification

- Open `README.md` and confirm the final line reads `MIT © 2026
Paperclip Labs, Inc`
- No code paths affected; no tests required

## Risks

- Low risk — single-line documentation change, no runtime impact

## Model Used

- Provider: Anthropic Claude
- Model ID: claude-opus-4-7
- Capabilities: tool use, code execution via Claude Code CLI

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (N/A — docs-only change)
- [x] I have added or updated tests where applicable (N/A)
- [x] If this change affects the UI, I have included before/after
screenshots (N/A — no UI change)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-27 15:07:02 -07:00
Dotta b7545823be [codex] Add document annotations and comments (#6733)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through issues, documents,
runs, and durable company-scoped state.
> - Issue documents are where agents and operators capture plans,
handoffs, and work products.
> - Before this change, document collaboration could only happen through
whole-document edits and detached issue comments.
> - Inline document annotations need stable anchors, revision-aware
persistence, and UI affordances that do not break existing document
editing.
> - This pull request adds company-scoped document annotation threads,
comments, anchor snapshots, API routes, and board UI.
> - The benefit is that operators and agents can discuss specific
document passages without losing context as documents evolve.

## What Changed

- Added document annotation tables, schema exports, shared types,
validators, anchor hashing, and text-anchor helpers.
- Added server-side document annotation services and issue routes for
listing, creating, commenting, resolving, and reopening annotation
threads.
- Included annotation summaries in relevant issue document reads and
backup/recovery document workspace behavior.
- Added React UI for inline document highlights, comment panels, mobile
sheet behavior, deep-link focus, and resolved/open filtering.
- Added annotation design artifacts, Storybook coverage, screenshots,
and a screenshot helper script.
- Rebased the branch onto current `paperclipai/paperclip` `master` and
renumbered the annotation migration from `0085_old_swarm` to
`0091_old_swarm`; the SQL uses `IF NOT EXISTS` guards so environments
that previously applied the old migration number can safely apply the
new one.
- Adjusted the new annotation UI tests to use a local async flush helper
because this workspace's React 19.2.4 export does not expose
`React.act`.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
packages/shared/src/document-anchors.test.ts
server/src/__tests__/document-annotation-routes.test.ts
server/src/__tests__/document-annotations-service.test.ts
ui/src/components/DocumentAnnotationLayer.test.tsx
ui/src/components/IssueDocumentAnnotations.test.tsx
ui/src/lib/document-annotation-hash.test.ts
ui/src/lib/document-annotation-selection.test.ts`
- Confirmed `git diff --check` passes.
- Confirmed no `pnpm-lock.yaml` or `.github/workflows/*` files are
included in the PR diff.

## Risks

- Medium risk: this adds new persisted annotation tables and routes
across db/shared/server/ui.
- Migration risk is reduced by moving the branch migration to
`0091_old_swarm` after upstream `0090_resource_memberships` and keeping
the SQL idempotent for old `0085_old_swarm` adopters.
- UI risk is mostly around text range anchoring and panel positioning
across long documents, folded content, and mobile layouts; the PR
includes focused unit coverage and design screenshots.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-using software engineering
mode. Context window size is not exposed in this Paperclip runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-26 06:41:23 -07:00
Dotta f0ddd24d61 [codex] Show bundled plugins in plugin manager (#6734)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The plugin system is how Paperclip exposes optional capabilities and
integrations without bloating the control plane.
> - Operators need the Instance Settings plugin manager to show both
installed external plugins and bundled built-in plugins.
> - Bundled plugins were available in the server/UI surface but were not
represented consistently in the plugin manager list.
> - Workspace runtime reuse also needed to stay pinned to the current
branch/base so the plugin manager can be validated from the intended
checkout.
> - This pull request shows bundled plugins in the manager, marks
experimental bundled plugins clearly, and tightens runtime/worktree
reuse guards.
> - The benefit is that operators can discover bundled plugins from the
same management screen as installed plugins without stale workspace
sessions hiding the latest branch state.

## What Changed

- Lists bundled monorepo plugin packages through the plugin routes API,
including plugin status and install metadata needed by the UI.
- Updates the plugin manager UI/API client to render bundled plugins and
display experimental badges based on installed plugin records.
- Adds server authorization coverage around plugin routes so board and
agent access stay company-scoped.
- Guards execution workspace/runtime reuse against stale base refs and
defaults new worktrees to the fetched target base.
- Expands workspace runtime tests for service reuse, stale workspace
prevention, and controlled runtime stops.
- Addressed Greptile feedback by respecting `origin/HEAD`, using async
cached bundled-plugin discovery, and avoiding duplicated UI experimental
plugin lists.

## Verification

- `pnpm exec vitest run server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/workspace-runtime.test.ts
server/src/__tests__/heartbeat-workspace-session.test.ts`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/plugin-sdk build && pnpm --filter
@paperclipai/server typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `gh pr checks 6734 --repo paperclipai/paperclip` reports all checks
passing on `10e1ba9e0f505637cd913713fb28c2c99ae92011`.
- Greptile Review reports 5/5 on
`10e1ba9e0f505637cd913713fb28c2c99ae92011`.
- Confirmed the branch is rebased onto `public-gh/master` and the PR
diff does not include `pnpm-lock.yaml` or `.github/workflows` changes.
- UI screenshots were not captured in this PR-creation pass because the
available local board runtime is authenticated; the visible UI path is
covered by the plugin manager code changes and server/API tests above.

## Risks

- Medium risk: this touches shared plugin listing behavior and workspace
runtime reuse, so regressions could affect plugin manager visibility or
service reuse across execution workspaces.
- No database migrations.
- No lockfile or GitHub workflow changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI GPT-5 Codex, coding-agent workflow with shell/tool use in a
local Paperclip worktree. Context window not surfaced by the runtime;
reasoning mode not externally reported.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-26 07:32:45 -06:00
Dotta 9aea3e3d35 [codex] Add resource membership controls (#6677)
Release / publish_stable (push) Has been skipped
Release / verify_stable (push) Has been skipped
Release / preview_stable (push) Has been skipped
Refresh Lockfile / refresh (push) Successful in 48s
Docker / build-and-push (push) Failing after 2m20s
Release / verify_canary (push) Failing after 6m5s
Release / publish_canary (push) Has been skipped
## Thinking Path

> - Paperclip orchestrates AI-agent companies through company-scoped
issues, projects, agents, and board-visible workflows.
> - The board sidebar and project list are the daily navigation surface
for that control plane.
> - Users need to keep all projects and agents accessible while hiding
resources they have intentionally left from their own sidebar.
> - That requires user-scoped resource membership state backed by
company-scoped API and database contracts.
> - The branch also needed to preserve HTTP worktree login sessions and
keep the project list easier to scan after membership grouping.
> - This pull request adds resource membership controls, sidebar leave
actions, grouped/sortable project listings, and focused tests.
> - The benefit is a cleaner personal workspace view without weakening
company-scoped access to the underlying project or agent detail pages.

## What Changed

- Added `project_memberships` and `agent_memberships` tables with
API/shared/server contracts for current-user join/leave state.
- Renumbered the membership migration to `0090_resource_memberships`
after rebasing onto current `master`, and made it idempotent for anyone
who had applied the old branch-local `0087` migration.
- Added project and agent sidebar leave actions, plus list filtering
that waits for membership state before hiding resources.
- Added grouped project listing, project sorting controls, and reserved
row subtitle height for cleaner scanning.
- Fixed HTTP auth cookie security handling so HTTP worktree sessions can
persist.
- Updated focused server and UI tests for the new membership, sidebar,
project list, and auth behavior.

## Verification

- `pnpm exec vitest run server/src/__tests__/better-auth.test.ts
server/src/__tests__/resource-memberships-routes.test.ts
ui/src/pages/Projects.test.tsx
ui/src/components/SidebarProjects.test.tsx
ui/src/components/SidebarAgents.test.tsx
ui/src/components/MembershipAction.test.tsx
ui/src/components/EntityRow.test.tsx`
- Confirmed the branch is rebased on current `origin/master`.
- Confirmed the PR diff does not include `pnpm-lock.yaml` or
`.github/workflows` changes.

## Risks

- Migration safety: low to medium. The migration now uses `IF NOT
EXISTS` / guarded constraints and is numbered after current master
migrations, but it should still get CI coverage against fresh databases.
- UI behavior: low. Left resources are hidden from sidebar only after
membership state loads; direct detail access remains available.
- Auth behavior: low. Cookie security is relaxed only for HTTP/private
local-style origins where secure cookies would prevent login
persistence.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI GPT-5 Codex coding agent, tool-enabled shell/git workflow,
context window not exposed by runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Screenshot note: no browser screenshots were captured in this heartbeat;
the UI changes are covered by focused component tests above.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-25 13:12:41 -05:00
Dotta 60efa38f86 docs: add v2026.525.0 release changelog (#6667)
## Summary

- Adds `releases/v2026.525.0.md` for the upcoming stable release
covering 32 commits since `v2026.517.0` (2026-05-17 → 2026-05-25).
- Highlights: Modal sandbox provider plugin, first-party workspace diff
viewer, routine env secrets, local Cloud Upstream sync, and ACPX-Claude
adapter polish.
- Categorized into Highlights / Improvements / Fixes with inline PR +
contributor attribution per the release-changelog skill.

## Source range

`v2026.517.0..origin/master` (ECE8A51E22 at time of cut).

## Verification

- Manual scan of `git log v2026.517.0..HEAD --no-merges` and merged-PR
titles since 2026-05-17.
- Confirmed no `BREAKING CHANGE:` / `feat!:` markers in the range (no
Breaking Changes / Upgrade Guide sections needed).
- DB migrations in range (`0086_routine_env_runtime_contract`,
`0087_backfill_environment_manage_human_defaults`,
`0088_backfill_principal_access_compatibility`, `0089_cloud_upstreams`)
are additive/backfill-only — none flagged as breaking.
- Contributor list excludes Paperclip founders (cryppadotta, devinfoley)
and bot accounts per skill rules.

## Test plan

- [ ] Reviewer eyeballs the changelog against the recent PR list.
- [ ] Reviewer confirms the highlight set is the right shipped subset.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-25 09:12:23 -07:00
Dotta ece8a51e22 [codex] Bundle local branch fixes from PAP-10032 (#6604)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - This branch accumulated multiple already-tested control-plane,
adapter runtime, invite, workspace, plugin, and UI quality fixes on the
primary Paperclip checkout.
> - `origin/master` advanced while those commits were still local, so
the branch needed to be preserved and reconciled before review.
> - Splitting the branch commit-by-commit against the new base produced
overlapping conflicts with recently merged upstream PRs.
> - This pull request keeps the remaining branch as one standalone PR
because the final diff is 38 files after removing screenshot artifacts,
under Greptile's 100-file cap, and can be merged independently after
review.
> - The benefit is that none of the local work is lost, the branch is
now based on current `origin/master`, and reviewers can evaluate the
reconciled changes in one place.

## What Changed

- Merged the local accumulated branch with current `origin/master` and
resolved the invite-flow overlaps from the newer upstream companies
query helper.
- Preserved the local fixes for invite existing-member behavior, invite
link copy fallback, reusable workspace selection, worktree auth, static
SPA fallback, markdown wrapping, plugin slot registration, cloud
upstream UX/server polish, project sorting, and related tests.
- Removed screenshot artifacts from the PR per review request.
- Kept the PR under the requested file limit: 38 files changed, with no
`pnpm-lock.yaml` or `.github/workflows/*` changes.

## Verification

- `NODE_ENV=test pnpm exec vitest run
ui/src/pages/CompanyInvites.test.tsx ui/src/pages/InviteLanding.test.tsx
ui/src/pages/Projects.test.tsx ui/src/plugins/slots.test.ts
ui/src/components/MarkdownBody.test.tsx
server/src/__tests__/invite-accept-existing-member.test.ts
server/src/__tests__/static-index-html.test.ts
server/src/__tests__/execution-workspaces-service.test.ts
server/src/__tests__/better-auth.test.ts
server/src/__tests__/worktree-config.test.ts`
- `NODE_ENV=test pnpm --filter @paperclipai/ui typecheck`
- `NODE_ENV=test pnpm --filter @paperclipai/server typecheck`
- Confirmed `git diff --name-only origin/master...HEAD | wc -l` is `38`.
- Confirmed no PR diff entries match `pnpm-lock.yaml`,
`.github/workflows/*`, or `screenshots/*`.

## Risks

- Medium review risk because this is a bundled rescue PR rather than
several narrow feature PRs.
- Invite flow and company cache behavior overlapped with newer upstream
changes; the merge resolution intentionally keeps the shared
`companiesListQueryOptions` helper while preserving local
existing-member invite behavior.
- Visual review evidence is no longer attached in-repo because
screenshots were removed from this PR per review request.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, with repository tool access,
terminal execution, and git/GitHub CLI operations.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] UI screenshots were intentionally removed from this PR per review
request
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: CodexCoder <codexcoder@paperclip.local>
2026-05-25 07:25:26 -05:00
Devin Foley 96f0279e08 Make ACPX-Claude adapter work seamlessly (PAPA-388) (#6590)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, so when
an adapter fails, the platform must surface enough detail for the next
agent (or human reviewer) to act
> - The `acpx_local` adapter wraps `claude-agent-acp`, which in turn
drives the Claude Code SDK — three layers, three different permission
and error-handling models
> - A user created a `Claude Local ACPX` agent in PAPA-387 and it failed
instantly with the generic `acpx.error / "Internal error"` log,
stranding the work and triggering an opaque `stranded_assigned_issue`
recovery to the CTO
> - Once the diagnostic blackbox was opened, the underlying cause turned
out to be two SDK-level mismatches: a model-name allowlist that rejects
bare IDs like `claude-opus-4-7`, and a Claude Code
permission/Read-sandbox configuration that silently denies every
non-allowlisted tool when the user's `~/.claude/settings.json` has
`defaultMode: "dontAsk"`
> - This pull request fixes both classes of failure in the adapter
itself so new ACPX agents work seamlessly without per-host
configuration, and widens the diagnostic surface so the *next* failure
of any kind is actionable
> - The benefit is that ACPX-Claude can join the regular agent roster —
verified end to end on PAPA-401, where the agent successfully reached
the Paperclip API, opened a worktree, surveyed existing notification
PRs, and posted a structured plan

## What Changed

- Widen ACPX failure diagnostics
(`packages/adapters/acpx-local/src/server/execute.ts`):
- Capture `err.name`, ACP code, `cause.message`, retryable flag, and a
5-frame stack preview into `errorMeta`.
- Promote phase-specific error codes: `ensure_session →
acpx_session_init_failed`, `configure_session →
acpx_session_config_failed`, `turn → acpx_turn_failed`, plus mapping for
`ACP_BACKEND_MISSING` / `ACP_BACKEND_UNAVAILABLE`.
- Set `verbose: true` on the ACPX runtime so its session-event log flows
through `ctx.onLog`.
- Capture child-process stderr via a wrapper-script tee into
`<stateDir>/run-stderr/<runId>.log`, inline the tail into the
`acpx.error` payload as `childStderrTail`, and forward it through
`ctx.onLog("stderr", …)` so it lands in the heartbeat `stderrExcerpt`
column (existing redaction applies).
- Set the model via `ANTHROPIC_MODEL` env for the `claude` agent instead
of `set_config_option(model, …)`. The ACP server's `set_config_option`
handler validates against an internal allowlist and rejects bare IDs
like `claude-opus-4-7`. `ANTHROPIC_MODEL` is read during initialization
and bypasses that check.
- Seed `<worktree>/.claude/settings.local.json` before spawning
`claude-agent-acp` (the seamless-API fix). Since `claude-agent-acp`
hard-codes `settingSources: ["user", "project", "local"]` and "local"
has the highest precedence:
- Set `permissions.defaultMode: "default"`, but **only** if the user's
value is missing or `"dontAsk"` (the broken case). Other modes like
`acceptEdits`/`plan` are preserved.
- Pre-allow Paperclip's Bash surface (`Bash(curl:*)`, `Bash(env:*)`,
`Bash(<cwd>/scripts/paperclip-issue-update.sh:*)`,
`Bash(<cwd>/scripts/paperclip:*)`).
- Widen `permissions.additionalDirectories` to include `stateDir`,
`agentHome`, and the per-company instance root
(`~/.paperclip/instances/<id>/companies/<companyId>`). Scoped to this
company only — does not expose other tenants.
- Existing user entries are merged, not replaced. The resolved roots are
folded into the session fingerprint so warm-session handles invalidate
when they change.
- Sync the existing server-side integration test
(`server/src/__tests__/acpx-local-execute.test.ts`) to assert
`acpx_session_init_failed` instead of the now-removed
`acpx_protocol_error` for `ACP_SESSION_INIT_FAILED` (a follow-up to
commit 1).

## Verification

- `pnpm --filter "@paperclipai/adapter-acpx-local" run typecheck` —
passes.
- `pnpm vitest run` in `packages/adapters/acpx-local` — 35/35 pass,
includes 4 new tests covering the settings.local.json write path (claude
only, merge with pre-existing content, `dontAsk` override, codex no-op).
- `pnpm vitest run src/__tests__/acpx-local-execute.test.ts` in
`server/` — 15/15 pass after the test-sync commit.
- End-to-end manual verification (PAPA-401): the `Claude Local ACPX`
agent that previously hit "restricted environment" now successfully
reaches the Paperclip API, opens its worktree, posts structured plan
comments, and flips the issue to `in_review` without any external
configuration.

## Risks

- **Low**, scoped to the `acpx_local` adapter. The settings.local.json
write is per-worktree (worktrees live under
`.paperclip/worktrees/<issue>/`) and only triggers when `acpxAgent ===
"claude"`. Existing user content is merged with `[...existing,
...paperclip]` and deduped — nothing is overwritten outright.
- The `defaultMode` override is intentionally narrow: it only flips
`"dontAsk"` (which silently denies every tool and is the root cause) to
`"default"`. Users who explicitly picked `acceptEdits`, `plan`, or any
other mode keep their choice.
- Stderr capture goes through the existing `log-redaction` pass before
persisting, so `PAPERCLIP_API_KEY` and similar secrets in the wrapper
env don't leak into heartbeat logs.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- Claude Opus 4.7 (`claude-opus-4-7`), running in the `claude_local`
adapter via Paperclip's harness. Extended thinking enabled, tool use
enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (adapter-only)
- [ ] I have updated relevant documentation to reflect my changes — no
user-facing docs changed; internal commentary in the code change
explains the SDK constraints
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-23 13:01:27 -07:00
Aron Prins 897cc322c7 Improve external agent invite flow (#6183)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Agent creation can happen through local runtimes, managed runtimes,
and external agents that onboard through invites.
> - The old OpenClaw-oriented invite UX lived under company
settings/invites and made a gateway-specific path look like a company
access setting.
> - That hid the broader bring-your-own-agent flow and forced operators
to leave the add-agent modal when adding an external agent.
> - This pull request moves external agent invite generation into the
add-agent modal and makes the copy agent-oriented instead of
OpenClaw-only.
> - The benefit is a clearer agent-first onboarding path while company
invites stay focused on human access.

## What Changed

- Added an external-agent invite branch to the add-agent modal,
including a dedicated prompt result view with Back navigation.
- Added a shared agent onboarding prompt builder and focused modal
coverage for prompt replacement/back navigation.
- Removed the agent invite prompt UI from Company Settings and Company
Invites, leaving Company Invites focused on human access links and
invite history.
- Updated the hidden OpenClaw Gateway runtime hint to direct operators
to the add-agent invite flow instead of presenting it as a blocked
runtime card.
- Updated invite/onboarding docs, storybook coverage, and server-side
onboarding copy toward generic agent language while preserving existing
gateway compatibility.

## Verification

- `pnpm -r typecheck`
- `pnpm build`
- `FAKE_BIN="$(mktemp -d)/bin"; mkdir -p "$FAKE_BIN"; printf
'#!/bin/sh\nexit 1\n' > "$FAKE_BIN/tailscale"; chmod +x
"$FAKE_BIN/tailscale"; PATH="$FAKE_BIN:$PATH" pnpm test:run`
- `pnpm test:run` without the fake `tailscale` shim was also attempted;
it failed only in two pre-existing CLI tailnet fallback tests because
this host has a real Tailscale address (`100.125.202.3`) where those
tests expect no Tailscale.
- Focused confirmation for that host-env issue: `FAKE_BIN=...
PATH="$FAKE_BIN:$PATH" pnpm exec vitest run --project paperclipai
cli/src/__tests__/network-bind.test.ts
cli/src/__tests__/onboard.test.ts`
- Manual UI verification: served UI locally in light mode, opened
add-agent modal, generated external agent prompt, verified the generated
prompt replaces the form and Back returns to the form.

### Screenshots

![Add agent
modal](https://raw.githubusercontent.com/aronprins/paperclip/pr-assets/6183-agent-invites/.github/pr-screenshots/6183/add-agent-modal-light.png)

![External agent invite
form](https://raw.githubusercontent.com/aronprins/paperclip/pr-assets/6183-agent-invites/.github/pr-screenshots/6183/external-agent-invite-form-light.png)

![Generated onboarding prompt replacement
view](https://raw.githubusercontent.com/aronprins/paperclip/pr-assets/6183-agent-invites/.github/pr-screenshots/6183/onboarding-prompt-result-light.png)

## Risks

- Existing OpenClaw gateway compatibility remains, but operators now
discover external agent onboarding from the add-agent modal instead of
company settings.
- Agent invites still appear in the invite history table, so that page
may show agent-scoped invite rows even though it no longer creates agent
onboarding prompts.
- Low migration risk: no schema changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent in Codex desktop; tool-enabled
repository, shell, browser, and GitHub workflow. Context window size was
not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-23 09:09:40 -05:00
Devin Foley e3c875c1c7 fix(sandbox): prevent E2B workspace upload + lease idle failures (PAPA-382) (#6560)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Heartbeats run inside managed sandboxes (E2B, Cloudflare Sandbox),
and each run begins by uploading the agent's workspace as a tar archive
> - PAPA-381's E2B runs were failing at 5 and 11 minutes — two distinct
failure modes were entangled: workspace tar extraction errors on Linux,
and sandbox idle/lease timeouts during normal heartbeat gaps
> - Workspace tar extraction failed because macOS bsdtar embeds
`LIBARCHIVE.xattr.*` PAX headers that GNU tar on Linux rejects with
"This does not look like a tar archive"; the existing
`COPYFILE_DISABLE=1` only suppresses AppleDouble `._*` sidecars, not
inline PAX xattr entries
> - E2B sandboxes also expired between heartbeats because `timeoutMs`
defaulted to a short window and was never refreshed per execute, and
Cloudflare sandboxes idled out because `sleepAfter` defaulted to 10
minutes
> - This pull request adds `--no-xattrs` to the workspace tar
invocation, refreshes the E2B sandbox lifetime on each execute and bumps
the default `timeoutMs` to 1h, and raises the Cloudflare `sleepAfter`
default to 1h
> - The benefit is that long-running heartbeat-driven runs (Claude,
Codex, etc.) survive across both their initial workspace upload and the
natural idle gaps between executes on both E2B and Cloudflare

## What Changed

- `packages/adapter-utils/src/sandbox-managed-runtime.ts`: added
`--no-xattrs` to `createTarballFromDirectory` so macOS bsdtar produces a
clean POSIX tar that GNU tar on Linux can extract, with an inline
comment explaining why `COPYFILE_DISABLE=1` alone is insufficient.
- `packages/plugins/sandbox-providers/e2b/src/plugin.ts`: refresh the
sandbox lifetime on every execute (so long runs don't expire mid-job)
and raised the default `timeoutMs` to 1h.
- `packages/plugins/sandbox-providers/e2b/src/manifest.ts` and
`plugin.test.ts`: updated manifest defaults and added regression
coverage for the new behavior.
- `packages/plugins/sandbox-providers/cloudflare/src/config.ts`,
`manifest.ts`, `plugin.test.ts`: raised default `sleepAfter` from 10m to
1h, mirroring the E2B 1h default, and added a regression test asserting
the acquire-lease request body sends `sleepAfter: "1h"` when not
overridden.

## Verification

- `pnpm --filter @paperclipai/plugin-e2b test`
- `pnpm --filter @paperclipai/plugin-cloudflare-sandbox test`
- Locally cherry-picked the `--no-xattrs` fix onto master and confirmed
end-to-end via a real PAPA-381-style heartbeat-driven E2B run that the
workspace upload now extracts cleanly on Linux. The user (board
operator) tested this on master and reported "Ok, that worked."
- Manual reviewer steps: trigger an E2B heartbeat from a macOS host
(this is where the bsdtar xattr headers come from), confirm the
workspace tar extracts on the Linux sandbox side; run a long (>15 min)
Cloudflare sandbox flow and confirm no lost-lease/idle errors between
executes.

## Risks

- Low risk overall.
- `--no-xattrs` is widely supported by both macOS bsdtar and GNU tar
(Linux). Worst case it silently no-ops on a future host that doesn't
support it; in that case the existing failure mode reappears, it doesn't
introduce a new one.
- Raising default `timeoutMs` (E2B) and `sleepAfter` (Cloudflare) from
short values to 1h means sandboxes stay alive longer between executes by
default. This is the intended behavior — operators that want a tighter
idle window can still override via plugin config.
- E2B per-execute sandbox lifetime refresh adds a small API call per
execute; it is bounded by the same client that already handles execute
traffic, so no new dependencies or retry semantics.

## Model Used

- Claude (Anthropic), `claude-opus-4-7`, extended thinking enabled, tool
use enabled (file/grep/git tools and Paperclip control-plane API). Used
to diagnose the dual failure mode (workspace tar PAX xattr headers +
sandbox lifetime), write the fixes and tests, and drive the verification
loop with the board operator.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — no UI changes)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-22 13:34:11 -07:00
Aron Prins e85ff094ec fix(ui): invite page goes blank from companies query-key collision (#6433)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies; humans
operate the board through the React UI.
> - The board gates company access via `CompanyProvider`
(CompanyContext) and onboards new humans through the invite landing page
at `/invite/:token`.
> - Reported symptom: opening an invite link and signing in works, but
the page then renders completely blank (black in dark mode).
> - End-to-end browser testing reproduced a client-side crash:
`companiesQuery.data?.some is not a function` and `Cannot read
properties of undefined (reading 'filter')`.
> - Root cause: `CompanyProvider` and `InviteLandingPage` both use the
React Query key `["companies"]` but return **different shapes** — `{
companies, unauthorized }` vs a bare `Company[]` — so they silently
corrupt the shared cache entry; whichever component reads the other's
shape calls `.some()`/`.filter()` on the wrong type and throws,
unmounting the tree.
> - Owners never hit it (they never mount the invite page); only
invitees landing on `/invite/:token` crash.
> - This PR unifies the `["companies"]` query into a single shared
definition so the cache entry always has one shape and the two consumers
can't drift apart again.
> - The benefit is a working invite/onboarding flow and removal of a
whole class of cache-shape bugs on this key.

## What Changed

- Add `ui/src/api/companies-query.ts` exporting a single shared
`companiesListQueryOptions` (and `CompanyListResult`) — one `queryKey` +
one `queryFn` that always returns the wrapped `{ companies, unauthorized
}` shape, documented with the shared-cache contract.
- `ui/src/context/CompanyContext.tsx` now uses
`useQuery(companiesListQueryOptions)` instead of an inline copy of that
query.
- `ui/src/pages/InviteLanding.tsx` uses the same
`companiesListQueryOptions` (with its own `enabled` gate), reads
`companiesQuery.data?.companies` for the membership checks, and uses
`queryClient.fetchQuery(companiesListQueryOptions)` in the post-auth
path — so it reads and writes the identical shape.

## Verification

- `pnpm --filter @paperclipai/ui typecheck` — clean.
- `vitest run src/pages/InviteLanding.test.tsx
src/context/CompanyContext.test.tsx` — 17/17 pass, unchanged.
- Manual end-to-end via a real browser against a LAN-exposed
authenticated instance:
  - Owner creates an Owner-role invite.
- New user opens the link and registers — **the "awaiting approval"
screen renders** (previously blank), `POST /api/invites/:token/accept`
returns `202`, no console errors.
- Owner approves at Company Settings → Access (`200`); invitee becomes
an active member.
- Invitee signs in — full board loads; smoke test of dashboard / issues
/ inbox / routines / goals / company settings — all render, zero
`pageerror`s.
- Before: invite page `#root` empty after sign-in (blank/black). After:
awaiting-approval panel renders. (Screenshots available on request.)

## Risks

- Low. `CompanyProvider`'s query behavior is unchanged (same `queryFn`
logic, just extracted into a shared module). `InviteLandingPage` now
reads the same shape it writes. No API, schema, or migration changes.
Existing tests pass unchanged.

## Model Used

- Claude (Anthropic), model ID `claude-opus-4-7` (Opus 4.7), 1M-context,
extended thinking + tool use; driving Claude Code with browser
automation for end-to-end reproduction and verification.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable (existing
InviteLanding/CompanyContext tests cover the touched code and pass; a
cross-provider regression test that mounts both consumers is a sensible
follow-up)
- [ ] If this change affects the UI, I have included before/after
screenshots (described textually above; this is a crash/blank-page fix,
screenshots available on request)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 15:28:49 -05:00
Aron Prins 4811d8dd33 Fix wrapped company issue prefix conflicts (#6423)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Company creation is the first control-plane object operators create,
and the generated issue prefix becomes part of task identity.
> - The company service already retries when a generated issue prefix
collides with the `companies_issue_prefix_idx` unique constraint.
> - Drizzle 0.45.x wraps PostgreSQL errors in `DrizzleQueryError`,
leaving the real `23505` constraint error on the `.cause` chain.
> - The existing retry detector only inspected the top-level error, so
wrapped prefix collisions surfaced as 500s instead of retrying.
> - This pull request walks the error cause chain for the exact prefix
constraint and verifies the retry path against embedded Postgres.
> - The benefit is company creation no longer fails when generated
prefixes collide under Drizzle 0.45.x wrappers.

## What Changed

- Walk the error `.cause` chain when detecting
`companies_issue_prefix_idx` unique violations, with a cycle guard and
support for `constraint` / `constraint_name` fields.
- Added an embedded Postgres regression test that seeds `ARO`, creates
`Aron & Sharon`, and verifies the retry produces `AROA`.
- Stabilized existing async tests touched by full verification: instance
sidebar plugin rendering now waits for React Query results, and
Tailscale-unavailable CLI tests explicitly hide host `tailscale`
detection.

## Verification

- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/companies-service.test.ts`
- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/heartbeat-stale-queue-invalidation.test.ts`
- `pnpm --filter @paperclipai/ui exec vitest run
src/components/InstanceSidebar.test.tsx`
- `pnpm --filter paperclipai exec vitest run
src/__tests__/network-bind.test.ts src/__tests__/onboard.test.ts`
- `pnpm test:run`
- `pnpm -r typecheck`
- `pnpm build`

## Risks

- Low runtime risk: the retry behavior only expands detection for the
existing exact company issue-prefix unique constraint.
- The cause-chain walk is bounded by visited objects to avoid cycles.
- The sidebar and CLI changes are test-only stabilization and do not
change production behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent in Codex desktop, with local
shell/tool execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no UI behavior change)
- [x] I have updated relevant documentation to reflect my changes (N/A:
bug fix with no user-facing docs change)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Closes #6350
2026-05-22 15:27:54 -05:00
Dotta 90117827eb [codex] Polish board UI mobile flows (#6550)
## Thinking Path

> - Paperclip is the board UI and control plane for supervising AI-agent
companies.
> - Operators repeatedly use mobile navigation, issue creation, inbox
scanning, and markdown reading surfaces.
> - Small layout and interaction rough edges add friction to those
high-frequency workflows.
> - The branch included a set of related board UI polish changes that
were too small to review as many separate PRs.
> - This pull request groups the remaining mobile/navigation/markdown
polish into one standalone branch.
> - The benefit is smoother board operation without mixing in unrelated
backend feature work.

## What Changed

- Tightened company settings navigation behavior on mobile.
- Fixed mobile new issue dialog height and moved issue priority into the
overflow controls on small screens.
- Restored browser controls for home-screen app mode.
- Fixed plugin-route sidebar selection on nested page loads.
- Added markdown preformatted-block wrapping controls and coverage.
- Kept updated issue list pages sorted by updated time in the board UI.

## Verification

- `pnpm --filter @paperclipai/plugin-sdk build`
- `NODE_ENV=test pnpm exec vitest run ui/src/components/Layout.test.tsx
ui/src/components/MarkdownBody.test.tsx
ui/src/components/MarkdownBody.wrap.test.tsx
ui/src/components/NewIssueDialog.test.tsx
ui/src/components/access/CompanySettingsNav.test.tsx
ui/src/lib/pwa-install-mode.test.ts ui/src/pages/Inbox.test.tsx`

The targeted UI tests passed. React emitted existing act-wrapping
warnings in a few test files, but there were no test failures.

## Risks

- Medium-low: changes span several UI surfaces, but they are mostly
layout/interaction polish with targeted component tests.
- Visual screenshots are not newly captured in this split PR; follow-up
review should include browser/visual QA before marking ready.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI GPT-5 Codex via `codex_local`, tool-enabled coding session;
exact context window not exposed by this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-22 10:13:47 -05:00
Dotta ad6effa65c [codex] Improve runtime and import reliability (#6549)
## Thinking Path

> - Paperclip coordinates autonomous company work through local and
hosted runtime surfaces.
> - Local embedded Postgres and tenant import/export paths are
foundational reliability pieces.
> - A runtime failure in either path can stop agents or imports before
useful work begins.
> - The branch included remaining fixes for embedded native library
bootstrap and async tenant import handling.
> - This pull request groups those runtime/import reliability changes
into one standalone PR.
> - The benefit is a more robust local runtime and safer cloud tenant
import behavior.

## What Changed

- Prepared embedded Postgres native runtime before startup in
CLI/server/test entrypoints.
- Added embedded Postgres native bootstrap coverage.
- Added async tenant import job handling and deferred validation
coverage.
- Kept the runtime/import changes based directly on current
`origin/master` after related upstream PRs had already merged.

## Verification

- `pnpm --filter @paperclipai/plugin-sdk build`
- `NODE_ENV=test pnpm exec vitest run
packages/db/src/embedded-postgres-native.test.ts
server/src/__tests__/company-portability-routes.test.ts`

## Risks

- Medium-low: this touches startup/import paths, but the branch is small
and covered by targeted tests.
- The embedded Postgres change depends on platform-specific
native-library behavior, so CI and follow-up checks should still verify
supported runners.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI GPT-5 Codex via `codex_local`, tool-enabled coding session;
exact context window not exposed by this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-22 09:57:22 -05:00
Dotta e43b392a79 [codex] Add local Cloud Upstream sync (#6548)
## Thinking Path

> - Paperclip is the control plane for AI-agent companies.
> - Operators need a path to move local company state toward Paperclip
Cloud without losing local-first control.
> - The Cloud Upstream flow needs API, persistence, CLI, and board UI
surfaces that agree on the same manifest/run model.
> - The existing branch had the feature work plus UX and error-handling
follow-ups.
> - This pull request packages the remaining Cloud Upstream sync work
into one standalone branch.
> - The benefit is an inspectable local-to-cloud sync workflow with
preview, conflicts, activation, and captured UX review states.

## What Changed

- Added Cloud Upstream shared types, server routes/services, and
persisted run schema/migration.
- Added Paperclip Cloud CLI sync helpers and local connection storage.
- Added the Cloud Upstream board UI, settings entry points, query keys,
and UX lab page.
- Added preview/activation checklist behavior, redirect handling,
manifest-only preview support, friendly errors, in-flight hints, and
entity count summaries.

## Verification

- `pnpm --filter @paperclipai/plugin-sdk build`
- `NODE_ENV=test pnpm exec vitest run cli/src/__tests__/cloud.test.ts
server/src/__tests__/instance-settings-routes.test.ts
server/src/__tests__/instance-settings-service.test.ts
ui/src/pages/CloudUpstream.test.tsx
ui/src/components/CompanySettingsSidebar.test.tsx`
- `NODE_ENV=test pnpm exec vitest run
server/src/__tests__/cloud-upstreams.test.ts`

Worktree setup note: the isolated worktree install skipped native sqlite
build scripts, so I copied the already-built local sqlite binding from
the main checkout before running
`server/src/__tests__/cloud-upstreams.test.ts`. The test then passed.

## Risks

- Medium: this adds a database migration and a broad feature path across
CLI/server/UI.
- Merge order: this is the only PR in this split with a DB migration;
merge it before any future Cloud Upstream migration follow-up.
- Mitigation: the PR is based directly on current `origin/master`, has
targeted route/service/UI tests, and keeps the feature behind existing
experimental Cloud Sync settings.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI GPT-5 Codex via `codex_local`, tool-enabled coding session;
exact context window not exposed by this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, screenshot artifacts are
intentionally omitted per reviewer request
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-22 09:56:22 -05:00
Dotta a1835cfa5e [codex] Harden plugin runtime invocation scope (#6547)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through a company-scoped
control plane.
> - Plugins extend that control plane, but plugin workers still call
back into host APIs.
> - Those worker-to-host calls need the same company boundary guarantees
as normal API routes.
> - Plugin action handlers also need authenticated actor context from
the host instead of trusting caller-supplied params.
> - This pull request hardens plugin bridge/action scope and keeps
plugin operation issues out of normal issue surfaces.
> - The benefit is safer plugin execution with clearer authorization
boundaries and better test coverage.

## What Changed

- Added host-owned invocation context plumbing for nested plugin worker
calls.
- Added actor context to plugin `performAction` calls and test harness
helpers.
- Enforced company invocation scope on worker-to-host calls and filtered
company lists to the active invocation scope.
- Extended plugin action route tests for board and agent actor context,
spoofed company params, and cross-company rejection.
- Extended plugin worker manager coverage for invocation-scope
propagation.
- Filtered typed and legacy plugin operation issue origins from default
issue/inbox lists.

## Verification

- `pnpm --filter @paperclipai/plugin-sdk build`
- `NODE_ENV=test pnpm exec vitest run
packages/plugins/sdk/tests/host-client-factory.test.ts
packages/plugins/sdk/tests/testing-actions.test.ts
server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/plugin-worker-manager.test.ts
server/src/__tests__/issues-service.test.ts`

Note: embedded Postgres issue-service tests reported host-level Postgres
init skip for 47 tests; the non-embedded targeted tests passed.

## Risks

- Medium: plugin host authorization paths are sensitive, and external
plugins may rely on previously loose company params.
- Mitigation: the change only tightens calls when the host attached a
company invocation scope and includes explicit tests for board, agent,
and nested worker calls.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI GPT-5 Codex via `codex_local`, tool-enabled coding session;
exact context window not exposed by this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-22 09:16:24 -05:00
Dotta 38c185fb8b [codex] Add agent permissions and controls plan (#6386)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies by keeping
task ownership, approvals, and operator control inside one control
plane.
> - Agent permissions and plugin-hosted company settings sit on the
boundary between autonomy and governance.
> - V1 needs scoped task assignment rules, plugin extension points, and
clearer company access surfaces without weakening company boundaries.
> - The branch builds the core authorization service, plugin SDK/host
APIs, and UI simplifications needed to support those controls.
> - Paperclip EE plugin surfaces were intentionally moved out of this
core PR per review direction, so this PR now carries only the public
core/plugin infrastructure work.
> - The latest updates preserve the PAP-9937 branch changes that belong
in this PR, remove the `design/` artifacts, and exclude the experimental
`plugin-briefs` package.
> - Greptile feedback was applied through the authorization/audit paths
and the final cleanup commit was re-reviewed at 5/5 with no unresolved
Greptile threads.
> - The benefit is safer assignment control with extension hooks for
richer permission products while preserving simple defaults for normal
operators.

## What Changed

- Added scoped task-assignment authorization decisions and routed
issue/agent assignment mutations through the authorization service.
- Added plugin SDK and host APIs for company settings slots,
authorization policy/grant management, assignment previews, and bridge
invocation scope propagation.
- Simplified core company access UI and moved advanced controls behind
plugin-provided settings surfaces.
- Added retry-now affordances for blocked issue next-step notices.
- Added protected-assignment enforcement for persisted
agent/project/issue policies, including explicit-grant fallback
behavior.
- Added incremental principal-access compatibility backfill for active
agent memberships and role-default human permission grants.
- Added the Markdown code block wrap action fix from the latest branch
changes.
- Removed `design/` artifacts from the PR and removed
`packages/plugins/plugin-briefs` from the final diff.
- Addressed Greptile feedback for plugin actor sanitization, legacy
membership handling, audit pagination, unknown grant-scope metadata, and
startup test mocks.

## Verification

- `pnpm exec vitest run server/src/__tests__/access-service.test.ts
server/src/__tests__/company-portability.test.ts` -> 2 files passed, 54
tests passed.
- `pnpm exec vitest run
server/src/__tests__/server-startup-feedback-export.test.ts
server/src/__tests__/access-service.test.ts
server/src/__tests__/company-portability.test.ts` -> 3 files passed, 62
tests passed.
- `pnpm exec vitest run
server/src/__tests__/authorization-service.test.ts
server/src/__tests__/plugin-access-authorization-host-services.test.ts
server/src/__tests__/server-startup-feedback-export.test.ts` -> 3 files
passed, 28 tests passed.
- `pnpm --filter @paperclipai/server typecheck` -> passed.
- `git diff --check` -> passed.
- `node ./scripts/check-docker-deps-stage.mjs` -> passed.
- `CI=true pnpm install --frozen-lockfile --ignore-scripts` -> passed
with no lockfile update.
- `pnpm exec vitest run
ui/src/components/MarkdownBody.interaction.test.tsx` -> 1 test passed.
- `git ls-files design packages/plugins/plugin-briefs | wc -l` -> 0.
- GitHub CI on `40cd83b53` -> all checks passed, merge state `CLEAN`.
- Greptile on `40cd83b53` -> 5/5, 102 files reviewed, 0
comments/annotations added, 0 unresolved review threads.
- Confirmed the PR diff contains no `design/`,
`packages/plugins/plugin-briefs`, `pnpm-lock.yaml`, or
`.github/workflows` changes.

## Risks

- Medium: task assignment authorization paths are behaviorally stricter
for protected/private policy data, so existing plugin-authored policies
may block assignment until explicit grants or approval flows are
configured.
- Medium: plugin-host authorization APIs expand the surface area
available to trusted plugins and need careful review for company
scoping.
- Low: startup now performs a principal-access compatibility backfill,
but the migration and runtime backfill use conflict-tolerant inserts.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled workflow with shell,
git, and GitHub CLI access.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-22 08:12:52 -05:00
Dotta c91a062326 [codex] Runtime control-plane fixes (#6380)
## Thinking Path

> - Paperclip orchestrates AI agents through a server-side control plane
> - That control plane depends on reliable issue state transitions,
plugin lifecycle behavior, import limits, and startup/shutdown handling
> - Several small runtime fixes had accumulated on the working branch
and were mixed with larger feature work
> - Keeping them separate makes the correctness fixes reviewable and
mergeable without waiting for cloud-sync UI work
> - This pull request groups the server/runtime control-plane fixes into
one standalone branch
> - The benefit is a tighter, safer runtime baseline for retries,
imports, plugin migrations, feedback flushing, and trusted cloud import
handling

## What Changed

- Fixed updated issue list pagination sorting and scheduled retry
comment handling.
- Re-applied pending plugin migrations during hot reload and fixed
plugin-schema worktree seed restore.
- Hardened public tenant DB startup, portable import body limits,
trusted cloud import errors, and trusted cloud tenant import mutation
access.
- Expired stale request confirmations after user comments.
- Added feedback export shutdown hardening so database-unavailable flush
loops stop cleanly.
- Guarded plugin worker `error` event emission when no listener is
registered.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm --filter @paperclipai/plugin-sdk build`
- `npm run install --prefix
node_modules/.pnpm/sqlite3@5.1.7/node_modules/sqlite3`
- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/plugin-lifecycle-restart.test.ts
server/src/__tests__/server-startup-feedback-export.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/body-limits.test.ts
server/src/__tests__/feedback-flush-controller.test.ts
server/src/__tests__/error-handler.test.ts
server/src/__tests__/board-mutation-guard.test.ts
packages/db/src/backup-lib.test.ts` initially exposed local setup issues
and two 5s test timeouts.
- Rerun after local prereq build: `pnpm exec vitest run --testTimeout
15000 server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/feedback-flush-controller.test.ts
server/src/__tests__/server-startup-feedback-export.test.ts` passed.
- Some embedded Postgres-backed tests skipped on this host because local
Postgres init was unavailable.

## Risks

- Runtime-touching branch: startup/shutdown and issue interaction
behavior should be reviewed carefully.
- The feedback export change disables repeated flush attempts only for
database connection-refused failures; other upload failures still log
normally.
- The plugin worker error guard avoids process crashes from unhandled
EventEmitter errors but may hide errors from code paths that expected an
emitted listener.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent with local shell/git/tool use.
Exact hosted model ID and context-window size are not exposed by the
local Paperclip adapter runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-20 10:37:11 -05:00
Dotta f257530537 [codex] UI and dev ops quality-of-life (#6384)
## Thinking Path

> - Paperclip operators spend most of their time scanning the board,
inbox, sidebar, and local dev status surfaces
> - Small UI and dev-ops frictions make repeated operator workflows feel
slower than they need to be
> - The working branch contained several independent quality-of-life
improvements mixed with larger cloud work
> - Grouping these smaller UI/dev-ops changes together keeps review
overhead reasonable without merging them into feature PRs
> - This pull request collects the operator-facing QoL polish into one
standalone branch
> - The benefit is a cleaner board navigation and local dev recovery
experience without depending on cloud upstream sync

## What Changed

- Relaxed forced 44px touch targets for small inline widgets.
- Fixed mobile mention menu scrolling and sidebar spacing on
touch/mobile layouts.
- Synced inbox hover state with j/k selection.
- Moved plugin sidebar entries into the Work section.
- Added manual dev-server restart action/banner behavior.
- Logged plugin bridge 502 causes for better diagnosis.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm --filter @paperclipai/plugin-sdk build`
- `pnpm exec vitest run ui/src/components/MarkdownEditor.test.tsx
ui/src/components/Sidebar.test.tsx
ui/src/components/SidebarProjects.test.tsx ui/src/pages/Inbox.test.tsx
ui/src/components/DevRestartBanner.test.tsx
server/src/__tests__/dev-server-status.test.ts
server/src/__tests__/health-dev-server-token.test.ts
server/src/__tests__/plugin-routes-authz.test.ts` initially failed only
because plugin SDK `dist` was not built in the fresh worktree.
- Rerun after build: `pnpm exec vitest run
server/src/__tests__/plugin-routes-authz.test.ts` passed.
- The remaining targeted UI/dev-server tests passed on the first
post-install run.

## Visual Evidence

- Sidebar layout and plugin Work section: ![Sidebar
desktop](https://raw.githubusercontent.com/paperclipai/paperclip/pap-9861-ui-dev-qol/docs/pr-screenshots/pr-6384/sidebar-desktop.png)
- Inbox/task row selection and hover-state surface: ![Inbox rows
desktop](https://raw.githubusercontent.com/paperclipai/paperclip/pap-9861-ui-dev-qol/docs/pr-screenshots/pr-6384/inbox-rows-desktop.png)
- Dev restart banner desktop: ![Dev restart banner
desktop](https://raw.githubusercontent.com/paperclipai/paperclip/pap-9861-ui-dev-qol/docs/pr-screenshots/pr-6384/dev-restart-banner-desktop.png)
- Dev restart banner mobile: ![Dev restart banner
mobile](https://raw.githubusercontent.com/paperclipai/paperclip/pap-9861-ui-dev-qol/docs/pr-screenshots/pr-6384/dev-restart-banner-mobile.png)

## Risks

- Mostly UI/dev ergonomics with low data risk.
- Sidebar and inbox changes touch frequently used navigation surfaces,
so visual review on desktop/mobile is still useful.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent with local shell/git/tool use.
Exact hosted model ID and context-window size are not exposed by the
local Paperclip adapter runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-19 15:52:39 -05:00
Dotta 43c5bb81b6 [codex] Workspace diff polish (#6383)
## Thinking Path

> - Paperclip gives operators a workspace diff plugin so they can
inspect agent changes before review
> - The diff view needs reliable base-ref defaults and controls that
stay usable while scrolling large diffs
> - The working branch mixed those plugin improvements with unrelated
server and cloud work
> - Keeping the workspace diff plugin changes isolated makes them easy
to test and review
> - This pull request polishes the workspace diff plugin controls,
base-ref behavior, and sticky headers
> - The benefit is a more predictable diff review surface for agent
workspaces

## What Changed

- Fixed workspace diff default base-ref resolution.
- Improved split/unified and working-tree/against-ref pane controls.
- Made workspace diff headers stay sticky while scrolling.
- Added a review screenshot at
`screenshots/PAP-9841-workspace-diff.png`.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm --filter @paperclipai/plugin-sdk build`
- `pnpm --filter @paperclipai/plugin-workspace-diff exec vitest run
tests/plugin.spec.ts`
- Result: 9 tests passed.

## Risks

- UI-only plugin branch with low data risk.
- The default base-ref inference should be reviewed against unusual
worktree/upstream combinations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent with local shell/git/tool use.
Exact hosted model ID and context-window size are not exposed by the
local Paperclip adapter runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-19 15:51:13 -05:00
Dotta d67347be77 [codex] Provider vault secrets UX (#6381)
## Thinking Path

> - Paperclip orchestrates AI agents that need scoped, auditable access
to secrets
> - Hosted and external deployments need provider vault configuration
without exposing secret values in Paperclip metadata
> - AWS Secrets Manager vault setup previously required too much manual
operator knowledge
> - Provider vault discovery and removal belong together as an
independent secrets-management improvement
> - This pull request adds AWS provider vault discovery/prefill plus
vault removal flows
> - The benefit is a safer operator path for configuring external secret
storage before higher-level cloud workflows depend on it

## What Changed

- Added shared validators/types for AWS provider vault discovery
payloads and safe provider metadata.
- Implemented AWS provider vault discovery preview on the server.
- Added provider vault removal service/route behavior.
- Added Secrets page UI for discovery prefill, removal messaging, and
related rendering coverage.
- Added Storybook provider-vault fixtures and captured screenshots for
the new UX states.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run packages/shared/src/validators/secret.test.ts
server/src/__tests__/aws-secrets-manager-provider.test.ts
server/src/__tests__/secrets-routes.test.ts
server/src/__tests__/secrets-service.test.ts
ui/src/pages/Secrets.render.test.tsx`
- Result: 4 files passed, 1 embedded Postgres-backed file skipped on
this host because local Postgres init was unavailable.
- `pnpm --filter @paperclipai/ui exec vitest run
src/pages/Secrets.render.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- Storybook screenshot capture against `Product/Secrets` on
`http://127.0.0.1:60381/iframe.html?id=product-secrets--secrets-inventory&viewMode=story&globals=theme:dark`

## Screenshots

Provider vaults tab after this change:

![Provider vaults
tab](https://raw.githubusercontent.com/paperclipai/paperclip/pap-9861-provider-vault-secrets/doc/screenshots/pr-6381/provider-vaults-tab.png)

AWS discovery candidate flow:

![AWS discovery candidate
flow](https://raw.githubusercontent.com/paperclipai/paperclip/pap-9861-provider-vault-secrets/doc/screenshots/pr-6381/aws-discovery-candidates.png)

Provider vault removal confirmation:

![Provider vault removal
confirmation](https://raw.githubusercontent.com/paperclipai/paperclip/pap-9861-provider-vault-secrets/doc/screenshots/pr-6381/remove-provider-vault-confirmation.png)

## Risks

- Secret provider metadata handling must remain non-sensitive;
validators reject credential-bearing Vault URLs and sensitive AWS
discovery keys.
- AWS discovery depends on deployment credentials being configured
correctly outside Paperclip-managed company secrets.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent with local shell/git/tool use.
Exact hosted model ID and context-window size are not exposed by the
local Paperclip adapter runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-19 15:50:23 -05:00
Dotta 9c29394f4d [codex] Allow cloud tenant import mutations without browser origin (#6378)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Paperclip Cloud imports local company data into tenant Paperclip
stacks through trusted server-to-server calls.
> - Tenant imports authenticate as board actors with `source:
"cloud_tenant"` because they act on behalf of an authorized stack user.
> - The board mutation guard correctly protects browser session
mutations with trusted `Origin`/`Referer` checks.
> - But the guard treated trusted Cloud tenant calls like browser
session mutations, so server-to-server imports without a browser origin
failed with `403 Board mutation requires trusted browser origin`.
> - This pull request exempts trusted Cloud tenant actors from
browser-origin enforcement while preserving the session-backed browser
guard.
> - The benefit is that authorized Cloud imports can persist into tenant
Paperclip storage without weakening browser CSRF protections.

## What Changed

- Allow `req.actor.source === "cloud_tenant"` through
`boardMutationGuard` without requiring browser `Origin` or `Referer`
headers.
- Add a focused regression test for Cloud tenant POST mutations without
an origin.
- Preserve the existing session-backed rejection test for board
mutations that lack a trusted browser origin.

## Verification

- `pnpm exec vitest run
server/src/__tests__/board-mutation-guard.test.ts`
- Result: 10 tests passed.

## Risks

- Low risk: this only expands the existing non-browser exemption list to
trusted Cloud tenant actors that have already passed tenant-server-token
authentication.
- The browser-session path remains covered by the existing rejection
test, so missing-origin browser mutations still fail.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled local repository
editing and shell verification.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-19 14:25:58 -05:00
Dotta bfe6369ef5 Guard cheap recovery model usage (#6371)
## Thinking Path

> - Paperclip is the control plane that coordinates AI-agent work
through issues, heartbeats, comments, approvals, and auditable recovery
paths.
> - The affected subsystem is heartbeat/recovery orchestration,
especially the optional cheap model profile used for operational
recovery overhead.
> - Cheap recovery should repair status and liveness, but it must not
become the worker lane that writes deliverables, continues source work,
or propagates cheap execution hints into downstream retries.
> - The gap was that cheap-profile hints could follow recovery wake
contexts and assignment overrides farther than intended, making real
work eligible to run on the cheap model.
> - This pull request separates status-only cheap recovery from normal
source-work continuations, adds route guards for deliverable mutations
during cheap status-only runs, and documents the invariant.
> - The benefit is safer retry/recovery behavior: cheap runs can clean
up control-plane state, while any remaining source work resumes through
a normal/original model path.

## What Changed

- Added recovery model-profile work classes so status-only recovery
carries explicit guard context and normal-model continuations scrub
cheap hints.
- Updated heartbeat, productivity review, liveness continuation, and
recovery service wakeups to request cheap only for bounded status-only
recovery work.
- Blocked cheap status-only recovery runs from writing issue documents,
plans, attachments, work products, or assigning downstream work back to
`modelProfile: "cheap"`.
- Added/updated server tests for cheap profile propagation,
artifact/document guards, route authorization, retry scheduling, and
successful-run handoff behavior.
- Documented the recovery model-profile lane in
`doc/SPEC-implementation.md` and `doc/execution-semantics.md`.
- After rebasing onto current `public-gh/master`, stabilized the new
`InstanceSidebar` plugin-filter tests so the PR check lane stays green.

## Verification

- Local: `pnpm exec vitest run --config vitest.config.ts
src/services/recovery/model-profile-hint.test.ts
src/__tests__/issue-agent-mutation-ownership-routes.test.ts
src/__tests__/issue-document-restore-routes.test.ts` from `server/` - 3
files, 37 tests passed after final edits.
- Local: `pnpm exec vitest run --config vitest.config.ts
src/__tests__/heartbeat-process-recovery.test.ts` from `server/` - 44
tests passed after rerunning the cleanup-sensitive file alone.
- Local: `pnpm --filter @paperclipai/ui exec vitest run
src/components/InstanceSidebar.test.tsx` - 4 tests passed.
- Local: `pnpm --filter @paperclipai/server typecheck` - passed.
- Local: `pnpm --filter @paperclipai/ui typecheck` - passed.
- PR checks on latest head `6f8c3b1380f5bd872c6f49f6f7188ecf3bb6d263` -
all green, including `verify`, build, typecheck,
server/general/serialized tests, e2e, Snyk, and policy.
- Greptile: pass 3 returned Confidence Score 5/5 with zero unresolved
Greptile review threads.

## Risks

- Medium risk: recovery behavior is intentionally stricter, so any path
that incorrectly relies on cheap recovery to keep doing source work will
now need to hand back to a normal-model run.
- Low migration risk: no schema changes.
- No product UI changes; the UI file touched is a test-only
stabilization after rebasing onto current `master`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool use and
local code execution enabled; context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no product UI changes)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-19 13:46:02 -05:00
Devin Foley 24748de421 fix(ui): hide sandbox-provider plugins from Instance Settings sidebar (#6341)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Plugins extend Paperclip with new capabilities; the Instance
Settings sidebar exposes per-plugin settings pages under a "Plugins"
group
> - Some plugins contribute only sandbox-provider drivers (E2B, exe.dev,
Modal). They have no per-plugin settings UI — `PluginSettings` already
redirects sandbox-provider-only plugins to the Environments page
> - As a result, listing them as their own sidebar rows produced
confusing entries that visually nested below the "Adapters" group and
only lead to a stub redirect — there was no value to the user
> - This pull request hides sandbox-provider-only plugins from the
Instance Settings sidebar, and reorders the indented plugin list so it
sits directly under the "Plugins" group it actually belongs to
> - The benefit is a cleaner sidebar that only surfaces plugins with
real per-plugin settings, and removes the visual mis-nesting under
Adapters

## What Changed

- `ui/src/components/InstanceSidebar.tsx`: filter out plugins whose only
contribution is `sandboxProviders` (hybrid plugins that contribute
sandbox providers *plus* something else still get a sidebar entry). Move
the indented plugin list so it renders between the "Plugins" row and the
"Adapters" row instead of after Adapters.
- `ui/src/components/InstanceSidebar.test.tsx`: new test file with 4
cases — sandbox-only plugins hidden, hybrid plugins shown, ordering
(plugin list appears under Plugins and before Adapters), and the
existing non-plugin sidebar items still render.

## Verification

- `pnpm -C ui vitest run src/components/InstanceSidebar.test.tsx` → 4/4
pass.
- `pnpm typecheck` clean on the changed files.
- Manual: visit `/instance/settings/plugins` — "E2B Sandbox Provider"
and "exe.dev Sandbox Provider" rows no longer appear in the sidebar;
remaining plugins are listed directly under the "Plugins" group, not
below Adapters.

**Before:** see the screenshot embedded in the linked issue (`PAPA-375`)
— sandbox-provider plugins show as sidebar rows visually nested under
"Adapters".

**After:** sandbox-provider-only rows are gone; plugin list sits
directly under the "Plugins" group. (A live runtime screenshot was not
captured for this PR because the local server requires an authenticated
browser session not currently available to the agent — happy to add one
on request.)

## Risks

- Low risk. Pure UI filter + reorder, scoped to `InstanceSidebar.tsx`.
No backend or plugin-loader changes. Hybrid plugins that legitimately
need a settings entry are explicitly preserved by the filter. Covered by
4 new unit tests.

## Model Used

- Claude Opus 4.7 (`claude-opus-4-7`), Anthropic, via Claude Code in a
Paperclip heartbeat. Standard tool-use mode (no extended thinking). 200K
context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — before is in the linked issue; after pending live capture
(see Verification)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Closes PAPA-375.

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-19 09:06:24 -07:00
Devin Foley c0c5a8263d feat(ui): wire SecretBindingPicker into JsonSchemaForm secret-ref fields (#6339)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Plugin authors expose configuration via JSON schemas, including
secret fields marked `format: "secret-ref"`
> - At the same time, Paperclip already has a first-class secrets store,
and `SecretBindingPicker` is the canonical UI for binding to one of
those stored secrets
> - But `JsonSchemaForm`'s `SecretField` rendered only a plain password
input, so configuring an E2B (or Modal / Cloudflare / Daytona) sandbox
required leaving the form, copying a secret UUID, and pasting it back
> - This pull request wires `SecretBindingPicker` into `SecretField` so
every plugin secret-ref field gets the picker plus an optional raw-value
fallback
> - The benefit is that secret reuse becomes one click instead of a tab
switch, and the raw-paste path still works for one-off keys or long
SSH-style secrets

## What Changed

- `ui/src/components/JsonSchemaForm.tsx` `SecretField` now renders
`SecretBindingPicker` above the existing password/textarea input.
UUID-shaped values are treated as bound refs (no raw input shown).
Non-UUID values keep the password/textarea visible (auto-opened) for SSH
keys and other long secrets. Empty fields show the picker plus a small
"Or paste a raw value" toggle.
- Selecting a secret writes the secret UUID to the form value — the
server-side resolution in `server/src/services/environment-config.ts`
(`resolveConfigSecretRefsForRuntime` / `collectEnvironmentSecretRefs`)
is unchanged. The version selector on the picker is suppressed
(`allowVersionSelector={false}`) because plugin secret refs always
resolve at `"latest"`.
- `ui/src/components/JsonSchemaForm.test.tsx` mocks the picker (which
requires `CompanyContext` + `QueryClient` providers) and adds coverage
for: picker render, UUID-bound state hides the raw input, picker
selection writes the UUID through `onChange`, raw text keeps the
password fallback. The original multiline (SSH key) case still asserts a
textarea + no password input.

## Verification

- `pnpm --filter @paperclipai/ui test
src/components/JsonSchemaForm.test.tsx` → 4/4 passing
- `pnpm --filter @paperclipai/ui test src/pages/PluginSettings.test.tsx`
→ 5/5 passing (existing consumer of `JsonSchemaForm`)
- `pnpm --filter @paperclipai/ui exec tsc --noEmit` → clean
- Manual: in the company Environments page, edit an environment with a
sandbox driver that exposes a `secret-ref` field (e.g., E2B `apiKey`).
The field should render the secret dropdown above the raw-value toggle;
selecting an active secret persists its UUID, and saving the form
continues to resolve the secret at runtime.

Before/after screenshots: deferred — change was validated by
[@devinfoley](https://github.com/devinfoley) on the main Paperclip
instance before this PR was opened. Happy to add screenshots if a
reviewer wants them.

## Risks

- Low risk. The change is additive in the SecretField: the raw-value
password/textarea path is preserved and auto-opens whenever the stored
value is not a UUID, so existing SSH-key entries and unsaved raw values
are untouched.
- The new heuristic is "if `value` is a UUID, treat it as a bound
secret". A user who somehow pasted a UUID as a literal value (not as a
secret ref) would now see it rendered as a bound (possibly missing)
secret in the picker. The previous UI already treated UUID values as
opaque secret refs at save time (server converts UUIDs straight
through), so the runtime behavior is unchanged.
- Picker pulls company secrets via the existing `secretsApi.list` query.
No new endpoints, no migrations.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (`claude-opus-4-7`)
- Capabilities: tool use, extended reasoning
- Surfaced through: Claude Code via Paperclip heartbeat (issue PAPA-377)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — deferred; user validated locally before opening the PR.
Will add if requested.
- [x] I have updated relevant documentation to reflect my changes (no
docs needed — internal behavior of an existing form field)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-18 21:17:41 -07:00
Devin Foley f343bae119 fix(ci): copy link-plugin-dev-sdk.mjs into Docker deps stage (#6338)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Releases ship via a Docker image built in the `build-and-push` CI
workflow
> - A recent change added `plugin-workspace-diff` to the pnpm workspace;
its `postinstall` hook calls `scripts/link-plugin-dev-sdk.mjs`
> - The Dockerfile's `deps` stage runs `pnpm install` before the full
repo is copied, so the script was missing and `pnpm install` failed with
`Cannot find module`
> - Sandbox-provider plugins have the same hook but are excluded from
the pnpm workspace, so they were unaffected — this was specific to
`plugin-workspace-diff`
> - This pull request copies `scripts/link-plugin-dev-sdk.mjs` into the
`deps` stage alongside the package.json files
> - The benefit is restoring the `build-and-push` CI workflow with a
minimal one-line change

## What Changed

- Add `COPY scripts/link-plugin-dev-sdk.mjs scripts/` to the
Dockerfile's `deps` stage so the `plugin-workspace-diff` postinstall
hook succeeds during `pnpm install`.

## Verification

- Reproduces the original failure on `master` by running `docker build
--target deps .` — fails at `pnpm install` with `Cannot find module
'/app/scripts/link-plugin-dev-sdk.mjs'`.
- With this patch, `docker build --target deps .` completes successfully
through the `deps` stage.
- CI `build-and-push` job (previously failing on
https://github.com/paperclipai/paperclip/actions/runs/26055610103/job/76602841176)
should now pass.

## Risks

- Low risk. One-line addition that copies a single script earlier in the
Docker build. No runtime behavior changes, no app code changes, no
schema changes.

## Model Used

- Claude (Anthropic), model ID `claude-opus-4-7`, extended thinking
enabled, 200K context. Used via Claude Code CLI with tool use (Bash,
Read, Edit, Grep) running inside the Paperclip agent harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-18 20:55:34 -07:00
Dotta a07e6cef7b [codex] Fix new issue autocomplete pointer selection (#6311)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Human operators create and edit issues through modal-heavy board UI
workflows.
> - The new-issue dialog embeds markdown editors that render
autocomplete menus through body-level portals.
> - Radix Dialog treats those portal clicks as outside-dialog pointer
events and prevents their default behavior.
> - That prevention made completion items hard or impossible to select
from inside the new-issue dialog.
> - This pull request marks the markdown editor floating autocomplete
menu as allowed dialog-external UI and extends the dialog
outside-pointer handler to preserve those interactions.
> - The benefit is that users can click/tap autocomplete completions
while keeping the existing modal behavior intact.

## What Changed

- Added a stable `data-paperclip-floating-ui` marker and explicit
pointer event handling to the markdown editor mention/autocomplete
portal.
- Updated the new issue dialog outside-pointer guard so editor
autocomplete portals are handled like Radix popover portals.
- Added regression coverage for markdown editor portal markup and new
issue dialog completion selection behavior.

## Verification

- `pnpm exec vitest run ui/src/components/MarkdownEditor.test.tsx
ui/src/components/NewIssueDialog.test.tsx` passed: 2 files, 38 tests.
- Confirmed the branch is rebased onto current `public-gh/master` before
opening this PR.
- Confirmed the diff does not include `pnpm-lock.yaml` or
`.github/workflows` changes.

## Risks

- Low risk. The change is scoped to allowing pointer events from known
body-level UI portals while keeping other outside-dialog pointer events
under Radix Dialog control.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent with repository tool use and local
command execution. Exact hosted context window is not surfaced in this
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Screenshot note: this is an interaction/event-handling fix with no
visible UI change; verification is covered by the focused regression
tests above.

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-18 14:28:49 -05:00
Devin Foley 988689947a fix(release): publish modal plugin from ci (#6290)
## Thinking Path

> - Paperclip keeps its core release process declarative through
`scripts/release-package-manifest.json`, which decides which packages CI
is allowed to publish.
> - The Modal sandbox provider now exists as a first-party plugin
package under `packages/plugins/sandbox-providers/modal`.
> - The original Modal PR intentionally left `publishFromCi` disabled
until the package had been published and the registry bootstrap concern
was cleared.
> - The latest reviewer comment confirms that bootstrap step is
complete, so the remaining gap is only release automation configuration.
> - This pull request flips the Modal manifest entry to `publishFromCi:
true` so future CI-driven releases can publish
`@paperclipai/plugin-modal` the same way the other releasable packages
do.
> - The benefit is that Modal releases no longer require a manual
exception in the release pipeline.

## What Changed

- Updated the `@paperclipai/plugin-modal` entry in
`scripts/release-package-manifest.json` to set `publishFromCi` to
`true`.

## Verification

- Ran `node -e 'const
m=require("./scripts/release-package-manifest.json"); const
e=m.find(x=>x.name==="@paperclipai/plugin-modal");
if(!e||e.publishFromCi!==true){throw new Error("modal publishFromCi not
true")}; console.log(JSON.stringify(e))'`

## Risks

- Low risk. This only changes release-manifest metadata; the main
failure mode is CI attempting to publish the Modal package before
registry credentials or release conditions are ready.

## Model Used

- OpenAI Codex local agent, GPT-5-based coding model in the Codex
runtime (exact deployment model ID not exposed in this workspace), with
tool use and shell execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-18 09:42:01 -07:00
Devin Foley e5a0f5debd fix(plugin): raise environmentProbe RPC timeout to 120s for cold-start sandboxes (#6289)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Companies provision execution environments via sandbox provider
plugins (Modal, Daytona, E2B, etc.)
> - At provision time, the server probes each plugin's environment /
sandbox-provider driver over a worker RPC to validate config
> - `workerManager.call()` defaults to a 30s timeout, but cold-start
sandboxes — Modal in particular — take ~31s to boot
> - Result: every fresh Modal environment probe fails with a worker RPC
timeout, blocking environment provisioning end-to-end
> - This PR passes `timeoutMs=120_000` to the two probe call sites
(`probePluginEnvironmentDriver`, `probePluginSandboxProviderDriver`)
> - The benefit is Modal — and any future provider with similar
cold-start latency — can be successfully probed without false-negative
timeout failures

## What Changed

- Pass `timeoutMs=120_000` to `workerManager.call()` in
`probePluginEnvironmentDriver`
(`server/src/services/plugin-environment-driver.ts`)
- Pass `timeoutMs=120_000` to `workerManager.call()` in
`probePluginSandboxProviderDriver` (same file)

## Verification

- Targeted unit tests:
  ```
  pnpm --filter @paperclipai/server exec vitest run \
    src/__tests__/plugin-environment-driver-seam.test.ts \
    src/__tests__/heartbeat-plugin-environment.test.ts
  ```
  5/5 tests pass.
- Manual: provision a fresh Modal sandbox environment from the UI.
Previously failed with a worker RPC timeout at ~30s; now succeeds.

## Risks

- Low risk. The change only raises a per-call timeout (default 30s →
explicit 120s) on two probe call sites. Fast providers are unaffected
since probe completes well below either bound. Worst case: a genuinely
hung worker now blocks the probe for 120s instead of 30s before giving
up — still bounded, and only on the provision-time probe path (not the
heartbeat/run path).

## Model Used

- Provider: Anthropic
- Model: `claude-opus-4-7` (Claude Opus 4.7, 1M context window)
- Capabilities: extended thinking, tool use, code execution
- Scope of AI assistance: the underlying 4-line code change was
human-authored by the committer; this PR (verification commands, message
structuring, and submission) was prepared with Claude per the
`paperclip-dev` skill.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable — n/a, this is a
per-call timeout configuration bump; existing tests cover the probe call
path
- [x] If this change affects the UI, I have included before/after
screenshots — n/a, no UI change
- [ ] I have updated relevant documentation to reflect my changes — n/a,
the timeout is an internal worker-RPC tuning value with no documented
contract
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-18 09:32:12 -07:00
Devin Foley 4b1e92a588 feat(plugins): add Modal sandbox provider plugin (#6245)
## Thinking Path

> - Paperclip orchestrates AI agents through company-scoped
control-plane workflows and extensible runtime integrations.
> - Sandbox providers are part of that extension surface because they
let agents execute isolated work without baking each provider into the
core server.
> - Modal already offers managed sandboxes with filesystem, process,
timeout, and networking controls that map onto Paperclip's sandbox
provider contract.
> - The repo did not have a Modal provider plugin, so teams wanting
Modal-backed sandboxes had no first-party integration path.
> - This pull request adds a standalone
`packages/plugins/sandbox-providers/modal` plugin that implements the
provider contract, worker entrypoint, docs, and tests.
> - The benefit is that Modal can now be installed as a provider plugin
without expanding the core control-plane surface area.

## What Changed

- Added a new `packages/plugins/sandbox-providers/modal` package with
the plugin manifest, worker entrypoint, and exported plugin surface.
- Implemented Modal-backed sandbox lifecycle support for creation,
command execution, file operations, networking options, termination, and
metadata translation.
- Added focused Vitest coverage for config validation, env handling,
lifecycle flows, networking behavior, and error mapping.
- Documented installation, configuration, and usage requirements in the
plugin README.
- Removed misleading `MODAL_TOKEN_*` fallback behavior so authentication
relies on supported Modal credentials only.

## Verification

- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- `cd packages/plugins/sandbox-providers/modal && pnpm test`

## Risks

- Low to medium risk: this is isolated to a new plugin package, but
runtime behavior still depends on live Modal account credentials and
service-side sandbox semantics.
- Modal's current docs target a newer Node baseline than the repo
default, so the first live install should confirm credential loading and
sandbox startup behavior in a real Modal workspace.
- No UI or schema changes are included in this PR.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agent (GPT-5-class Codex
coding model; exact backend model ID is not exposed by the runtime),
with tool use, shell execution, and code-editing capabilities enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-18 08:36:34 -07:00
Dotta 8f45d12447 docs: update plugin authoring guide for managed resources (#6261)
## Thinking Path

<!--
Required. Trace your reasoning from the top of the project down to this
  specific change. Start with what Paperclip is, then narrow through the
  subsystem, the problem, and why this PR exists. Use blockquote style.
  Aim for 5-8 steps. See CONTRIBUTING.md for full examples.
-->

> - Paperclip orchestrates AI agents for zero-human companies.
> - The plugin system is how optional capabilities extend the control
plane without adding hidden core behavior.
> - Plugin authors need accurate guidance for the current managed
capabilities model.
> - The existing docs under-described managed skills and the
routine-first pattern for durable plugin automation.
> - Content-oriented plugins such as LLM Wiki should model recurring
work with visible managed agents, projects, routines, and skills.
> - This pull request aligns the authoring guide, local development
guide, and longer plugin spec with that model.
> - The benefit is clearer plugin guidance that preserves Paperclip
visibility, budgets, pause controls, and audit trails.

## What Changed

<!-- Bullet list of concrete changes. One bullet per logical unit. -->

- Documented plugin-managed skills alongside managed agents, projects,
and routines.
- Added guidance for content-oriented plugins to use managed projects,
agents, skills, and routines instead of private daemon-like state.
- Updated the manifest/spec examples and capability lists for current
plugin-managed surfaces.
- Clarified when to use managed routines instead of plugin runtime jobs
for board-visible recurring work.
- Added a short local plugin development note pointing authors toward
routine-first automation.
- Addressed Greptile docs feedback by marking top-level `launchers` as
legacy and removing a redundant `slug` from the managed skill example.

## Verification

<!--
  How can a reviewer confirm this works? Include test commands, manual
  steps, or both. For UI changes, include before/after screenshots.
-->

- `git diff --check public-gh/master...HEAD`
- Reviewed `ROADMAP.md`; this is docs alignment for the completed plugin
system milestone and does not add roadmap-level core feature work.
- Greptile Review: success on the latest head; `3 files reviewed, 0
comments added` after follow-up fixes.
- GitHub PR checks are green on the latest head, including Build,
Typecheck + Release Registry, General tests, serialized server suites,
e2e, Canary Dry Run, policy, Snyk, and aggregate `verify`.

## Risks

<!--
  What could go wrong? Mention migration safety, breaking changes,
  behavioral shifts, or "Low risk" if genuinely minor.
-->

- Low risk: documentation-only changes.
- Main risk is documentation drift if the plugin API changes again
before these docs are reviewed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected - check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

<!--
  Required. Specify which AI model was used to produce or assist with
  this change. Be as descriptive as possible - include:
    • Provider and model name (e.g., Claude, GPT, Gemini, Codex)
• Exact model ID or version (e.g., claude-opus-4-6,
gpt-4-turbo-2024-04-09)
    • Context window size if relevant (e.g., 1M context)
• Reasoning/thinking mode if applicable (e.g., extended thinking,
chain-of-thought)
• Any other relevant capability details (e.g., tool use, code execution)
  If no AI model was used, write "None — human-authored".
-->

- OpenAI Codex, GPT-5 coding agent with shell and GitHub connector tool
use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-18 10:14:27 -05:00
Dotta 32605b71ad Remove planning badge from inbox issue rows (PAP-9691) (#6269)
## Summary

- Removes the amber "Planning" pill from inbox / issue-list rows in
`IssueRow`
- Updates the focused IssueRow test to assert the badge is no longer
rendered
- Per [PAP-9691](https://paperclip.ing/PAP/issues/PAP-9691): user just
doesn't want to see the badge in list rows

The underlying `issue.workMode === "planning"` data, the issue detail
composer toggle, and the server/plugin/heartbeat work-mode contract
introduced in #5353 are all untouched. Planning mode still functions;
the list-row indicator is just gone.

## Test plan

- [x] `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueRow.test.tsx` (11 passed)
- [ ] Visual: open `/PAP/inbox` with a planning-mode issue assigned and
confirm no amber Planning pill on the row

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 09:27:05 -05:00
github-actions[bot] 85510f0e5b chore(lockfile): refresh pnpm-lock.yaml (#6263)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-05-18 08:52:09 -05:00
Dotta 5071c4c776 [codex] Add workspace diff viewer plugin (#6071)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators need to inspect what agents changed inside execution and
project workspaces.
> - The existing workspace detail views did not provide a first-party
rich diff surface for staged, unstaged, head, renamed, binary,
oversized, and untracked changes.
> - The plugin system is the intended extension point for optional rich
UI surfaces.
> - This pull request adds a workspace diff plugin plus host services
and shared contracts so Changes tabs can render workspace diffs through
plugin slots.
> - The diff-renderer dependency should stay owned by the plugin package
rather than the core UI app.
> - The dependency surface must stay aligned with repository PR policy,
including intentionally omitting `pnpm-lock.yaml` from the PR.
> - The benefit is a more reviewable workspace surface without
hard-coding the renderer into every page.

## What Changed

- Added `@paperclipai/plugin-workspace-diff`, including diff
normalization, plugin manifest/worker/UI entrypoints, and focused plugin
tests.
- Kept `@pierre/diffs` scoped to `@paperclipai/plugin-workspace-diff`;
removed the core UI lab diff-renderer surface and direct UI package
dependency.
- Added shared workspace diff types and validators, plus plugin SDK
surface for workspace diff host services.
- Added server workspace diff service support and route coverage for
execution/project workspace diff flows.
- Wired Execution Workspace and Project Workspace Changes tabs to load
the diff plugin, including loading/error fallback behavior.
- Added UI tests and fixtures for the Changes tabs and plugin bridge
behavior.
- Added the new plugin package manifest to the Docker deps stage so PR
policy can validate dependency coverage.
- Addressed review hardening around empty untracked patches, workspace
path exposure, project workspace read capability checks, and default
base refs.

## Verification

- `pnpm --filter @paperclipai/plugin-workspace-diff test`
- `pnpm exec vitest run
packages/shared/src/validators/workspace-diff.test.ts
server/src/__tests__/workspace-diff-service.test.ts
ui/src/pages/ProjectWorkspaceDetail.test.tsx
ui/src/pages/ExecutionWorkspaceDetail.test.tsx`
- `pnpm exec vitest run ui/src/plugins/bridge.test.ts
server/src/__tests__/workspace-runtime-routes-authz.test.ts`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/plugin-workspace-diff typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `node ./scripts/check-docker-deps-stage.mjs`
- Browser screenshot captured from the local worktree dev server:
https://files.catbox.moe/ofdpsp.png
- Confirmed branch is rebased onto `public-gh/master`,
`.github/workflows/pr.yml` is not included in the PR diff,
`ui/package.json` is not included in the PR diff, and `pnpm-lock.yaml`
is not included in the PR diff.

## Risks

- Medium UI integration risk: the Changes tab depends on the plugin slot
and host diff service path.
- Medium dependency risk: this adds `@pierre/diffs` in the plugin
package, but `pnpm-lock.yaml` is intentionally omitted per packaging
instructions because repository automation manages lockfile updates.
- Current CI blocker: downstream frozen installs fail until the
repository policy path for new plugin package dependencies is chosen.
- Diff rendering edge cases are covered for common working-tree and head
diff states, but very large repositories may still expose performance
limits.
- No migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 class coding model, tool-enabled local execution
environment. Exact context window was not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-18 08:50:06 -05:00
Devin Foley 242a2c2f2b fix(cli): stop worktree init --force from wiping repo worktrees/ (#6240)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each working tree gets an isolated Paperclip instance; the CLI's
`paperclip worktree init` is what bootstraps that instance and writes
`<repo>/.paperclip/config.json` + `.env`
> - When `--force` was passed, the init path tried to "start clean" by
recursively removing the entire `<repo>/.paperclip/` directory before
rewriting those two files
> - But `<repo>/.paperclip/` also holds `worktrees/`, which contains
every repo-managed worktree checkout (70+ on this machine). The
recursive rm silently nuked all of them.
> - This PR narrows the `--force` reset so it only deletes the two files
it's about to rewrite (`config.json`, `.env`), instead of wiping the
whole `repoConfigDir`
> - It also adds a regression test that drops a sentinel file into
`<repoConfigDir>/worktrees/` and asserts it survives a `--force` init
> - The benefit is that `worktree init --force` becomes safe to run from
inside the main repo without destroying every sibling worktree checkout

## What Changed

- `cli/src/commands/worktree.ts`: in the `--force` branch of
`runWorktreeInit`, replace `rmSync(paths.repoConfigDir, { recursive:
true, force: true })` with targeted removals of `paths.configPath` and
`paths.envPath`. `paths.instanceRoot` removal is unchanged — that path
is per-instance and safe to wipe.
- `cli/src/__tests__/worktree.test.ts`: new regression test that seeds a
fake `worktrees/<name>/` checkout inside the repo's `.paperclip/` and
verifies `runWorktreeInit({ force: true, ... })` does not delete it.

## Verification

- `pnpm --filter @paperclip/cli test -- worktree` — the new regression
test fails on the old code and passes on the fix
- Manual: from a repo checkout, `npx paperclipai worktree init --force
…` no longer removes `<repo>/.paperclip/worktrees/`; only `config.json`
and `.env` are rewritten

## Risks

- Low. The change strictly narrows what `--force` removes. Any caller
that depended on `--force` also wiping unrelated files under
`<repo>/.paperclip/` (there shouldn't be any — it's documented as just
config + env) would see those files persist. `instanceRoot` cleanup is
unchanged.

## Model Used

- Claude (Anthropic), model `claude-opus-4-7`, ~200K context,
extended-thinking + tool-use enabled, run via the Paperclip
`claude_local` adapter.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — CLI-only fix)
- [x] I have updated relevant documentation to reflect my changes (no
doc surface affected)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-17 22:12:56 -07:00
Devin Foley 734385102c Fix new secret form textarea overflow (PAPA-348) (#6222)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Operators manage per-company secrets through the Secrets page in the
web UI
> - A long secret value pasted into the "New secret" textarea blew out
the form's width, which pushed the Create/Cancel buttons off-screen and
made the form unusable
> - Root cause: the shadcn `Textarea` primitive sets `w-full` but does
not constrain `min-width`, so a flex parent honors the textarea's
intrinsic content width when a long unbreakable string is present
> - This pull request adds `min-w-0 max-w-full` to the shared `Textarea`
primitive and `min-w-0 overflow-x-hidden break-all` on the secret-value
usage so a long token wraps inside the form bounds
> - The benefit is the Create/Cancel buttons stay reachable regardless
of pasted token length, and every other `Textarea` consumer also gets
the flexbox-friendly width constraint

## What Changed

- `ui/src/components/ui/textarea.tsx`: added `min-w-0 max-w-full` to the
base shadcn `Textarea` so it cannot exceed its flex parent
- `ui/src/pages/Secrets.tsx`: added `min-w-0 overflow-x-hidden
break-all` on the new-secret value `Textarea` so long opaque tokens wrap
instead of pushing the form
- `ui/src/pages/Secrets.render.test.tsx`: new regression test that opens
the New Secret dialog and asserts the value textarea carries the
width-constraint classes

## Verification

- `cd ui && npx vitest run src/pages/Secrets.render.test.tsx` — 3/3 pass
- Manual: open the Secrets page, click "New secret", paste a long
unbroken string (e.g. a 500-char token) into the value field. The form
stays within its dialog and the Create/Cancel buttons remain in view.

Before:

<img width="1772" height="1432" alt="image"
src="https://github.com/user-attachments/assets/cb31a290-f82a-41dc-9346-91d18cbb5911"
/>

After:

<img width="672" height="734" alt="Screenshot 2026-05-17 at 5 39 38 PM"
src="https://github.com/user-attachments/assets/a08800c2-b09b-43be-b0e8-114d9149b8f5"
/>

After: the value field wraps with `break-all` inside the dialog;
Create/Cancel stay clickable. Covered by the new render test which
asserts `min-w-0`, `overflow-x-hidden`, and `break-all` are present on
`#new-secret-value`.

## Risks

- Low risk. The base `Textarea` change adds `min-w-0 max-w-full`, which
only affects layouts where a textarea was previously allowed to grow
past its parent — those cases were already buggy. `break-all` on the
secret-value textarea is the right behavior for opaque tokens; it would
be wrong for prose, but this field is explicitly a secret token.

## Model Used

- Provider: Anthropic Claude
- Model: claude-opus-4-7 (Opus 4.7)
- Mode: standard Claude Code agent, tool use enabled

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-17 17:54:15 -07:00
Dotta d734bd43d1 [codex] Roll up May 17 branch changes (#6210)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, so agent
work needs visible ownership, recovery, and operator controls.
> - This local branch had accumulated several related control-plane
reliability and operator-experience fixes across recovery actions,
watchdog folding, model-profile defaults, mentions, markdown editing,
plugin launchers, and small UI polish.
> - The branch needed to be converted into a PR against the current
`origin/master` without losing dirty work or including lockfile/workflow
churn.
> - The safest standalone shape is a single rollup PR because the
recovery/server/UI files overlap heavily across the local commits and
splitting would create avoidable conflicts.
> - This pull request replays the local branch onto latest
`origin/master`, preserves the uncommitted work as logical commits, and
adds a Zod 4 validator compatibility fix found during verification.
> - The benefit is that the May 17 local branch can be reviewed and
merged as one coherent, conflict-free branch under the 100-file Greptile
limit.

## What Changed

- Rebased the local May 17 branch work onto current `origin/master` in a
dedicated worktree.
- Preserved and committed previously dirty changes for recovery retry
handling, plugin/sidebar launcher polish, and `.herenow` ignores.
- Added recovery-action behavior for returning source issues to `todo`
when retrying source-scoped recovery.
- Included the existing local recovery/liveness/watchdog fold, Codex
cheap-profile, markdown/mention, duplicate-agent, and UI polish commits
from the branch.
- Normalized shared validator `z.record(...)` schemas to explicit
string-key records for Zod 4 compatibility.
- Confirmed the PR has no `pnpm-lock.yaml` or `.github/workflows/*`
changes and stays below the 100-file Greptile limit.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `npm run install` in
`node_modules/.pnpm/sqlite3@5.1.7/node_modules/sqlite3` to build the
local native sqlite3 binding after installing with scripts disabled
- `pnpm exec vitest run packages/shared/src/validators/issue.test.ts
packages/shared/src/project-mentions.test.ts
packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/heartbeat-model-profile.test.ts
server/src/__tests__/issue-recovery-actions.test.ts
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
server/src/__tests__/heartbeat-active-run-output-watchdog.test.ts
server/src/__tests__/plugin-local-folders.test.ts
ui/src/components/IssueRecoveryActionCard.test.tsx
ui/src/components/Sidebar.test.tsx
ui/src/components/SidebarAccountMenu.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/components/MarkdownBody.test.tsx
ui/src/lib/duplicate-agent-payload.test.ts
ui/src/pages/Routines.test.tsx`
- First pass: 13 files passed with 201 passing tests; 3 server files
failed before sqlite3 native binding was built.
- After rebuilding sqlite3:
`server/src/__tests__/heartbeat-model-profile.test.ts`,
`server/src/__tests__/issue-recovery-actions.test.ts`, and
`server/src/__tests__/heartbeat-active-run-output-watchdog.test.ts`
passed/loaded; embedded Postgres tests were skipped by the local host
guard.
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/adapter-utils typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Medium risk: this is a broad rollup PR across recovery semantics,
server tests, shared validators, and UI surfaces.
- Some embedded Postgres tests skipped locally due the host guard, so CI
should provide the stronger database-backed signal.
- UI changes were covered by component tests, but no browser screenshot
was captured in this PR creation pass.
- This branch may overlap with existing recovery/liveness PR work; merge
this PR independently or restack/close overlapping branches rather than
merging duplicate implementations together.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled local repository
and GitHub workflow, medium reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-17 17:15:06 -05:00
Dotta 705c1b8d81 [codex] Add routine env secrets support (#6212)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Scheduled routines are the control-plane path for recurring agent
work.
> - Routines already had dispatch/history, but their runtime environment
did not carry routine-owned secret bindings through execution.
> - Operators need routine-specific secrets that can override
project/agent env without exposing secret values in history, logs, or
access events.
> - This pull request adds the routine env runtime contract, wires it
into execution, and makes the routine UI/history surfaces show safe
secret metadata.
> - The benefit is that routine executions can use scoped secret refs
predictably while preserving company boundaries and auditability.

## What Changed

- Added routine env persistence/runtime support, including
`routines.env`, `routine_runs.routine_revision_id`, revision snapshots,
and idempotent migration `0086_routine_env_runtime_contract`.
- Resolved routine env during heartbeat adapter config assembly with
precedence `agent < project < routine` and secret access events recorded
against the routine consumer.
- Added secret binding synchronization for routine create/update/restore
flows and guarded cross-company, missing, disabled, and deleted secret
cases.
- Added a Secrets tab to routine detail, env/secret history diff
rendering, and Storybook coverage for the new UI states.
- Added server/UI regression tests, including an embedded-Postgres QA
path for routine secret execution and restore behavior.
- Updated implementation/database docs for routine env and
secret-binding behavior.

## Verification

- `pnpm install --frozen-lockfile` after rebasing onto
`public-gh/master` to refresh workspace links for the newly-added
upstream Grok adapter package.
- `pnpm exec vitest run
server/src/__tests__/heartbeat-project-env.test.ts
server/src/__tests__/routines-service.test.ts
server/src/__tests__/secrets-service.test.ts
server/src/__tests__/qa-routine-secrets-e2e.test.ts
ui/src/components/RoutineHistoryTab.test.tsx` passed: 5 files, 92 tests.
- `pnpm -r typecheck` passed across the workspace.
- `pnpm build` passed. Vite emitted the existing
large-chunk/dynamic-import warnings.
- UI screenshots were captured locally during QA in
`artifacts/pap-9521/` and `artifacts/pap-9522/`; generated screenshots
are not committed to avoid adding binary artifacts to the repo.

## Risks

- Migration risk is limited by `IF NOT EXISTS` guards for the new
columns, FK, and index, and the migration is ordered as `0086`
immediately after upstream `0085`.
- Runtime behavior changes env precedence for routine executions by
adding routine env as the highest-precedence layer; tests cover
agent/project/routine precedence.
- Secret handling is security-sensitive; tests cover value-free
manifests/events/errors, disabled/missing/deleted secrets, and
cross-company rejection.
- UI history now renders routine env/secret diffs; tests and Storybook
stories cover the main rendering paths.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell/tool use and
medium reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-17 16:30:34 -05:00
Dotta 3e6610fb93 docs(skills): add release-changelog-discord-message skill (#6152)
## Summary

Adds `.agents/skills/release-changelog-discord-message/SKILL.md` — the
companion to `.agents/skills/release-changelog/`. The changelog skill
produces `releases/vYYYY.MDD.P.md`; this one turns that into the single
copy-pasteable Discord post in dotta's voice and attaches it as the
`discord_announcement` document on the release issue.

Includes:

- dotta's instructions near-verbatim from PAP-3687 ("This is for discord
— try to follow my format. If I have a section where I think about the
future, pull from recent issues we're working on etc.")
- The three previous Discord announcements (v2026.403.0, v2026.416.0,
v2026.427.0) **verbatim** as canonical voice references
- Format template + language tips (ALL CAPS for excitement,
emoji-shortcode-per-highlight, first-person voice, opener/closer brand
bookends)
- Workflow tied to the existing release issue + `discord_announcement`
document
- Review checklist (version match, contributor list dedup, real "what's
next" themes, no invented work)

Resolves PAP-9524.

## Test plan

- [ ] dotta eyeballs voice + structure against the three prior posts
- [ ] On the next release, run this skill end-to-end and confirm the
`discord_announcement` document on the release issue matches the format

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-16 20:36:22 -05:00
Dotta 7bbdfb69df [codex] Enable Grok adapter canary publishing (#6154)
## Thinking Path

> - Paperclip publishes its CLI, server, UI, and adapter packages
through the shared release workflow.
> - Canary releases are driven by the GitHub release workflow on pushes
to `master`.
> - The release workflow does not publish every public package
automatically; it uses `scripts/release-package-manifest.json` as the CI
enrollment source of truth.
> - The Grok adapter is public and already present in the manifest, but
its `publishFromCi` flag was still disabled.
> - Because of that flag, normal canary publishes skipped
`@paperclipai/adapter-grok-local` even when the main package received a
canary.
> - This pull request enables Grok in the release manifest so future
canary runs include it.
> - The benefit is that Grok adapter canaries stay aligned with the rest
of the package release set.

## What Changed

- Set `packages/adapters/grok-local` / `@paperclipai/adapter-grok-local`
to `publishFromCi: true` in `scripts/release-package-manifest.json`.

## Verification

- `node ./scripts/release-package-map.mjs check`
- `node ./scripts/release-package-map.mjs list | grep
'@paperclipai/adapter-grok-local'`
- `pnpm test:release-registry`

## Risks

- Low risk: this is a release manifest-only change.
- Future canary releases will attempt to publish
`@paperclipai/adapter-grok-local`; this assumes the package remains
publishable and trusted publishing/package access are correctly
configured for the existing npm package.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent with shell, git, and GitHub
tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-16 20:34:28 -05:00
Dotta 93cd933f79 docs: add v2026.517.0 release changelog (#6150)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - Stable releases need a reviewer-readable changelog artifact that
summarizes what changed for operators and contributors.
> - The last stable release tag is `v2026.513.0`, and the requested
release date is 2026-05-17.
> - The release changelog skill maps that date to stable version
`2026.517.0` and requires the range `v2026.513.0..origin/master`.
> - I reviewed the commits, PR metadata, migration/API surfaces, and
contributor attribution in that range.
> - This pull request adds the `releases/v2026.517.0.md` changelog for
human release sign-off.
> - The benefit is a stable release note artifact that is ready to
review before publishing the release.

## What Changed

- Added `releases/v2026.517.0.md` for the 2026-05-17 stable release.
- Summarized user-facing highlights, improvements, fixes, and
contributor attribution from `v2026.513.0..origin/master`.
- Omitted Breaking Changes and Upgrade Guide sections after checking for
destructive migrations, removed API surfaces, and breaking-change commit
signals.

## Verification

- `./scripts/release.sh stable --date 2026-05-17 --print-version` ->
`2026.517.0`.
- `git tag --list 'v*' --sort=-version:refname | head -10` confirmed
`v2026.513.0` is the latest stable tag.
- `git log v2026.513.0..origin/master --oneline --no-merges` reviewed
the 16 release-range commits.
- `git diff --name-only v2026.513.0..origin/master -- .changeset`
returned no changeset files.
- `git log v2026.513.0..origin/master --format='%s' | rg -n 'BREAKING
CHANGE|BREAKING:|^[a-z]+!:' || true` returned no breaking-change
signals.
- `git diff --name-only v2026.513.0..origin/master --
packages/db/src/migrations packages/db/src/schema server/src/routes
server/src/api server/src` reviewed the DB/API/server touchpoints; only
an additive document-lock migration appeared in the DB schema/migration
path.
- `test -f releases/v2026.517.0.md` passed.
- `rg -n -- '-canary|canary/' releases/v2026.517.0.md || true` returned
no canary filename/title language.
- `git diff --cached --check` passed before commit.

## Risks

- Low risk: docs-only release changelog addition.
- Changelog grouping is editorial; reviewers may want wording or
prioritization changes before release publication.
- Contributor attribution intentionally excludes bot accounts and
Paperclip founders from the Contributors section per the release
changelog skill.

> Checked [`ROADMAP.md`](ROADMAP.md); this is release documentation, not
new roadmap-level core feature work.

## Model Used

- OpenAI Codex, GPT-5 coding agent via Paperclip `codex_local`, with
shell, git, GitHub CLI, and repository file editing enabled. Exact
backend sub-version and context window were not surfaced by the
Paperclip runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-16 19:32:24 -05:00
Devin Foley 573e9ec909 fix(grok-local): restore turn boundaries in streaming reasoning text (#6142)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The `grok-local` adapter streams reasoning text to the issue
"Working..." panel as the grok CLI runs
> - The `grok` CLI's `--output-format streaming-json` mode silently
drops the `\n` separator between reasoning turns around tool calls
> - Consecutive `thought` chunks (e.g. `` "`" `` followed by `"The"`)
arrive with no intervening whitespace event, so the UI's `delta: true`
concatenator merged them into run-on text like `"…planningGreat, now I
have the issue descriptionThe only co"`
> - This PR adds a small turn-boundary helper that detects sentence
boundaries in the upstream `thought` stream and inserts a single `\n`
only when the previous chunk ended with sentence punctuation (or a
balanced closing backtick) AND the next chunk begins a new uppercase
sentence
> - The benefit is readable streaming reasoning in the UI without
changing how completed messages are stored

## What Changed

- Added `packages/adapters/grok-local/src/shared/turn-boundary.ts` with
per-stream state (last chunk + backtick parity) and a
`restoreTurnBoundary()` helper that inserts `\n` only between balanced,
sentence-terminated `thought` chunks
- Wired the helper into `parseGrokJsonl` (server) and added a new
`createGrokStdoutParser` factory used by `grokLocalUIAdapter` for the
live "Working..." panel
- Added focused tests in `shared/turn-boundary.test.ts`, plus regression
assertions in `server/parse.test.ts` and `ui/parse-stdout.test.ts`

## Verification

- `pnpm --filter @paperclip/grok-local test` — 23/23 adapter tests pass
- `pnpm --filter @paperclip/grok-local typecheck` and UI typecheck —
clean
- Replayed an actual broken `grok 0.1.210` stream from the report;
previously-merged boundaries (`` `ls`The ``, `returned:Confirmed`) now
render with a separating newline; chunks inside un-closed backtick spans
are left alone

## Risks

- Low risk. Boundary insertion only fires when prev ends with
`.`/`!`/`?`/balanced `` ` `` and next begins with an uppercase ≥2-char
word, with no whitespace on either side. Worst case: a rare missed split
or a misplaced newline inside reasoning — both purely cosmetic and
confined to the live streaming panel.

## Model Used

- Claude Opus 4.7 (claude-opus-4-7), Anthropic, extended thinking + tool
use via Claude Code

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-16 11:48:51 -07:00
Devin Foley 81d18f2d77 ci: speed up PR verify workflow (#6137)
## Thinking Path

> - Paperclip orchestrates AI agents through a control-plane repo that
relies on GitHub Actions as part of its release and verification safety
net.
> - The PR workflow in `.github/workflows/pr.yml` is the core CI path
protecting pull requests before merge.
> - Baseline measurement work in [PAPA-335](/PAPA/issues/PAPA-335)
showed the old single `verify` job was the critical-path bottleneck,
with general tests and build serialized together.
> - Follow-up implementation in [PAPA-338](/PAPA/issues/PAPA-338) and
[PAPA-339](/PAPA/issues/PAPA-339) split that work into parallel lanes
and removed redundant clean-runner prebuild work.
> - [PAPA-340](/PAPA/issues/PAPA-340) now needs real post-change PR
workflow evidence, not local inference, to compare against the May 15,
2026 baseline and decide whether phase-2 work is still justified.
> - This pull request publishes the already-implemented CI speedup
branch so GitHub can run the actual `PR` workflow against it.
> - The benefit is that CI timing decisions are based on measured runs
from the exact workflow shape we intend to ship.

## What Changed

- Split the PR workflow so `policy` fans out into separate `Typecheck +
Release Registry`, grouped `General tests`, and `Build` jobs.
- Kept the serialized server matrix, canary dry run, and e2e jobs intact
while removing the old monolithic `verify` bottleneck.
- Reworked grouped general-test execution in
`scripts/run-vitest-stable.mjs` so the workflow can run balanced
non-serialized lanes.
- Replaced redundant clean-runner prebuild gates with the idempotent
`ensure-build-deps` path used by the relevant CI entrypoints.

## Verification

- `ruby -e "require 'yaml'; YAML.load_file('.github/workflows/pr.yml');
puts 'yaml-ok'"`
- `node scripts/run-vitest-stable.mjs --mode general --dry-run`
- `node scripts/run-vitest-stable.mjs --mode general --group
general-server --dry-run`
- `node scripts/run-vitest-stable.mjs --mode general --group
general-workspaces-a --dry-run`
- `node scripts/run-vitest-stable.mjs --mode general --group
general-workspaces-b --dry-run`
- `pnpm test:run:general -- --group general-workspaces-b`
- `pnpm test:run:general -- --group general-workspaces-a`
- `pnpm test:run:general -- --group general-server`
- `pnpm run typecheck:build-gaps`
- `pnpm --filter @paperclipai/plugin-hello-world-example typecheck`

## Risks

- Required-check and branch-protection settings may still reference the
old single `verify` job name.
- Parallel CI lanes can expose hidden ordering assumptions or
clean-runner bootstrap gaps that local grouped dry-runs did not surface.
- Because the branch is behind current `master`, merge conflicts or
unrelated upstream drift could affect the measured runtime until the
branch is rebased.

> Checked `ROADMAP.md`; this work is CI throughput maintenance for the
existing PR verification path, not duplicate feature work.

## Model Used

- OpenAI Codex via Paperclip `codex_local`, GPT-5-class coding agent
with repository read/write, shell execution, and GitHub CLI/tool use.
The runtime does not expose a more specific backend model ID in-session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-16 11:28:25 -07:00
github-actions[bot] 9b6d2e6b79 chore(lockfile): refresh pnpm-lock.yaml (#6136)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-05-16 10:13:22 -07:00
Devin Foley ab8b471685 Add built-in grok_local adapter (#6087)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, so
adapter quality directly affects what runtimes the control plane can
supervise.
> - Local CLI adapters are one of the core execution surfaces because
they turn real coding tools into Paperclip-managed employees with
heartbeats, transcripts, and reviewability.
> - Grok Build was installed on the Paperclip host, but Paperclip had no
built-in `grok_local` adapter, so the runtime could not be configured
through the normal server/UI/CLI adapter path.
> - That gap needed to be closed with the same built-in registry,
environment diagnostics, transcript parsing, and skill/instructions
behavior that the other local adapters already rely on.
> - After the initial adapter landed, a real follow-up run showed that
Grok streaming text was being rendered one fragment per line, which made
transcripts harder to read even though the runtime itself was working.
> - This pull request adds the built-in `grok_local` adapter end-to-end
and then fixes the transcript parser so streamed Grok output is
coalesced into readable assistant/thinking blocks.
> - The benefit is that Grok Build becomes a first-class Paperclip
runtime with a usable operator experience instead of a partially wired
runtime with noisy transcript output.

## What Changed

- Added a new built-in `@paperclipai/adapter-grok-local` package with
server, UI, and CLI entrypoints.
- Implemented Grok execution, session handling, environment diagnostics,
config building, skill syncing, and parser coverage inside the new
adapter package.
- Registered `grok_local` across the built-in adapter inventories and
capability/display metadata in server, UI, CLI, and shared constants.
- Added adapter route coverage for the new built-in type.
- Fixed Grok transcript readability by emitting streamed `text` and
`thought` fragments as deltas so the shared transcript builder coalesces
them into readable message blocks.
- Added regression tests for the Grok parser and transcript coalescing
behavior.

## Verification

- `pnpm vitest run
packages/adapters/grok-local/src/ui/parse-stdout.test.ts
ui/src/adapters/transcript.test.ts`
- `pnpm --filter @paperclipai/adapter-grok-local build`
- Manual runtime verification on the Paperclip host during
implementation and follow-up review:
  - confirmed the Grok CLI was installed and authenticated
- confirmed the worktree dev server could be restarted cleanly and
health-checked after the parser follow-up
- No screenshots attached. This change is primarily adapter plumbing
plus transcript formatting behavior; reviewers can verify via the
Grok-backed run surfaces directly.

## Risks

- This adds a new built-in adapter, so any missed registration surface
could create inconsistencies between server, UI, and CLI behavior.
- The adapter depends on Grok Build's current event/output shape; if
upstream Grok streaming JSON changes, transcript parsing or session
extraction may need follow-up updates.
- The transcript readability fix intentionally changes how Grok
fragments are grouped, so any downstream code that implicitly expected
one entry per fragment would behave differently.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agent runtime.
- GPT-5-class coding model with tool use, shell execution, file editing,
and repo inspection enabled.
- Exact backend model ID/context window were not surfaced to the agent
in this Paperclip session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-16 09:51:09 -07:00
Dotta 63821bfe4c [codex] Add full locale catalog (#6070)
## Thinking Path

> - Paperclip orchestrates AI agents through a board-facing control
plane.
> - The UI is the operator surface where onboarding and company creation
flows need to remain understandable across languages.
> - The i18next foundation now supports locale resource loading and
validation, but only English was present on `master`.
> - The branch exists to populate that foundation with the supported
language catalog without changing routing, data contracts, or runtime
behavior.
> - This pull request adds locale JSON resources for the current
non-English language set.
> - The benefit is that future locale selection work has validated
message catalogs ready for the first translated onboarding strings.

## What Changed

- Added 39 localized message catalogs under `ui/src/i18n/locales/` for
the existing no-companies onboarding strings.
- Kept the PR rebased onto current `master` so it only contains the
all-languages layer on top of the already-merged i18next foundation.

## Verification

- `pnpm exec vitest run ui/src/i18n/locale-validation.test.ts`
- Checked `ROADMAP.md`; this is scoped locale catalog follow-up work and
does not duplicate a listed roadmap feature.
- No before/after screenshots included because this PR only adds
resource JSON files and does not change rendered layout or visible
default-English UI behavior.

## Risks

- Low risk: static JSON resource additions validated against the English
reference schema.
- Translation quality may need native-speaker review before enabling
visible locale switching broadly.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex CLI, GPT-5 family coding agent, tool-enabled repository
access, medium reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-16 08:24:31 -05:00
Dotta 4c47eb46c3 [codex] Add multilingual issue preservation coverage (#6069)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies.
> - Agents and board operators coordinate through company-scoped issues,
comments, documents, and heartbeat wake payloads.
> - Chinese, Japanese, and Hindi text needs to survive the full issue
lifecycle without normalization or prompt serialization damage.
> - The riskiest paths are board issue creation, server
issue/comment/document round-tripping, and scoped wake prompt rendering.
> - This pull request adds focused regression coverage across those
surfaces.
> - The benefit is higher confidence that multilingual operators and
agents can create, search, comment on, complete, and wake on issues
using non-Latin text.

## What Changed

- Added adapter-utils wake payload and prompt rendering coverage for
Chinese, Japanese, and Hindi issue/comment text.
- Added UI New Issue dialog coverage proving multilingual title and
description text is submitted unchanged.
- Added server route coverage that round-trips multilingual issue text
through create, search, comments, documents, completion comments, and
heartbeat context.
- Addressed Greptile feedback by using a typed storage mock and
splitting the server route integration path into smaller ordered
assertions.

## Verification

- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
ui/src/components/NewIssueDialog.test.tsx
server/src/__tests__/multilingual-issues-routes.test.ts`
- Result: 3 test files passed, 51 tests passed.

## Risks

- Low risk: this PR adds regression coverage only and does not change
runtime behavior.
- The new server test uses embedded Postgres support and skips on
unsupported hosts using the existing helper pattern.
- No migrations are included.
- No `pnpm-lock.yaml` changes are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected - check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 based coding agent, with shell, git, Vitest, and
GitHub connector/CLI tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-15 12:49:57 -05:00
Dotta e2d7263b07 [codex] Add minimal i18next i18n foundation (#5943)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through a web control
plane.
> - The UI currently renders operator-facing copy directly from React
components.
> - Internationalization needs a smallest-possible starting point before
broader locale work can proceed.
> - The package declarations for `i18next` and `react-i18next` landed
separately, so this PR can stay focused on the implementation slice.
> - The implementation keeps the first surface English-only and
deliberately tiny while using the established `i18next` +
`react-i18next` runtime.
> - Future language contributions should be able to add a single locale
JSON file, with validation guarding key shape, interpolation parity,
suspicious payloads, and string length.
> - Locale strings must remain display-only UI copy and must not flow
into prompts, agent instructions, tool calls, shell commands, issue
content, approvals, adapter config, or other LLM-visible control paths.

## What Changed

- Initialized `i18next` behind the existing `@/i18n` boundary with fixed
English resources, fallback English, no detector plugin, no backend
plugin, no language picker, and no rich-text translation component.
- Kept `ui/src/i18n/locales/en.json` as the English source locale and
converted the validated JSON locale registry into i18next resources
before app rendering.
- Routed the no-companies start page title, description, and button
through `t(key, { defaultValue })` while preserving unchanged rendered
English copy.
- Added locale validation and focused Vitest coverage for missing/extra
keys, non-string leaves, interpolation parity, suspicious
executable/link payloads, and length caps.
- Addressed Greptile i18n review feedback: case-insensitive
event-handler detection, multi-violation diagnostics,
future-locale-friendly registration test, surfaced i18next init errors,
and removed the redundant side-effect import.
- Rebasing note: rebased onto current `public-gh/master` after the
package-only PR landed; this PR no longer changes `ui/package.json` or
`pnpm-lock.yaml`.

## Verification

- `pnpm install --no-lockfile --ignore-scripts` to install local
dependencies without reading or writing `pnpm-lock.yaml`.
- `pnpm --filter @paperclipai/ui exec vitest run
src/i18n/locale-validation.test.ts` -> passed, 7 tests.
- `pnpm --filter @paperclipai/ui typecheck` -> passed.
- `git diff --name-only public-gh/master...HEAD` shows only i18n
implementation files and the touched App copy call site; no package
manifest or lockfile changes remain in this PR.
- Visual impact is intentionally unchanged for the touched no-companies
copy because the English translations match the previous literal
strings.

## Risks

- Locale validation reduces prompt-injection risk, but the main safety
invariant is architectural: locale strings must remain display-only and
must never be used as LLM-visible control text.
- This intentionally does not add non-English locales, a language
picker, browser detection, HTTP/backend locale loading, server
localization, adapter localization, broad copy migration, or new package
scripts.
- Repository-wide CI may still depend on the separate lockfile-refresh
workflow for the already-merged package declaration, but this PR no
longer introduces package manifest or lockfile changes itself.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5, tool-enabled coding agent in medium reasoning
mode.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why screenshots are not applicable because
there is no intended visual change
- [x] I have updated relevant documentation to reflect my changes, or
confirmed no docs changed because behavior/commands did not change
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-15 11:11:02 -05:00
Emad Ibrahim afb73ba553 Scale issue kanban board for high-volume columns (#5309)
## Thinking Path

> - Paperclip is a control plane for autonomous AI-agent companies, and
the board UI needs to keep operator visibility clear as company work
scales.
> - The involved subsystem is the Issues page board mode, specifically
the Kanban rendering path for issue status columns.
> - The current board keeps the classic Kanban model, but high-volume
columns can become tall, slow, and hard to scan when hundreds of issues
are loaded.
> - We explored alternatives and chose the conservative Scaled Kanban
direction: preserve status lanes and drag/drop, but bound visible cards
and collapse low-signal lanes.
> - This pull request adds UI-only density controls and high-volume
defaults rather than introducing schema or API changes.
> - The benefit is a board that remains usable with large issue
inventories while keeping active workflow lanes visible.

## What Changed

- Added scaled Kanban behavior with compact cards, collapsed cold-lane
rails, per-column visible-card limits, and per-column "show more" reveal
controls.
- Added persisted board density preferences to the Issues page view
state, scoped through the existing company-specific localStorage path.
- Added board toolbar controls for compact cards, collapsed cold lanes,
cards-per-column page size (`10`, `25`, `50`), and density reset.
- Added a design spec and implementation plan under `doc/plans/`.
- Added focused Vitest coverage for `KanbanBoard` and `IssuesList`
high-volume board behavior.

## Verification

- `pnpm exec vitest run ui/src/components/IssuesList.test.tsx
ui/src/components/KanbanBoard.test.tsx` — pass, 35 tests.
- `pnpm -r typecheck` — pass.
- `pnpm build` — pass before the upstream merge; not rerun after
docs/assets cleanup.
- `curl -fsS http://127.0.0.1:3100/api/health` — pass against restarted
local dev server after applying pending migration
`0078_white_darwin.sql`.
- `pnpm test:run` — previously failed in unrelated Cursor remote-sandbox
server tests:
- `server/src/__tests__/cursor-local-adapter-environment.test.ts`:
expected probe status `pass`, received `fail`.
- `server/src/__tests__/cursor-local-execute.test.ts`: two remote
sandbox execution cases exited `127` instead of `0`.

Local dev server for manual UI inspection: `http://127.0.0.1:3100`.

Screenshots were captured for review and attached in the PR thread
rather than committed to source.

## Risks

- Low schema/API risk: this is UI-only and uses the existing issue data
path.
- Board users may need to notice the new density controls if they want
to override high-volume defaults.
- Collapsed cold lanes remain valid drop targets, so status moves can
happen without expanding the destination lane.
- Very large remote columns can still hit the existing 200-item
per-column query cap; this PR improves rendering, not server-side board
pagination.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with repository tool use,
shell execution, local test/build execution, and inline implementation
planning. No subagents were used.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-15 10:53:09 -05:00
github-actions[bot] 7e1a27c8ec chore(lockfile): refresh pnpm-lock.yaml (#6062)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-05-15 10:32:10 -05:00
Dotta d5ba3348a9 [codex] Add UI i18n runtime packages (#6058)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through a web control
plane.
> - The UI i18n slice needs `i18next` and `react-i18next` available as
runtime packages before the implementation PR can stay focused on code
changes.
> - The implementation PR should not mix package declaration work with
Greptile-driven i18n code feedback.
> - This pull request isolates only the package manifest additions
requested by the maintainer.
> - The benefit is a tiny dependency-declaration PR that can be reviewed
and merged independently before rebasing the i18n implementation PR.

## What Changed

- Added `i18next` to `ui/package.json` dependencies.
- Added `react-i18next` to `ui/package.json` dependencies.
- Intentionally did not change `pnpm-lock.yaml`, matching the repository
policy that PRs do not commit lockfile changes.

## Verification

- `node -e
"JSON.parse(require('fs').readFileSync('ui/package.json','utf8'));
console.log('ui/package.json valid JSON')"`
- `git diff --name-only public-gh/master...HEAD` shows only
`ui/package.json`.
- `npm view i18next version` -> `26.2.0`.
- `npm view i18next@26.1.0 version` -> `26.1.0`.
- `npm view react-i18next version` -> `17.0.8`.
- `npm view react-i18next@17.0.7 version` -> `17.0.7`.
- Did not run `pnpm install --frozen-lockfile` because this PR
intentionally changes only `ui/package.json` and leaves lockfile
handling to the repository's separate lockfile workflow.

## Risks

- CI jobs that run `pnpm install --frozen-lockfile` may fail until the
repository lockfile workflow handles these package declarations.
- Low behavioral risk: this PR does not import or execute the packages
and changes no runtime code.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5, tool-enabled coding agent in medium reasoning
mode.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run the applicable local validation for this manifest-only
change
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why screenshots are not applicable because
there is no runtime UI change
- [x] I have updated relevant documentation to reflect my changes, or
confirmed no docs changed because behavior/commands did not change
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-15 10:31:01 -05:00
Dotta eb38b226c2 Fix LLM Wiki package and migration validation (#6010)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Plugins extend the control plane with optional capabilities such as
LLM Wiki.
> - LLM Wiki needs its package assets and plugin-owned database
migrations to work when installed from the packaged plugin.
> - The bundled spaces migration used validation-hostile dynamic SQL,
and the packaged plugin could omit non-dist runtime assets.
> - This pull request makes the LLM Wiki package include its required
assets and cuts the spaces migration over to explicit, idempotent SQL
that passes the production plugin database validator.
> - The benefit is a simpler plugin install path that validates and
applies the bundled LLM Wiki migrations without adding plugin-specific
legacy handling to Paperclip core.

## What Changed

- Added the LLM Wiki package asset allowlist so agents, migrations,
skills, templates, dist output, and README are included when packaged.
- Renamed the bootstrap `.gitignore` template to `gitignore.template`
and updated the runtime lookup so package tooling does not drop the
hidden template file.
- Relaxed plugin migration validation to allow namespace-scoped
`INSERT`/`UPDATE` backfills and `CREATE INDEX` statements while
continuing to reject destructive or cross-namespace SQL.
- Replaced the LLM Wiki spaces migration's dynamic constraint-drop DO
block with explicit `DROP CONSTRAINT IF EXISTS` statements.
- Replaced fragile regex-source dispatch in SQL reference extraction
with explicit capture-group descriptors.
- Added regression coverage that applies the bundled LLM Wiki migrations
through the production validator and checks the expected constraints.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/plugin-database.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/plugin-llm-wiki build`
- `git diff --check`
- Confirmed `pnpm-lock.yaml` is not included in the branch diff.

## Risks

- Low migration risk for current users: LLM Wiki spaces are new, so this
intentionally cuts over the plugin migration instead of adding legacy
handling in core.
- Validator behavior is broader than before, but still requires fully
qualified plugin namespace targets, blocks deletes/destructive DDL, and
keeps public table access read-only and allowlisted.

> Checked [`ROADMAP.md`](ROADMAP.md); this is a targeted plugin
packaging/migration fix and does not duplicate planned core feature
work. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 based coding agent, tool-enabled local repo
access, reasoning mode managed by the Paperclip/Codex runtime. Exact
context window was not surfaced in this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-15 10:20:02 -05:00
Dotta dfcebf082b [codex] Refresh issue documents from live updates (#6005)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The board UI keeps issue pages responsive by subscribing to live
activity events and invalidating TanStack Query caches.
> - Issue documents are first-class issue artifacts, but document
activity events were not refreshing the document list, active document,
or revision caches.
> - That meant a user could update a document on an issue and another
open board would keep showing stale document content until a page
reload.
> - This pull request routes issue document activity events through the
same live invalidation path used for issue and comment updates.
> - The benefit is that issue document changes become visible
automatically on active issue pages without forcing operators to reload
the board.

## What Changed

- Added live-update cache invalidation for `issue.document_created`,
`issue.document_updated`, `issue.document_restored`, and
`issue.document_deleted` activity events.
- Invalidated the issue document list, the active document cache, and
document revisions for both issue id and identifier references when the
activity payload includes a document key.
- Added regression coverage for document activity events so active issue
pages refetch document caches without inactive-only behavior.
- Simplified the document invalidation test mock after Greptile feedback
so the test only models the cache reads it actually uses.

## Verification

- `git rebase public-gh/master` reported the branch was up to date after
fetching `public-gh/master`.
- `pnpm run preflight:workspace-links` passed.
- `pnpm exec vitest run --project @paperclipai/ui
ui/src/context/LiveUpdatesProvider.test.ts` passed: 1 file, 16 tests.
- `pnpm --filter @paperclipai/ui typecheck` passed.
- PR checks passed on `eecd19f7b0355490f17314c94bffa06aada8f9e3`:
`policy`, `verify`, `e2e`, all 4 serialized server shards, `Canary Dry
Run`, `security/snyk`, and `Greptile Review`.
- Greptile completed with `5/5` confidence and no unresolved review
threads.

## Risks

- Low risk. This expands cache invalidation for existing live activity
events and does not change API contracts, database schema, migrations,
or document persistence behavior.
- No migrations or `pnpm-lock.yaml` changes are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled local repository
workflow.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] No visible UI layout changed; screenshots are not applicable for
live cache invalidation behavior
- [x] No documentation changes were needed for this internal UI cache
refresh fix
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-15 08:55:54 -05:00
Dotta 03ad5c5bea [codex] Add issue document locking (#6009)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through company-scoped
issues, comments, and issue documents.
> - Issue documents are the durable place where plans, handoffs, and
other work artifacts are revised over time.
> - Some documents need to be preserved as operator-approved snapshots
while agents continue working on the same issue.
> - Without document locking, a later board or agent write can overwrite
the document key that reviewers expected to remain stable.
> - This pull request adds board-managed issue document locks and makes
agent writes to locked keys create a derived document instead of
mutating the locked document.
> - The benefit is safer document handoffs: approved or frozen issue
documents stay immutable until the board explicitly unlocks them.

## What Changed

- Added `locked_at`, `locked_by_agent_id`, and `locked_by_user_id`
document fields plus migration `0085_tranquil_the_executioner.sql`.
- Added document lock/unlock service behavior, route endpoints, activity
events, and locked-document write protections.
- Made agent document writes to locked keys create a new derived key
such as `plan-2` rather than overwriting the locked document.
- Surfaced lock state through shared issue document types, UI API
methods, document header lock controls, and activity formatting.
- Added server and UI tests for lock/unlock behavior, locked document
immutability, and UI action visibility.
- Updated `doc/SPEC-implementation.md` with the V1 document lock
contract and endpoints.

## Verification

- `git rebase public-gh/master` completed cleanly after committing the
branch changes.
- `git diff --check` passed before commit.
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/documents-service.test.ts
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
ui/src/components/IssueDocumentsSection.test.tsx
ui/src/components/IssueContinuationHandoff.test.tsx
ui/src/lib/document-revisions.test.ts` passed: 5 files, 32 tests.

## Risks

- Medium risk because this changes the document persistence contract and
adds a migration.
- The migration uses `ADD COLUMN IF NOT EXISTS` and guarded foreign-key
creation so it remains safe for users who may have already applied an
earlier copy of the migration.
- Locked documents intentionally reject board edits/deletes/restores
until unlocked; any existing workflows that expected direct overwrite
need to unlock first.
- Agent writes to locked keys now create derived documents, which may
create extra issue documents when agents retry locked writes.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with tool use and local code
execution in the Paperclip worktree.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-15 08:54:55 -05:00
Devin Foley 901c088e14 fix: propagate projectId into wakeup context and support identifier lookup (#6026)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The server's heartbeat/wakeup pipeline resolves which project
workspace an agent run should bind to
> - `enqueueWakeup` resolves an issue (and therefore a project) before
scheduling a run, but the resolved `projectId` was never written back
into `enrichedContextSnapshot.projectId`, so `resolveWorkspaceForRun`
always saw `contextProjectId === null`
> - When the `issueProjectRef` DB lookup also returned null (e.g.
identifier-style id like `ENV-13`, not a UUID), workspace resolution
fell through to the `agent_home` fallback instead of the correct project
workspace
> - Surfaced while running the QA matrix on sandbox/SSH — runs were
ending up in the wrong workspace
> - This pull request stores the resolved `projectId` back into context
and replaces the raw UUID-only DB query with `issuesSvc.getById`, which
accepts both UUIDs and identifiers and canonicalizes `context.issueId` /
`context.taskId` to the UUID on identifier hits
> - The benefit is that wakeups triggered with identifier-style ids
correctly bind to their project workspace instead of silently degrading
to `agent_home`

## What Changed

- In `enqueueWakeup`, after the issue resolves, write `projectId` back
into `enrichedContextSnapshot.projectId` so downstream workspace
resolution can use it.
- Replace the raw UUID-only DB query for the issue with
`issuesSvc.getById`, which handles both UUIDs and identifiers (e.g.
`ENV-13`).
- On an identifier hit, canonicalize `context.issueId` and
`context.taskId` to the resolved UUID.

## Verification

- Trigger a wakeup with an identifier-style id (`ENV-13`) on the dev
instance and confirm the run binds to the correct project workspace
instead of `agent_home`.
- Confirm UUID-style wakeups still resolve to the same project workspace
as before.

## Risks

- Low risk. Scope is a single function in
`server/src/services/heartbeat.ts` (+20/-7). Failure mode if regressed
is the prior behavior (fallback to `agent_home`).

## Model Used

- Claude (Anthropic), `claude-opus-4-7`, via Claude Code / Paperclip
`claude_local` adapter.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-14 22:09:16 -07:00
Dotta 333a16b035 Fix company export with missing run logs (#5960)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - Company export/import lets operators move company state, including
issue threads and agent execution context, between Paperclip instances.
> - Issue comments can be enriched by nearby heartbeat run logs so
exported threads preserve useful agent/run attribution metadata.
> - Some local instances can have heartbeat run database rows whose
local log files were deleted or never copied into the current workspace.
> - The export path should still include the original user comments
instead of failing because optional run-log metadata is unavailable.
> - This pull request makes comment run-log metadata derivation tolerate
missing local log files, logs the missing-file condition for operators,
and adds a regression test.
> - The benefit is safer company exports for real instances with
incomplete local run-log storage.

## What Changed

- Treat missing local heartbeat run logs as absent optional metadata
while listing issue comments.
- Emit a structured warning with `runId` and `logRef` when optional
comment-attribution log content is missing.
- Preserve the existing error behavior for non-404 run-log read
failures.
- Added a regression test proving user comments still list when a
candidate attribution run has a missing local log reference.

## Verification

- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts -t
"candidate attribution run log is missing"` passed: 1 selected test
passed, 47 skipped.
- `pnpm --filter @paperclipai/server typecheck` passed.
- Greptile Review passed with Confidence Score 5/5 and zero unresolved
threads on commit `f68cac02bf98d7d31e7831e5bdfa95cffa85e254`.
- GitHub PR workflow run succeeded: `policy`, `verify`, four serialized
server suites, `e2e`, and `Canary Dry Run` all passed.
- `security/snyk (cryppadotta)` passed.
- Confirmed this branch is on top of `public-gh/master` and
`pnpm-lock.yaml` is not in the PR diff.

## Risks

- Low risk. The change only softens optional comment metadata derivation
for 404/missing local log files; other log read errors still throw.
- Exported comments in this edge case may lack derived run metadata, but
they remain visible/exportable instead of failing the request.
- Operators may see new warnings when historical run-log references
point to missing local files; those warnings indicate degraded optional
metadata, not data loss.

## Model Used

- OpenAI Codex, GPT-5 coding agent in this Paperclip heartbeat, with
shell/git/GitHub CLI tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-14 08:37:04 -05:00
Devin Foley 1bd44c8a0d Harden Cloudflare sandbox execution (#5967)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Remote-managed adapters need sandbox/environment execution to behave
like real agent runs, not just local host probes.
> - The Cloudflare sandbox path was the weakest leg in the SSH +
Cloudflare QA matrix because bridge execution could truncate output,
time out long-running installs, and under-provision the worker instance.
> - That made several adapters fail for reasons unrelated to their
actual business logic, which blocks confidence in Paperclip's non-local
environment model.
> - This pull request hardens the Cloudflare bridge/runtime path and
adjusts sandbox probe budgets so adapter verification matches the
measured behavior of the fixed environment.
> - It also corrects the Pi sandbox install command so the QA matrix
exercises a real, supported install path.
> - The benefit is a materially more reliable SSH + Cloudflare adapter
matrix with fewer false negatives and clearer failure boundaries.

## What Changed

- Switched the Cloudflare bridge worker instance type to `standard-2`
for the QA-matrix execution path.
- Raised Cloudflare bridge/plugin-worker timeout budgets and added SSE
keepalives so long-running install/exec calls can complete instead of
dying at the transport layer.
- Fixed Cloudflare bridge-channel command handling to avoid dropped
final stdout chunks on short-lived execs.
- Made Claude, OpenCode, and Cursor sandbox probe timeouts
configurable/sandbox-aware, then tightened the defaults to the measured
post-fix range.
- Updated the Pi sandbox install command to use the package currently
installed by the official `pi.dev` installer, pinned to a specific npm
version.
- Added/updated tests around Cloudflare bridge behavior and adapter
sandbox probe paths.

## Verification

- `pnpm --filter @paperclipai/adapter-claude-local typecheck`
- `pnpm --filter @paperclipai/adapter-opencode-local typecheck`
- `pnpm --filter @paperclipai/adapter-cursor-local typecheck`
- `pnpm vitest run packages/adapters/cursor-local
packages/adapters/claude-local packages/adapters/opencode-local
packages/adapters/pi-local packages/plugins/sandbox-providers/cloudflare
server/src/services/__tests__/plugin-worker-manager.test.ts`
- Manual QA on the dedicated dev instance using the SSH + Cloudflare
environment matrix (`ENV-29` through `ENV-40`). Clean end-to-end passes:
SSH `claude_local`, `codex_local`, `cursor`, `gemini_local`; Cloudflare
`claude_local`, `codex_local`, `cursor`, `gemini_local`.

## Risks

- Cloudflare sandbox cost increases because the bridge worker now runs
on `standard-2` instead of `lite`.
- Higher timeout ceilings can delay surfacing truly hung Cloudflare
bridge calls, even though they remove transport-level false negatives.
- The manual heartbeat matrix still exposed follow-on
execution/sync/disposition bugs in `opencode_local` and `pi_local`;
those are not fixed by this PR.

## Model Used

- OpenAI `gpt-5.4` via Paperclip `codex_local`, reasoning effort `high`,
tool use enabled, repo search enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable)
- [x] I have updated relevant documentation to reflect my changes (not
applicable)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-13 22:00:10 -07:00
Dotta f4bed4a70f Release changelog v2026.513.0 (#5944)
## Summary

- Add `releases/v2026.513.0.md` covering the stable release range
`v2026.512.0..origin/master` (6 PRs).
- Includes one new DB migration (`0084_issue_recovery_actions`) under
the Upgrade Guide.
- No breaking changes detected; all PRs are core-maintainer commits so
the Contributors section is omitted.

## Highlights captured

- Source-scoped recovery actions
([#5599](https://github.com/paperclipai/paperclip/pull/5599))
- Blocked Inbox attention view
([#5603](https://github.com/paperclipai/paperclip/pull/5603))
- Local plugin development workflow
([#5821](https://github.com/paperclipai/paperclip/pull/5821))

## Test plan

- [ ] Reviewer confirms the highlight/improvement/fix categorization
matches release intent
- [ ] Reviewer confirms `0084_issue_recovery_actions` upgrade note is
accurate
- [ ] Reviewer signs off on `releases/v2026.513.0.md` for the stable
release cut

Generated under [PAP-9378](/PAP/issues/PAP-9378) via the
`release-changelog` skill.

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-13 16:56:19 -05:00
Dotta 4142559c37 [codex] Add blocked inbox attention view (#5603)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies through
company-scoped issues, comments, approvals, and execution workspaces.
> - Operators need the Inbox to show not only active work, but also
blocked work that may need human or agent attention.
> - The existing inbox experience did not have a dedicated blocked-work
surface, so blocked tasks were harder to triage and resume deliberately.
> - Backend consumers also needed a compact attention signal that
distinguishes actionable blockers from covered or waiting blocker
states.
> - This pull request adds a Blocked Inbox tab backed by issue
blocker-attention metadata, shared validators, and UI helpers.
> - The benefit is a clearer triage path for stalled or blocked
Paperclip work without exposing external wait internals in the
operator-facing UI.

## What Changed

- Added shared issue blocker-attention types, validators, and exports
for the API/UI contract.
- Added backend blocker-attention computation and issue route support
for blocked inbox data.
- Added the Blocked Inbox tab, blocked reason chips, filtering/search
UI, responsive layouts, and Storybook stories.
- Updated inbox helpers and page behavior so toolbar controls only
appear where they apply.
- Added coverage for shared validators, server blocker-attention
behavior, blocked inbox UI helpers/components, and the Inbox page.
- Added a screenshot helper script for the blocked inbox Storybook
stories.
- Addressed Greptile feedback by making urgency sorting deterministic
for null stop times, avoiding full blocked-inbox list enrichment for
counts, and hardening the screenshot helper.

## Verification

- Rebased the branch cleanly onto `public-gh/master`.
- Confirmed the diff does not include `pnpm-lock.yaml`.
- Confirmed the diff does not include database migration files.
- Ran `pnpm exec vitest run packages/shared/src/validators/issue.test.ts
server/src/__tests__/issue-blocker-attention.test.ts
ui/src/components/BlockedInboxView.test.tsx
ui/src/components/BlockedReasonChip.test.tsx
ui/src/lib/blockedInbox.test.ts ui/src/lib/inbox.test.ts
ui/src/pages/Inbox.test.tsx`.
- Ran `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/server typecheck && pnpm --filter @paperclipai/ui
typecheck`.
- Checked `ROADMAP.md`; this is scoped inbox/operator triage work and
does not duplicate a listed roadmap feature.
- Greptile Review is green on the latest head and all four Greptile
review threads are resolved.
- GitHub PR checks are green on the latest head: policy, security/snyk,
e2e, verify, Canary Dry Run, Greptile Review, and serialized server
suites 1/4 through 4/4.

## Risks

- Medium review surface because this touches the shared issue contract,
server issue services, and the Inbox UI together.
- Blocker-attention classification may need product tuning after
operators use it on real blocked queues.
- UI screenshots were not attached in this PR-opening pass; the branch
includes `scripts/screenshot-blocked-inbox.mjs` and Storybook stories
for visual capture.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

OpenAI Codex, GPT-5-based coding agent with shell, git, GitHub CLI,
GitHub connector, and Paperclip API tool use. Reasoning mode: medium.
Context window: not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-13 16:41:36 -05:00
Dotta d1a8c873b2 fix(remote-sandbox): harden host workspace resumes (#5922)
## Thinking Path

> - Paperclip orchestrates AI agents through a control plane while
adapters execute work in local, remote, or sandboxed runtimes.
> - Remote sandbox execution depends on a strict host-versus-remote
workspace boundary: the host prepares/restores files, while the adapter
command runs inside the sandbox cwd.
> - Jannes' PR #5823 identified host-side failure modes that were not
covered by replacement PR #5822.
> - Persisting a remote pod cwd in session params could poison the next
host heartbeat resume and make Paperclip inspect or upload system temp
roots.
> - Plugin sandbox providers also need a narrow way to receive
model-provider API keys without exposing the full server environment to
every plugin worker.
> - This pull request ports the host-side fixes from #5823 in the
current codebase style, with focused regression coverage.
> - The benefit is safer remote sandbox resumes and plugin worker
environment handling without broadening core plugin privileges.

## What Changed

- Persist host workspace cwd, not remote sandbox cwd, in `claude_local`
session params while retaining remote execution identity metadata.
- Reject saved session cwds that point at system roots before heartbeat
falls back to agent home workspace.
- Skip sockets, FIFOs, devices, and other non-file entries during
workspace restore snapshot capture/comparison.
- Pass a small model-provider API-key allowlist only to plugins
declaring `environment.drivers.register`.
- Added focused regression tests for remote Claude session params,
unsafe session cwd detection, plugin worker env filtering, and non-file
snapshot entries.

Credits: ports host-side fixes from Jannes' #5823.

## Verification

- `pnpm vitest run
packages/adapter-utils/src/workspace-restore-merge.test.ts
server/src/services/session-workspace-cwd.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/plugin-database.test.ts` (25 passed, 7 skipped by
existing embedded-Postgres host guard)
- `pnpm --filter @paperclipai/adapter-utils typecheck`
- `pnpm --filter @paperclipai/adapter-claude-local typecheck`
- `pnpm --filter @paperclipai/server typecheck`

## Risks

- Low risk: changes are scoped to remote sandbox/session metadata,
workspace snapshot filtering, and plugin worker env setup.
- Sandbox-provider plugins now receive only the explicit model-provider
key allowlist; any provider needing another key name will need a
deliberate allowlist update.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled local code
execution and repository editing.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-13 16:23:04 -05:00
Dotta 012a738729 Add ordered sub-issue navigation (#5938)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through company-scoped
issues, comments, and execution context.
> - The issue detail page is the board surface where operators and
agents inspect a task in its parent/child workflow.
> - Ordered sub-issues need a low-friction way to move through work
without returning to the parent list after every issue.
> - Existing issue detail navigation only covered sibling transitions
and did not continue into a parent issue's first ordered child.
> - This pull request adds ordered previous/next navigation for issue
detail views and extends it to continue from a parent or last sibling
into the first direct child.
> - The benefit is a smoother review/execution path through hierarchical
work while preserving hidden issue filtering and dependency-aware
ordering.

## What Changed

- Added `IssueSiblingNavigation` and route-state handling so issue
detail footers can link to previous/next ordered issues.
- Extended sub-issue ordering helpers to build navigation from siblings
plus direct children, including root-parent and
last-sibling-to-first-child cases.
- Added page, component, and library tests for ordered sibling
navigation, child fallback navigation, hidden issues, and link
rendering.
- Fixed the quicklook blur/click race Greptile found by deferring close
until after portaled link clicks can complete, with a regression test.
- Polished the navigation landmark label so it remains accurate when the
next target is a direct child rather than a sibling.

## Verification

- `pnpm exec vitest run src/components/IssueLinkQuicklook.test.tsx
src/lib/issue-detail-subissues.test.ts
src/components/IssueSiblingNavigation.test.tsx
src/pages/IssueDetail.test.tsx --config vitest.config.ts` from `ui/` -
31 tests passed.
- `pnpm --filter @paperclipai/ui typecheck` - passed.
- `git diff --check` - passed.
- GitHub PR checks on latest head `34046be2` - passed: Greptile Review,
verify, e2e, Canary Dry Run, policy, Snyk, and serialized server shards.
- Screenshots: not captured in this heartbeat; this PR is a draft and
the changed states are covered by focused component/page tests.

## Risks

- Low risk; this is a UI navigation addition with no database or API
contract changes.
- The main behavioral risk is navigation ordering drift if
`workflowSort` expectations change later.
- The IssueDetail navigation now waits for child issue loading, which
avoids stale child fallback links but can delay footer navigation
briefly while data loads.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected - check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent with repository tool use and shell
execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-13 15:43:51 -05:00
Dotta eb452fba30 Fix comment date binding regression (#5919)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, and
issue comments are the primary durable communication surface between
operators and agents.
> - Commit `c445e592` (`fix(ui): fix message attribution for
agent-posted comments with user author IDs (#5780)`) added server-side
derived attribution for historical comments by scanning heartbeat runs
near comment timestamps.
> - That scan accidentally bound JavaScript `Date` objects directly into
postgres-js SQL fragments for the run timestamp window.
> - On real Postgres, that can fail while listing issue comments with
`ERR_INVALID_ARG_TYPE`, which makes comments disappear from issue pages
such as `PAP-9284`.
> - This pull request keeps the attribution behavior intact while
changing only the broken timestamp binding path.
> - The benefit is that comments load again without weakening the
conservative attribution recovery introduced by `c445e592`.

## What Changed

- Convert the derived-attribution heartbeat-run window bounds to ISO
timestamp strings before binding them into SQL, with explicit
`::timestamptz` casts.
- Add an embedded Postgres regression that inserts a heartbeat run and
user-authored comment, then verifies `issueService.listComments()`
returns the comment while the attribution scan runs.
- Delete `heartbeat_runs` during the issue service test cleanup before
deleting agents so the new test data does not leak across cases.

## Verification

- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts -t
"lists user comments when derived run attribution scans a timestamp
window"`
- `pnpm --filter @paperclipai/server typecheck`
- `git diff --check`

## Risks

- Low risk. The change is limited to how timestamp parameters are bound
for an existing query.
- The derived attribution logic remains conservative and still requires
exact run-log proof before relabeling a comment.
- The regression uses embedded Postgres so it covers the postgres-js
binding path that failed in production-like local runs.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via the Paperclip `codex_local` adapter; GPT-5
coding-agent family with local terminal, file-editing, and git/GitHub
CLI tool use. Exact hosted model deployment ID is not exposed by this
local adapter runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable: server-side comment API bugfix)
- [x] I have updated relevant documentation to reflect my changes (not
applicable: no documented behavior or command changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-13 12:56:51 -05:00
Dotta b947a7d76c [codex] Improve local plugin development workflow (#5821)
## Thinking Path

> - Paperclip is the control plane for autonomous AI-agent companies.
> - Plugins are the extension point for adding capabilities without
expanding the core product surface.
> - Local plugin development needed a tighter CLI-first loop so plugin
authors can scaffold, run, install, inspect, and reload plugins without
reaching into internal package paths.
> - The server plugin install path also needed local-path handling that
keeps plugin identity, dashboard routes, and development watchers
coherent.
> - This pull request adds the CLI scaffold/install workflow, fixes the
server and SDK edge cases that blocked that loop, and updates the
agent-facing plugin creation skill and docs.
> - The benefit is that contributors can develop plugins from local
folders with a documented, repeatable happy path.

## What Changed

- Added `paperclipai plugin init` coverage and CLI wiring for local
plugin scaffolding.
- Improved local plugin install handling, plugin key route resolution,
dashboard capability behavior, and dev watcher startup/reload behavior.
- Fixed plugin SDK worker entrypoint validation for symlinked package
layouts.
- Added targeted tests for plugin init, server plugin authz/watcher
behavior, SDK worker host validation, and the authoring smoke example.
- Added a short local plugin development guide and refreshed the plugin
authoring guide plus `paperclip-create-plugin` skill instructions.

## Verification

- `pnpm run preflight:workspace-links && pnpm --filter
@paperclipai/plugin-sdk build && pnpm --filter
@paperclipai/create-paperclip-plugin typecheck && pnpm --filter
paperclipai typecheck && pnpm --filter @paperclipai/plugin-sdk typecheck
&& pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run --project paperclipai
cli/src/__tests__/plugin-init.test.ts`
- `pnpm exec vitest run --project @paperclipai/plugin-sdk
packages/plugins/sdk/tests/worker-rpc-host.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/plugin-dev-watcher.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/plugin-routes-authz.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --dir packages/plugins/examples/plugin-authoring-smoke-example
test`
- Confirmed `pnpm-lock.yaml` is not included in the PR diff.

## Risks

- Medium risk: this touches plugin install routing, CLI command
behavior, and the local development watcher.
- Local path plugin installs execute trusted local code by design; the
new docs call out that trust boundary.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled local shell and git
workflow, medium reasoning effort. Context window details were not
exposed in this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

UI screenshots: not applicable; this PR changes CLI/server/plugin docs
and tests, not board UI rendering.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-12 17:38:24 -05:00
Dotta 0808b388ee [codex] Add source-scoped recovery actions (#5599)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies, where work
must end with a clear disposition rather than ambiguous agent liveness.
> - Recovery currently detects stalled or missing-next-step issues, but
source issue recovery can become split across child recovery issues,
blockers, and comments.
> - That makes it harder for operators and agents to see who owns
recovery and what exact action is needed on the original issue.
> - Source-scoped recovery actions give the original issue a first-class
active recovery state with owner, evidence, wake policy, and resolution
outcome.
> - This pull request adds the recovery-action data model, backend
reconciliation and resolution APIs, and board UI indicators/actions.
> - The benefit is clearer stalled-work recovery without losing source
issue context or relying on comments as the liveness path.

## What Changed

- Added the `issue_recovery_actions` schema, shared
types/constants/validators, and an idempotent
`0084_issue_recovery_actions` migration ordered after current `master`
migrations.
- Updated stranded/missing-disposition recovery to create source-scoped
recovery actions, wake the recovery owner on the source issue, and avoid
locking the source issue for recovery-action wakes.
- Added API support for reading active recovery actions on issue
detail/list surfaces and resolving them with restored, blocked,
cancelled, or false-positive outcomes.
- Require blocked recovery resolutions to have an unresolved first-class
blocker, and removed the UI shortcut that could mark recovery blocked
without a blocker selection path.
- Surfaced recovery indicators/actions in the issue UI, blocker notices,
active run panels, issue rows, and Storybook coverage.
- Updated docs and focused tests for recovery semantics, ownership,
races, stale comments, and UI behavior.

## Verification

- `pnpm exec vitest run
server/src/__tests__/issue-recovery-actions.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
ui/src/components/IssueRecoveryActionCard.test.tsx
ui/src/components/IssueBlockedNotice.test.tsx ui/src/api/issues.test.ts`
— 5 files, 72 tests passed.
- `pnpm --filter @paperclipai/shared typecheck` — passed.
- `pnpm --filter @paperclipai/db typecheck` — passed, including
migration numbering check.
- `pnpm --filter @paperclipai/server typecheck` — passed.
- `pnpm --filter @paperclipai/ui typecheck` — passed.
- Follow-up verification after blocker-resolution guard: `pnpm exec
vitest run server/src/__tests__/issue-recovery-actions.test.ts
ui/src/components/IssueRecoveryActionCard.test.tsx
ui/src/api/issues.test.ts` — 3 files, 27 tests passed.
- Follow-up `pnpm --filter @paperclipai/server typecheck` — passed.
- Follow-up `pnpm --filter @paperclipai/ui typecheck` — passed.
- UI states are available in
`ui/storybook/stories/source-issue-recovery.stories.tsx`; screenshot
capture helper is `scripts/screenshot-recovery-card.cjs`.

## Risks

- Medium: recovery behavior changes from child recovery issue ownership
toward source-scoped actions, so operators may see stalled-work state in
new places.
- Migration risk is mitigated by using the next migration slot after
`master` and making the table/constraints/index creation idempotent for
anyone who previously applied the old branch-local
`0082_dizzy_master_mold` migration.
- Existing child recovery issue paths are still guarded for
already-created recovery issues, but new source-scoped flows should be
watched in CI and Greptile review.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use enabled for shell, Git,
GitHub, and local test execution. Context window not exposed by the
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-12 09:37:15 -05:00
Devin Foley c445e59256 fix(ui): fix message attribution for agent-posted comments with user author IDs (#5780)
## Thinking Path

> - Paperclip’s issue chat is an audit surface: reviewers need to trust
who actually authored a message.
> - Some historical agent comments were persisted with `authorUserId`
and no surviving `createdByRunId`, so the UI rendered real agent output
as if it came from the board user.
> - A pure timestamp-window fallback is too risky because human
reviewers can comment while agents are running.
> - The safe recovery path is to derive attribution only when the server
can prove it from same-issue run logs that include the exact posted
comment id, then let the chat renderer prefer that recovered agent
attribution.
> - This keeps historical threads trustworthy without mutating old
database rows or guessing in ambiguous cases.

## What Changed

- Added shared `IssueComment` fields for derived attribution so server
and UI can carry recovered `derivedAuthorAgentId`,
`derivedCreatedByRunId`, and `derivedAuthorSource` consistently.
- Added server-side attribution recovery in
`server/src/services/issues.ts` that reads same-issue run logs and only
derives agent authorship when a run log contains the exact `comment id:
...` emitted during posting.
- Updated issue chat rendering in `ui/src/lib/issue-chat-messages.ts` to
prefer direct agent authorship, then activity-log `runAgentId`, then the
server-derived attribution.
- Removed the unsafe UI-only run-window fallback from
`ui/src/pages/IssueDetail.tsx` so human comments posted during an active
run are not silently relabeled as agent output.
- Added regression coverage for both the run-log derivation path and the
chat-rendering fallback behavior.
- Bounded server-side run-log enrichment to 8 concurrent reads per
request and removed the unused `issueCommentSchema` declaration during
PR cleanup.

## Verification

- `pnpm exec vitest run ui/src/lib/issue-chat-messages.test.ts
server/src/__tests__/issues-service.test.ts`
- `pnpm test:run:general`
- Live validation on May 12, 2026 in `PAPA-322`: confirmed the
previously misattributed historical comments on `PAPA-316` now render as
Claude-authored on `http://goldie.gerbil-company.ts.net:3100`.
- Reviewer check: open `PAPA-316` in the running instance and confirm
historical comments such as `## Investigation: exe.dev 422 + codex
re-test` render under Claude instead of the board user.

## Risks

- Low risk. The change is scoped to comment attribution recovery and
rendering.
- Derived attribution is intentionally conservative: if there is no
exact run-log proof, the comment remains user-authored instead of
guessing.
- Run-log recovery depends on retained same-issue logs, so older
comments without that evidence remain unchanged.

## Model Used

- OpenAI Codex via the Paperclip `codex_local` adapter (GPT-5-class
coding agent with tool use in the local Paperclip runtime; the exact
deployment/model ID is not surfaced by this workspace).

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-12 01:20:49 -07:00
Dotta 9746dab4e8 Bump release changelog to v2026.512.0 (#5764)
## Summary

PR [#5366](https://github.com/paperclipai/paperclip/pull/5366) already
merged the v2026.511.0 changelog. This follow-up bumps the artifact to
the actual cut date and drops the pre-alpha sandbox work per maintainer
feedback.

- **Rename** `releases/v2026.511.0.md` → `releases/v2026.512.0.md`
- **Bump header / date** to `# v2026.512.0` / `> Released: 2026-05-12`
- **Drop new sandbox content** (pre-alpha, not yet ready):
- Daytona sandbox provider plugin highlight
([#5580](https://github.com/paperclipai/paperclip/pull/5580),
[#5586](https://github.com/paperclipai/paperclip/pull/5586))
- Cursor sandbox support improvement
([#4803](https://github.com/paperclipai/paperclip/pull/4803))
- Cursor sandbox runtime resolution fix
([#5446](https://github.com/paperclipai/paperclip/pull/5446))
- Sandbox provider messaging polish
([#4902](https://github.com/paperclipai/paperclip/pull/4902))
- **Add LLM Wiki plugin package highlight**
([#5716](https://github.com/paperclipai/paperclip/pull/5716)) — the
package itself landed on master after #5366 merged.
- **Update Upgrade Guide closer** to mention only the `cursor_cloud`
adapter as opt-in.

The `cursor_cloud` adapter is kept in (adapter, not sandbox). The
exe.dev and Cloudflare sandbox provider plugins that landed since the
merge are also excluded as pre-alpha.

No breaking changes; the nine new migrations (`0075`–`0083`) carry over
unchanged from the merged 511 file.

## Test plan

- [ ] Maintainer review of the dropped entries — confirm I caught
everything sandbox-related you wanted out
- [ ] Confirm Cursor Cloud adapter staying in is intentional (flag for
removal if not)
- [ ] Confirm LLM Wiki plugin package highlight phrasing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 22:06:43 -05:00
Dotta 563413ecd4 Fix LLM wiki type contracts (#5758)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, and
plugins extend that control plane without bloating core.
> - The LLM Wiki plugin adds a knowledge surface through the plugin
runtime and shared plugin UI components.
> - After the LLM Wiki work merged to `master`, CI exposed TypeScript
contract drift between plugin code, SDK component types, and update
settings types.
> - The ingestion settings update path intentionally accepts partial
source toggles, but its type intersected with the full settings shape
and required every source key.
> - The LLM Wiki UI also passes managed routine default-drift metadata
through the shared routine list item shape, but that metadata was
missing from the public item type.
> - This pull request narrows those type contracts to match the existing
runtime behavior.
> - The benefit is restoring typecheck on `master` with a small,
non-behavioral follow-up.

## What Changed

- Added a `WikiEventIngestionSettingsUpdate` type that permits partial
source updates without weakening normalized stored settings.
- Added managed routine default-drift metadata to the plugin SDK
`ManagedRoutinesListItem` type.
- Mirrored that managed routine default-drift type in the host UI
component item type.

## Verification

- `pnpm --filter @paperclipai/plugin-llm-wiki typecheck`
- `pnpm --filter @paperclipai/plugin-sdk typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `git diff --check`

## Risks

- Low risk. This is a TypeScript type-contract fix only; no runtime
behavior or database schema changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled local repository
editing and command execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Notes on checklist applicability: no screenshots are included because
the UI change is a shared type-only contract update with no visual
behavior change; no docs were required because no behavior or commands
changed.

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 21:07:06 -05:00
github-actions[bot] 94ce7af715 chore(lockfile): refresh pnpm-lock.yaml (#5756)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-05-11 20:46:58 -05:00
Dotta 508355b8fc [codex] Add LLM Wiki plugin package to master (#5716)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The plugin system is the extension surface for optional product
capabilities without baking every workflow into core.
> - The LLM Wiki plugin package was reviewed in stacked PR #5592, which
targeted `pap-9173-llm-wiki-rest`.
> - The stack base PR #5597 merged to `master` before #5592 was merged
into that branch, so the plugin package never reached `master`.
> - A direct PR from `pap-9173-llm-wiki-rest` back to `master` would be
noisy because that branch has diverged from current `master`.
> - This pull request reapplies the reviewed
`packages/plugins/plugin-llm-wiki/` package onto current `master` and
updates Docker deps-stage manifest coverage.
> - The branch intentionally no longer changes `pnpm-workspace.yaml`
after maintainer feedback; because the new package is now a root
workspace importer, the remaining integration question is how
maintainers want the root lockfile handled under the current PR policy.

## What Changed

- Added the LLM Wiki plugin package under
`packages/plugins/plugin-llm-wiki/` from the merged PR #5592 head.
- Preserved the post-review cleanup from #5592: generated
design/screenshot artifacts are not committed, and `src/ui/index.tsx` /
`src/wiki.ts` are small public entrypoints.
- Added the new plugin package manifest to the Docker deps stage so
policy can validate package manifest coverage.
- Removed the earlier `pnpm-workspace.yaml` exclusion per maintainer
request, so the plugin is included by the existing `packages/plugins/*`
workspace glob.

## Verification

Current head:
- PGlite migration harness: ran migrations 001-003, verified old
non-space distillation unique constraints were removed, inserted
duplicate cursor and work-item keys in a second space, then reran
migration 003 successfully
- `node ./scripts/check-docker-deps-stage.mjs`
- `git diff --check`

Known current-head install result after removing the workspace
exclusion:
- `pnpm install --frozen-lockfile` fails because `pnpm-lock.yaml` has no
importer for `packages/plugins/plugin-llm-wiki/package.json`.

Previously verified on the same plugin source before the
workspace-exclusion removal:
- `pnpm --filter @paperclipai/plugin-sdk build`
- `cd packages/plugins/plugin-llm-wiki && pnpm install --lockfile=false
&& pnpm test`

## Risks

- The branch now includes `packages/plugins/plugin-llm-wiki` in the root
workspace but does not update `pnpm-lock.yaml`. Root frozen install will
fail until maintainers choose a lockfile path that fits repo policy.
- Committing `pnpm-lock.yaml` directly on this PR conflicts with the
current PR policy check, while excluding the package from
`pnpm-workspace.yaml` was rejected in maintainer feedback.
- The package includes UI code already reviewed in #5592; generated
screenshot/design artifacts were intentionally removed per maintainer
request, so visual review should regenerate screenshots locally if
needed.
- The package depends on plugin host support from #5597, which is
already merged to `master`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI GPT-5 Codex via Codex CLI, tool use and local code execution
enabled; context window not exposed.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run the targeted checks listed above
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Stack context: #5592 was merged into `pap-9173-llm-wiki-rest` after
#5597 had already merged that branch to `master`, so this follow-up PR
is needed to carry the plugin package itself into `master`.

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 20:45:41 -05:00
Devin Foley ad0bb57350 Fix exe.dev sandbox installs for gemini/opencode local adapters (#5737)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, including
running adapter CLIs inside remote sandboxes
> - The QA matrix in PAPA-316 spins up local-runtime adapters
(claude/gemini/opencode) against both SSH and the new exe.dev sandbox
provider, and "Test" exercises the same install + probe path the real
runtime uses
> - On exe.dev the QA matrix failed at three different points:
SSH/sandbox secret refs would not resolve, gemini-local could not find
npm, and opencode-local installed a binary that was not on the
probe-shell PATH
> - These are all environment-shape issues the runtime should handle,
not regressions in any individual adapter, so they need to be fixed in
the shared install/resolve layer before the matrix can pass
> - This pull request wires the environment id through to secret-ref
resolution, bootstraps npm from a portable Node tarball when the sandbox
image lacks Node, and symlinks the opencode binary into a directory that
non-login shells see
> - The benefit is that the QA matrix passes end-to-end on exe.dev, and
any future sandbox provider that ships without Node or relies on rc-file
PATH wiring gets the same fixes for free

## What Changed

- `server/src/services/environment-execution-target.ts`: pass the
environment `id` into `resolveEnvironmentDriverConfigForRuntime` for
both the sandbox and SSH branches, so `privateKeySecretRef` /
sandbox-provider secret refs (e.g. exe.dev `apiKey`) can resolve against
the secret store at runtime instead of throwing `Runtime secret
resolution requires an environment id`.
- `packages/adapter-utils/src/sandbox-install-command.ts`: extend
`buildSandboxNpmInstallCommand` with an `ENSURE_NPM_PREAMBLE` that, when
`npm` is missing, downloads a portable Node v22 tarball into
`$HOME/.local` and sets `PAPERCLIP_NPM_BOOTSTRAPPED=1` so the install
step skips sudo (sudo's `secure_path` would lose the freshly-installed
`npm` in `$HOME/.local/bin`). Distro-packaged Node from apt-get is
intentionally avoided because it tends to be too old to parse modern JS
syntax used by `@google/gemini-cli`.
- `packages/adapters/gemini-local/src/index.ts`: switch the hardcoded
`npm install -g @google/gemini-cli` to `buildSandboxNpmInstallCommand`,
so gemini-local picks up the same sudo-aware + npm-bootstrap behavior as
the other local adapters.
- `packages/adapters/opencode-local/src/index.ts`: append a step to the
install command that symlinks `$HOME/.opencode/bin/opencode` into
`$HOME/.local/bin`. The upstream installer only adds `~/.opencode/bin`
to PATH via `~/.bashrc`, which non-login `sh -c` probe invocations do
not source.
- `packages/adapter-utils/src/sandbox-install-command.test.ts`: cover
the new preamble plus the unchanged root/sudo/user-prefix branches.

## Verification

- `cd packages/adapter-utils && npm test -- sandbox-install-command`
(passes; new "bootstraps npm from a portable Node tarball when missing"
case is included).
- Manual: ran the in-app `Test` action against the QA matrix dev
instance for `QA exe.dev Claude`, `QA exe.dev Gemini`, and `QA exe.dev
OpenCode` — all three now report `status=pass` including the hello
probe. `QA SSH Claude` also passes; without the environment-id fix, SSH
resolution threw before the wrapper / install fixes could run.
- Suggested reviewer check: re-run the matrix on a fresh exe.dev
environment and confirm the install step no longer hits `npm: command
not found` for gemini and the opencode probe no longer hits `opencode:
command not found`.

## Risks

- Low/medium. The npm bootstrap pins Node `v22.11.0` from
`nodejs.org/dist`; if that URL becomes unreachable the install will fail
with a clear `curl` error rather than corrupting state. The bootstrap
path is only taken when `npm` is genuinely missing, so existing sandbox
images that ship with Node are unaffected.
- The opencode symlink uses `ln -sf` into `$HOME/.local/bin`, which is
created with `mkdir -p`; idempotent on re-install.
- The `id` change is a strict additive: callers previously got
`undefined` and only the secret-ref code paths actually read it. No
behavior change for environments without secret refs.

## Model Used

- Claude (Anthropic), `claude-opus-4-7`, with extended thinking and tool
use enabled. Iterated through the Paperclip QA matrix harness; no other
model assisted.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (n/a — runtime/install path only)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 14:28:22 -07:00
Devin Foley eaa80cf88b Enable CI publishing for cursor-cloud, cloudflare, and exe.dev release packages (#5728)
## Thinking Path

> - Paperclip orchestrates AI agents, and its release flow depends on
explicit package enrollment for automated publishing.
> - The release registry tooling uses
`scripts/release-package-manifest.json` as the source of truth for which
public packages CI is allowed to publish.
> - The cursor cloud adapter plus the Cloudflare and exe.dev sandbox
plugins are public packages that now need to ship through the normal CI
release path.
> - Leaving those entries at `publishFromCi: false` keeps release
automation and registry validation out of sync with the intended package
set.
> - This pull request updates only those three manifest entries and
leaves the release tooling itself unchanged.
> - The benefit is that CI release enrollment now matches the packages
we intend to publish, with the existing manifest checks continuing to
guard correctness.

## What Changed

- Enabled CI publishing for `@paperclipai/adapter-cursor-cloud` in
`scripts/release-package-manifest.json`.
- Enabled CI publishing for `@paperclipai/plugin-cloudflare-sandbox` in
`scripts/release-package-manifest.json`.
- Enabled CI publishing for `@paperclipai/plugin-exe-dev` in
`scripts/release-package-manifest.json`.

## Verification

- `node ./scripts/release-package-map.mjs check`
- `pnpm test:release-registry`

## Risks

- Low risk. This is a manifest-only change, but a wrong enrollment flag
would affect release automation, so the release-registry checks are the
main guardrail.

## Model Used

- OpenAI GPT-5.4 via Paperclip `codex_local` (`adapterConfig.model:
gpt-5.4`), high reasoning effort, with tool use and shell/code
execution. The adapter does not expose a separate context-window value
in this environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-11 11:47:07 -07:00
Dotta 8af38fb054 Revert "fix(ui): prevent lossy cron rewrites + redesign routine triggers tab" (#5725)
## Thinking Path

> - Paperclip orchestrates AI agents through visible, governable task
and routine workflows.
> - Routines are the recurring-work surface where operators configure
schedules, runs, and activity.
> - PR #3569 moved routine operational tabs into the right-hand
properties panel while also redesigning the routine trigger editor.
> - The current product request is to remove that routine properties
right-tab change for now and come back to it later.
> - The cleanest way to do that is a direct revert of #3569 on top of
current `master`, which already includes the #5703 revert.
> - This pull request restores the pre-#3569 routine trigger/detail
behavior and removes the right-tab properties-panel routine layout.
> - The benefit is a simple, reviewable rollback with no schema or API
changes.

## What Changed

- Reverted #3569: `fix(ui): prevent lossy cron rewrites + redesign
routine triggers tab`.
- Restored the previous `RoutineDetail` inline tabs and trigger editing
flow.
- Restored the earlier `ScheduleEditor` implementation.
- Removed the UI components and tests introduced by #3569:
`ConfirmDialog`, `TriggerDialog`, `TriggerListCard`, and
`ScheduleEditor.test.ts`.

## Verification

- `git diff --check origin/master..HEAD`
- `pnpm vitest run ui/src/pages/Routines.test.tsx
ui/src/components/RoutineHistoryTab.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`

Notes:

- `pnpm install --frozen-lockfile` was run in the clean worktree before
verification. It completed with known workspace bin-link warnings for
`paperclip-plugin-dev-server` because the plugin SDK `dist/dev-cli.js`
has not been built in that fresh worktree.
- `Routines.test.tsx` emitted existing Radix dialog accessibility
warnings during the test run; the tests passed.

### Screenshots

This is a direct revert of #3569. The visual state after this PR
corresponds to the old screenshot from #3569, and the state being
removed corresponds to the new/right-panel screenshots from #3569.

| Before this revert | After this revert |
| --- | --- |
| <img width="1410" height="1325"
alt="routine-triggers-before-this-revert"
src="https://github.com/user-attachments/assets/d70dd35b-e72f-4fc6-bb21-be9b0d92b3b1"
/> | <img width="721" height="707"
alt="routine-triggers-after-this-revert"
src="https://github.com/user-attachments/assets/260bb682-32cb-4dff-b038-d55e45824b04"
/> |

Right-hand properties panel state removed by this revert:

<img width="1409" height="830" alt="routine-properties-panel-removed"
src="https://github.com/user-attachments/assets/f1d42f07-7cd3-4614-8e93-5b585affd4bf"
/>

## Risks

- Low technical risk: this is a clean Git revert of a UI-only PR.
- Product risk: #3569 also fixed lossy cron editing and added broader
schedule presets, so this rollback intentionally removes those
improvements along with the right-tab routine layout.
- Follow-up risk: if we want only the schedule-editor fixes back later,
they should be reintroduced separately from the routine properties-panel
layout.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with local shell and
GitHub CLI access. Context window size was not exposed in this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 13:24:48 -05:00
Dotta 0c6f9bdcf8 Revert "fix(ui): improve routine properties panel and history UX" (#5723)
## Thinking Path

> - Paperclip orchestrates AI agents through visible, governable task
and routine workflows.
> - The routines UI includes the routine detail page, properties panel,
history tab, and shared sidebar components.
> - PR #5703 changed that workflow by widening the routine properties
panel and moving revision inspection/comparison into dialogs.
> - The product direction for that change is being paused for now, so
the safest path is a direct revert instead of partial edits.
> - This pull request reverts merge commit
`74cb560c41305ac3283067d1ec8d3060ffdc28cb` from #5703.
> - The benefit is restoring the prior routines UI behavior while
keeping the revert easy to review and re-apply later if needed.

## What Changed

- Reverted #5703: `fix(ui): improve routine properties panel and history
UX`.
- Restored the previous routine properties panel sizing, panel context
API, routine detail layout, and routine history rendering behavior.
- Removed the reverted sidebar pane test additions and restored the
previous focused routine history test expectations.

## Verification

- `git diff --check origin/master..HEAD`
- `pnpm vitest run ui/src/components/RoutineHistoryTab.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`

### Screenshots

This is a direct revert of #5703. The visual state after this PR
corresponds to the "Before" screenshots from #5703, and the state being
removed corresponds to the "After" screenshots from #5703.

#### Trigger Panel Width

| Before this revert | After this revert |
| --- | --- |
| <img width="1742" height="1288" alt="triggers-before-this-revert"
src="https://github.com/user-attachments/assets/9e818978-283c-49a3-9401-879be550c67b"
/> | <img width="1741" height="1289" alt="triggers-after-this-revert"
src="https://github.com/user-attachments/assets/2a391769-c355-4219-8da3-d1ea18698430"
/> |

#### History Panel

| Before this revert | After this revert |
| --- | --- |
| <img width="1741" height="1290" alt="history-before-this-revert"
src="https://github.com/user-attachments/assets/4c139238-8494-4438-89e1-4277d05bc3aa"
/> | <img width="1739" height="1289" alt="history-after-this-revert"
src="https://github.com/user-attachments/assets/eaea4f3d-bb65-4af6-b67f-3ba3026fe0c9"
/> |

## Risks

- Low technical risk: this is a clean Git revert of a recently merged
UI-only PR.
- Product risk: the routine properties panel and revision history return
to the older, narrower workflow that #5703 was improving.
- Re-application risk: future work that wants the #5703 behavior back
should re-apply it deliberately rather than cherry-picking around this
revert.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with local shell and
GitHub CLI access. Context window size was not exposed in this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 13:10:40 -05:00
Dotta 21404e8a34 [codex] Fix Docker build without LLM wiki plugin package (#5714)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, and its
Docker image needs to build from the checked-in core repository.
> - The Docker `deps` stage copies workspace package manifests before
running `pnpm install --frozen-lockfile` so dependency installation can
be cached.
> - Current `master` copied
`packages/plugins/plugin-llm-wiki/package.json`, but that plugin package
has not been merged into core yet.
> - Docker fails before install with a missing build-context path, so
the release image cannot build from the current repository state.
> - This pull request removes the premature plugin manifest copy while
leaving the plugin SDK and existing sandbox plugin package copies
intact.
> - The benefit is that the Docker build no longer depends on an
unmerged plugin package.

## What Changed

- Removed the `packages/plugins/plugin-llm-wiki/package.json` copy from
the Dockerfile `deps` stage.

## Verification

- `git diff --check`
- Static Dockerfile source validation: parsed non-stage `COPY` sources
and confirmed every source exists in the build context.
- Attempted `docker build --target deps --progress=plain -t
paperclip-pap-9235-deps-check .`, but Docker is unavailable in this
execution environment: `Cannot connect to the Docker daemon at
unix:///Users/dotta/.docker/run/docker.sock`.

## Risks

- Low risk. The removed path points to a package that is absent from the
repository, so retaining it is what breaks the build. The plugin can add
its manifest copy back when the package itself lands.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex using GPT-5, tool-enabled coding agent in a local
repository workspace. Exact context-window metadata is not exposed in
this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 10:19:08 -05:00
Devin Foley 5a64cf52a1 Add exe.dev sandbox provider plugin (#5688)
> _Stacked on top of #5685#5686#5687. Diff against master includes
commits from earlier PRs in the stack — review focuses on the two new
commits (`Add long-secret textarea variant to JsonSchemaForm
SecretField` + `Add exe.dev sandbox provider plugin`)._

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent runs in a sandbox environment, and operators choose the
provider — today E2B, Daytona, and (in this stack) Cloudflare
> - exe.dev offers per-VM sandboxes via a small CLI / HTTP API — useful
for operators who want full Linux VMs (vs container/runtime-only
sandboxes)
> - The plugin shape mirrors the e2b plugin: lifecycle hooks (`new`,
`ls`, `rm`) drive exe.dev's CLI; SSH plumbing handles direct VM access
for adapters that need it
> - exe.dev VMs come up bare — `node` is not preinstalled, so the
Paperclip sandbox callback bridge (a Node script) needs Node 20
installed at VM init via `--setup-script`. The plugin defaults the setup
script to a Nodesource install
> - The auth field accepts long SSH private keys, which need a textarea
variant of the existing `SecretField` in `JsonSchemaForm` — added behind
a `maxLength > THRESHOLD` opt-in so other secret fields are unaffected
> - The benefit is that operators get exe.dev as a fully working sandbox
provider out of the box, with no manual VM provisioning required

## What Changed

**Shared UI support (`Add long-secret textarea variant to JsonSchemaForm
SecretField`):**

- `ui/src/components/JsonSchemaForm.tsx` + new
`JsonSchemaForm.test.tsx`: when a secret-formatted field declares
`maxLength` larger than the existing single-line threshold, render a
monospace textarea instead of the masked input. Short secrets (API keys,
tokens) keep the existing masked-input + show/hide toggle behavior.

**The exe.dev plugin (`Add exe.dev sandbox provider plugin`):**

- `packages/plugins/sandbox-providers/exe-dev/`: plugin entry, manifest,
plugin runtime, README, and 19-test Vitest suite.
- Manifest fields: API token (with `secret-ref` + `/exec` permission
notes — needs `new`, `ls`, `rm`), API URL override, optional SSH
username, optional SSH private key (uses the new `JsonSchemaForm`
textarea variant via `maxLength: 4096`), optional SSH identity-file
path, optional setup script.
- Default `--setup-script` is a Nodesource Node 20 install. exe.dev VMs
come up bare and the Paperclip sandbox callback bridge is a Node script,
so without Node preinstalled the bridge can't start. Operators can
override by supplying their own setup script.
- `runLifecycleCommand` redacts env values from the executed command
before surfacing it in error messages, so secrets passed via
`--env=KEY=VALUE` don't leak into operator-visible failures.
- The plugin distinguishes exe.dev's SSH onboarding failures (`Please
complete registration by running: ssh exe.dev`) from general SSH
failures and surfaces a clear remediation message.
- `scripts/release-package-manifest.json`: register the new plugin for
CI publish alongside the existing daytona / e2b providers.

## Verification

- `pnpm typecheck`
- `pnpm exec vitest run --no-coverage
ui/src/components/JsonSchemaForm.test.tsx`
- `(cd packages/plugins/sandbox-providers/exe-dev && pnpm test)` — 19
passing

For an operator-side smoke test:

1. Get an exe.dev API token with `/exec` permission for `new`, `ls`,
`rm`.
2. Register the plugin in your Paperclip instance, configure an
environment with the token.
3. Create a sandbox env whose provider is `exe-dev`, then run a Codex or
Claude job against it. The default Node 20 setup script should bring the
VM up automatically.

## Risks

- Adds a new sandbox provider plugin that follows the existing daytona /
e2b shape; behavior on existing providers is unchanged.
- The `JsonSchemaForm` textarea variant only engages for fields that opt
in via `maxLength` larger than the existing threshold. All existing
secret fields (which don't declare a `maxLength`) keep their current
rendering. Test coverage pins both paths.
- The redaction in `runLifecycleCommand` is a defense-in-depth measure;
the test suite exercises the redaction path. If the redaction misses a
future env-arg shape, the worst case is restored behavior (secrets in
error messages), which is what the existing daytona / e2b plugins also
do today.
- Default setup script downloads from `deb.nodesource.com` over HTTPS at
VM init. Operators on air-gapped networks or with a different package
strategy can override the setup script.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — UI change is a textarea variant of an existing secret
field; will attach screenshots before requesting merge
- [x] I have updated relevant documentation to reflect my changes
(plugin README, manifest descriptions)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 07:42:18 -07:00
Aron Prins 74cb560c41 fix(ui): improve routine properties panel and history UX (#5703)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Routines are the recurring-work surface where operators configure
schedules, executions, activity, and revision history.
> - The routine detail view uses a contextual right properties panel for
triggers, runs, activity, and history.
> - That panel was too cramped for routine workflows: the routine header
could collapse at constrained widths, and revision previews/comparisons
were trying to live inside the same narrow panel.
> - This pull request makes the routine properties panel wider and
responsive without changing the default panel behavior for other pages.
> - It also moves routine revision viewing and comparison into focused
dialogs so history stays usable instead of rendering dense revision
content inside the right panel.
> - The benefit is a cleaner routine workflow: triggers remain
scannable, the main routine stays readable, and revisions can be
inspected, compared, and restored without fighting the sidebar width.

## What Changed

- Added optional per-panel layout options for storage key, default
width, min/max width, and compact viewport behavior.
- Set the routine properties panel to use its own 400px default width
and persistence key, while compacting to 320px on narrower viewports.
- Made the shared resizable sidebar support right-side panes, custom
width bounds, compact max width, and keyboard resizing.
- Fixed the routine detail header so title text and action controls
remain readable beside the properties panel at constrained widths.
- Reworked routine history so selecting a revision opens a read-only
snapshot dialog instead of trying to render the whole revision inside
the right panel.
- Added a side-by-side current-vs-selected revision comparison dialog
with clearer diff markers for structured fields, triggers, and
variables.
- Added focused tests for the resizable pane and routine history
behavior.

## Verification

- `pnpm vitest run ui/src/components/RoutineHistoryTab.test.tsx
ui/src/components/ResizableSidebarPane.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm -r typecheck`
- `git diff --check`
- Browser E2E in TestCo at `http://localhost:3100/TES/dashboard`:
  - created and edited a routine
  - added, edited, toggled, and deleted schedule triggers
  - paused automation
  - ran the routine and stopped the live run
- verified runs, activity, history, snapshot dialog, compare mode,
restore confirmation, routine list, recent runs, row actions, panel
close/reopen, and constrained-width layout

### Screenshots

#### Trigger Panel Width

| Before | After |
| --- | --- |
| <img width="1741" height="1289" alt="triggers-before"
src="https://github.com/user-attachments/assets/2a391769-c355-4219-8da3-d1ea18698430"
/> | <img width="1742" height="1288" alt="triggers-after"
src="https://github.com/user-attachments/assets/9e818978-283c-49a3-9401-879be550c67b"
/> |

#### History Panel

Before, selecting a revision attempted to show dense revision content
inside the already narrow right panel. After, history remains a compact
list and revision details open separately.

| Before | After |
| --- | --- |
| <img width="1739" height="1289" alt="history-before"
src="https://github.com/user-attachments/assets/eaea4f3d-bb65-4af6-b67f-3ba3026fe0c9"
/> | <img width="1741" height="1290" alt="history-after"
src="https://github.com/user-attachments/assets/4c139238-8494-4438-89e1-4277d05bc3aa"
/> |

#### Revision Snapshot

The selected revision now opens in a dedicated read-only dialog instead
of crowding the properties panel.

<img width="1740" height="1289" alt="revision-single"
src="https://github.com/user-attachments/assets/f930f50f-7016-434b-bd81-d8d97304c528"
/>

#### Revision Compare

Historical revisions can be compared side-by-side with the current
revision, including changed structured fields and trigger differences.

<img width="1740" height="1287" alt="revision-compare"
src="https://github.com/user-attachments/assets/5640201e-de4f-446b-8941-1b0f140c56d7"
/>

## Risks

- Low to moderate UI risk: the shared resizable pane API gained optional
layout parameters, but existing callers keep the previous defaults.
- Routine history now uses dialogs for revision viewing and comparison,
so reviewers should confirm the new workflow feels right for restore and
compare.
- Routine panel width now persists under a routine-specific key, so
previous global properties panel width preferences do not carry into
routines.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent in Codex Desktop, tool-enabled with
local shell, git, and in-app browser automation. Context window size was
not exposed in this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-11 07:37:30 -07:00
Devin Foley 486fb88a15 Add Cloudflare sandbox provider plugin (#5687)
> _Stacked on top of #5685#5686. Diff against master includes commits
from earlier PRs in the stack — review focuses on the two new commits
(`Extend sandbox callback bridge for Worker-hosted plugins` + `Add
Cloudflare sandbox provider plugin`)._

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent runs in a sandbox environment, and operators choose which
provider backs that sandbox — today E2B and Daytona are bundled with the
platform
> - Cloudflare Workers + Durable Objects + the Sandbox SDK offer a
credible new option: globally distributed, cheap idle, and
operator-deployable as a single Worker
> - To plug it in, Paperclip needs (a) a provider plugin that speaks the
`PaperclipPluginManifestV1` lifecycle and (b) a small operator-deployed
Worker — the **bridge** — that adapts Paperclip's runtime RPCs to the
Cloudflare Sandbox SDK
> - The plugin extends the existing sandbox-callback-bridge with a
`bridge.transport: "worker"` discriminator so the platform routes
runtime RPCs through the Worker bridge instead of the in-process runner
> - This pull request adds the plugin, the bridge Worker template, and
the supporting adapter-utils + server hooks the new transport needs
> - The benefit is that operators can run sandboxes on Cloudflare's edge
with no new platform code beyond installing the plugin and deploying the
Worker

## What Changed

**Shared support (`Extend sandbox callback bridge for Worker-hosted
plugins`):**

- `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`:
expose `expectedHostHeader` so plugin-side bridge clients can verify the
canonical request envelope before forwarding.
- `packages/adapter-utils/src/command-managed-runtime.{ts,test.ts}`:
relax the always-fresh runner construction so callers can re-use a
runner across exec calls (Worker-hosted bridges hold the runner inside a
Durable Object).
- `server/src/services/environment-runtime.ts` +
`environment-runtime.test.ts`: route Worker-hosted bridges through the
same env-shaping path as E2B and pin the `requestEnv` contract.
- `server/src/services/plugin-environment-driver.ts`: thread an optional
`issueId` through the runtime descriptor so bridges can scope leases to
the originating issue (used by Cloudflare to map a sandbox to the
issue/workflow for billing and audit).
- `packages/plugins/sdk/src/protocol.ts`: add `issueId?` to
`PluginEnvironmentDriverBaseParams` and the new `bridge.transport:
"worker"` discriminator that the new plugin declares.
- `server/__tests__/heartbeat-plugin-environment.test.ts`: pin the
heartbeat path against the new runtime descriptor.

**The Cloudflare plugin itself (`Add Cloudflare sandbox provider
plugin`):**

- `packages/plugins/sandbox-providers/cloudflare/`: plugin entry,
manifest, plugin runtime (lifecycle + bridge client), config parsing,
and Vitest coverage. Manifest declares `bridge.transport: "worker"` so
the platform routes runtime RPCs through the bridge client.
- `bridge-template/`: a Worker template the operator deploys with
`wrangler`. Owns Durable Object-backed sessions (`sessions.ts`),
exec/stream routes (`exec.ts`, `routes.ts`), and an HMAC auth layer
(`auth.ts`) that pins the `Host` header surface. Includes the
SDK-contract-correct exec implementation, lease recovery, and chunked
stdout/stderr streaming.
- Tests cover lease/session handoff (`bridge-template/src/exec.test.ts`,
`routes.test.ts`), bridge client request shaping
(`src/bridge-client.test.ts`), and end-to-end plugin behavior
(`src/plugin.test.ts`) including streamed exec output. 27 tests in
total.
- `README.md` walks the operator through deploying the bridge Worker,
registering the plugin, and configuring the runtime.

## Verification

- `pnpm typecheck`
- `pnpm exec vitest run --no-coverage
packages/adapter-utils/src/sandbox-callback-bridge.test.ts
packages/adapter-utils/src/command-managed-runtime.test.ts
server/src/__tests__/environment-runtime.test.ts
server/src/__tests__/heartbeat-plugin-environment.test.ts`
- `(cd packages/plugins/sandbox-providers/cloudflare && pnpm test)` — 27
passing

For an operator-side smoke test:

1. Deploy the bridge: `cd
packages/plugins/sandbox-providers/cloudflare/bridge-template &&
wrangler deploy`
2. Register the plugin in your Paperclip instance, point its bridge URL
at the deployed Worker, set the HMAC shared secret.
3. Create a sandbox environment whose provider is `cloudflare`, then run
a Codex or Claude job against it.

## Risks

- Adds a new `bridge.transport: "worker"` code path, but the existing
E2B / Daytona transports go through the same shaped helpers and have
explicit test coverage that pins their behavior unchanged.
- The Worker bridge stores session state in a Durable Object; operator
instances must be aware of the corresponding Cloudflare costs (DO
requests, storage). Documented in the README.
- The `issueId` plumbing is optional throughout — existing plugins that
don't supply it continue to work.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A, no UI change
- [x] I have updated relevant documentation to reflect my changes
(plugin README, bridge-template README)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 07:33:13 -07:00
Dotta 4ad1c83b84 Write release changelog for v2026.511.0 (#5366)
## Summary

- Adds the user-facing stable release changelog at
`releases/v2026.511.0.md`, generated per
[`.agents/skills/release-changelog/SKILL.md`](../blob/master/.agents/skills/release-changelog/SKILL.md)
from the 89 commits between `v2026.428.0` and `origin/master`.
- Surfaces nine headline themes: planning mode, full company search,
routine revision history, recovery system notices, expanded plugin host,
**secrets provider vaults + remote import**, **Cursor cloud adapter**,
**Daytona sandbox provider plugin**, and the ACPX local adapter — plus
categorized improvements/fixes.
- Documents nine new additive/idempotent migrations (`0075`–`0083`) in
the Upgrade Guide, including the `fuzzystrmatch` extension requirement
for company search and the `provider_config_id` text→uuid retype on
`company_secrets`.

Branch retains its `release/v2026.506.0` name to avoid PR churn; only
the artifact filename and content track the actual release date. Branch
was rebased onto current `origin/master` and force-pushed with a single
squashed commit.

No `BREAKING:` / `BREAKING CHANGE:` / `feat!:` signals across the full
range.

## Test plan

- [ ] Maintainer review of the draft changelog content for tone/scope
- [ ] Confirm the headline grouping reflects intended release messaging
- [ ] Confirm the migration list and upgrade guide match runtime
behavior
- [ ] Confirm the `0083` `provider_config_id` retype guidance is
acceptable for the release notes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 09:28:02 -05:00
Aron Prins c0c58d6b01 fix(ui): prevent lossy cron rewrites + redesign routine triggers tab (#3569)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Humans configure when those agents run via **routines**, which are
driven by cron-backed triggers
> - The routine detail page exposed triggers through an always-visible
inline add form and per-row inline editor, with a ScheduleEditor that
only understood a narrow set of cron shapes
> - That combination was actively lossy: pasting `0 9,13,17 * * *`
silently collapsed to `0 10 * * *` on save, and common shapes
(every-N-minutes within a window, multiple times per day, monthly on
several dates) had no first-class UI
> - This pull request rebuilds the triggers tab around a list of cards +
add/edit modal, teaches ScheduleEditor the cron shapes users actually
want, and prevents cron round-trips from dropping data
> - It also *optionally* tucks the Triggers/Runs/Activity tabs into the
shared right-hand PropertiesPanel (same pattern as Issues and Goals) so
they stay in view alongside the routine instead of being hidden below
the main content
> - The benefit is that routine scheduling becomes non-destructive and
legible — operators can see, describe, and edit real-world schedules
without dropping into raw cron and without fear that saving will
silently rewrite their trigger

## What Changed

**Core fixes + redesign (required):**
- **ScheduleEditor correctness** — `parseCronToPreset` now detects comma
lists, ranges, steps, and unknown tokens across every cron field and
routes anything it can't round-trip losslessly to the `custom` preset
(except `dow === "1-5"` → `weekdays`). Fixes the `0 9,13,17 * * *` → `0
10 * * *` regression.
- **ScheduleEditor presets** — adds first-class support for
every-N-minutes (with optional hour window + weekdays-only),
every-N-hours, hourly at minute offset, daily with multiple times/day,
selected-days-of-week with multiple times, and monthly on multiple
dates. `describeSchedule` unfolds multi-value hour/day lists into
readable sentences.
- **ScheduleEditor polish** — swaps raw `<input type=\"checkbox\">` for
the shadcn `Checkbox` primitive so hour-window and weekdays-only toggles
match the rest of the app.
- **Triggers tab redesign** — replaces the inline add form + inline
editor with a header + \"Add trigger\" button, compact `TriggerListCard`
entries, and a `TriggerDialog` add/edit modal. Enable/disable is now a
single-click switch on each card; delete goes through a `ConfirmDialog`.
- **Webhook trigger gating** — webhook kind is visible but disabled with
\"— COMING SOON\" in the add dialog, matching the old inline form's
production behaviour. Editing existing webhook triggers still works.
- **Tests** — adds `ScheduleEditor.test.ts` covering the regression cron
strings (`0 9,13,17 * * *`, `0 */4 * * *`, `0 10,16 * * *`) plus
existing preset patterns as regression guards in the other direction.

**Optional layout change (commit `145a86b5` — can be dropped without
affecting the rest):**
- Moves Triggers/Runs/Activity into the shared right-hand
`PropertiesPanel` (persisted open/close, header toggle button),
mirroring `IssueDetail` and `GoalDetail`. The reasoning: these tabs are
the primary way a human *operates* a routine, and keeping them docked on
the right means they're always in view next to the routine content
rather than hidden below the fold. Mobile parity is preserved by
rendering the same tabs inline below `md`. Trigger cards and
run/activity rows were restructured into vertical stacks so they fit the
320px panel without overflow, and the last-result badge became a
wrapping inline chip so long error strings no longer fill the card
width.
- **If reviewers prefer to keep the tabs inline below the routine, this
commit can be reverted cleanly without touching any of the fixes
above.**

## Screenshots:

Old:
<img width="721" height="707" alt="triggers-old"
src="https://github.com/user-attachments/assets/260bb682-32cb-4dff-b038-d55e45824b04"
/>

New: 
<img width="1410" height="1325" alt="Screenshot 2026-04-13 at 12 25 00"
src="https://github.com/user-attachments/assets/d70dd35b-e72f-4fc6-bb21-be9b0d92b3b1"
/>

New Add Trigger modal:
<img width="1408" height="1321" alt="Screenshot 2026-04-13 at 12 25 07"
src="https://github.com/user-attachments/assets/0f23a83d-ba2c-47ed-9efa-829e777dcdf5"
/>

Commit 145a86b5 Properties panel:
<img width="1409" height="830"
alt="commit-145a86b51265e326160cb8c48e0874cb36d86f37"
src="https://github.com/user-attachments/assets/f1d42f07-7cd3-4614-8e93-5b585affd4bf"
/>

## Verification

- `cd ui && npm test -- ScheduleEditor` — new cron parser/describer
cases pass.
- Full UI test suite + typecheck green locally.
- Manual:
1. Open a routine → Triggers tab → verify cards render with enable
switch, edit, and delete (confirm dialog).
2. Create a schedule trigger with each preset (every-N-min with window,
every-N-hours, hourly@offset, daily multi-time, weekly multi-time,
monthly multi-date) → save → reopen → preset + values round-trip intact.
3. Paste `0 9,13,17 * * *` into an existing trigger → editor routes to
Custom with the raw cron preserved → save → value unchanged.
4. Try to add a webhook trigger → kind option shows \"— COMING SOON\"
and is disabled; edit an existing webhook trigger still works.
5. Toggle the properties panel via header button → state persists across
reload. Resize below `md` → tabs render inline.
- **Before/after screenshots:** attached in PR description (inline
triggers tab → list+modal; raw-cron save hazard → custom preset
preservation; bottom-of-page tabs → right-hand PropertiesPanel).

## Risks

- **Medium-low.** UI-only change; no API, schema, or migration impact.
- `parseCronToPreset` / `describeSchedule` signatures are preserved, but
their *behaviour* shifts: more cron strings now resolve to `custom` than
before. Any external caller relying on the old (lossy) classification
would see different preset tags — none known in-repo.
- PropertiesPanel reuse (optional commit) depends on the existing
localStorage key behaviour; if two routes ever write conflicting
open/close state under the same key, one could clobber the other.
Mirrors the established `IssueDetail`/`GoalDetail` pattern, so risk is
bounded. Reverting `145a86b5` removes this risk entirely while keeping
the fixes.
- Webhook kind is disabled in the add dialog only; existing webhook
triggers remain editable, so no data is stranded.

## Model Used

- **Authoring / PR drafting:** Anthropic Claude — `claude-opus-4-6` (1M
context window), via Claude Code CLI. Used for diff review and PR
description drafting. Code authored by @aronprins.
- **Post-hoc audit:** OpenAI Codex — `gpt-5.4` (high reasoning). Audited
the completed work after implementation; found no issues.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified with version + capability details
- [x] Tests run locally and pass
- [x] Added/updated tests (`ScheduleEditor.test.ts`)
- [x] Before/after screenshots attached
- [ ] Documentation updated — none required (internal UI only)
- [x] Risks documented
- [x] Will address all Greptile + reviewer comments before merge
2026-05-11 00:53:10 -07:00
Devin Foley 0fe39a2d5c fix(cursor-local): resolve sandbox agent installs from cursor bin (#5686)
> _Stacked on top of #5685 (Harden remote sandbox runtime). Diff against
master includes commits from earlier PRs in the stack — review focuses
on the new commit only._

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The cursor-local adapter wraps the Cursor Agent CLI so a Paperclip
workflow can drive it inside a sandbox
> - When the adapter runs in a remote sandbox, the Cursor Agent CLI
installs under `$HOME/.local/bin/cursor-agent` (or wherever
`$XDG_BIN_HOME` points), not on the global PATH
> - The existing post-install resolution assumed `cursor-agent` would
resolve via the sandbox's login shell PATH after `npm install -g`, which
fails on sandboxes where the install lands in a user-prefixed directory
that isn't on PATH at probe time
> - This pull request resolves the agent CLI from the cursor binary's
own directory (`dirname "$(command -v cursor)"`) so the install probe
and execute path agree on a real binary location
> - The benefit is that cursor-local works correctly on any sandbox
provider where `npm install` lands in a user-prefixed directory

## What Changed

- `packages/adapters/cursor-local/src/server/remote-command.ts`: resolve
the cursor-agent binary from the cursor bin directory after install,
instead of relying on PATH.
- `packages/adapters/cursor-local/src/server/test.ts`: corresponding
probe tweak.
- `packages/adapters/cursor-local/src/server/test.test.ts` (new) +
`remote-command.test.ts`: focused coverage that exercises the install +
resolve path against a sandbox runner that places the binary in a
user-prefixed directory.

## Verification

- `pnpm exec vitest run --no-coverage
packages/adapters/cursor-local/src/server/test.test.ts
packages/adapters/cursor-local/src/server/remote-command.test.ts
packages/adapters/cursor-local/src/server/execute.test.ts`

All passing locally.

## Risks

- Local cursor-local runs are unaffected — the resolution change only
kicks in for the sandbox install path.
- Low risk; isolated to one adapter.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: tool use (Read/Edit/Bash), no code execution beyond
local repo commands

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A, no UI change
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 00:41:20 -07:00
Devin Foley b24c6909e8 Harden remote sandbox runtime probes, timeouts, and installs (#5685)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent runs inside a sandbox environment so its CLI is isolated
from the host
> - Sandbox-backed adapter runs go through a small set of shared helpers
— `ensureAdapterExecutionTargetCommandResolvable`, the sandbox callback
bridge runner, and per-adapter `SANDBOX_INSTALL_COMMAND` strings
> - When standing up new sandbox provider plugins, the existing helpers
timed out, missed install fallbacks, or leaned on assumptions that only
held for E2B
> - Local adapters (`claude-local`, `codex-local`, `gemini-local`,
`opencode-local`) needed slightly hardened probes so they could install
themselves and validate inside *any* remote sandbox transport, not just
E2B
> - This pull request bundles those runtime fixes so future sandbox
provider plugins inherit a working baseline
> - The benefit is that adding a new sandbox provider plugin no longer
requires touching adapter-utils or each local-adapter probe — the
supporting infra is already correct

## What Changed

- `packages/adapter-utils/src/execution-target.ts`: introduce
`DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC = 1800` and
`resolveAdapterExecutionTargetTimeoutSec(...)`. Local and SSH adapters
keep the historical "0 means no adapter timeout" behavior;
sandbox-backed runs without an explicit `timeoutSec` get an explicit
30-minute default so remote installs and warm-up don't time out at the
per-RPC default. Plumbed `timeoutSec` through
`ensureAdapterExecutionTargetCommandResolvable` so install probes inside
a sandbox honor adapter-level overrides instead of the bridge's 5-minute
default.
- `packages/adapters/opencode-local/src/index.ts`: switch
`SANDBOX_INSTALL_COMMAND` from `npm install -g opencode-ai` to `curl
-fsSL https://opencode.ai/install | bash`. The npm package reifies four
large prebuilt-binary subpackages in parallel even though only one
matches the host arch; on bandwidth-constrained sandboxes that blew
through the 240s install budget. The official installer fetches one
arch-specific binary and adds `$HOME/.opencode/bin` to PATH via
`~/.bashrc`, which the sandbox-callback-bridge login-shell script
already sources.
- `packages/adapters/{claude,codex,gemini,opencode}-local/`: harden
remote-target probes — pass `--skip-git-repo-check` for Codex when
probing outside a repo, normalize permission flags for Claude, and add
`*.remote.test.ts` coverage that exercises the remote-sandbox path
explicitly for each adapter.
- `packages/adapter-utils/src/sandbox-install-command.{ts,test.ts}`
(new): add `buildSandboxNpmInstallCommand` helper.
`server/src/adapters/registry.ts` + new
`server/src/__tests__/adapter-registry.test.ts`: wire adapter install
commands so they fall back to a writable `$HOME/.local` prefix when
global install isn't available.
- `server/src/__tests__/plugin-worker-manager.test.ts` + new
`server/src/__tests__/fixtures/plugin-worker-delayed.cjs`: pin per-call
timeout overrides so plugin worker exec calls honor the caller's timeout
instead of the worker's default.

## Verification

- `pnpm typecheck`
- `pnpm exec vitest run --no-coverage
packages/adapter-utils/src/execution-target-sandbox.test.ts
packages/adapter-utils/src/sandbox-install-command.test.ts`
- `pnpm exec vitest run --no-coverage
server/src/__tests__/plugin-worker-manager.test.ts
server/src/__tests__/adapter-registry.test.ts
server/src/__tests__/claude-local-adapter-environment.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/gemini-local-adapter-environment.test.ts`
- `pnpm exec vitest run --no-coverage
packages/adapters/codex-local/src/server/test.remote.test.ts
packages/adapters/opencode-local/src/server/test.remote.test.ts
packages/adapters/codex-local/src/server/codex-args.test.ts
packages/adapters/codex-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts`

All passing locally.

## Risks

- Touches shared `adapter-utils` and several `*-local` adapters. The
30-minute default applies only when both (a) the target is
`remote+sandbox` and (b) no `timeoutSec` is configured — local + SSH
paths are unchanged. New test coverage was added alongside each behavior
change to pin the contracts.
- Switching OpenCode's install command to the official installer is a
behavior change for any operator running OpenCode inside a remote
sandbox. Local installs are unaffected (the `SANDBOX_INSTALL_COMMAND`
only runs when an adapter is being installed inside a sandbox).
- Low risk overall — no migrations, no API surface change.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep),
no code execution beyond local repo commands

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A, no UI change
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 00:31:54 -07:00
github-actions[bot] 6e4fa78d86 chore(lockfile): refresh pnpm-lock.yaml (#5668)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-05-10 17:30:05 -07:00
Devin Foley 534aee66ae Add cursor_cloud adapter for Cursor SDK + Cloud Agents API v1 (#5664)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - There are many adapter types, one per agent-runtime product (Claude,
Codex, OpenCode, Cursor local CLI, etc.)
> - Cursor shipped a public TypeScript SDK on 2026-04-29 that exposes
Cursor's full hosted-agent platform (cloud VMs, harness, MCP, skills,
hooks)
> - Paperclip had no first-class adapter for this — agents that wanted
to use Cursor's managed cloud runtime had to fall back to the local CLI
adapter, which loses the cloud session, streaming, and durable run model
> - This PR adds a new `cursor_cloud` adapter built directly on
`@cursor/sdk`, with Paperclip's heartbeat mapped to Cursor's
durable-agent + per-run model
> - The benefit is that any Paperclip agent can now drive a Cursor cloud
agent across heartbeats with native session reuse, streaming, and
cancellation, while Paperclip remains the source of truth for issue/task
state

## What Changed

- New built-in adapter package `packages/adapters/cursor-cloud` (15
files, ~1.7k LOC) backed by `@cursor/sdk` ^1.0.12
- `src/server/execute.ts` — SDK-first lifecycle: `Agent.create` /
`Agent.resume` / `Agent.getRun` / `agent.send` / `run.stream` /
`run.wait`, with session reuse keyed on the (runtime env type, env name,
repo set) tuple
- `src/server/session.ts` — codec for `cursorAgentId` + `latestRunId` +
repo metadata, persisted in `runtime.sessionParams`
- `src/server/test.ts` — environment probe via `Cursor.me()` and
optional model validation via `Cursor.models.list()`
- `src/ui/parse-stdout.ts` + `src/cli/format-event.ts` — normalize
Cursor SDK message types (`status`, `thinking`, `assistant`, `user`,
`tool_call`, `tool_result`, `result`) into Paperclip transcript events
for the UI and CLI
- Registrations: `packages/shared/src/constants.ts`,
`packages/adapter-utils/src/session-compaction.ts`,
`server/src/adapters/{registry,builtin-adapter-types}.ts`,
`ui/src/adapters/{registry,adapter-display-registry}.ts` +
`ui/src/adapters/cursor-cloud/index.ts`, `cli/src/adapters/registry.ts`,
plus workspace deps in `cli`/`server`/`ui` `package.json`
- `ui/src/components/AgentConfigForm.tsx` — hide local-Cursor
`mode`/thinking-effort field for `cursor_cloud` (different config
surface)
- 11 vitest tests covering execute paths (fresh create, matching-resume,
active-run reattach, non-finished result), session codec round-trip,
transcript parsing, and config building

## Verification

Reviewer steps:

```bash
pnpm install
pnpm --filter @paperclipai/adapter-cursor-cloud typecheck   # → clean
pnpm vitest run packages/adapters/cursor-cloud              # → 11/11 passing
```

End-to-end check against a real Cursor cloud agent (requires
`CURSOR_API_KEY` and Cursor GitHub-app install on the target repo):

1. Create a `cursor_cloud` agent in Paperclip with `repoUrl` set to the
test repo, `repoStartingRef: main`, and `env.CURSOR_API_KEY` set
2. Trigger a heartbeat → adapter calls `Agent.create({ cloud: { env: {
type: "cloud" }, repos: [...] } })`, streams events, terminates on
`finished`
3. Trigger a second heartbeat → adapter calls `Agent.resume` or
`agent.send` follow-up depending on prior-run state, reusing
`cursorAgentId`
4. The Paperclip UI/CLI transcript reflects Cursor `status` / `thinking`
/ `assistant` events as they stream
5. Cancellation from Paperclip maps to `run.cancel()` or Cloud API v1
`cancelRun` for cross-heartbeat cancellation

A direct-SDK smoke run against a real repo (devinfoley/my_test_project @
main) confirmed: `Cursor.me()` ok → `Agent.create` → `agent.send` →
`run.stream()` (30 events) → terminal status `finished` in ~11s.

## Risks

- **New adapter, additive only.** No existing adapter or registry is
replaced; current `cursor` local-CLI adapter is untouched. Default
behavior of any existing agent is unchanged.
- **External dependency on `@cursor/sdk`.** Cursor's SDK is v1.0.x and
may evolve. Mocked unit tests cover the public surface used here; if the
SDK breaks compatibility we update the adapter independently.
- **Cost/budget.** `cursor_cloud` runs on Cursor's billed cloud VMs;
operators must understand they are spending money outside Paperclip's
budget controls when they enable this adapter. Same shape as other
API-billed adapters.
- **No webhook support in V1.** The SDK already provides
stream/wait/cancel/reattach, so V1 does not require a public callback
URL. If a future use case needs out-of-band wakes, we add a Cloud API v1
webhook bridge as a separate change. This is called out in the issue
plan document.
- **Lockfile.** Per repo policy, `pnpm-lock.yaml` is intentionally not
in this PR — CI's lockfile workflow will update it on merge given the
manifest changes.

## Model Used

- Provider: Anthropic Claude (via Claude Code / Paperclip `claude_local`
adapter)
- Model: `claude-opus-4-7` (Claude Opus 4.7), knowledge cutoff January
2026
- Mode: standard tool-use with extended reasoning
- Context: ~200k token window
- Capabilities used: code generation, multi-file edits, shell/test
execution, GitHub PR workflow

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (11/11 in
`packages/adapters/cursor-cloud`)
- [x] I have added or updated tests where applicable (4 new test files,
11 cases)
- [ ] If this change affects the UI, I have included before/after
screenshots (the only UI change is hiding the local-Cursor mode field on
the `cursor_cloud` adapter — happy to attach a screenshot if the
reviewer wants one)
- [x] I have updated relevant documentation to reflect my changes (issue
plan document supersedes the pre-SDK design; tracked in PAPA-203)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-10 17:21:04 -07:00
Dotta 0096b56a1c [codex] Add LLM Wiki plugin host support (#5597)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The plugin system needs host contracts and runtime support before
large plugins can integrate cleanly.
> - The source branch mixed the LLM Wiki package with supporting
host/runtime work, managed plugin skills, root-level storage spaces, and
a bookmarks reference plugin.
> - [PAP-9173](/PAP/issues/PAP-9173) asked for the current branch to be
split by file boundary: plugin package separately from everything else.
> - [PAP-9188](/PAP/issues/PAP-9188) clarified that LLM Wiki may have
plugin-local spaces, but Paperclip core should not reorganize top-level
local storage into spaces.
> - Follow-up review clarified that the bookmarks example should not
ship in this PR either.
> - This pull request contains the
non-`packages/plugins/plugin-llm-wiki/` host/runtime work, keeps runtime
state under the selected Paperclip instance root, and no longer includes
the bookmarks example.

## What Changed

- Added/updated plugin host contracts, SDK types, worker RPC plumbing,
managed plugin skill support, and related server tests.
- Removed the bookmarks example plugin package and its
bundled-example/workspace references.
- Removed the root-level local spaces CLI/migration surface and restored
instance-root runtime defaults for config, db, logs, storage, secrets,
workspaces, projects, and adapter homes.
- Replaced shared root `space-paths` helpers with `home-paths` helpers
for core runtime storage.
- Tightened stranded recovery unique-conflict detection so concurrent
recovery scans reuse the raced recovery issue when Postgres errors are
wrapped.
- Kept `packages/plugins/plugin-llm-wiki/` out of this PR diff;
plugin-local spaces remain in the stacked plugin-only PR.

## Verification

- `pnpm exec vitest run cli/src/__tests__/data-dir.test.ts
cli/src/__tests__/home-paths.test.ts cli/src/__tests__/onboard.test.ts
packages/shared/src/home-paths.test.ts
packages/db/src/runtime-config.test.ts
server/src/__tests__/agent-instructions-service.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/codex-local-execute.test.ts`
- `pnpm exec vitest run packages/db/src/runtime-config.test.ts`
- `pnpm exec vitest run
server/src/__tests__/plugin-routes-authz.test.ts`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts -t "reuses the
raced stranded recovery issue"` skipped locally because embedded
Postgres did not initialize on this macOS temp host; the code path was
typechecked and is covered by Linux CI.
- Boundary check: no core references remain for `PAPERCLIP_SPACE_ID`,
`spaces migrate-default`, `@paperclipai/shared/space-paths`,
`registerSpacesCommands`, or the removed bookmarks example.
- Previous PR head `4f23e034` had green GitHub checks: `verify`, all
four serialized server shards, `e2e`, `Canary Dry Run`, `policy`, Snyk,
and `Greptile Review`. Current head `582f466d` is re-running checks
after the bookmarks deletion.

## Risks

- Plugin host changes touch shared runtime paths, so regressions would
most likely appear in adapter startup, plugin loading, or local dev path
defaults.
- Removing the bookmarks example also removes one demonstration of
plugin database namespaces plus local-folder persistence; remaining
plugin examples still cover bundled example discovery and plugin host
flows.
- The plugin package itself is intentionally deferred to the stacked
plugin-only PR, where LLM Wiki plugin-local spaces live.
- Existing installs that tested the transient root-level spaces CLI
should stop using it; this PR intentionally removes that unsupported
migration surface before merge.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI GPT-5 Codex via Codex CLI, tool use and local code execution
enabled; context window not exposed.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass, except where noted above
for host-specific embedded Postgres initialization
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Stacked follow-up: PR #5592 contains only
`packages/plugins/plugin-llm-wiki/` and targets this branch.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-10 07:34:12 -05:00
Devin Foley eb12c42009 Clarify sandbox provider messaging in company environments (#4902)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Company Environments is the operator-facing seam for choosing where
compatible adapters execute work.
> - Sandbox provider plugins such as E2B extend that seam, but they are
not agent adapters themselves.
> - The current Company Environments copy put adapter capability rows
and sandbox-provider enablement on the same page without clearly
distinguishing the two concepts.
> - That made it look like installing the E2B sandbox provider caused a
new adapter to appear under adapters.
> - This pull request clarifies the UI language so provider plugins are
described as backing the Sandbox driver rather than being adapter types.
> - The benefit is a more accurate mental model for operators
configuring environments and adapters.

## What Changed

- Added explicit Company Environments copy stating that installed
sandbox providers are not adapter types and instead back the Sandbox
driver for compatible adapters.
- Renamed the support-matrix column from `Sandbox` to `Sandbox via
plugin` to make the provider relationship visible in the table itself.
- Extended the existing environments UI test to assert the new
clarification text.

## Verification

- `pnpm test -- --run ui/src/pages/CompanySettings.test.tsx`
Result: could not complete cleanly in this worktree because the checkout
is missing its local workspace install links.
- Direct Vitest fallback against `ui/src/pages/CompanySettings.test.tsx`
Result: failed before test collection on local dependency resolution
(`react/jsx-dev-runtime`), so there is no passing automated signal from
this checkout.
- Manual review
Confirm the Company Environments page now says sandbox providers are not
adapter types and labels the table column as `Sandbox via plugin`.

## Risks

- Low risk. This is a copy-only UI clarification plus a matching test
assertion; the main risk is wording drift if the product later decides
sandbox providers should be surfaced differently.

## Model Used

- OpenAI Codex via the local `codex_local` Paperclip adapter. This run
used tool-assisted code editing and shell execution. The exact backend
model ID and context window are not exposed in the Paperclip run context
for this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
2026-05-09 23:03:26 -07:00
Devin Foley a72731f118 fix: harden release registry verification against npm lag (#4816)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Its release automation publishes canary packages to npm and then
validates the published registry state before considering the release
healthy
> - The failing canary run `25139465018` showed that npm can expose a
newly published version through version-specific endpoints before the
root package document has fully converged
> - That made a successful canary publish look like a failed release
because the verifier trusted stale root metadata too early
> - This pull request hardens the registry verification path by
preferring version-specific manifest checks, retrying
convergence-sensitive failures, and distinguishing permanent failures
from propagation lag
> - While validating that change in CI, a separate teardown race in
`heartbeat-stale-queue-invalidation.test.ts` surfaced and was hardened
so the PR could pass reliably
> - The benefit is that transient npm propagation lag no longer fails a
successful canary publish, while genuine registry-state and
dependency-integrity failures still stop the release flow promptly

## What Changed

- Hardened `scripts/verify-release-registry-state.mjs` so it prefers
version-specific manifest resolution over stale root metadata, adds
bounded registry-fetch timeouts, and classifies failures as retriable vs
non-retriable.
- Updated `scripts/release-lib.sh` and `scripts/release.sh` so
post-publish registry verification retries only convergence-sensitive
failures and reports immediate permanent failures clearly.
- Expanded `scripts/verify-release-registry-state.test.mjs` with
regression coverage for stale root metadata, fetch timeout behavior,
peer dependency range handling, non-retriable canary-latest cases, and
related verifier edge cases.
- Hardened
`server/src/__tests__/heartbeat-stale-queue-invalidation.test.ts`
teardown to tolerate the late-comment foreign-key race that CI exposed
while validating this branch.

## Verification

- `pnpm run test:release-registry`
- `node --check scripts/verify-release-registry-state.mjs`
- `bash -n scripts/release.sh && bash -n scripts/release-lib.sh`
- PR checks passed on head `5c422600fc12acac61f6b7c267a4dc915df622b1`:
`policy`, `verify`, `e2e`, `security/snyk`, and `Greptile Review`

## Risks

- Low risk. The main behavioral changes are limited to release
automation and verifier retry semantics, plus a test-only teardown
hardening for a CI race.

> I checked [`ROADMAP.md`](ROADMAP.md). This is a narrow release bugfix
and does not overlap planned core feature work.

## Model Used

- OpenAI Codex via Paperclip `codex_local` with tool use and local code
execution enabled. This agent session runs on a GPT-5-class coding
model; the exact backend model ID/context window is not exposed by the
local adapter runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I have addressed all Greptile and reviewer comments before
requesting merge
2026-05-09 22:18:12 -07:00
github-actions[bot] a1b2875165 chore(lockfile): refresh pnpm-lock.yaml (#5610)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-05-09 23:40:25 -05:00
Devin Foley 2f72cb29ea chore: update drizzle-orm to 0.45.2 (#5589)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The server, DB package, and CLI all rely on the shared Drizzle ORM
dependency for core persistence flows.
> - A published install was still resolving nested `drizzle-orm@0.38.4`,
which left the production package graph behind the intended security
update.
> - The repo’s documented dependency policy says GitHub Actions owns
`pnpm-lock.yaml`, so the correct maintainer workflow is to update
dependency manifests in the feature PR and let the lockfile refresh
happen separately after merge.
> - This pull request therefore keeps the Drizzle upgrade to the package
manifests only and leaves lockfile regeneration to the existing `Refresh
Lockfile` automation.

## What Changed

- Updated `drizzle-orm` dependency declarations in `cli/package.json`,
`packages/db/package.json`, and `server/package.json` from `0.38.4` /
`^0.38.4` to `0.45.2` / `^0.45.2`.
- Re-verified the packed `@paperclipai/db` and `@paperclipai/server`
publish payloads to confirm their generated `package.json` files
advertise `drizzle-orm ^0.45.2`.
- Removed the temporary lockfile/CI follow-up commits so the branch now
matches the intended manifest-only protocol.

## Verification

- `pnpm list drizzle-orm -r --depth 0`
- `pnpm exec vitest run packages/db/src/client.test.ts
server/src/__tests__/issues-service.test.ts`
- `pnpm run test:release-registry`
- Packed `@paperclipai/db` and `@paperclipai/server` locally and
inspected the tarball `package.json` files to confirm they advertise
`drizzle-orm ^0.45.2`.

## Risks

- Low to moderate risk: the runtime code paths are unchanged, but
downstream lockfile refresh now depends on the existing post-merge
GitHub automation working as documented.
- A separate packaging/versioning issue around unpublished
`@paperclipai/plugin-sdk@1.0.0` showed up during a raw local tarball
install experiment; that is called out for reviewers but is not part of
this Drizzle bump.

## Model Used

- OpenAI Codex via the `codex_local` adapter, using a GPT-5-based coding
agent with terminal tool use and code execution. The adapter does not
expose a public exact model ID or context-window value in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-09 21:31:57 -07:00
Dotta e3af7aa489 Add shared sidebar section controls (#5585)
## Thinking Path

> - Paperclip is the control plane for AI-agent companies.
> - The board UI sidebar is one of the main ways operators scan active
agents and projects.
> - Agents and projects had duplicated section header behavior, which
made collapse controls, add actions, and future section menus harder to
keep consistent.
> - Operators also need lightweight ways to switch between their curated
sidebar order and common scan orders like alphabetical or recent
activity.
> - This pull request introduces a shared sidebar section header and
uses it for the Agents and Projects sidebar sections.
> - The benefit is a more consistent sidebar surface with reusable
header controls and persisted sort modes without losing the existing
drag-ordered Top view.

## What Changed

- Added a reusable `SidebarSection` component that supports collapsible
content, header actions, and section dropdown menus.
- Updated the Agents sidebar section to use the shared header and add
persisted `Top`, `Alphabetical`, and `Recent` sort modes.
- Updated the Projects sidebar section to use the shared header and add
persisted `Top`, `Alphabetical`, and `Recent` sort modes.
- Added local-storage helpers and cross-tab update events for
agent/project sidebar sort preferences.
- Added focused component coverage for the shared section behavior and
the updated Agents/Projects sidebar ordering paths.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
ui/src/components/SidebarSection.test.tsx
ui/src/components/SidebarProjects.test.tsx
ui/src/components/SidebarAgents.test.tsx`
  - 3 test files passed
  - 18 tests passed

## Risks

- Low-to-moderate UI risk: this changes sidebar section header
interactions and adds persisted client-side sort preferences.
- Drag ordering is intentionally limited to `Top` mode; non-top modes
render sorted lists and do not persist drag order changes.
- No database migrations or API contract changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5-based model, tool-use enabled; exact
hosted model build/context-window identifier was not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-09 19:49:59 -05:00
Devin Foley 433dfed33d Enable CI publish for plugin-daytona (#5586)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The release pipeline gates new public packages behind a bootstrap
policy: `scripts/check-release-package-bootstrap.mjs` requires every
package marked `publishFromCi: true` in
`scripts/release-package-manifest.json` to already exist on npm
> - PR #5580 added the new Daytona sandbox provider plugin but had to
land with `publishFromCi: false` because the package had never been
published, so CI's release plan would have failed bootstrap validation
otherwise
> - Now that `@paperclipai/plugin-daytona` has been bootstrap-published
to npm by hand, the temporary `false` flag is the only thing keeping it
out of the standard CI publish flow
> - This pull request flips the Daytona entry to `publishFromCi: true`,
matching every other release-enabled package in the manifest
> - The benefit is that future tagged releases will publish the Daytona
plugin automatically alongside the rest of the monorepo's public
packages

## What Changed

- Single-line flip in `scripts/release-package-manifest.json`:
`@paperclipai/plugin-daytona` is now `publishFromCi: true`

## Verification

- `node ./scripts/release-package-map.mjs check` → `Release package
manifest OK: 19 enabled for CI publish, 0 disabled pending bootstrap`
(was 18 + 1)
- `node ./scripts/check-release-package-bootstrap.mjs
scripts/release-package-manifest.json` against `origin/master` →
`Release bootstrap OK for changed manifests:
@paperclipai/plugin-daytona`, confirming npm sees the
bootstrap-published package
- No code changes; no tests required beyond the existing manifest
validators

## Risks

- Low risk. Only effect is that the next release run will include
`@paperclipai/plugin-daytona` in its publish set
- If the npm bootstrap was incomplete, CI's bootstrap check will fail
loudly before any release tag goes out — same safety net the policy is
designed to provide

## Model Used

- Claude Opus 4.7 (`claude-opus-4-7`), extended thinking, tool use
enabled

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable (N/A —
manifest-only flag flip, covered by existing validators)
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — release config)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-09 16:58:35 -07:00
Dotta 778e775c35 Add secrets provider vaults and remote import (#5429)
## Thinking Path

> - Paperclip orchestrates AI-agent companies and needs secrets handling
to work across local development, hosted operators, and governed agent
execution.
> - The affected subsystem is the company-scoped secrets control plane:
database schema, server services/routes, CLI workflows, and the Secrets
settings UI.
> - The gap was that secrets were local-only and operators could not
manage provider vaults or import existing remote references without
exposing plaintext.
> - This branch adds provider vault configuration plus an AWS Secrets
Manager remote-import path while preserving company boundaries, binding
context, and audit trails.
> - I kept the PR to a single branch PR, removed unrelated
lockfile/package drift, rebased the full branch onto the current
`public-gh/master`, and addressed fresh Greptile findings.
> - The benefit is a reviewable implementation of provider-backed
secrets with focused tests covering provider selection, import
conflicts, deleted secret reuse, rotation guards, and AWS signing
behavior.

## What Changed

- Added provider vault support for company secrets, including provider
config storage, default vault handling, health checks, binding usage,
access events, and remote import preview/commit.
- Added an AWS Secrets Manager provider using SigV4 request signing,
bounded request timeouts, namespace guardrails, cached runtime
credential resolution, and external-reference linking without plaintext
reads.
- Added Secrets UI surfaces for vault management and remote import, plus
CLI/API documentation for setup and operations.
- Stabilized routine webhook secret binding paths and SSH
environment-driver fixture bindings discovered during verification.
- Addressed Greptile and CI findings: no lockfile/package drift,
monotonic migration metadata, disabled-vault default races, soft-deleted
secret hiding/recreate behavior, remove behavior with disabled vaults,
soft-deleted external-reference re-import, non-active rotation guards,
managed-secret soft deletion through PATCH, and per-call AWS SDK
credential client churn.
- Rebased this branch onto `public-gh/master` at `0e1a5828` and
force-pushed with lease to keep this as the single PR for the branch.

## Verification

- `git fetch public-gh master`
- `git rebase public-gh/master`
- `git diff --name-only public-gh/master...HEAD | grep
'^pnpm-lock\.yaml$' || true` confirmed `pnpm-lock.yaml` is not in the PR
diff.
- Confirmed migration ordering: master ends at `0081_optimal_dormammu`;
this PR adds `0082_dry_vision` and
`0083_company_secret_provider_configs`.
- Inspected migrations for repeat safety: new tables/indexes use `IF NOT
EXISTS`; foreign keys are guarded by `DO $$ ... IF NOT EXISTS`; column
additions use `ADD COLUMN IF NOT EXISTS`.
- `pnpm -r typecheck` passed before the Greptile follow-up commits.
- `pnpm test:run` ran the full stable Vitest path before the Greptile
follow-up commits; it completed with 3 timing-related failures under
parallel load: `codex-local-execute.test.ts`,
`cursor-local-execute.test.ts`, and `environment-service.test.ts`.
- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/codex-local-execute.test.ts
src/__tests__/cursor-local-execute.test.ts
src/__tests__/environment-service.test.ts` passed on targeted rerun
(`24/24`).
- `pnpm build` passed before the Greptile follow-up commits. Vite
reported existing chunk-size/dynamic-import warnings.
- After Greptile follow-up commits: `pnpm --filter @paperclipai/server
exec vitest run src/__tests__/secrets-service.test.ts` passed (`26/26`).
- After Greptile follow-up commits: `pnpm --filter @paperclipai/server
exec vitest run src/__tests__/aws-secrets-manager-provider.test.ts
src/__tests__/secrets-service.test.ts` passed (`39/39`).
- After Greptile follow-up commits: `pnpm --filter @paperclipai/server
typecheck` passed.
- Captured Storybook screenshots from `ui/storybook-static` for visual
review.
- Latest PR checks on `5ca3a5cf`: `policy`, serialized server suites
1/4-4/4, `Canary Dry Run`, `e2e`, `security/snyk`, and `Greptile Review`
pass; aggregate `verify` is still registering the completed child
checks.
- Greptile review loop continued through the latest requested pass; all
Greptile review threads are resolved and the latest `Greptile Review`
check on `5ca3a5cf` passed with 0 comments added.

## Screenshots

Before: the provider-vault and remote-import surfaces did not exist on
`master`; these are after-state screenshots from the Storybook fixtures.

![Secrets
inventory](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secrets-inventory.png)

![Secret binding
picker](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secret-binding-picker.png)

![Environment editor with
secrets](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/env-editor-with-secrets.png)

## Risks

- Migration risk: this adds new secret provider tables and extends
existing secret rows. The migrations were checked for monotonic ordering
and idempotent guards, but reviewers should still inspect upgrade
behavior carefully.
- Provider risk: AWS support uses direct SigV4 requests. Automated tests
cover signing, request timeouts, vault-config selection, namespace
guardrails, pending-version archival, sanitized provider errors, and
service-level cleanup paths. A real-vault AWS smoke test remains
deployment validation for an operator with AWS credentials rather than
an unverified merge blocker in this local branch.
- UI risk: the Secrets page and import dialog are large new surfaces;
screenshots are included above for reviewer inspection.
- Verification risk: the full local stable test command hit
parallel-load timing failures, although the exact failed files passed
when rerun directly.
- Operational risk: remote import intentionally avoids plaintext reads;
operators must understand that imported external references resolve at
runtime and may fail if AWS permissions change.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent with local shell/tool use in the
Paperclip worktree. Exact context-window size was not exposed by the
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 18:22:17 -05:00
Devin Foley 06e6ee25cd Add Daytona sandbox provider plugin (#5580)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents need isolated sandbox environments to execute work safely;
Paperclip already supports E2B as a sandbox provider plugin
> - Users want to use Daytona (https://www.daytona.io/) as an
alternative sandbox backend, but no plugin existed for it
> - Without a Daytona plugin, teams that prefer Daytona's
pricing/regions/runtime can't run Paperclip agents on it
> - This pull request adds a `@paperclip/sandbox-provider-daytona`
plugin that mirrors the existing E2B plugin shape and wires up Daytona's
`@daytonaio/sdk` for sandbox lifecycle, command execution, and shell
detection
> - The benefit is that operators can pick Daytona as a first-class
sandbox provider without touching core code, broadening Paperclip's
runtime options

## What Changed

- New plugin package `packages/plugins/sandbox-providers/daytona` with
manifest, worker entry, and provider implementation backed by
`@daytonaio/sdk`
- Implements sandbox create/destroy/exec/upload/download lifecycle,
shell command detection, and config/env wiring consistent with the E2B
plugin
- Adds unit tests under `src/plugin.test.ts` and a README documenting
setup and the `DAYTONA_API_KEY` requirement
- Minor adjustments in `scripts/paperclip-issue-update.sh`,
`packages/shared/src/issue-thread-interactions.test.ts`, and
`packages/shared/src/validators/issue.ts` to support the integration

## Verification

- Re-ran the full sandbox provider matrix on the QA Paperclip instance
using Daytona as the runtime — all 6 adapters executed inside the
Daytona sandbox with zero `environmentExecute` timeouts
- 5/6 adapters pass cleanly (or with informational warns); the only
failure is `codex_local`, which is an OpenAI quota/billing issue
unrelated to Daytona
- `pnpm --filter @paperclip/sandbox-provider-daytona test` runs the
plugin unit tests

## Risks

- New optional plugin; no behavior change for users who don't enable it
- Requires `DAYTONA_API_KEY` for runtime use — documented in the plugin
README
- Daytona SDK is a new external dependency; tracked in the plugin's own
package.json so it doesn't affect the core install footprint

## Model Used

- Claude Opus 4.7 (`claude-opus-4-7`), extended thinking, tool use
enabled

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — backend plugin)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-09 11:50:12 -07:00
Devin Foley f784d8d90e Retry canary registry verification (#5579)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and the
release pipeline is part of keeping that control plane shippable.
> - The relevant subsystem here is the release automation in
`scripts/release.sh`, which publishes canary builds and then verifies
npm registry state.
> - The failing CI run showed a successful publish followed by an
immediate registry-state verification failure while npm dist-tag
metadata was still propagating.
> - That made the canary job flaky even when the publish itself had
succeeded, which is the wrong failure mode for release automation.
> - This pull request adds bounded retries around the post-publish
registry-state verification step instead of failing on the first stale
read.
> - The benefit is that canary releases tolerate transient npm
propagation lag while still failing clearly if registry metadata never
converges.

## What Changed

- Wrapped the post-publish `verify-release-registry-state.mjs` call in a
bounded retry loop inside `scripts/release.sh`.
- Reused the existing publish verification retry defaults and added
optional overrides via `NPM_REGISTRY_STATE_VERIFY_ATTEMPTS` and
`NPM_REGISTRY_STATE_VERIFY_DELAY_SECONDS` for dist-tag-specific tuning.

## Verification

- `bash -n scripts/release.sh`
- CI will also exercise the release path via the existing `Canary Dry
Run` workflow job in `.github/workflows/pr.yml`.

## Risks

- Low risk. The main behavioral change is that a genuinely broken
registry-state verification can now wait through the configured retry
window before failing.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex local agent, GPT-5-based Codex runtime in Paperclip with
tool use and shell execution. The exact backend model ID/context window
is not surfaced in this local heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-09 11:40:02 -07:00
Devin Foley 0e1a582831 Revert "Add experimental newest-first issue thread" (#5460)
This is actually bad. Glad it was under experiments.
2026-05-07 16:50:31 -07:00
Devin Foley a904effb96 Add experimental newest-first issue thread (#5455)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, so issue
threads are a core operator surface for reviewing work.
> - The issue detail page is the place where humans read agent messages,
user comments, and execution context together.
> - That thread originally rendered oldest-first, which made recent
activity harder to see during active review.
> - Reversing the thread order changes navigation expectations,
timestamp placement, and the "Jump to latest" affordance, so the UI
behavior needed to move as a coherent set.
> - Because this is a visible core-product behavior shift, it also
needed a safe rollout path instead of becoming the default immediately.
> - This pull request adds the newest-first issue thread behavior behind
an Experimental setting, updates the thread UI to match that mode, and
keeps the legacy oldest-first experience unchanged by default.
> - The benefit is that reviewers can opt into a more recent-first issue
workflow without forcing a global behavior change on every Paperclip
instance.

## What Changed

- Reversed issue thread rendering so the newest comments and messages
appear first when the experiment is enabled.
- Moved the plain comment timestamp into the card header in newest-first
mode and kept the legacy timestamp placement for oldest-first mode.
- Moved the `Jump to latest` control to the bottom of the thread in
newest-first mode while leaving the existing top placement for the
legacy mode.
- Added the `Enable Newest-First Issue Thread` experimental instance
setting and wired issue detail to read that toggle.
- Added regression coverage for thread order, timestamp placement,
jump-button placement, and the issue-detail experiment toggle behavior.

## Verification

- `pnpm -r typecheck`
- `pnpm test:run`
- `pnpm build`
- Focused checks that also passed during issue review:
- `pnpm vitest run src/components/IssueChatThread.test.tsx
src/pages/IssueDetail.test.tsx` in `ui/`
- `pnpm vitest run src/__tests__/instance-settings-routes.test.ts` in
`server/`
- Manual review path:
- Enable `Instance Settings > Experimental > Enable Newest-First Issue
Thread`
- Open an issue with comments/messages and confirm newest activity
renders first, timestamps move into the header, and `Jump to latest`
sits below the thread
- Disable the experiment and confirm the legacy oldest-first behavior
returns

## Risks

- Low risk: the behavioral change is gated behind an instance-level
experimental toggle and defaults off.
- The main regression risk is thread navigation drift between the two
modes, especially around anchor scrolling and the `Jump to latest`
affordance.
- There is some UI coupling between issue-detail query state and
experimental settings fetches, so future changes in that area should
keep both modes covered.
- Screenshots are not attached in this PR body; verification is
described with automated coverage and manual steps instead.

> I checked [`ROADMAP.md`](ROADMAP.md). This is a scoped issue-thread UX
improvement and rollout gate, not a duplicate of a roadmap-level planned
core feature.

## Model Used

- OpenAI Codex via the local `codex_local` Paperclip adapter,
GPT-5-based coding agent with terminal tool use and local code execution
in this repository worktree.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-07 16:45:12 -07:00
Devin Foley 4269545b19 Stabilize Cursor sandbox runtime resolution (#5446)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Cursor adapter spawns the Cursor CLI against local, SSH, and
sandbox execution targets; on a fresh sandbox lease, it has to resolve
where Cursor was installed
> - The previous resolver only looked for `~/.local/bin/cursor-agent`
even though the official installer (and the adapter's own
`SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as
`~/.local/bin/agent`, so a sandbox where the install ran successfully
would still fail to find the CLI
> - This pull request lets the resolver accept either basename and lets
the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't
pay the cost of a remote `printf $HOME` round-trip when the home
directory is already known
> - The benefit is sandboxed Cursor runs find the binary that the
install actually produced, and runtime probes are cheaper when the home
dir is already resolved

## What Changed

- `packages/adapters/cursor-local/src/server/remote-command.ts`: accept
either `agent` or `cursor-agent` as the preferred basename; new optional
`remoteSystemHomeDirHint` short-circuits the home-dir probe
- `packages/adapters/cursor-local/src/server/execute.ts`: thread the
home-dir hint through, prefer the resolved binary path, and shift the
effective execution cwd to the per-run managed subdirectory once the
runtime is prepared
- New `remote-command.test.ts` and `execute.test.ts` cover both
basenames, the hint short-circuit, and the cwd shift
- `packages/adapters/cursor-local/src/index.ts`: update doc string to
reflect the broader resolution
- `execute.remote.test.ts` updated to expect the managed-subdirectory
cwd shape introduced by the cwd shift

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-cursor-local` — 6/6 passing
- `pnpm typecheck` clean
- Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor
(binary lands as `~/.local/bin/agent`) now runs cleanly through the
adapter

## Risks

Low. Resolver is strictly broader (matches a superset of paths);
existing setups with `~/.local/bin/cursor-agent` continue to work. The
home-dir hint is opt-in; callers that don't pass it get the existing
probe behavior. Cursor's effective execution cwd now matches the rest of
the adapters (per-run managed subdirectory) — sessions previously rooted
at the workspace root will land in the new subdirectory.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
both basenames + hint short-circuit + cwd shift
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---

> **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative
diff against `master` includes both of those PRs' content; the files
touched by *this* PR's commit are listed under "What Changed" above.
Will rebase onto `master` and force-push once the prerequisite PRs
merge.
2026-05-07 15:00:28 -07:00
Devin Foley fe3904f434 Stabilize runtime probes and Codex env tests (#5445)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters expose a Test action that probes the configured runtime —
install, resolvability, hello — to give operators a fast yes/no on
whether an environment is healthy
> - The Codex test path was running its hello probe directly without
going through the managed-runtime preparation that production runs use,
so a healthy production setup could still report a probe failure
> - The plugin worker manager wasn't surfacing terminated workers
cleanly, leaving the runtime probe waiting on a dead worker until the
request timed out
> - This pull request routes the Codex test probe through
`prepareAdapterExecutionTargetRuntime` (so it sees the same managed
Codex home production sees), exposes `commandCwd` on
`createCommandManagedRuntimeClient` so callers can target a per-probe
directory without leaking the workspace `remoteCwd`, and propagates
plugin-worker termination as a usable error instead of a hang
> - The benefit is the Codex Test action mirrors production behavior
end-to-end, and probes against a terminated plugin worker fail fast
instead of timing out

## What Changed

- `packages/adapter-utils/src/command-managed-runtime.ts`: rename the
`remoteCwd` knob to `commandCwd` so callers can target a per-probe
directory without inheriting the workspace cwd; matching test coverage
in `command-managed-runtime.test.ts`
- `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`:
small fixes to keep callback bridge stop semantics deterministic
- `packages/adapters/codex-local/src/server/test.ts`: thread the Codex
hello probe through `prepareAdapterExecutionTargetRuntime` +
`prepareManagedCodexHome` so the probe sees the same managed home
production sees; new `test.remote.test.ts` covers the remote probe path
- `packages/adapters/cursor-local/src/server/execute.ts`: small
probe-side cleanup that aligns with the new commandCwd contract
- `server/src/services/plugin-worker-manager.ts`: surface plugin-worker
termination as a structured error so callers fail fast; new
`plugin-worker-terminated.cjs` fixture and
`plugin-worker-manager.test.ts` cases pin the behavior

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-codex-local --project
@paperclipai/adapter-cursor-local --project @paperclipai/server` —
1749/1750 passing (1 unrelated skip)
- `pnpm typecheck` clean

## Risks

Low–medium. The `remoteCwd → commandCwd` rename is a parameter renaming
on an internal helper used only by adapter test/execute paths in this
repo. The plugin-worker-terminated path was previously a hang; failing
fast may surface latent timeouts as explicit termination errors in
callers that already expected them.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
commandCwd, plugin-worker termination, and Codex remote test path
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---

> **Stacked PR.** Sits on top of #5444 which adds the per-run runtime
API surface this PR builds on. Cumulative diff against `master` includes
that PR's content; the files touched by *this* PR's commit are listed
under "What Changed" above. Will rebase onto `master` and force-push
once #5444 merges.
2026-05-07 14:52:31 -07:00
Devin Foley 12cb7b40fd Harden remote workspace sync and restore flows (#5444)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - When an agent runs against a remote target, Paperclip syncs the
workspace out to the remote at run start and restores changes back to
the local workspace at run end
> - The previous restore flow naïvely overwrote local files with
whatever the remote returned, so files that the remote run never touched
but had timestamp/mode drift could be needlessly rewritten — and a
single static `refs/paperclip/ssh-sync/imported` ref made concurrent SSH
workspace exports race on the same git ref
> - This pull request adds a `workspace-restore-merge` module that diffs
a pre-run snapshot against the post-run remote state and only writes
back files the remote actually changed; SSH workspace exports now use a
per-import unique ref so concurrent runs can't trample each other
> - Every adapter's execute path threads the snapshot through
`prepareAdapterExecutionTargetRuntime` so the merge has the baseline it
needs
> - The benefit is workspace restores no longer churn untouched files,
and concurrent SSH runs no longer collide on the import ref

## What Changed

- `packages/adapter-utils/src/workspace-restore-merge.{ts,test.ts}`: new
module — directory snapshot (kind/mode/sha256/symlink target) plus
snapshot-aware merge that writes only the files the remote changed
- `packages/adapter-utils/src/ssh.ts`: SSH workspace export uses a
per-import unique ref (`refs/paperclip/ssh-sync/imported/<uuid>`);
restore goes through the new merge helper; `ssh-fixture.test.ts` covers
the unique-ref + merge paths
- `packages/adapter-utils/src/sandbox-managed-runtime.ts` +
`remote-managed-runtime.ts`: thread the snapshot/merge through the
sandbox and SSH paths
- `packages/adapter-utils/src/server-utils.{ts,test.ts}` +
`execution-target.ts`: helpers for capturing the pre-run snapshot;
`prepareAdapterExecutionTargetRuntime` gains required `runId` and
optional `workspaceRemoteDir`, and returns the realized
`workspaceRemoteDir`
- Each adapter's `execute.ts` (acpx, claude, codex, cursor, gemini,
opencode, pi) takes the snapshot at run start and passes it through to
the runtime restore
- Remote execute test mocks updated to match the new
`prepareWorkspaceForSshExecution` return shape and the per-run
`${managedRemoteWorkspace}` cwd subdirectory

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-acpx-local --project
@paperclipai/adapter-claude-local --project
@paperclipai/adapter-codex-local --project
@paperclipai/adapter-cursor-local --project
@paperclipai/adapter-gemini-local --project
@paperclipai/adapter-opencode-local --project
@paperclipai/adapter-pi-local` — 196/196 passing
- `pnpm typecheck` clean across the workspace

## Risks

Medium. The restore path now writes a strict subset of what it
previously did — files the remote did not touch are no longer rewritten.
If any flow was relying on a touch-without-content-change being copied
back (timestamp or permission propagation only), that behavior is now
skipped. Snapshot capture adds an O(N-files-in-workspace) hash pass at
run start; the cost is bounded by the existing exclude list. The `runId`
parameter on `prepareAdapterExecutionTargetRuntime` is now required —
every in-tree caller is updated; out-of-tree adapter authors need to
pass it.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new module +
every adapter execute path covered
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-07 14:44:45 -07:00
Dotta 824298f414 Route sidebar search icon directly to search (#5440)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Operators use the sidebar as their primary board navigation surface
> - The board now has a dedicated search page, so the header search icon
should behave as normal navigation instead of only dispatching a
command-palette shortcut
> - The Work nav also had a separate Search row, which duplicated the
always-visible header search affordance
> - This pull request keeps search one click away while making it a
direct `/search` link and reducing sidebar nav noise
> - The benefit is a smaller, clearer sidebar with search still
accessible from the top-level chrome

## What Changed

- Changed the sidebar header search icon into a direct `NavLink` to
`/search`.
- Removed the duplicate `Search` row from the Work navigation section.
- Added focused Sidebar coverage that asserts the header search link
target and confirms Search is not rendered in the Work nav.
- Refactored the Sidebar test setup helper to avoid repeating the React
Query wrapper across tests.

## Verification

- `pnpm install --frozen-lockfile` in the PR worktree so workspace
package symlinks existed for test execution. This completed with
existing plugin SDK bin warnings for missing built artifacts.
- `pnpm exec vitest run ui/src/components/Sidebar.test.tsx` — 3 passed.
- `pnpm --filter @paperclipai/ui typecheck` — passed.

## Risks

- Low: this changes a sidebar navigation affordance only. Users who
previously clicked the header icon now land on the full search page
instead of opening the command-palette shortcut path.
- Low: removing the Work nav Search row could affect users who expected
Search in that section, but the icon remains in the fixed sidebar header
and is covered by a targeted DOM test.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled
Paperclip heartbeat environment. Context window and internal reasoning
mode are not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or equivalent focused UI verification
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-07 15:20:58 -05:00
Dotta e400315cbf Guard assigned backlog liveness (#5428)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The issue graph and liveness recovery system decide whether assigned
work is executable or parked
> - Assigned issues created without an explicit status could silently
land in backlog, making parents look blocked with no productive wake
path
> - The server, shared validators, recovery analysis, and UI all need to
agree on that execution semantic
> - This pull request makes assigned issue creation default to `todo`,
flags assigned backlog blockers, and surfaces the state in the board
> - The benefit is that parked assigned work becomes intentional and
visible instead of creating silent liveness stalls

## What Changed

- Adds contract tests for assigned issue creation defaults.
- Defaults assigned issue creation to `todo` when status is omitted
while preserving explicit `backlog` parking.
- Exposes `resolveCreateIssueStatusDefault` through shared validators.
- Teaches liveness/blocker attention paths to distinguish assigned
backlog blockers.
- Adds UI notices, row/header badges, and issue detail safeguards for
assigned backlog blockers.
- Adds Storybook fixtures and execution-semantics documentation for the
assigned-backlog behavior.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
packages/shared/src/validators/issue.test.ts
server/src/__tests__/issue-assigned-backlog-contract-routes.test.ts
server/src/__tests__/issue-blocker-attention.test.ts
server/src/__tests__/issue-liveness.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
ui/src/components/IssueAssignedBacklogNotice.test.tsx
ui/src/components/IssueRow.test.tsx` — 50 passed, 23 skipped.
- Skipped tests were embedded Postgres suites on this host with the repo
skip message: `Postgres init script exited with code null. Please check
the logs for extra info. The data directory might already exist.`
- Pairwise merge check against the issue-controls PR branch completed
without conflicts via `git merge --no-commit --no-ff` in a temporary
worktree.
- Screenshots for assigned-backlog UI states:
[light](docs/pr-screenshots/pr-5428/assigned-backlog-light.png),
[dark](docs/pr-screenshots/pr-5428/assigned-backlog-dark.png).
- Follow-up checks: `pnpm --filter /ui typecheck`; `pnpm --filter
/mcp-server build`; `pnpm --filter /mcp-server test`; `pnpm exec vitest
run packages/shared/src/validators/issue.test.ts`; focused UI component
tests.
- Remote PR checks on head `6300b3c`: policy, verify, serialized server
shards 1/4-4/4, Canary Dry Run, e2e, Greptile Review, and Snyk all
passed.

## Risks

- Medium: changes status defaulting for assigned issue creation when the
caller omits status. Explicit `backlog` remains supported, and
server/shared tests cover both paths.
- Medium: liveness classification changes can affect blocker attention
labels; focused service and UI tests cover the new assigned-backlog
state.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled
Paperclip heartbeat environment. Context window and internal reasoning
mode are not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-07 12:25:26 -05:00
Dotta 6f30003421 Polish operator UI task controls (#5427)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Operators spend most of their day scanning skills, routines, inbox
groups, and activity cards
> - Several small UI rough edges made those surfaces harder to scan or
easier to crash on real API payloads
> - These fixes are grouped together because they are low-risk operator
quality-of-life improvements rather than separate control-plane
contracts
> - This pull request polishes skills metadata, routine run-now access,
grouped issue creation defaults, monitor activity rendering, and
activity row identity layout
> - The benefit is a smoother board workflow with fewer small
interruptions while keeping the change set compact

## What Changed

- Improves company skill source display and the used-by agent list.
- Truncates long skill source paths and adds a copy affordance.
- Adds a row-level run-now button to the routines table.
- Adds grouped issue creation defaults for inbox issue groups and aligns
grouped add buttons to the right.
- Fixes `IssueMonitorActivityCard` when `monitorNextCheckAt` arrives as
an ISO string.
- Polishes activity row actor avatar/name layout by using the shared
avatar primitive.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
ui/src/pages/Routines.test.tsx ui/src/components/IssuesList.test.tsx
ui/src/lib/inbox.test.ts
ui/src/components/IssueMonitorActivityCard.test.tsx` — 91 passed.
- The routines test emitted the pre-existing Radix warning about missing
`DialogTitle`/description in dialog content; tests still passed.
- Pairwise merge checks against the other two PR branches reported no
textual conflicts.

## Risks

- Low: changes are UI-focused and covered by targeted component/lib
tests.
- Low-to-medium: activity row layout changes could affect dense feed
scanability; the implementation uses the shared avatar component and
keeps truncation behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled
Paperclip heartbeat environment. Context window and internal reasoning
mode are not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-07 12:24:02 -05:00
Dotta 772fc92619 Add issue controls and retry-now recovery (#5426)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue operators need clear controls for execution settings, model
overrides, and recovery retries
> - Existing issue properties hid useful adapter override state and did
not expose a board-triggered retry for scheduled heartbeat recovery
> - Scheduled retries also need to respect the same safety gates as
normal execution instead of bypassing budget, review, pause, dependency,
or terminal-state checks
> - This pull request adds the issue property controls and retry-now
surfaces together because they share the issue details/properties UI
> - The benefit is that operators can inspect and adjust issue execution
settings and safely trigger pending scheduled recovery without hidden
control-plane behavior

## What Changed

- Adds editable issue assignee model override controls in
`IssueProperties`, with focused coverage.
- Removes the stale workspace tasks link from issue properties.
- Adds a scheduled retry `retry-now` backend path and shared response
types.
- Adds main-pane and properties-pane scheduled retry UI, backed by a
shared `useRetryNowMutation` hook.
- Adds suppression coverage for budget hard stops, review participant
changes, subtree pause holds, unresolved blockers, terminal issues, and
company scoping.
- Updates the `IssueProperties` test harness with toast actions required
by the retry-now hook.

## Verification

- `pnpm exec vitest run ui/src/components/IssueProperties.test.tsx
ui/src/components/IssueScheduledRetryCard.test.tsx` — 31 passed.
- `pnpm exec vitest run
server/src/__tests__/issue-scheduled-retry-routes.test.ts` — exited 0,
but this host skipped the embedded Postgres route tests with: `Postgres
init script exited with code null. Please check the logs for extra info.
The data directory might already exist.`
- Pairwise merge check against the assigned-backlog PR branch completed
without conflicts via `git merge --no-commit --no-ff` in a temporary
worktree.

### Visual verification screenshots

Storybook story: `Product/Issue Scheduled retry surfaces /
ScheduledRetrySurfaces`.

![Scheduled retry card and issue properties rows -
desktop](https://raw.githubusercontent.com/paperclipai/paperclip/62fb566f357312b43b9162af02252d0175530a8f/docs/assets/pr-5426/scheduled-retry-story-desktop.png)

![Scheduled retry card and issue properties rows -
mobile](https://raw.githubusercontent.com/paperclipai/paperclip/62fb566f357312b43b9162af02252d0175530a8f/docs/assets/pr-5426/scheduled-retry-story-mobile.png)

## Risks

- Medium: this touches issue execution/retry behavior, so CI should run
the embedded Postgres route tests on a host that can initialize
Postgres.
- Low-to-medium UI risk around duplicated retry-now entry points; both
surfaces share one mutation hook to keep behavior consistent.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled
Paperclip heartbeat environment. Context window and internal reasoning
mode are not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-07 12:23:13 -05:00
Dotta d0e9cc76f2 Show workspace changes and stale notices in issue threads (#5356)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The issue thread is the operator's durable audit trail for what
changed and why
> - Workspace changes and stale disposition notices need to be visible
in that same timeline without noisy or misleading rendering
> - The local branch already contained backend activity details,
timeline conversion, and UI rendering work for those events
> - This pull request isolates the issue-thread activity work into a
standalone branch against `origin/master`
> - The benefit is a focused audit-trail PR that can merge independently
of the sidebar/operator UI polish branch

## What Changed

- Adds readable workspace-change activity details to issue update
activity events.
- Surfaces workspace-change events in issue chat/timeline rendering.
- Makes the existing issue comment migration idempotent.
- Folds and renders stale disposition notices inline so they match
activity-log styling and spacing.
- Adds focused route, timeline, and issue-thread system notice coverage.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run
server/src/__tests__/issue-activity-events-routes.test.ts
ui/src/lib/issue-timeline-events.test.ts
ui/src/components/IssueChatThreadSystemNotice.test.tsx` — 3 files
passed, 22 tests passed.
- Confirmed the PR changes 9 files and does not include `pnpm-lock.yaml`
or `.github/workflows/*`.
- `pnpm exec vitest run
server/src/__tests__/issue-closed-workspace-routes.test.ts` — 1 file
passed, 4 tests passed.
- `pnpm exec vitest run
server/src/__tests__/issue-activity-events-routes.test.ts
ui/src/lib/issue-timeline-events.test.ts
ui/src/components/IssueChatThreadSystemNotice.test.tsx
server/src/services/recovery/successful-run-handoff.test.ts
packages/shared/src/validators/issue.test.ts` — 5 files passed, 54 tests
passed.
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/server typecheck && pnpm --filter @paperclipai/ui
typecheck`.
- `pnpm --filter @paperclipai/ui typecheck` after adding the Storybook
screenshot fixture.
- Captured Storybook screenshots for the new UI rendering paths:
- Collapsed stale notice + workspace-change row:
`docs/pr-screenshots/pr-5356/issue-thread-notices-collapsed.png`
- Expanded stale notice details:
`docs/pr-screenshots/pr-5356/issue-thread-notices-expanded.png`


### Screenshots

Collapsed stale notice with workspace-change row:

![Collapsed stale notice with workspace-change
row](docs/pr-screenshots/pr-5356/issue-thread-notices-collapsed.png)

Expanded stale notice details:

![Expanded stale notice
details](docs/pr-screenshots/pr-5356/issue-thread-notices-expanded.png)

## Risks

- Moderate risk: this touches issue activity serialization and
issue-thread rendering, both of which are central operator surfaces.
- Migration risk is low: the only migration change makes an existing
migration idempotent.
- No new migrations are introduced, so there is no cross-PR migration
ordering requirement.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, shell/tool-use enabled, used to
split the existing branch, verify the isolated PR branch, and create
this PR.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 09:00:54 -05:00
Dotta 4103978578 Polish operator sidebar and issue property controls (#5355)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Operators use the board sidebar and issue properties panel to move
between companies and understand task metadata
> - Small UI regressions in these controls make repeated board operation
slower and less predictable
> - The local branch already contained targeted fixes for company
ordering, issue date display, and sidebar rail sizing
> - This pull request isolates those operator UI quality-of-life fixes
into a standalone branch against `origin/master`
> - The benefit is a focused, reviewable PR that can merge independently
of the issue-thread activity work

## What Changed

- Shows issue property timestamps with time, not just dates.
- Adds edit-mode support for ordering companies in the sidebar company
menu.
- Fixes a workspace switcher rail regression and keeps the account menu
aligned with the rail width.
- Includes focused component coverage for the touched controls.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/components/IssueProperties.test.tsx
ui/src/components/SidebarCompanyMenu.test.tsx
ui/src/components/Layout.test.tsx
ui/src/components/SidebarAccountMenu.test.tsx` — 4 files passed, 29
tests passed.
- `pnpm --filter /ui typecheck`
- PR checks on `a4030f7a` are green: policy, verify, serialized server
suites 1/4-4/4, e2e, Canary Dry Run, Greptile Review, and Snyk.
- Captured a local Storybook screenshot of `Product/Navigation & Layout`
after the sidebar polish:
`/tmp/pap-3659-screenshots/navigation-layout-after.png`.
- Confirmed the PR changes 8 files and does not include `pnpm-lock.yaml`
or `.github/workflows/*`.

## Risks

- Low to moderate UI risk: this touches shared sidebar components and
issue metadata rendering.
- The company ordering behavior depends on existing query/cache
behavior, so stale cache bugs would show up as ordering inconsistencies.
- No database, API, workflow, or lockfile changes are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, shell/tool-use enabled, used to
split the existing branch, verify the isolated PR branch, and create
this PR.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 08:59:39 -05:00
Dotta 68f69975a4 Harden control-plane safety and issue identifiers (#5292)
## Thinking Path

> - Paperclip relies on issue identifiers, execution policies, and agent
heartbeat rules to keep autonomous work auditable.
> - Safety checks need to reject ambiguous agent handoffs, and
identifier parsing needs to support Cloud tenant prefixes.
> - Agent instructions also need to make final-disposition rules
explicit so work does not stall in vague states.
> - This pull request isolates backend correctness and governance
hardening from the UI and recovery-system-notice branches.
> - The benefit is safer in-review transitions, better identifier
compatibility, and clearer agent operating contracts.

## What Changed

- Fixed run-aware confirmation ordering and interrupted-run state
cleanup.
- Added Cloud tenant identity bootstrap and alphanumeric issue
identifier support across shared parsing and server routes.
- Guarded agent-authored `in_review` updates unless a real review path
exists.
- Tightened heartbeat disposition instructions in adapter
utilities/default AGENTS/Paperclip skill.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run packages/shared/src/issue-references.test.ts
server/src/__tests__/issue-identifier-routes.test.ts
server/src/__tests__/issue-execution-policy-routes.test.ts
packages/adapter-utils/src/server-utils.test.ts` initially had the first
execution-policy test hit Vitest's 5s timeout under the parallel bundle
while the rest passed.
- `pnpm exec vitest run
server/src/__tests__/issue-execution-policy-routes.test.ts
--testTimeout=20000` passed with 10/10 tests.

- Follow-up: `pnpm run typecheck:build-gaps` passed.
- Follow-up: `pnpm --filter @paperclipai/ui typecheck` passed.
- Follow-up: `pnpm vitest run
server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/company-portability.test.ts
server/src/__tests__/costs-service.test.ts` passed.
- Follow-up: `pnpm vitest run ui/src/context/LiveUpdatesProvider.test.ts
ui/src/lib/issue-chat-messages.test.ts
ui/src/lib/issue-reference.test.ts
ui/src/lib/issue-timeline-events.test.ts` passed.

## Risks

- Medium control-plane risk: in-review update validation changes agent
behavior. The error message is explicit and tests cover allowed review
paths.

## Model Used

- OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with
shell/git/GitHub CLI tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 07:49:47 -05:00
Dotta a1b30c9f35 Add planning mode for issue work (#5353)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies.
> - Issues are the core unit of work, and issue comments are how board
users and agents coordinate execution.
> - Some issue conversations need to produce plans and approvals instead
of immediate implementation work.
> - The existing issue contract did not distinguish standard execution
comments from planning-oriented issue work.
> - This pull request adds an issue work-mode contract and board UI
affordances for standard vs planning mode.
> - The benefit is that planning-mode issues can be created, displayed,
discussed, and carried through agent heartbeat context without losing
the normal issue workflow.

## What Changed

- Added `standard` / `planning` issue work-mode contracts across DB,
shared validators/types, server issue flows, plugin protocol, and
adapter heartbeat payloads.
- Added an idempotent `0081_optimal_dormammu` migration for
`issues.work_mode`, ordered after current `public-gh/master` migrations.
- Updated heartbeat/context summaries and issue-thread interaction
behavior so planning work mode is preserved when creating suggested
follow-up issues.
- Added UI support for planning-mode issue creation, issue rows, detail
composer styling, and composer work-mode toggles.
- Added focused server/shared/UI tests plus a Playwright visual
verification spec for planning-mode surfaces.
- Rebased the branch onto current `public-gh/master` and added durable
planning-mode screenshots under `doc/assets/pap-3368/`.

## Verification

- `pnpm --filter @paperclipai/db run check:migrations`
- `pnpm exec vitest run --project @paperclipai/shared
packages/shared/src/validators/issue.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-context-summary.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
server/src/__tests__/issues-goal-context-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueChatThread.test.tsx
ui/src/components/NewIssueDialog.test.tsx
ui/src/components/IssueRow.test.tsx ui/src/pages/IssueDetail.test.tsx`
- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `PAPERCLIP_E2E_SKIP_LLM=true npx playwright test --config
tests/e2e/playwright.config.ts
tests/e2e/planning-mode-visual-verification.spec.ts`

## Screenshots

Desktop planning detail:

![Desktop planning
detail](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-planning-detail.png)

Desktop planning row:

![Desktop planning
row](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-planning-row.png)

Desktop staged standard toggle:

![Desktop staged standard
toggle](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/desktop-standard-toggle.png)

Mobile planning detail:

![Mobile planning
detail](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/mobile-planning-detail.png)

Mobile planning row:

![Mobile planning
row](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-3368-plan-a-planning-mode-for-issues/doc/assets/pap-3368/mobile-planning-row.png)

## Risks

- Medium migration risk: this adds a non-null issue column. The
migration uses `ADD COLUMN IF NOT EXISTS` so installations that applied
an older branch-local migration number can still apply the final
numbered migration safely.
- Medium contract risk: issue payloads, plugin payloads, and adapter
heartbeat payloads now include work mode; compatibility is handled by
defaulting missing values to `standard`.
- UI risk is moderate because composer controls changed; focused
component tests and visual e2e coverage exercise standard vs planning
display and toggle behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent in a local Paperclip worktree, with
shell/tool use. Exact context-window size is not exposed in this
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 07:01:28 -05:00
Dotta 320fd5d23b Add full company search page (#5293)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators need to find work, documents, agents, projects, comments,
and activity across a company without jumping through separate surfaces.
> - The existing Command-K flow was useful for fast navigation but not
enough for deeper company-wide discovery.
> - Search also needs company-scoped backend contracts, query cost
controls, and indexed document matching so it stays safe as company data
grows.
> - This pull request adds a full company search API and a dedicated
board search page that Command-K can hand off to.
> - The benefit is a single searchable control-plane surface with richer
result context, recents, highlights, and test coverage across server and
UI behavior.

## What Changed

- Added a company-scoped search endpoint/service with query validation,
rate limiting, text matching, fuzzy title matching, and result typing
shared through `@paperclipai/shared`.
- Added idempotent search migrations for document search indexes and
fuzzy matching support.
- Added the full `/companies/:companyKey/search` UI, search result row
components, highlighted snippets, recent searches, and sidebar/Command-K
handoff.
- Added Storybook coverage for search surfaces and Vitest coverage for
server search behavior, rate limiting, route generation, Command-K
behavior, and the search page.
- Addressed Greptile findings by renaming the no-match SQL helper,
applying search pagination after cross-type merge sorting, and
lazy-initializing the default search service so unrelated route-test
mocks do not need to know about it.
- Merged current `public-gh/master` and renumbered the search migrations
behind upstream `0078_white_darwin`: search indexes are now
`0079_company_search_document_indexes` and fuzzy matching is
`0080_company_search_fuzzystrmatch`.

## Verification

- `git fetch public-gh master`
- `git diff --check public-gh/master...HEAD`
- `git diff --name-only public-gh/master...HEAD | rg '^pnpm-lock\.yaml$'
|| true` produced no output before opening the PR.
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/company-search-service.test.ts
server/src/__tests__/company-search-rate-limit-routes.test.ts
ui/src/pages/Search.test.tsx ui/src/components/CommandPalette.test.tsx
ui/src/lib/company-routes.test.ts` passed: 5 files, 25 tests.
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/db typecheck && pnpm --filter @paperclipai/server typecheck
&& pnpm --filter @paperclipai/ui typecheck` passed.
- `pnpm exec vitest run
server/src/__tests__/company-search-service.test.ts
server/src/__tests__/company-search-rate-limit-routes.test.ts && pnpm
--filter @paperclipai/server typecheck` passed after Greptile pagination
fixes.
- `pnpm exec vitest run
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
server/src/__tests__/company-search-rate-limit-routes.test.ts
server/src/__tests__/company-search-service.test.ts && pnpm --filter
@paperclipai/server typecheck` passed after the CI mock fix.
- After resolving the migration conflict with current
`public-gh/master`: `pnpm --filter @paperclipai/db typecheck && pnpm
exec vitest run server/src/__tests__/company-search-service.test.ts
server/src/__tests__/company-search-rate-limit-routes.test.ts && pnpm
--filter @paperclipai/server typecheck` passed.
- DB migration numbering check passed as part of `@paperclipai/db`
typecheck.
- UI states are covered by the added Storybook stories in
`ui/storybook/stories/search.stories.tsx`.
- GitHub reports the PR merge state as `CLEAN` on head `18e54fa8`.
- GitHub PR checks are green on head `18e54fa8`: policy, verify,
serialized server shards 1/4 through 4/4, e2e, canary dry run, Snyk, and
Greptile Review.

## Risks

- Search ranking and snippets are new user-facing behavior, so reviewers
should check whether result ordering feels right on real company data.
- Search touches broad company data, so company scoping and query
cost/rate-limit behavior should be reviewed carefully.
- The migrations add search indexes/extensions; they are idempotent with
`IF NOT EXISTS` for users who may have applied an earlier branch
migration number.

> ROADMAP.md checked. This PR adds a focused board search surface and
does not duplicate an open roadmap item.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub CLI
session with medium reasoning effort. Existing branch commits were
produced across prior agent sessions; this packaging pass verified,
opened the PR, addressed Greptile findings, resolved migration conflicts
after upstream PRs landed, and got PR checks green.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 06:32:37 -05:00
Dotta 424e81d087 Improve operator workflow QoL (#5291)
## Thinking Path

> - Paperclip is a control plane operators use repeatedly to supervise
agent companies.
> - Common operator workflows depend on fast scanning of inboxes, issue
sidebars, workspaces, cost totals, and runtime services.
> - Several small UI and service gaps made those workflows slower or
less clear.
> - This pull request groups the operator-facing QoL changes that can
stand alone from recovery and adapter work.
> - The benefit is a denser, clearer board experience for issue triage
and workspace operation.

## What Changed

- Added inbox assignee/project grouping and issue list token/runtime
totals.
- Improved issue properties with removable blocker chips and workspace
task links.
- Improved execution workspace layout, runtime controls, issues tab
default, and stopped-port reuse behavior.
- Added mobile markdown/routine dialog fixes, page title company names,
sidebar polish, and dashboard run task label cleanup.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/lib/inbox.test.ts
ui/src/components/IssueProperties.test.tsx
ui/src/components/WorkspaceRuntimeControls.test.tsx
server/src/__tests__/workspace-runtime.test.ts
server/src/__tests__/costs-service.test.ts`

## Risks

- Medium UI risk because this touches several operator surfaces. The
branch is intentionally grouped around workflow/QoL files and keeps the
file count below the Greptile limit.

## Model Used

- OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with
shell/git/GitHub CLI tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 06:30:44 -05:00
Dotta 11ffd6f2c5 Improve ACPX adapter configuration (#5290)
## Thinking Path

> - Paperclip orchestrates AI agents across several adapter
implementations.
> - ACPX is a local adapter path that can proxy Claude and Codex-style
execution.
> - Its configuration needed stronger schema defaults, provider-aware
model handling, and better UI support.
> - Plugin authors also need clear docs for managed resources.
> - This pull request improves ACPX adapter configuration and documents
plugin-managed resources.
> - The benefit is a more predictable adapter setup path without
changing unrelated control-plane behavior.

## What Changed

- Improved ACPX config schema, execution config handling, UI build
config, and route coverage.
- Added ACPX model filtering support and tests.
- Updated the agent config form and storybook coverage for ACPX
model/provider behavior.
- Expanded plugin authoring documentation for managed resources.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run server/src/__tests__/acpx-local-execute.test.ts
server/src/__tests__/adapter-routes.test.ts
ui/src/lib/acpx-model-filter.test.ts`

## Risks

- Low-to-medium risk: adapter configuration behavior changes can affect
ACPX users, but the change is isolated to ACPX/plugin-doc surfaces and
covered by targeted adapter tests.

## Model Used

- OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with
shell/git/GitHub CLI tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 06:06:47 -05:00
Dotta 454edfe81e Add recovery handoff system notices (#5289)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Agent runs can end productively while the source issue still lacks a
durable final disposition.
> - That leaves the control plane unsure whether to resume, escalate, or
close the work.
> - Issue comments also need a presentation contract so system-authored
recovery notices can render as first-class thread messages without
overloading normal comments.
> - This pull request adds successful-run handoff recovery, comment
presentation metadata, and system notice rendering.
> - The benefit is stricter task liveness with clearer operator-facing
recovery state.

## What Changed

- Added successful-run handoff decisions, wake payloads, escalation
behavior, and recovery tests.
- Added issue comment presentation metadata with migration
`0078_white_darwin.sql` and shared/server/company portability support.
- Rendered recovery/system notices in issue chat with dedicated UI
components, fixtures, tests, and storybook/lab coverage.
- Included the current recovery model-profile hint patch so automatic
recovery follow-ups use the cheap profile.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run
server/src/services/recovery/successful-run-handoff.test.ts
ui/src/components/SystemNotice.test.tsx
ui/src/lib/system-notice-comment.test.ts
ui/src/components/IssueChatThreadSystemNotice.test.tsx`

## Risks

- Migration-bearing PR: merge this before any other branch that might
later add a migration.
- The branch touches both recovery services and issue-thread rendering,
so review should pay attention to recovery wake idempotency and comment
metadata compatibility.

## Model Used

- OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with
shell/git/GitHub CLI tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-06 06:05:58 -05:00
Devin Foley 50db8c01d2 Serialize sandbox callback bridge against concurrent heartbeats (#5326)
> **Stacked PR.** This PR's branch carries cumulative content from #5324
(bridge allowlist expand) and #5325 (env sanitization) — the
mutex/sha256 logic in this PR sits on top of both. Reviewers should
focus on the files this PR's commit touches:
`packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`,
`packages/adapter-utils/src/ssh.ts`, and
`packages/adapter-utils/src/ssh-fixture.test.ts`. Will rebase onto
`master` and force-push once both prerequisite PRs are merged.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent that runs in a sandbox or via SSH talks back to the
Paperclip server through a per-lease callback bridge whose entrypoint
script is uploaded to the remote
> - When two heartbeats target the same agent on the same machine
concurrently, both upload the bridge entrypoint and both write to the
same response files — producing torn-write races: `SyntaxError:
Identifier 'randomUUID' has already been declared` from a concatenated
upload, `mv: cannot stat …` from colliding `.json.tmp` writes, and
0-byte commits from a truncated stdin
> - This pull request serializes those operations with a POSIX
`mkdir`-mutex (PID liveness check + atomic rename) at the bridge
entrypoint upload, applies the same lock to the bridge response writer,
forwards stdin into remote ssh commands so the entrypoint payload
arrives intact, and verifies a sha256 of the upload before promoting it
> - The benefit is concurrent heartbeats no longer corrupt each other's
bridge state

## What Changed

- `packages/adapter-utils/src/sandbox-callback-bridge.ts`: serialize
entrypoint upload and response writes via POSIX `mkdir`-mutex with PID
liveness; sha256 the upload before promoting via `mv`; content-skip when
the existing entrypoint already matches
- `packages/adapter-utils/src/ssh.ts`: forward stdin into remote ssh
commands through the SSH managed runtime so `cat > "$remote_upload"`
actually receives the base64-encoded entrypoint
- `packages/adapter-utils/src/ssh-fixture.test.ts`: cover the
stdin-forwarded SSH path
- `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: cover
the mutex, content-skip, sha256-verify, and atomic-rename paths

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils`
- `pnpm typecheck` clean
- Manual: two parallel heartbeats targeting the same SSH agent no longer
race on the bridge entrypoint or response files

## Risks

Medium. Serializing previously-parallel operations adds latency on the
contended path (one heartbeat waits on another), bounded by the
entrypoint upload time. The mutex includes PID liveness so a crashed
heartbeat doesn't deadlock subsequent ones. Sha256-verify gives a clear
"torn upload" failure mode instead of silent 0-byte commits.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — tests cover mutex
+ sha256-verify + stdin-forwarded ssh
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 20:01:04 -07:00
Devin Foley f6bad8f6bf Sanitize remote execution envs at the boundary (#5325)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters spawn CLIs against local, SSH, and sandbox targets,
threading a runtime env through `runAdapterExecutionTargetProcess` and
the SSH/sandbox runners
> - Host identity vars (HOME, TMPDIR, XDG_*, NVM_DIR, PATH) routinely
leak into the env we send to remote targets — sometimes via test probes,
sometimes via runtime config — and break sandboxed/SSH'd CLIs whose own
profiles set those values correctly
> - The sanitization logic existed but lived alongside other helpers in
`server-utils.ts` and was applied piecemeal at adapter callsites, so it
was easy to bypass
> - This pull request lifts the sanitization into a standalone
`remote-execution-env.ts`, applies it at the SSH and sandbox runtime
boundary so every remote spawn goes through it, and removes the
duplicated callsite-level filtering
> - The benefit is identity-bound host env stops leaking across
SSH/sandbox transports regardless of which adapter calls in

## What Changed

- `packages/adapter-utils/src/remote-execution-env.ts`: new module —
single source of truth for which env keys are identity-bound and how to
strip them when the value matches the host's value
- `packages/adapter-utils/src/server-utils.ts`: remove the inline
sanitization (now in `remote-execution-env.ts`)
- `packages/adapter-utils/src/execution-target.ts`: apply sanitization
at the sandbox runtime boundary
- `packages/adapter-utils/src/ssh.ts`: apply sanitization at the SSH
spawn boundary
- `packages/adapters/opencode-local/src/server/test.ts`: drop
now-redundant callsite filtering
- `packages/adapters/pi-local/src/server/test.ts`: drop now-redundant
callsite filtering
- New tests `execution-target.test.ts` and
`execution-target-sandbox.test.ts` cover the sanitizer flow at both
transports, including positive cases (host-shaped path stripped) and
explicit-override preservation

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-opencode-local --project
@paperclipai/adapter-pi-local`
- `pnpm typecheck` clean

## Risks

Low–medium. The sanitization is now applied at one layer (boundary)
instead of N (callsites), so behavior is more consistent. Any adapter
that previously relied on a leaked host var landing on the remote shell
would now see it stripped — but those reliances were what this change
exists to fix.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests at both
transports
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 19:30:14 -07:00
Devin Foley 36eaf9778f Expand sandbox callback bridge allowlist to cover the documented heartbeat surface (#5324)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - When an agent runs in an e2b sandbox or other non-managed
environment, it talks back to the Paperclip server through a per-lease
callback bridge that proxies HTTP requests
> - The bridge has an allowlist of method/path patterns it will forward;
anything outside the list is rejected to keep the bridge tight
> - The allowlist had drifted behind what the heartbeat documentation
describes as the supported callback surface — several documented
endpoints (issue updates, agent-side log emit, work-status writes) were
being rejected at the bridge
> - This pull request expands the allowlist to cover the documented
heartbeat surface and adds tests that pin every newly-allowed pattern,
so the doc and the bridge stay in sync
> - The benefit is sandboxed runs no longer hit "method not allowed" /
"path not allowed" rejections on the documented set of callbacks

## What Changed

- `packages/adapter-utils/src/sandbox-callback-bridge.ts`: expand the
method/path allowlist to match the documented heartbeat callback surface
- `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: add
coverage for every newly-allowed pattern, plus negative cases for
patterns that should still be rejected

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils`
- `pnpm typecheck` clean
- Manual: previously-rejected callbacks from sandboxed runs now succeed
end-to-end

## Risks

Low. The allowlist only grows; nothing previously allowed is now
blocked. Tests pin both the new allowed patterns and that out-of-doc
patterns stay rejected.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
added patterns + still-rejected negatives
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 19:30:11 -07:00
Devin Foley 83e7ecc58e Preserve scope on manual heartbeat invokes (#5323)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The agent live-run route lets operators trigger a manual heartbeat
invocation so an agent can pick up a specific issue or step out of band
> - The current route flow drops the caller's scope (issue/run context)
when forwarding the manual invoke into the heartbeat service, so the
resulting run loses the targeting the operator specified
> - This pull request threads the operator-supplied scope through the
manual invoke path on both the server route and the UI client, with a
regression test that confirms the scope round-trips
> - The benefit is manual heartbeat invokes from the live-run UI
actually pick up the scoped issue/run instead of falling through to the
agent's default routine

## What Changed

- `server/src/routes/agents.ts`: forward the operator-supplied scope
into the manual invoke heartbeat service call
- `server/src/__tests__/agent-live-run-routes.test.ts`: new test
verifying the manual invoke path preserves scope
- `ui/src/api/agents.ts`: pass scope through the live-run client API

## Verification

- `pnpm vitest run --no-coverage
server/src/__tests__/agent-live-run-routes.test.ts`
- `pnpm typecheck` clean

## Risks

Low. The change is purely additive on the route surface — handlers that
did not previously pass scope continue to work; handlers that did pass
it now have it preserved instead of dropped.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new test covers
the preserved-scope path
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (internal API change, no visible UI shift)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 19:30:08 -07:00
Devin Foley 9fb0c73e0a Raise gemini-local hello probe timeout to 60s for SSH and E2B targets (#5322)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Gemini adapter's environment Test surfaces a hello probe so
operators can confirm the CLI runs end-to-end on the configured target
> - On SSH and E2B sandbox targets the round-trip cost (login-shell
sourcing, network, model warm-up) routinely exceeds the existing 10s
probe timeout, so the probe spuriously fails on environments that are
actually healthy
> - This pull request raises the gemini-local hello probe timeout to
60s, matching the timeout we use for slower-bootstrapping adapters
> - The benefit is the Gemini Test action no longer reports false
negatives on remote targets that need a longer first-run window

## What Changed

- `packages/adapters/gemini-local/src/server/test.ts`: hello probe
timeout raised from 10s to 60s

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-gemini-local`
- Manual: SSH and E2B Gemini hello probes now complete cleanly without
spurious timeouts

## Risks

Low. A 60s ceiling on a non-blocking probe is consistent with sibling
adapters; the only behavior change is a longer worst-case wait when the
probe genuinely hangs.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — N/A (one-line
timeout change)
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 19:30:04 -07:00
Dotta d6d7a7cea6 Add routine revision history and restore flow (#5285)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - Routines are the scheduled/recurring work surface that keeps a
company operating without manual kicks.
> - Operators need routine edits to be auditable and recoverable,
especially when routines control assignments, prompts, triggers, and
webhook secrets.
> - Documents already have revision-style safety, but routines did not
have equivalent history or restore semantics.
> - This pull request adds append-only routine revisions across the
database, shared contracts, server routes, and board UI.
> - The benefit is safer routine iteration: users can inspect history,
compare changes, restore older definitions, and avoid overwriting newer
edits.

## What Changed

- Added `routine_revisions` storage, latest revision pointers on
routines, shared types, validators, and API docs for routine revision
history.
- Added server service/route support for listing routine revisions,
conflict-aware routine saves, and append-only restore operations.
- Added a History tab on routine detail with revision preview,
structured change summaries, description line diffs, dirty-edit
blocking, restore confirmation, and restored webhook secret surfacing.
- Extracted the line diff helper from `DocumentDiffModal` into
`ui/src/lib/line-diff.ts` for reuse.
- Rebased the branch onto current `public-gh/master` and renumbered the
routine revision migration to `0077_unusual_karnak` after upstream
`0076_useful_elektra`.
- Made the `0077` routine revision migration idempotent so installs that
already applied the branch-local `0076_unusual_karnak` can safely
advance.
- Updated the plugin SDK test harness routine fixture with the new
revision fields required by the shared `Routine` contract.

## Verification

- `pnpm --filter @paperclipai/db run check:migrations` passed.
- `pnpm exec vitest run --project @paperclipai/shared
packages/shared/src/validators/routine.test.ts` passed.
- `pnpm exec vitest run --project @paperclipai/ui
ui/src/lib/line-diff.test.ts
ui/src/components/RoutineHistoryTab.test.tsx
ui/src/lib/workspace-routines.test.ts ui/src/pages/Routines.test.tsx`
passed.
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/routines-service.test.ts --pool=forks
--poolOptions.forks.isolate=true` passed.
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/routines-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true` passed.
- `pnpm --filter @paperclipai/plugin-sdk typecheck` passed after
updating the SDK test harness fixture.
- `pnpm --filter @paperclipai/plugin-sdk build` passed; this refreshed
local generated SDK output needed by plugin example typechecks.
- `pnpm -r typecheck` passed.

## Risks

- Medium migration risk: this adds routine revision storage and
backfills existing routines. The migration is ordered after upstream
`0076` and uses `IF NOT EXISTS` / duplicate-object guards to tolerate
earlier branch-local migration application.
- Restore behavior intentionally appends a new revision instead of
mutating history; callers expecting an in-place rollback need to follow
the new latest revision pointer.
- Restoring webhook triggers recreates webhook secret material, so users
must copy newly surfaced secrets after restore.
- Conflict-aware saves now reject stale routine edits when the client
sends an older `baseRevisionId`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, with shell/tool use in a local
git worktree. Exact context-window size is not exposed in this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Screenshots: not attached in this draft PR; the new UI flow is covered
by component tests listed above.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-05 11:54:52 -05:00
Devin Foley 9578dc3da7 Wire per-adapter sandbox install commands through test and execute paths (#5280)
> **Stacked PR.** Sits on top of the e2b sandbox chain — #5278 (stdin
staging) and #5279 (honest-resolvability + login-profiles). The
cumulative diff against `master` includes both of those PRs' content;
the files touched by *this* PR's commit are the new
`maybeRunSandboxInstallCommand` helper in
`packages/adapter-utils/src/execution-target.ts` and the per-adapter
`index.ts`/`server/test.ts`/`server/execute.ts` wiring under
`packages/adapters/{claude,codex,cursor,gemini,opencode,pi}-local/`. The
honest resolvability check from #5279 is what gives this PR's install
command a meaningful "did it actually land on PATH" follow-up.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Sandbox execution targets are ephemeral — each fresh lease starts
from a template image that may or may not have the agent CLIs
preinstalled
> - When a CLI isn't preinstalled, the resolvability probe fails at
`command -v` and the hello probe never runs
> - There's no shared mechanism for "before you probe or provision,
install the CLI on this sandbox"
> - This pull request adds a `SANDBOX_INSTALL_COMMAND` constant per
adapter and a `maybeRunSandboxInstallCommand` helper that runs it via
the existing sandbox login shell, captures structured output, and never
throws (so the resolvability + hello probe still run after); each
adapter's `test()` and `execute()` share the constant so the two
callsites can't drift
> - The benefit is a fresh sandbox lease without a preinstalled CLI now
installs it once via `sh -lc` before the resolvability probe and before
managed-runtime provisioning, with a uniform
`<adapter>_install_command_run` check on the test report

## What Changed

- `packages/adapter-utils/src/execution-target.ts`: add
`AdapterSandboxInstallCommandCheck` and `maybeRunSandboxInstallCommand`
(runs the install via existing sandbox shell, captures
exit/stdout/stderr, returns a structured info/warn check, never throws)
- Add `SANDBOX_INSTALL_COMMAND` to each adapter's `index.ts` so `test()`
and `execute()` share a single source of truth
- Wire each of the 6 affected adapter `testEnvironment()`s to call
`maybeRunSandboxInstallCommand` before
`ensureAdapterExecutionTargetCommandResolvable`
- Pass `installCommand: SANDBOX_INSTALL_COMMAND` through
`prepareAdapterExecutionTargetRuntime` in each adapter's `execute()`
- Per-adapter install commands use npm globals where possible so
binaries land on a PATH segment the template already exports:
  - claude → `npm install -g @anthropic-ai/claude-code`
  - codex → `npm install -g @openai/codex`
  - cursor → `curl https://cursor.com/install -fsS | bash`
  - gemini → `npm install -g @google/gemini-cli`
  - opencode → `npm install -g opencode-ai`
  - pi → `npm install -g @mariozechner/pi-coding-agent`

SSH and local targets ignore `installCommand` (SSH runtime takes no such
param; local short-circuits before runtime prep), so this is a no-op for
non-sandbox environments.

## Verification

- `pnpm typecheck` clean
- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils`
and per-adapter projects pass
- Manual sandbox matrix (claude, codex, cursor, gemini, opencode, pi) —
each goes `install_command_run → resolvable → hello_probe_passed` (Codex
and Pi land on `hello_probe_auth_required`, which is the
configured-credentials problem, not an install issue)
- SSH no-regression: SSH Claude still passes; the helper short-circuits
on non-sandbox targets

## Risks

Medium — adds a network/CPU cost (npm install / curl) on every fresh
sandbox lease. Cost is bounded (one-time per lease, typically tens of
seconds for npm globals), and the helper never throws so a failing
install still lets the report run resolvability and hello probes. If a
sandbox image already has the CLI, the install is an idempotent
reinstall.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:29:28 -07:00
Devin Foley af9386f879 Run a real command-v probe and source login profiles before exec in e2b sandboxes (#5279)
> **Stacked PR.** Sits on top of #5278 (`e2b/stage-stdin-to-temp-file`)
which ships the stdin-staging fix this builds on. The cumulative diff
against `master` includes that PR's content; the files touched by *this*
PR's commit are `packages/adapter-utils/src/execution-target.ts`,
`packages/plugins/sandbox-providers/e2b/src/plugin.ts`, and
`packages/plugins/sandbox-providers/e2b/src/plugin.test.ts`.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The adapter Test flow does an "is the command resolvable?" probe
before running the hello probe so the report distinguishes "binary not
installed" from "binary errored"
> - For sandbox targets, that resolvability check was a no-op
early-return — every sandboxed adapter test reported "Command is
executable" regardless of whether the binary existed
> - That made the resolvability check disagree with the hello probe in a
way that looked like a PATH bug, when it was actually a missing CLI
> - Separately, the e2b spawn used `sandbox.commands.run` with a
non-login non-interactive shell whose PATH did not include npm-globals,
nvm shims, or anything else the template installs via
`.profile`/`.bashrc`
> - This pull request makes the resolvability check honest by running a
real `command -v` invocation through the sandbox runner, and aligns the
e2b spawn with SSH by sourcing login profiles before `exec env KEY=val
<cmd>`
> - The benefit is the e2b sandbox spawn agrees with the hello probe and
finds CLIs at template-installed paths

## What Changed

- `packages/adapter-utils/src/execution-target.ts`: add
`ensureSandboxCommandResolvable` that runs `command -v <cli>` through
the sandbox runner; replace the early-return in
`ensureAdapterExecutionTargetCommandResolvable` for sandbox targets
- `packages/plugins/sandbox-providers/e2b/src/plugin.ts`: replace
`buildCommandLine` with `buildLoginShellScript` (sources `/etc/profile`,
`~/.profile`, `~/.bash_profile`, `~/.bashrc`, `~/.zprofile`, and nvm.sh
before `exec env KEY=val <cmd>`); env vars are interpolated inline so
user-configured adapter env always wins over profile-exported values;
drop the now-unused `envs:` SDK option
- `plugin.test.ts` updated for the login-shell wrapping

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/sandbox-e2b` —
17/17 plugin tests pass
- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils`
clean
- `pnpm typecheck` clean
- Manual: previously every sandboxed adapter said "Command is
executable" then the hello probe failed with "exec: not found". After
this change, missing CLIs surface honestly at the resolvability step.
SSH no-regression: SSH Claude probe still passes.

## Risks

Medium — sandbox adapter Test reports will start failing at the
resolvability step for environments where the CLI was never actually
installed. This was always the real state; the previous "Command is
executable" message was incorrect. Operators should expect
previously-green-but-broken sandbox environments to report accurately.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — `plugin.test.ts`
updated for the login-shell wrapping
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:21:37 -07:00
Devin Foley cb6af7c2cc Stage stdin to a temp file so the e2b sandbox executor delivers it reliably (#5278)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The e2b sandbox provider implements `onEnvironmentExecute` so
adapters can spawn CLIs in an e2b sandbox
> - For commands that need stdin (e.g. piping a hello prompt to a CLI),
the previous implementation awaited a foreground `commands.run({ stdin:
true, ... })` and then tried to call `sendStdin(pid)` on the now-dead
PID
> - That call resolves only after the process exits, so stdin was never
delivered and e2b raised "process not found"
> - This pull request stages stdin to `/tmp/paperclip-stdin-<uuid>`
inside the sandbox and shell-redirects it (`exec '<cmd>' '<args>' <
'<file>'`), making the command synchronous regardless of whether stdin
is supplied
> - The benefit is adapter Test probes that pipe a hello prompt to a CLI
inside an e2b sandbox now actually deliver the prompt

## What Changed

- `packages/plugins/sandbox-providers/e2b/src/plugin.ts`: replace the
broken async `commands.run` + `sendStdin` flow with stdin-staging to a
sandbox temp file and shell-redirection
- Staged file is removed in a `finally` block; write failures propagate
after best-effort cleanup

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/sandbox-e2b` —
all 17 unit tests pass
- `pnpm typecheck` clean
- Manual: a sandboxed adapter Test probe that pipes a hello prompt now
receives the prompt

## Risks

Low risk — `plugin.test.ts` already encodes the temp-file design; the
change brings the implementation in line with the test.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — existing tests
already encode the new design
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:49 -07:00
Devin Foley 5c2f9aba9d Run explicit-environment adapter tests on the requested target instead of falling back to the host (#5277)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - When a user clicks "Test" on a configured environment (SSH or
sandbox), the agent-test route exercises the adapter against that target
> - The route previously fell back to running the probe on the Paperclip
host whenever an explicit environment target couldn't be resolved, with
the test report still saying "passed"
> - That hid two real failure modes: misconfigured environments looked
green, and sandbox environments were never actually exercised
> - This pull request acquires an ad-hoc lease and realizes a workspace
for sandbox/plugin test environments, resolves a sandbox execution
target wired to the environment runtime, and returns synthesized
diagnostics instead of running a host probe when an explicit env target
can't be resolved
> - The benefit is the Test action surfaces the real environment state
and never silently exercises the wrong machine

## What Changed

- `server/routes/agents.ts`: acquire an ad-hoc lease and realize a
workspace for sandbox/plugin test environments; resolve a sandbox
execution target wired to the environment runtime
- Return synthesized diagnostics (no host fallback) when an explicit env
target can't be resolved
- `server/services/environment-runtime.ts`: small adjustments to support
the explicit-env-target case
- Clarify test-route messages so they no longer claim a host fallback in
explicit env flows
- New `agent-test-environment-routes.test.ts` covers the guard and
missing-environment path

## Verification

- `pnpm vitest run --no-coverage
server/src/__tests__/agent-test-environment-routes.test.ts`
- `pnpm typecheck` clean
- Manual: a deliberately misconfigured sandbox environment now reports
diagnostics instead of a misleading host-pass

## Risks

Medium — Test route behavior change. Explicit environments that
previously appeared to pass via host fallback will now report their real
state. This is the desired behavior, but operators should expect to see
new failures for environments that were never actually working.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
guard + missing-env paths
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:32 -07:00
Devin Foley 9042b8d042 Write apikey-mode auth.json so Codex CLI 0.122+ can authenticate via OPENAI_API_KEY (#5276)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Codex adapter spawns the OpenAI Codex CLI to drive the model
> - Codex CLI 0.122 changed how it reads credentials: it ignores
`OPENAI_API_KEY` from the environment and reads only
`$CODEX_HOME/auth.json`
> - Without auth.json, Codex 0.122+ returns 401 "Missing bearer or basic
authentication" on `/v1/responses` even when `OPENAI_API_KEY` is
forwarded into the sandbox or remote shell
> - This pull request materializes an apikey-mode `auth.json` in the
managed Codex home (or per-run for the test probe) when an
`OPENAI_API_KEY` is configured
> - The benefit is configured Codex API keys authenticate correctly with
current Codex CLI versions across local, SSH, and sandbox targets

## What Changed

- `codex-home.ts`: add `writeApiKeyAuthJson()` and let
`prepareManagedCodexHome` accept an `apiKey` override that replaces the
symlinked host auth.json with an apikey-mode file
- `execute.ts`: pass `envConfig.OPENAI_API_KEY` into
`prepareManagedCodexHome` so the managed (and synced-to-remote) Codex
home authenticates via the configured key
- `test.ts`: when `OPENAI_API_KEY` is available, wrap the hello probe
with a small shell that materializes a per-run `$CODEX_HOME/auth.json`
before exec'ing codex; key content rides through env to avoid leaking
into process listings
- Update the `codex_hello_probe_auth_required` hint to explain Codex CLI
does not read `OPENAI_API_KEY` from env

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-codex-local`
- `pnpm typecheck` clean
- Manual: Codex 0.122.0 with empty `CODEX_HOME` returns 401 with
env-only auth; with this change it authenticates cleanly

## Risks

Low risk — when no API key is configured, behavior is unchanged (no
auth.json written, existing chatgpt-mode flow preserved). Apikey-mode
`auth.json` is the upstream-supported format.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:27 -07:00
Devin Foley 44c365dea3 Stop leaking host process.env into the remote Pi SSH probe (#5275)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Pi adapter runs the pi-coding-agent CLI against local, SSH, and
sandbox execution targets
> - The Test path's hello probe spreads the host's `process.env` into
the remote process env, including the macOS PATH
> - The leaked Mac PATH overrides the nvm-sourced PATH set up by
`buildSshSpawnTarget`, so on a Linux SSH target `node` resolves to
system Node 18 instead of nvm's Node 20+
> - pi-coding-agent v0.68 / pi-tui then crashes at
`pi-tui/dist/utils.js:27` with `SyntaxError: Invalid regular expression
flags` on the `/v` unicode-sets regex (a Node 20+ feature)
> - This pull request stops the leak — same fix as the opencode SSH
probe — by passing only user-configured adapter env to the probe when
the target is remote
> - The benefit is the Pi hello probe now passes end-to-end against an
SSH target without the Node version mismatch

## What Changed

- `packages/adapters/pi-local/src/server/test.ts` passes only the
user-configured adapter env (`normalizeEnv(env)`) to
`runAdapterExecutionTargetProcess` when the target is remote
- Local probes still get the full `runtimeEnv` so headless permission
injection keeps working

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-pi-local`
- `pnpm typecheck` clean
- Manual: Pi hello probe goes from `pi_hello_probe_failed` (Node 18
regex error) to `pi_hello_probe_passed` against an SSH target

## Risks

Low risk — same pattern shipped for opencode-local and consistent with
claude-local / codex-local / gemini-local.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — pattern mirrors
sibling adapters
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:23 -07:00
Devin Foley 028c5aa00a Stop leaking host process.env into the remote OpenCode SSH probe (#5274)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The OpenCode adapter runs against local, SSH, and sandbox execution
targets
> - The Test path's hello probe spreads the Paperclip host's
`process.env` into the remote process env, which over SSH gets exported
on the remote shell
> - On a Linux SSH target, `HOME=/Users/...` and a host XDG_CONFIG_HOME
pointing at a macOS `/var/folders/...` temp dir cause OpenCode to walk a
host-only path and fail with `EACCES: permission denied, mkdir '/Users'`
> - This pull request stops the leak by passing only user-configured
adapter env to the probe when the target is remote, matching the pattern
already used by claude-local, codex-local, and gemini-local
> - The benefit is the OpenCode hello probe now passes end-to-end
against an SSH target without spurious filesystem errors

## What Changed

- `prepareOpenCodeRuntimeConfig` short-circuits when the target is
remote — the host-fs temp config dir is meaningless and harmful for a
remote target
- `test.ts` passes only the user-configured adapter env (no host
`process.env` spread) to `runAdapterExecutionTargetProcess` when
`targetIsRemote`
- Local probes still get the full `runtimeEnv` so headless permission
injection keeps working

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-opencode-local`
- `pnpm typecheck` clean
- Manual: SSH OpenCode hello probe goes from `EACCES … mkdir '/Users'`
to `opencode_hello_probe_passed`

## Risks

Low risk — local probe behavior is unchanged; the change only narrows
the env passed to remote targets, matching the pattern already shipped
in sibling adapters.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — pattern mirrors
existing sibling tests
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:19 -07:00
Devin Foley ea7f53fd7d Handle Gemini CLI v0.38 stream-json wire format across parser, UI, and CLI formatter (#5273)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent uses an adapter that drives a CLI (Claude, Gemini, Codex,
etc.)
> - The Gemini adapter parses a JSONL transcript stream the CLI emits to
learn what the model said
> - Gemini CLI v0.38 changed the transcript shape: assistant text now
comes through `type=message` with `role`/`content` and terminal status
comes through `type=status` / `type=stats`
> - The existing parser was written against the older `type=assistant` /
`type=result` shape, so post-v0.38 outputs left the parsed summary empty
and downgraded the SSH hello probe to "unexpected output"
> - This pull request updates every Gemini consumer (server parser, UI
parser, CLI formatter) to accept the v0.38 shape while keeping the
legacy shape working
> - The benefit is the Gemini adapter handles current upstream output
without losing backward compatibility, with explicit test coverage for
both shapes

## What Changed

- `packages/adapters/gemini-local/src/server/parse.ts` recognizes
`type=message` events with role/content and stops downgrading them
- `packages/adapters/gemini-local/src/ui/parse-stdout.ts` mirrors the
parser changes for the live UI transcript
- `packages/adapters/gemini-local/src/cli/format-event.ts` formats the
new event shape correctly for CLI output
- `parse.test.ts` and `parse-stdout.test.ts` add v0.38 coverage;
`gemini-local-adapter.test.ts` and `execute.remote.test.ts` switch
happy-path fixtures to the current real wire format and keep dedicated
tests for the older schema

## Verification

- `pnpm vitest run --no-coverage --project
@paperclipai/adapter-gemini-local` — full suite passes including new
v0.38 cases and preserved legacy cases
- `pnpm typecheck` clean

## Risks

Low risk — additive event handling. Legacy event shape path is preserved
with its own tests, so existing fixtures continue to parse identically.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-05 08:00:14 -07:00
Dotta 3c73ed26b5 Expand plugin host surface (#5205)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The plugin system is the extension boundary for optional product
capabilities
> - Rich plugins need more than a worker entrypoint: they need scoped
database storage, local project folders, managed agents/routines, host
navigation, and reusable UI components
> - The LLM Wiki work exposed those missing host surfaces while keeping
plugin code outside the core control plane
> - This pull request expands the core plugin host, SDK, server APIs,
and UI bridge so plugins can declare and use those surfaces
> - The benefit is that future plugins can integrate with Paperclip
through documented, validated contracts instead of bespoke server or UI
imports

## What Changed

- Added plugin-managed database namespaces and migration tracking,
including Drizzle schema/migration files and SQL validation for
namespace isolation.
- Added server support for plugin local folders, managed agents, managed
routines, scoped plugin APIs, and plugin operation visibility.
- Expanded shared plugin manifest/types/validators and SDK
host/testing/UI exports for richer plugin surfaces.
- Added reusable UI pieces for file trees, managed routines, resizable
sidebars, route sidebars, and plugin bridge initialization.
- Updated plugin docs and example plugins to use the expanded host and
SDK surface.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm run preflight:workspace-links && pnpm exec vitest run
packages/shared/src/validators/plugin.test.ts
server/src/__tests__/plugin-database.test.ts
server/src/__tests__/plugin-local-folders.test.ts
server/src/__tests__/plugin-managed-agents.test.ts
server/src/__tests__/plugin-managed-routines.test.ts
server/src/__tests__/plugin-orchestration-apis.test.ts
ui/src/api/plugins.test.ts ui/src/components/FileTree.test.tsx
ui/src/components/ResizableSidebarPane.test.tsx
ui/src/pages/PluginPage.test.tsx ui/src/plugins/bridge.test.ts` passed:
11 files, 67 tests.
- Confirmed this PR changes 89 files and does not include
`pnpm-lock.yaml` or `.github/workflows/*`.

## Risks

- Medium: this expands plugin host contracts across db/shared/server/ui
and includes a new core migration (`0076_useful_elektra.sql`).
- The plugin database namespace validator is intentionally restrictive;
plugin authors may need follow-up affordances for SQL patterns that
remain blocked.
- Merge this before the LLM Wiki plugin PR so the plugin can resolve the
new SDK and host APIs.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub
workflow. Context window size was not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-05 07:42:57 -05:00
Dotta d6bee62f02 Fix Cloud tenant issue identifier routes (#5196)
## Summary

- Allow Cloud tenant issue identifiers with alphanumeric prefixes, such
as `PC1897-1`, to normalize as issue references.
- Resolve those identifiers through issue detail/update routes, active
run/live run polling, activity, costs, and `issueService.getById`.
- Keep UI issue-link parsing aligned so tenant links normalize back to
`/issues/<IDENTIFIER>`.

## Root Cause

Cloud tenant issue prefixes include digits from the stack-id hash. The
app-side route normalization still accepted only all-letter prefixes, so
`/api/issues/PC1897-1` skipped identifier lookup and fell through as a
non-UUID id.

## Verification

- `pnpm exec vitest run packages/shared/src/issue-references.test.ts
ui/src/lib/issue-reference.test.ts
server/src/__tests__/issue-identifier-routes.test.ts
server/src/__tests__/activity-routes.test.ts
server/src/__tests__/costs-service.test.ts
server/src/__tests__/agent-live-run-routes.test.ts
server/src/__tests__/issues-service.test.ts`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/server typecheck`
- `git diff --check`

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-04 13:20:58 -05:00
Dotta edbb670c3b Merge pull request #5154 from paperclipai/pap-3474-docker-timeout
Raise Docker image build timeout
2026-05-03 23:01:46 -05:00
Dotta fd10404374 Raise Docker image build timeout
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-03 22:52:33 -05:00
Devin Foley 47920f9c47 Speed up PR CI critical path (#5147)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies, so
developer throughput on the control plane repo directly affects how fast
the product can evolve.
> - The PR workflow is part of that throughput surface because every
change waits on it before review and merge.
> - This branch started from measured evidence that the PR critical path
was dominated by work that was either serialized unnecessarily or placed
on the wrong part of the graph.
> - The biggest concrete problems were: the canary dry run living inside
`verify`, the server isolated suites running one-by-one in a single
lane, and duplicate CI work that the PR path was paying for without
increasing coverage proportionally.
> - This pull request restructures the PR workflow so those costs are
reduced without removing the important coverage that was already
protecting release and test quality.
> - Follow-up fixes on the branch hardened the new entrypoints so they
work on clean GitHub runners and so the reduced PR typecheck path stays
self-maintaining as workspace packages evolve.
> - The benefit is materially faster PR wall-clock time while keeping
canary packaging checks, serialized-suite isolation, plugin SDK
consumers, and explicit TypeScript coverage where builds do not already
provide it.

## What Changed

- Moved the PR canary dry run into its own `Canary Dry Run` job so it
still runs on PRs but no longer extends the `verify` critical path.
- Split the custom Vitest runner into `general`, `serialized`, and `all`
modes, and added shard support for the isolated server suites.
- Added `test:run:general` and `test:run:serialized` scripts, then
rewired PR CI to fan the serialized server suites out across a 4-way
matrix.
- Added the required `@paperclipai/plugin-sdk` build preflight before
the new reduced-scope typecheck and test entrypoints so they succeed on
clean CI runners.
- Replaced the hardcoded PR build-gap list with
`scripts/run-typecheck-build-gaps.mjs`, which discovers workspace
packages whose `build` scripts skip TypeScript and runs only their
explicit `typecheck` scripts.
- Removed the redundant `pnpm build` from the PR `e2e` job because the
Playwright onboarding path boots Paperclip from source.

## Verification

- `ruby -e "require 'yaml'; YAML.load_file('.github/workflows/pr.yml');
puts 'workflow ok'"`
- `node scripts/run-vitest-stable.mjs --mode general --dry-run`
- `node scripts/run-vitest-stable.mjs --mode serialized --shard-index 0
--shard-count 4 --dry-run`
- `pnpm run typecheck:build-gaps`
- `pnpm test:run:general`
- `pnpm test:run:serialized -- --shard-index 0 --shard-count 4`
- `pnpm build`
- `pnpm paperclipai onboard --yes --run`
- `curl http://127.0.0.1:3299/api/health`

## Risks

- Branch protection or required-check configuration may need to be
updated for the new standalone `Canary Dry Run` job and the
serialized-suite matrix job names.
- `scripts/run-typecheck-build-gaps.mjs` assumes packages that need
explicit PR-time typechecking are the ones whose `build` scripts omit
`tsc`; if build conventions change, that heuristic needs to stay
aligned.
- Serialized test sharding preserves per-suite isolation, but the first
few CI runs should still be watched for shard-balance or naming
assumptions in downstream tooling.

## Model Used

- OpenAI GPT-5.4 via the Codex local adapter, using high reasoning
effort with shell, git, and file-edit tool use in a local worktree.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 20:20:14 -07:00
Dotta e01ffc18d3 Merge pull request #5148 from paperclipai/pap-3474-tenant-identity-deploy
Support Cloud tenant identity bootstrap
2026-05-03 21:57:28 -05:00
Dotta ae23e02526 Support Cloud tenant identity bootstrap
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-03 21:55:52 -05:00
Devin Foley 29401b231b fix(ci): gate new release packages on npm bootstrap (#5146)
## Thinking Path

> - Paperclip is a control plane for autonomous agent companies, so its
release automation is part of the core operator trust boundary.
> - The affected subsystem is npm/GitHub Actions release publishing for
the public monorepo packages.
> - The concrete failure was that a newly added package reached
`master`, the canary workflow attempted its first publish, and npm
trusted publishing was not yet bootstrapped for that package.
> - That means the problem is not just one broken run; it is a missing
pre-merge guard that lets release-ineligible packages land and only fail
once `publish_canary` runs.
> - This pull request makes release enrollment explicit, validates that
enrollment in CI, and adds a PR-time bootstrap check against npm for
changed release-enabled package manifests.
> - The result is that we keep trusted publishing, avoid teaching CI to
`npm adduser`, and move this class of failure from post-merge canary
time to pre-merge review time.

## What Changed

- Added `scripts/release-package-manifest.json` so release-managed
public packages are explicitly enrolled instead of being inferred from
every non-private workspace package.
- Hardened `scripts/release-package-map.mjs` to validate the manifest
before release workflows rewrite versions or assemble publish payloads.
- Added `scripts/check-release-package-bootstrap.mjs` and wired it into
`.github/workflows/pr.yml` so PRs that change a release-enabled package
manifest fail if that package does not already exist on npm.
- Added release-package manifest coverage tests to
`scripts/release-package-map.test.mjs` and included them in `pnpm run
test:release-registry`.
- Wired manifest validation into `.github/workflows/release.yml` and
documented the first-publish bootstrap policy in `doc/PUBLISHING.md` and
`doc/RELEASE-AUTOMATION-SETUP.md`.

## Verification

- `pnpm run test:release-registry`
- `./scripts/release.sh canary --skip-verify --dry-run`
- Confirmed the committed diff contains no obvious PII/secrets via
targeted pattern scan before pushing.

## Risks

- Low risk overall: this is CI/release-policy code, not product runtime
logic.
- The new PR bootstrap check depends on npm metadata availability, so a
transient npm outage could block a PR that changes a release-enabled
package manifest.
- The manifest introduces a new source of truth that must stay aligned
with public package additions, but that is intentional and now enforced.

## Model Used

- OpenAI Codex via the `codex_local` Paperclip adapter; GPT-5-based
coding agent with tool use, terminal execution, git, and GitHub CLI.
Exact served model ID/context window are not exposed by the local
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 19:31:28 -07:00
Devin Foley a5430f010d Handle Gemini assistant message events in JSONL parser (#5143)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, including
agents
>   running the Gemini CLI (`gemini-local` adapter)
> - The Gemini CLI emits a JSONL event stream during a run that the
adapter
> parses to extract the assistant's response text, tool results, and
usage
> - Recent versions of the Gemini CLI emit assistant responses as
> `{ "type": "message", "role": "assistant", "content": ... }` events in
>   addition to the previously-handled event shapes
> - The parser was not handling the new event type, so the assistant's
actual
> response text was being silently dropped from parsed output. Callers
ended
>   up with empty assistant messages even when Gemini had successfully
>   responded
> - This PR teaches the parser to recognize `{type: "message", role:
>   "assistant"}` events and extract their content text via the same
>   `collectMessageText` helper used for other message-shaped events
> - The benefit is that Gemini runs surface the assistant's real
response in
> downstream consumers (issue comments, run logs, downstream agent
context)
>   instead of vanishing

## What Changed

- `packages/adapters/gemini-local/src/server/parse.ts`: in
`parseGeminiJsonl(...)`, add a branch for `event.type === "message"`
with
  `role === "assistant"` that calls
  `messages.push(...collectMessageText(event.content))`.
- `packages/adapters/gemini-local/src/server/parse.test.ts`: ~19 lines
of
  coverage for the new branch.

## Verification

- `pnpm --filter @paperclipai/adapter-gemini-local test -- parse`
- Manual QA: run a Gemini agent on an issue, confirm the assistant's
response
appears as the issue comment / run output. Before this fix the comment
was
  empty even when the run completed successfully.

## Risks

- Tightly scoped: 8 lines of production code in one parser branch. No
effect
  on existing event shapes or other adapters.
- If the Gemini CLI changes its event schema again, this branch may need
to be
  revisited — but adding it is strictly additive over current behaviour.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:36:50 -07:00
Devin Foley 6c090f84a9 Strip inherited host shell env from SSH remote execution (#5142)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents executing on remote SSH hosts receive an env map built from
the host
>   process's env plus per-run additions like `PAPERCLIP_API_KEY`,
>   `PAPERCLIP_RUN_ID`, etc.
> - The env map currently includes inherited host vars by default,
including
> identity-bound ones like `PATH`, `HOME`, `USER`, `NVM_DIR`, `XDG_*` —
> variables whose values are meaningful only on the host they came from
> - Sending the host's `PATH` (containing host-only directories like a
local
> nvm install path) to a remote SSH box overrides the remote's actual
`PATH`
> and breaks command resolution. Same hazard for `HOME` (commands
looking for
> config files end up in a non-existent dir), `USER` (writes go to the
wrong
>   path), etc.
> - This PR adds `sanitizeSshRemoteEnv()` that drops inherited
identity-bound
> vars when their value matches the host process's value. Explicitly-set
> values pass through untouched, so callers that genuinely want to
override
>   remote `PATH` etc. still can — but accidental leakage from
>   `process.env` is filtered.
> - The benefit is that SSH remote execution stops corrupting the remote
> shell's environment with host-shaped paths, so commands resolve
correctly
>   against the remote PATH and config files land in the remote `HOME`

## What Changed

- New `sanitizeSshRemoteEnv(env, inheritedEnv = process.env)` in
`packages/adapter-utils/src/server-utils.ts`. The identity-bound key set
is:
    - `PATH`, `HOME`, `PWD`, `SHELL`, `USER`, `LOGNAME`
    - `NVM_DIR`, `TMPDIR`, `TMP`, `TEMP`
    - `XDG_CONFIG_HOME`, `XDG_CACHE_HOME`, `XDG_DATA_HOME`,
      `XDG_STATE_HOME`, `XDG_RUNTIME_DIR`
For any key in this set, the entry is dropped iff the env value equals
the
  inherited (host process) value. Other keys pass through unchanged.
- `readEnvValueCaseInsensitive(...)` helper handles Windows-style
  case-insensitive env var lookups.
- Wired into `resolveSpawnTarget(...)` for the SSH transport. Sandbox
and local
  paths are unaffected.
- Tests added in `server-utils.test.ts` (~50 lines) covering: matching
keys
filtered, mismatched keys preserved, non-identity keys passed through,
case
  insensitivity.

## Verification

- `pnpm --filter @paperclipai/adapter-utils test -- server-utils`
- Manual QA: run any adapter against an SSH-backed environment, confirm
remote command resolution works (e.g. `node`, `npm`, the adapter's CLI)
and
config files land in the remote user's `HOME`. Compare to the prior
behaviour
by transiently re-introducing the inherited `PATH` and watching commands
  fail with `command not found`.

## Risks

- Behavioural shift: SSH remote execution previously passed inherited
host env
  vars verbatim. Code that relied on that (e.g. a remote command somehow
  expecting the host's `PATH`) will see different behaviour. None of the
  adapter code in this repo has such a dependency.
- Edge case: if a caller explicitly sets `PATH` to the same value as the
host's
`PATH` (literally — same exact string), the sanitizer drops it as a
leak.
  In practice no caller constructs the env this way.
- Windows host: case-insensitive lookup handles `Path` vs `PATH`
correctly.
  Tested.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:36:13 -07:00
Devin Foley 90631b09b3 Let adapters declare runtime command spec for remote provisioning (#5141)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, running
adapter
> commands like `claude`, `codex`, `pi` either locally or on remote
runtimes
>   (SSH hosts, sandboxes, etc.)
> - On a fresh remote runtime — particularly an ephemeral sandbox — the
> adapter's CLI may not be installed yet. Today operators handle this
via
> external configuration (e.g. a project-level `provisionCommand` shell
> script) that has to know about every adapter the operator might want
to use
> - This means every adapter has its own well-known npm package, but
operators
>   end up writing duplicate provision shell scripts that paste together
> `npm install -g @anthropic-ai/claude-code`, `npm install -g
@openai/codex`,
>   etc. — knowledge the adapter itself already has
> - This PR moves that knowledge into the adapter modules: each adapter
declares
> how its runtime command should be detected and (if applicable)
installed
> via `getRuntimeCommandSpec(config)`. The execution path runs the
adapter's
> own install command on remote sandbox targets before launching, so a
fresh
> sandbox bootstraps itself instead of requiring a hand-written
provision script
> - The benefit is fewer footguns for operators provisioning remote
runtimes,
>   and a clean place for new adapters to plug in their install recipe

## What Changed

- New types in `packages/adapter-utils/src/types.ts`:
    - `AdapterRuntimeCommandSpec` describing `command`, optional
      `detectCommand`, and optional `installCommand`
    - Optional `getRuntimeCommandSpec(config)` on `ServerAdapterModule`
- Optional `runtimeCommandSpec` on `AdapterExecutionContext` so adapters
      receive the resolved spec at execute time
- New helper `ensureAdapterExecutionTargetRuntimeCommandInstalled(...)`
in
`packages/adapter-utils/src/execution-target.ts` that runs the install
command
on remote targets when `transport === "sandbox"`. SSH and local targets
are
  no-ops. Throws on timeout or non-zero exit so failures surface early.
- Each of `claude-local`, `codex-local`, `cursor-local`, `gemini-local`,
  `opencode-local`, `pi-local`'s `execute.ts` now reads
`ctx.runtimeCommandSpec?.installCommand` and calls the helper before
launching
  the adapter command.
- `server/src/adapters/registry.ts` declares `getRuntimeCommandSpec` for
each
  adapter:
- claude/codex/gemini/opencode/pi-local: `npm install -g <package>`
recipe via
a shared `buildNpmRuntimeCommandSpec` helper, with a defensive guard
that
only auto-installs when the configured `command` matches the well-known
      fallback (custom binaries are left alone).
- cursor-local: declares `command` only; no auto-install (no public npm
      package), preserving the existing manual setup.
- `server/src/services/heartbeat.ts` resolves the spec via
`adapter.getRuntimeCommandSpec?.(runtimeConfig)` and passes it through
to
  `AdapterExecutionContext`.
- Tests added in `execution-target.test.ts` (~75 lines), e2b
`plugin.test.ts` (~32 lines), and `environment-run-orchestrator.test.ts`
  (~76 lines).

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-run-orchestrator`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: run an adapter (claude/codex/etc.) against a fresh
sandbox-backed
environment that does NOT have the adapter CLI pre-installed. Confirm
the
install runs once at the start of the agent run and the adapter then
launches
successfully. Re-run on the same sandbox; confirm the install command is
  idempotent and the second run starts faster.
- Confirm SSH and local execution paths are unaffected (gated by
  `transport === "sandbox"`).

## Risks

- Behavioural shift on sandbox runs: a new install step now runs at the
start
  of every sandbox agent run for adapters with `installCommand` set. The
install commands are idempotent (`if ! command -v X >/dev/null 2>&1;
then
npm install -g <pkg>; fi`), so this is fast on warm sandboxes. On a cold
  sandbox, the first run takes longer.
- Operators who used the legacy project-level `provisionCommand` to
install
adapter CLIs can drop that part of their script; the adapter handles it
now.
  Existing scripts continue to work — installs are idempotent.
- The cursor-local adapter has no auto-install (no public npm package).
  Behaviour for cursor-local on sandboxes is unchanged.
- New optional surface on `ServerAdapterModule`. Plugins that don't
implement
  `getRuntimeCommandSpec` retain previous behaviour (no auto-install).

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:35:36 -07:00
Devin Foley 2dce81fbf6 Add optional bridge proxy request logging via PAPERCLIP_BRIDGE_DEBUG (#5140)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents on remote sandboxes call back into the Paperclip control
plane via a
> callback bridge — the host process running the bridge proxies HTTP
requests
>   from the sandbox to the Paperclip API
> - When something goes wrong end-to-end (sandbox can't reach Paperclip,
requests
> timing out, malformed responses), it's hard to tell whether the bridge
> processed the request, what URL/method it used, and what the upstream
>   responded with
> - There was no built-in way to log bridge proxy traffic without
modifying
>   adapter code or attaching a debugger
> - This PR adds opt-in stdout logging of every bridge proxy request and
response
> (method, path, query, status), gated behind `PAPERCLIP_BRIDGE_DEBUG`
so it
>   stays off by default
> - The benefit is that operators can flip a single env var to get full
visibility
> into bridge traffic when debugging remote runs, without changing code

## What Changed

- `packages/adapter-utils/src/execution-target.ts`:
`startAdapterExecutionTargetPaperclipBridge`'s `handleRequest` now logs
each
  proxied request and response when `PAPERCLIP_BRIDGE_DEBUG` is truthy:
    - `[paperclip] Bridge proxy <METHOD> <path>?<query>` before fetch
- `[paperclip] Bridge proxy response <status> for <METHOD>
<path>?<query>` after
- Logging is no-op when the env var is unset/`"0"`/`"false"`.

## Verification

- Set `PAPERCLIP_BRIDGE_DEBUG=1` in the host process env, run an agent
against
a sandbox-backed environment, confirm the bridge log lines appear in
stdout.
- Unset the env var and confirm no extra log lines appear during normal
runs.

## Risks

- Off-by-default, no observable change for shipping users.
- When enabled, the logging is verbose — every API call from the sandbox
  produces 2 stdout lines. Operators should only enable it during active
  debugging.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable — covered by
exercising the
      flag in dev; the underlying handleRequest behavior is unchanged
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:34:48 -07:00
Devin Foley 0e51fa2b0d Honor reuse-existing preference and assignee default environment in issue runs (#5139)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside execution workspaces (a per-issue cwd + env), and
an issue
> can prefer to reuse an existing workspace or get a fresh one each time
> - The heartbeat service was reading the existing workspace's config to
derive
> environment selection regardless of whether the issue actually wanted
to reuse
> it. So fresh-run issues were inheriting stale config from a workspace
that was
>   about to be discarded
> - Separately, when an issue is assigned to an agent, the issue's
execution
> workspace settings weren't picking up the agent's
`defaultEnvironmentId`,
>   even though the agent's choice is the natural default for that issue
> - This PR makes both selection paths honor the obvious source of
truth:
> workspace config flows only when the issue actually wants
`reuse_existing`,
> and the assignee agent's default environment is applied at assignment
time if
>   nothing else is set on the issue
> - The benefit is that re-running a flaky issue picks up the right
environment
> instead of inheriting the previous run's config, and assigning an
agent to an
>   issue does the obvious thing without operator intervention

## What Changed

- `server/src/services/heartbeat.ts`: introduce
`reusableExecutionWorkspaceConfig`
  that is non-null only when `shouldReuseExisting` is true. Both
  `resolveExecutionWorkspaceEnvironmentId(...)` and
`applyPersistedExecutionWorkspaceConfig(...)` now read from it instead
of
unconditionally consulting `existingExecutionWorkspace?.config`.
Fresh-run
issues no longer inherit stale environment config from an in-flight
workspace
  about to be discarded.
- `server/src/services/issues.ts`: when an issue update sets a new
  `assigneeAgentId` and isolated workspaces are enabled, populate
  `executionWorkspaceSettings.environmentId` from the assignee agent's
  `defaultEnvironmentId` if the issue doesn't have an explicit
  `environmentId` set yet.
- Tests added in `heartbeat-plugin-environment.test.ts` (~216 lines) and
  `issues-service.test.ts` (~85 lines) covering both paths.

## Verification

- `pnpm --filter @paperclipai/server test --
heartbeat-plugin-environment issues-service`
- Manual QA: assign an issue to an agent that has a non-default
`defaultEnvironmentId`, confirm the issue's workspace settings now
include that
environment id without operator intervention. Trigger a rerun on an
issue
whose existing workspace points at a stale environment, confirm the
rerun uses
  the freshly-resolved environment.

## Risks

- Behavioural shift on assignment: previously assigning an agent didn't
propagate the agent's default environment to the issue. Now it does.
Callers
that explicitly want the issue to keep its existing/null environment
must set
`executionWorkspaceSettings.environmentId` themselves; the new logic
only
  fires when no explicit value is set.
- Behavioural shift on rerun: stale workspace config is no longer
applied to
  fresh runs. Operators who relied on this implicit inheritance may see
different environment selection on the first rerun after deploy.
Mitigation:
the explicit isssue settings and project policy are still honored as
before.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:33:55 -07:00
Devin Foley 09eceb952a Avoid resuming stale remote sessions (Pi adapter) (#5120)
> **Stacked PR (part 7 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
  - PR #5117
  - PR #5118
  - PR #5119
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Pi adapter persists a session jsonl per agent so subsequent runs
resume
>   conversation context instead of starting cold
> - SSH testing reproduced a real failure: a verification issue reached
terminal
>   `done` and the agent claimed success, but the proof artifact
> `manual-qa/environment-matrix/ssh/pi_local.md` was missing from the
realized
>   SSH workspace on the QA target box
> - Root cause: the saved session header recorded a different cwd than
the new
> execution cwd, but the resume eligibility check only compared
session-params
> cwd via local-style `path.resolve` (which doesn't roundtrip on remote
POSIX
> paths). The stale session got resumed and writes landed in the wrong
cwd
> - This PR tightens resume eligibility for remote targets: it adds
remote-aware
> cwd normalisation, reads the first line of the session jsonl over SSH
(`head
>   -n 1`) to verify the saved header cwd, and only resumes when both
> session-params cwd *and* the on-disk header cwd match the realised
execution
>   cwd. Stale sessions are skipped silently and the run starts cold
> - The benefit is that Pi runs across cwd-changing environments stop
> accidentally resuming each other's sessions, and proof artifacts land
where
>   reviewers expect them

## What Changed

- Added `normalizeExecutionCwd`, `executionCwdsMatch`,
`readSessionHeaderCwd`,
  and `readSavedSessionCwd` helpers in `pi-local/src/server/execute.ts`
- `readSavedSessionCwd` reads the first line of the session jsonl —
locally via
`fs.readFile`, remotely via `runAdapterExecutionTargetShellCommand`
(`head -n 1`)
- Resume eligibility now requires:
  1. Saved session id is non-empty
  2. Execution target shape matches (existing check)
  3. Session-params cwd matches the realised execution cwd
4. Session-header cwd (from the on-disk jsonl) matches the realised
execution cwd
- Stale sessions are skipped silently (run starts cold) instead of
resumed
- `execute.remote.test.ts` extended with: matching header → resume;
mismatched
header → start fresh; missing/unreadable header → start fresh; remote
head
  command failure → start fresh

## Verification

- `pnpm --filter @paperclipai/adapter-pi-local test`
- `pnpm test -- pi-local`
- Manual QA: ran a Pi agent twice in two different remote cwds,
confirmed
the second run did not pick up the first run's session and that
subsequent
  runs in the original cwd still resumed correctly

## Risks

- Adds a `head -n 1` shell call per Pi run on remote targets. Negligible
  latency (single read of session jsonl), bounded by 15s timeout.
- If the `head` call fails for unrelated reasons (transient remote
unreachability), the run will start cold instead of resuming. This is
the
safe default but worth noting — operators may see one extra cold run if
a
  remote glitches mid-session.
- No data is deleted or migrated; stale sessions remain on disk for
manual
  inspection if desired.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:51:38 -07:00
Devin Foley d22e790bd4 Validate remote model probes on execution target (OpenCode) (#5119)
> **Stacked PR (part 6 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
  - PR #5117
  - PR #5118
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The OpenCode adapter validates that its configured model exists
before letting
>   a run start so misconfiguration fails fast with a clear error
> - SSH testing reproduced an OpenCode failure where issues stayed
`backlog`,
>   timed out, and produced no comments. The root cause was in
> `packages/adapters/opencode-local/src/server/execute.ts`: the local
model
> guard `ensureOpenCodeModelConfiguredAndAvailable(...)` only ran when
execution
> was *not* remote, so SSH OpenCode bypassed it and failed silently
later
> - Subsequent testing surfaced a related remote-only failure where the
probe
> (when wired up naively) hits `EACCES: permission denied, mkdir
'/var/folders'`
> on the SSH box because of how OpenCode's runtime config picks a
tempdir
> - This PR runs the model probe on the actual execution target —
`opencode
> models` via `runAdapterExecutionTargetProcess` — instead of the local
CLI,
> parses the output with the shared `parseOpenCodeModelsOutput` helper,
and
> reports a concrete error naming the offending model and a sample of
available
>   remote models when the configured model isn't present
> - The benefit is that mismatched OpenCode models surface as a clear
pre-flight
> error referencing the remote target instead of a silent run that never
leaves
>   `backlog`

## What Changed

- Added `ensureRemoteOpenCodeModelConfiguredAndAvailable` in
  `opencode-local/src/server/execute.ts` that runs `opencode models` via
`runAdapterExecutionTargetProcess` and validates the configured model is
in
  the parsed output
- `models.ts` now exports `parseOpenCodeModelsOutput` and
`requireOpenCodeModelId`
  so the remote path can reuse them
- `execute.ts` calls the remote variant when `executionTargetIsRemote`,
otherwise
  the existing local `ensureOpenCodeModelConfiguredAndAvailable`
- Errors include the offending model id and a sample of available remote
models
  so the operator knows exactly what's missing
- `execute.remote.test.ts` extended with cases for: probe timeout, probe
  non-zero exit, empty model list, and missing-model error

## Verification

- `pnpm --filter @paperclipai/adapter-opencode-local test`
- `pnpm test -- opencode-local`
- Manual QA: configured an OpenCode agent with a model that exists
locally but
not in the remote sandbox, and confirmed the new error fires before the
run
  starts and references the remote target

## Risks

- New behaviour: remote model validation adds a `~20s timeout` `opencode
models`
call on every remote run start. For most environments this is fast, but
a
network-slow sandbox could see startup latency rise. Timeout is bounded.
- If the remote CLI is missing or misconfigured, the new error replaces
the old
generic startup failure — clearer message, but the failure point shifts
earlier. Monitor for any QA flows that relied on the old failure shape.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:34:09 -07:00
Devin Foley 856c6cb192 Fix remote workspace environment shaping (#5118)
> **Stacked PR (part 5 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
  - PR #5117
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run with a Paperclip-shaped environment
(`PAPERCLIP_WORKSPACE_CWD`,
> worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can
locate the
>   correct project tree
> - SSH testing reproduced a real failure: a Codex SSH run wrote to
> `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the
realized
> remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...`
> because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into
the
>   remote env
> - Code review on the initial codex-only fix asked to roll the same
approach
> into every other SSH-capable adapter (claude, acpx, cursor, opencode,
gemini,
>   pi) via a shared helper rather than duplicating per-adapter
> - This PR adds `shapePaperclipWorkspaceEnvForExecution` in
adapter-utils that,
> when the execution target is remote: replaces local cwd with the
realized
> execution cwd, nulls out worktree path (which has no remote meaning),
and
> rewrites/strips `cwd` entries in workspace hints based on what was
actually
>   synced. Every adapter calls it before invoking the remote runner
> - The benefit is that remote runs see the realized remote workspace,
host-local
> paths stop leaking into remote env, and the rule is unit-tested in one
place

## What Changed

- Added `shapePaperclipWorkspaceEnvForExecution` to
  `packages/adapter-utils/src/server-utils.ts` with full unit coverage
  (`server-utils.test.ts`)
- Each of acpx-local, claude-local, codex-local, cursor-local,
gemini-local,
opencode-local, pi-local now calls the new shaper before issuing the
remote
  command and feeds the shaped values into `applyPaperclipWorkspaceEnv`
- Per-adapter `execute.remote.test.ts` files extended to cover the new
shaping
  behaviour: localhost paths replaced with remote cwd, foreign-cwd hints
  stripped, worktree path nulled out for remote targets
- `acpx-local/src/server/execute.test.ts` extended with shaping coverage

## Verification

- `pnpm test -- server-utils execute.remote`
- `pnpm --filter @paperclipai/adapter-acpx-local test`
- Manual QA reproducing the original failure:
  1. Provision an E2B sandbox environment for the Paperclip QA company
2. Assign an issue to a remote-targeted claude-local agent and confirm
the
run starts in the correct remote cwd (no `/Users/...` path leakage in
the
     run logs)
  3. Repeat for opencode-local and pi-local

## Risks

- Behavioural shift: hints whose `cwd` doesn't match the workspace cwd
are now
stripped on remote targets. If any adapter relied on a leaked local hint
cwd,
it will see a missing `cwd` instead. Reviewed all current callers — none
do.
- Adds a small per-run cost (path resolve + string normalisation) on
every remote
  execution. Negligible.
- Worktree path is now nulled out on remote (it has no meaning there).
Adapters
  that previously read the value defensively will continue to work.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:17:52 -07:00
Devin Foley bb7d040894 Switch OpenCode to explicit static/local-aware model selection (#5117)
> **Stacked PR (part 4 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - When creating an OpenCode-local agent, Paperclip currently validates
> `adapterConfig.model` against the *Paperclip host's* `opencode models`
output
> - SSH testing surfaced that this blocks creating an OpenCode agent for
an SSH
> environment: the model that exists on the SSH target isn't visible to
the
> host, so creation fails with "OpenCode requires `adapterConfig.model`
in
> provider/model format" even when the operator picked a real remote
model
> - The initial direction was environment-aware model discovery; the
final
> decision was to keep OpenCode on the same explicit-model pattern as
other
> adapters (default + curated list + manual override) and stop blocking
>   creation on host-side discovery
> - This PR does both: the adapter-models endpoint now accepts
`environmentId` and
> probes against the target environment, and the create-time hard gate
is
> replaced by `requireOpenCodeModelId` which validates `provider/model`
*format*
> without requiring host-local discovery. Test/run-time still surfaces
real
>   auth/availability problems
> - The benefit is that operators can create OpenCode agents for remote
> environments without out-of-band setup, and the model picker in the UI
>   reflects the actually-targeted environment

## What Changed

- Added `requireOpenCodeModelId(input)` in
`opencode-local/src/server/models.ts`,
  exported it from the adapter index
- `ensureOpenCodeModelConfiguredAndAvailable` now delegates the format
check to
  `requireOpenCodeModelId`
- `agentsApi.adapterModels(companyId, adapterType, { environmentId })`
now accepts
  an environment ID and passes it as a query parameter
- `queryKeys.agents.adapterModels` now keys on `(companyId, adapterType,
environmentId)`
- `server/src/routes/agents.ts` reads and validates the new query
parameter,
  forwarding it to the adapter's model probe
- `AgentConfigForm.tsx` and `OnboardingWizard.tsx` build the model query
key from
the currently selected default environment ID and disable autodetect for
  `opencode_local` (model selection is explicit)
- `NewAgent.tsx` simplified — no longer special-cases OpenCode
autodetect
- `company-portability.ts` no longer needs OpenCode-specific autodetect
handling
- Tests added/updated:
  `adapter-model-refresh-routes.test.ts`, `adapter-models.test.ts`,
`agent-permissions-routes.test.ts`,
`opencode-local/src/server/models.test.ts`

## Verification

- `pnpm --filter @paperclipai/server test -- adapter-models
adapter-model-refresh agent-permissions`
- `pnpm --filter @paperclipai/adapter-opencode-local test`
- `pnpm --filter @paperclipai/ui test -- AgentConfigForm
OnboardingWizard NewAgent`
- Manual QA in browser:
1. Boot Paperclip on Tailscale-bound port (so it's reachable from
another
machine), create an OpenCode-local agent, switch the default environment
between two installed sandboxes, and confirm the model list refreshes
     per-environment
  2. Submit with a malformed `provider/model` string and verify the new
     `requireOpenCodeModelId` error surfaces
- Before/after screenshots attached for `AgentConfigForm` model picker

## Risks

- Behavioural shift: switching default environment now triggers a model
refetch.
Should be cheap but introduces a new UI loading state for OpenCode
users.
- Removing dynamic autodetect for OpenCode: if any user configured an
agent
without specifying `model` and relied on autodetect populating it, that
agent
will now fail at submit time. Mitigation: validation error is explicit
and
  actionable.
- New query string parameter on `/api/companies/:id/adapter-models` —
older
clients that omit it still work (parameter is optional and defaults to
null).

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:01:34 -07:00
Devin Foley 076067865f Migrate SSH environment callback to bridge (#5116)
> **Stacked PR (part 3 of 7).** Depends on:
  - PR #5114
  - PR #5115
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents executing on a remote SSH-backed environment need a way to
call back into
>   the Paperclip control plane (run events, log streaming, signals)
> - When the SSH host can't reach the Paperclip host (NAT, firewalls, or
simply not
> on the same network), the run silently fails or hangs — a recurring
class of
>   failure during SSH testing
> - In sandboxed environments we already solved this with a callback
bridge that
> tunnels back through the existing connection; SSH was the odd one out
> - This PR migrates SSH execution to use the same callback bridge, so
every
> adapter's remote run uses one consistent reverse-channel. Per-adapter
SSH glue
> is deleted in favour of a shared `CommandManagedRuntimeRunner` built
from the
>   SSH spec
> - The benefit is fewer SSH-specific failure modes, a smaller code
surface, and
>   one place to evolve the callback contract going forward

## What Changed

- Added `createSshCommandManagedRuntimeRunner` in
`packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a
generic
  command-managed-runtime runner (with cwd, env, and timeout handling)
- Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge
URL now flows
  through the shared runner
- Reworked `execution-target.ts` to use the SSH runner alongside sandbox
runners
  via a unified `CommandManagedRuntimeRunner` interface
- Simplified `remote-managed-runtime.ts` and
`sandbox-managed-runtime.ts` to consume
  the shared runner abstraction
- Deleted per-adapter SSH callback wiring from claude-local,
codex-local,
  cursor-local, gemini-local, opencode-local, pi-local execute.ts files
- Removed `environment-runtime-driver-contract.test.ts` (the contract is
now
  enforced by `environment-execution-target.test.ts`)
- Added/updated `execute.remote.test.ts` cases for each adapter to cover
the SSH
  runner path

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm test -- execute.remote` (covers all six local adapters' SSH
paths)
- Manual QA: ran a claude-local agent against an SSH-backed environment,
confirmed
the agent successfully called back to `/api/agent-callback/*` endpoints
during
  the run

## Risks

- Refactor touches all six local adapters. If any adapter had subtle
SSH-specific
behaviour that wasn't captured in tests, it could regress. Mitigation:
each
  adapter's `execute.remote.test.ts` was extended.
- `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking
type change
for any internal consumer. Verified no external plugins consume this
type.
- The new `CommandManagedRuntimeRunner` shape is a public surface in
`@paperclipai/adapter-utils`; downstream plugins implementing custom
runners may
  need updates, but no such plugins exist in this repo.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:43:52 -07:00
Devin Foley a7b45938b7 Let sandbox providers declare shell defaults (#5114)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
>   providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
>   provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
>   provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
>   helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
>   preference

## What Changed

- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
  else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
  and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
  and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
  `shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
  startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
  (validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
  external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
  `execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
  `environment-execution-target.test.ts`

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
  shell semantics flow through the callback bridge end-to-end)

## Risks

- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
  providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
  Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
Dotta 15eac43b43 [codex] Retry max-turn exhausted heartbeats (#5096)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies, and
heartbeat execution is the control-plane loop that keeps assigned work
moving.
> - Max-turn exhaustion is a recoverable local-adapter stop condition
for Claude and Gemini agents when a run needs another heartbeat to
continue safely.
> - The previous behavior could leave max-turn continuation details hard
to inspect, and duplicate/stale continuation wakes could keep running
after issue state changed.
> - The adapter layer also needed to avoid trusting arbitrary
stdout/stderr text as scheduler control metadata.
> - This pull request adds bounded max-turn continuation scheduling,
visible retry state, structured stop metadata handling, and
stale/duplicate continuation guards.
> - The benefit is safer automatic continuation after max-turn stops,
clearer operator visibility, and fewer duplicate or stale agent runs.

## What Changed

- Replaces closed PR #4952, whose head repository was deleted.
- Rebases the recovered max-turn continuation branch onto current
`paperclipai/paperclip:master`.
- Adds max-turn continuation scheduling and retry-state plumbing for
heartbeat runs.
- Adds stale/duplicate continuation suppression when issue status,
ownership, or execution locks change.
- Normalizes Claude/Gemini max-turn detection around structured stop
metadata instead of unstructured stdout/stderr text.
- Surfaces max-turn continuation settings and retry visibility in the
board UI.
- Adds focused server, adapter, and UI tests for max-turn stop metadata,
retry scheduling, stale queued-run invalidation, adapter
parsing/execution, run ledger display, and agent config patching.

## Verification

- `pnpm install --no-frozen-lockfile` to refresh local dependencies
after rebasing onto current `master`.
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/claude-local-adapter.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/gemini-local-adapter.test.ts
server/src/__tests__/gemini-local-execute.test.ts
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/heartbeat-stale-queue-invalidation.test.ts
server/src/services/heartbeat-stop-metadata.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/lib/agent-config-patch.test.ts ui/src/lib/runRetryState.test.ts
--testTimeout=20000`
- `pnpm --filter @paperclipai/adapter-claude-local typecheck && pnpm
--filter @paperclipai/adapter-gemini-local typecheck && pnpm --filter
@paperclipai/server typecheck && pnpm --filter @paperclipai/ui
typecheck`
- UI screenshot note: the UI changes are limited to config/ledger state
rendering rather than layout changes; component/unit coverage above
verifies the rendered behavior.

## Risks

- Medium behavior risk: heartbeat retry gating now suppresses max-turn
continuations when issue state or execution locks drift, so any callers
that relied on stale continuations running will now see cancellation
instead.
- Low adapter risk: Claude/Gemini unstructured text no longer triggers
max-turn scheduler metadata, so only structured stop signals and Gemini
exit code 53 are trusted.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5-class model, tool-enabled local
repository editing and command execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable: state/default rendering only; covered by
component/unit tests)
- [x] I have updated relevant documentation to reflect my changes (not
applicable: no user-facing command or docs contract changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 11:30:48 -05:00
Dotta 57229d0f24 [codex] Add issue monitor liveness controls (#4988)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies where work
must stay observable, governable, and recoverable.
> - The task/heartbeat subsystem owns agent execution continuity, issue
state transitions, and visible recovery behavior.
> - Waiting on an external service is not the same as being blocked when
the assignee still owns a future check.
> - The gap was that agents had no first-class one-shot monitor state
for external-service waits, so recovery could look stalled or require ad
hoc comments.
> - This pull request adds bounded issue monitors that can wake the
owner, clear exhausted waits, and produce explicit recovery behavior.
> - It also surfaces monitor status in the board UI and documents when
to use monitors versus `blocked`.
> - The benefit is clearer liveness semantics for asynchronous waits
without weakening single-assignee task ownership.

## What Changed

- Added issue monitor fields, shared types, validators, constants, and
an idempotent `0075` migration for scheduled monitor state.
- Added server-side monitor scheduling, dispatch, recovery bounds,
activity logging, and external-ref redaction.
- Added board/agent route coverage for monitor permissions and child
monitor scheduling.
- Added issue detail/property UI for monitor state, a monitor activity
card, and Storybook stories for review surfaces.
- Documented monitor semantics and recovery policy behavior in
`doc/execution-semantics.md`.
- Addressed Greptile review feedback by preserving monitor state in
skipped-stage builders and making board monitor saves send `scheduledBy:
"board"`.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/issue-execution-policy-routes.test.ts
server/src/__tests__/issue-execution-policy.test.ts
server/src/__tests__/issue-monitor-scheduler.test.ts
server/src/__tests__/recovery-classifiers.test.ts
ui/src/components/IssueMonitorActivityCard.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/lib/activity-format.test.ts`
- First run passed 5 files and failed to collect 2 server suites because
the worktree was missing the optional `acpx/runtime` dependency.
- After `pnpm install --frozen-lockfile`, reran the 2 failed suites
successfully.
- `pnpm exec vitest run
server/src/__tests__/issue-monitor-scheduler.test.ts
server/src/__tests__/recovery-classifiers.test.ts`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/db typecheck && pnpm --filter @paperclipai/server typecheck
&& pnpm --filter @paperclipai/ui typecheck`
- `pnpm exec vitest run
server/src/__tests__/issue-execution-policy.test.ts
ui/src/components/IssueProperties.test.tsx`
- `pnpm --filter @paperclipai/server typecheck && pnpm --filter
@paperclipai/ui typecheck`
- `pnpm exec vitest run
ui/src/components/IssueMonitorActivityCard.test.tsx
ui/src/components/IssueProperties.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- Storybook screenshot captured from
`http://127.0.0.1:6006/iframe.html?viewMode=story&id=product-issue-monitor-surfaces--monitor-surfaces`
with Playwright.

## Screenshots

![Issue monitor Storybook
surfaces](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2945-when-a-task-is-waiting-for-an-_external-service_-what-state-should-it-be-in-and-what-recovery-method-could-it-h/docs/pr-screenshots/pap-2945/monitor-surfaces.png)

## Risks

- Medium: this changes heartbeat recovery behavior for scheduled
external-service waits, so regressions could affect wake timing or
recovery issue creation.
- Migration risk is reduced by using `IF NOT EXISTS` for the new issue
monitor columns and index.
- External monitor references are treated as secret-adjacent and are
intentionally omitted from visible activity/wake payloads.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent with repository tool use and terminal
execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or Storybook review surfaces
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 08:58:53 -05:00
Dotta 76f09c8eb6 [PAP-3180] Move workspace switcher into sidebar (#4981)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - The board UI needs a clear persistent way to move between company
workspaces.
> - The previous layout kept company switching in a separate left rail,
which made the sidebar feel split between workspace selection and
navigation.
> - The workspace switcher belongs in the sidebar header so navigation
and workspace context stay together.
> - This pull request removes the separate company rail from the layout
and turns the sidebar company menu into the primary workspace switcher.
> - The benefit is a cleaner sidebar structure that keeps workspace
identity, switching, company actions, and navigation in one place.

## What Changed

- Removed the standalone `CompanyRail` from the main layout.
- Added the company/workspace switcher to the default, company settings,
and instance settings sidebars.
- Expanded `SidebarCompanyMenu` to list active workspaces, indicate the
current workspace, navigate out of instance settings when switching, and
expose add-company onboarding.
- Updated focused component tests for the new workspace-switcher
behavior.

## Verification

- `pnpm --filter @paperclipai/ui exec vitest run
src/components/SidebarCompanyMenu.test.tsx
src/components/CompanySettingsSidebar.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- `git diff --check`
- Visual smoke attempted against the managed dev server at
`http://127.0.0.1:57385`; a fresh browser context reached the
authenticated sign-in screen, so I could not capture an authenticated
sidebar screenshot from this heartbeat.

## Risks

- Low-to-medium UI risk: this changes the primary sidebar structure and
workspace-switching entry point.
- The instance-settings switch behavior now routes back to the selected
company dashboard when a workspace is selected.
- No migrations, API contracts, or lockfile changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled, medium reasoning mode.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-02 08:13:53 -05:00
Dotta 685ee84e4a [codex] Document terminal bench dispatch config (#4961)
## Thinking Path

> - Paperclip agents rely on skills for repeatable operating procedures
> - The Terminal-Bench loop skill needs to preserve enough dispatch
configuration to reproduce real heartbeat behavior
> - A bare benchmark command can create unassigned work with no
heartbeat-enabled agent, which is a harness setup failure rather than
product evidence
> - The Paperclip heartbeat skill also needs to keep escalation biased
toward agent-owned follow-through
> - This pull request documents dispatch runner config requirements and
strengthens the agent follow-through rule
> - The benefit is fewer misleading benchmark loops and clearer agent
operating guidance

## What Changed

- Documented `PAPERCLIP_HARBOR_RUNNER_CONFIG` / runner dispatch config
as required Terminal-Bench loop input.
- Updated the Terminal-Bench loop smoke check to require the dispatch
config mention.
- Added stronger Paperclip skill guidance to avoid asking humans for
work an agent can perform.

## Verification

- `pnpm smoke:terminal-bench-loop-skill`

## Risks

- Low risk: documentation and smoke expectation changes only. The
stricter smoke assertion is intentional so future edits do not drop the
dispatch config requirement.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 12:00:47 -05:00
Dotta d7719423e9 [codex] Harden non-system database backup schemas (#4960)
## Thinking Path

> - Paperclip is a control plane whose database is the durable audit and
work record
> - Database backup needs to include operator/plugin schemas while
excluding PostgreSQL-owned internals
> - PostgreSQL reserves the `pg_` schema prefix for system schemas,
including temp and toast variants
> - A single escaped `pg_` prefix predicate is less brittle than
enumerating individual `pg_toast` and `pg_temp` forms
> - This pull request tightens non-system schema discovery for logical
backups without changing the normal user/plugin schema path

## What Changed

- Replaced narrow `pg_toast` and `pg_temp` schema exclusions with an
escaped `pg_` reserved-prefix exclusion.
- Kept `information_schema` excluded from logical backup metadata
discovery.
- Addressed Greptile feedback by removing redundant no-op additions from
the prior iteration.

## Verification

- `pnpm exec vitest run packages/db/src/backup-lib.test.ts`
- PR checks on the latest pushed head: policy, verify, e2e, Greptile
Review, and Snyk

## Risks

- Low risk: PostgreSQL reserves `pg_` schema names for system use, so
this should only exclude database-owned internals that should not be
restored from Paperclip logical backups.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected - check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:59:53 -05:00
Dotta fe401b7fa9 [codex] Polish inbox nested issue UI (#4959)
## Thinking Path

> - Paperclip orchestrates AI agents through issue lists and
issue-thread interactions
> - The inbox must preserve nested issue visibility and keyboard
navigation as work decomposes into deeper sub-issues
> - Some UI polish issues made nested rows harder to scan and pending
question cancellation less covered
> - The issue list also had a small test indentation regression around
load-more behavior
> - This pull request tightens nested inbox rendering and related
issue-thread/list polish
> - The benefit is a more reliable operator inbox for multi-level work
trees

## What Changed

- Included nested grandchild issues in inbox keyboard navigation and
recursive row rendering.
- Sort parent rows by descendant activity so active subtrees remain
visible.
- Removed extra inbox card background styling in favor of the page
surface.
- Added regression coverage for pending question cancellation.
- Cleaned up the issue-list load-more test indentation.

## Verification

- `pnpm exec vitest run ui/src/lib/inbox.test.ts
ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssuesList.test.tsx`
- Screenshots were not captured in this PR split; the visible flow is
covered by focused component/helper tests and should get browser QA in
the follow-up issue.

## Risks

- Medium risk: nested inbox rendering and keyboard navigation are
user-visible. The changes are localized to inbox grouping/rendering
helpers and covered by targeted tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:58:53 -05:00
Dotta 2d72292ad6 [codex] Add workspace routine run tab (#4958)
## Thinking Path

> - Paperclip orchestrates AI agents through reusable execution
workspaces and routines
> - Operators need a fast way to run workspace-aware routines against a
specific execution workspace
> - The existing workspace detail surface showed configuration, runtime
logs, and linked issues, but not routines that depend on workspace
variables
> - Routine runs also needed to prefill the selected execution workspace
so branch variables resolve correctly
> - This pull request adds a workspace routines tab and prefilled
routine-run dialog support
> - The benefit is a tighter workflow for rerunning reviews, smoke
checks, and other workspace-specific routines

## What Changed

- Added an execution workspace `Routines` tab and company-prefixed
routes.
- Listed routines that declare or reference workspace-specific
variables.
- Added `Run now` support that preselects the current execution
workspace in `RoutineRunVariablesDialog`.
- Centralized reusable execution workspace ordering/deduplication for
issue creation and workspace cards.
- Added focused UI helper and dialog regression tests.

## Verification

- `pnpm exec vitest run ui/src/lib/reusable-execution-workspaces.test.ts
ui/src/lib/workspace-routines.test.ts
ui/src/components/RoutineRunVariablesDialog.test.tsx
ui/src/lib/company-routes.test.ts`
- Screenshots were not captured in this PR split; the visible flow is
covered by focused component/helper tests and should get browser QA in
the follow-up issue.

## Risks

- Medium risk: this adds a new workspace detail tab and routine-run
path. It is isolated to workspace-scoped routines and uses existing
routine run APIs.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:58:15 -05:00
Dotta 570a4206da [codex] Recover productive terminal continuations (#4956)
## Thinking Path

> - Paperclip orchestrates AI agents through issue-scoped heartbeat runs
> - Recovery logic decides whether in-progress work still has a live
path after a terminal run
> - A productive terminal continuation can still leave an issue stranded
when no active run or wake remains
> - Treating that state as healthy leaves work stuck despite evidence
that more action is needed
> - This pull request re-enqueues recovery for productive terminal
continuations that left no live path
> - The benefit is fewer silently stranded in-progress issues after
agents make partial progress

## What Changed

- Reclassified successful-but-productive terminal continuations as
recoverable when no live path remains.
- Enqueue a follow-up recovery wake with the original run id and
continuation metadata.
- Added regression tests covering productive terminal continuation
recovery and advanced liveness handoff.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/run-continuations.test.ts`

## Risks

- Medium risk: recovery may schedule one more follow-up where Paperclip
previously considered the work observed. The existing uniqueness,
budget, and escalation checks still constrain retry loops.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:57:23 -05:00
Dotta 3cd26a78fc [codex] Surface live run comment context (#4957)
## Thinking Path

> - Paperclip orchestrates AI agents through issue comments and
heartbeat runs
> - The board UI needs to distinguish a comment that triggered a live
run from comments queued after that run started
> - The run payload already stores comment context, but active-run API
responses did not expose the ids the UI needs
> - Without those ids, the triggering comment can flash as queued while
the agent is already responding to it
> - This pull request exposes live-run comment context and teaches the
optimistic comment helper to ignore the trigger comment
> - The benefit is clearer issue-chat state during comment-triggered
agent interruptions

## What Changed

- Added `contextCommentId` and `contextWakeCommentId` to active/live run
payloads.
- Threaded those ids through server routes, heartbeat summaries, UI API
types, and issue detail rendering.
- Updated optimistic comment classification to avoid marking the
triggering comment as queued.
- Added server and UI regression coverage.

## Verification

- `pnpm exec vitest run
server/src/__tests__/agent-live-run-routes.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low-to-medium risk: adds optional fields to existing run payloads.
Existing consumers should ignore unknown fields, and UI handling is
null-safe.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 10:44:11 -05:00
Dotta e8275318ba [codex] Raise agent heartbeat concurrency default (#4954)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agent heartbeat settings control how much parallel work one employee
can run
> - The previous default of 5 concurrent runs was too restrictive for
active local agent teams
> - The shared default, heartbeat clamp, docs, and route/import/UI
expectations need to agree
> - This pull request raises the default heartbeat concurrency to 20
while keeping explicit headroom up to 50 for power users
> - The benefit is higher throughput for agent teams without each new
agent needing manual runtime config edits

## What Changed

- Raised `AGENT_DEFAULT_MAX_CONCURRENT_RUNS` from 5 to 20.
- Raised the heartbeat service max clamp from 10 to 50, keeping the new
default below the ceiling.
- Updated V1 implementation docs and tests that assert default
imported/exported runtime config.
- Updated the new-agent UI runtime config test to assert the shared
default constant instead of duplicating the numeric value.

## Verification

- `pnpm exec vitest run
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/company-portability.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`

## Risks

- Medium risk: new agents can consume more local execution capacity by
default. The heartbeat scheduler still respects configured max
concurrency and budget/pause controls, and operators can lower or raise
the per-agent cap within the `1..50` clamp.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 10:42:56 -05:00
Dotta e273d621fc [PAP-3154] Stop padding /live-runs by default (#4963)
## Summary
- Fix [PAP-3154](/PAP/issues/PAP-3154): the Sidebar's "Dashboard NN
live" badge showed a constant 50 in every company because `GET
/api/companies/:companyId/live-runs` was padding its response with up to
50 recent (non-live) heartbeat runs whenever the caller did not pass
`minCount`.
- Regression introduced by
[#4875](https://github.com/paperclipai/paperclip/pull/4875) (commit
`6445bef9`), which capped both `minCount` and `limit` at 50 with a
fallback of 50 for omitted values. The cap is correct for `limit` (real
unboundedness guard); for `minCount` it conflates "no padding" with "pad
to the cap".
- Default `minCount` to 0 so callers asking for "live runs" only get
actually-live runs unless they explicitly request padding
(`ActiveAgentsPanel` is the only caller that does). Keep `limit` capped
at 50 by default.

## Test plan
- [x] `pnpm exec vitest run
server/src/__tests__/agent-live-run-routes.test.ts` — 7/7 pass,
including new tests for the no-pad default and explicit padding.
- [x] `pnpm exec vitest run ui/src/components/Sidebar.test.tsx
ui/src/components/ActiveAgentsPanel.test.tsx
ui/src/api/heartbeats.test.ts` — 6/6 pass.
- [ ] Verify in dev: with ~8 truly-live runs in a company, the sidebar
Dashboard badge shows the real count (not 50).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 10:33:13 -05:00
Dotta 42a299fb9d [codex] Bound productivity review recovery loops (#4948)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat/productivity review subsystem detects when assigned
work is likely stuck or churning.
> - Productivity reviews are useful, but repeated reconciliation can
create noisy refresh comments or repeated review issues around the same
source issue.
> - That makes manager follow-up harder because the signal can get
buried under duplicate review activity.
> - This pull request bounds productivity review refreshes and creation
loops while preserving the existing escalation path.
> - The benefit is a quieter recovery loop that still surfaces stuck or
high-churn work for manager attention.

## What Changed

- Added refresh throttling for open productivity review issues,
including a one-hour default interval and a maximum of three refresh
comments per open review.
- Added a rolling 24-hour creation cap so completed/closed reviews
cannot immediately recreate review issues indefinitely for the same
source issue.
- Excluded cancelled productivity reviews from the creation cap so
manager cancellations do not silently suppress future legitimate
reviews.
- Preserved productivity review timestamps in deterministic test paths
and added targeted coverage for immediate refresh suppression, refresh
caps, creation caps, and cancelled-review exclusion.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/productivity-review-service.test.ts`
- `pnpm exec vitest run
server/src/__tests__/productivity-review-service.test.ts`
- Greptile Review: 5/5 on commit
`bcf25832d0ffae25890b2ee7eed112d1c2d114fe` with review threads resolved.
- GitHub PR checks passed on the latest head: `policy`, `verify`, `e2e`,
`Greptile Review`, and `security/snyk (cryppadotta)`.
- Verified the branch is rebased onto `public-gh/master` with no
conflicts.
- Verified the diff does not include `pnpm-lock.yaml`, database schema
changes, or migrations.

## Risks

- Low-to-medium risk: this changes automation cadence for productivity
reviews. A truly stuck issue may receive fewer repeated refresh
comments, but the original review issue remains open and assigned for
manager action.
- No migration risk: this is server logic and tests only.

> Checked [`ROADMAP.md`](ROADMAP.md) for overlapping planned core work;
this is a targeted recovery-loop fix and does not add a new roadmap
feature.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family, tool-using software
engineering mode. Exact context window is not exposed in this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable; server-only change)
- [x] I have updated relevant documentation to reflect my changes (not
applicable; no user-facing docs or commands changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 08:32:04 -05:00
Devin Foley d2dd759caa plugins: make e2b template default explicit (#4901)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Remote execution environments are part of that control plane,
including sandbox-provider plugins like E2B
> - The E2B provider already normalizes config and runtime behavior
around a `base` template default
> - But the manifest still presented `template` as required, which
forces redundant operator input and makes the UI contract stricter than
runtime behavior
> - That mismatch showed up while building a repeatable QA workflow for
sandbox testing
> - This pull request makes the manifest and validation contract line up
with the existing `base` default
> - The benefit is a simpler and more accurate E2B environment setup
experience

## What Changed

- Removed the E2B manifest's `required: ["template"]` requirement so the
config schema matches runtime behavior
- Clarified the manifest description to say the template defaults to
`base` when omitted
- Added a focused unit test proving that validation normalizes a missing
template to `base`

## Verification

- Ran the focused E2B plugin test for the new behavior:
- `cd packages/plugins/sandbox-providers/e2b && pnpm test --
--testNamePattern "defaults a missing template to base"`

## Risks

- Low risk. This only loosens the schema to match the plugin's existing
runtime normalization and adds a test for that path.
- The broader E2B plugin suite currently has unrelated existing failures
outside this change; this PR does not modify those paths.

## Model Used

- OpenAI Codex, GPT-5 Codex via Codex CLI agent tooling, large-context
coding workflow with terminal tool use and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
2026-04-30 22:43:24 -07:00
Devin Foley b02e67cea5 fix(ci): diff PR workflow paths from merge base (#4903)
## Thinking Path

> - Paperclip’s PR workflow is part of the control-plane safety surface
because it decides whether a branch is allowed to merge.
> - This issue started in that workflow: the lockfile and manifest
policy checks were diffing `base.sha..head.sha`, which incorrectly
treated unrelated `master` commits as if they belonged to the PR branch.
> - The right fix there is to diff from the PR merge base
(`base...head`) so policy checks only evaluate files introduced by the
branch itself.
> - Once that workflow fix was in place, `/checkpr` exposed a second
blocker on the PR merge ref: `verify` was failing in newer `master`-side
tests that were not part of the original branch diff.
> - The actionable repeated failure came from the ACPX local adapter
test suite, where a test hard-coded the managed Codex home under
`instances/default` even though the stable Vitest runner sets a
non-default `PAPERCLIP_INSTANCE_ID`.
> - This pull request now includes both the original CI diff-scope fix
and the targeted ACPX test fix so the PR’s actual checks align with
current base-branch execution.
> - The benefit is that the original false-positive lockfile failure is
removed, and the merge-ref verify path is hardened against the
instance-id isolation used in CI.

## What Changed

- Updated `.github/workflows/pr.yml` so the lockfile policy and manifest
policy steps diff `pull_request.base.sha...pull_request.head.sha` from
the merge base instead of using a two-dot base/head diff.
- Added an inline workflow comment explaining why the three-dot diff is
required for PR-scoped file detection.
- Updated `packages/adapters/acpx-local/src/server/execute.test.ts` so
the managed Codex home assertion uses a test-specific
`PAPERCLIP_INSTANCE_ID` instead of hard-coding `default`.
- Restored `PAPERCLIP_INSTANCE_ID` after that ACPX test finishes so the
test remains isolated and does not leak process env changes.

## Verification

- Reproduced the original false positive locally by comparing PR heads
`#4901` and `#4902` with the old `base..head` logic; both incorrectly
included `pnpm-lock.yaml` from unrelated `master` commits.
- Verified the new `base...head` logic reduces those PRs to only their
actual changed files and excludes `pnpm-lock.yaml`.
- Verified a real manifest-changing PR (`#4893`) still reports
`package.json` changes under the new logic.
- Ran `pnpm -r typecheck` successfully.
- Ran `pnpm vitest run
packages/adapters/acpx-local/src/server/execute.test.ts` successfully
after the ACPX test fix.
- Ran `pnpm vitest run packages/db/src/backup-lib.test.ts` successfully
against the merge-ref-related DB failure path observed during
`/checkpr`.
- Pushed commit `9520a976` and allowed PR `#4903` checks to rerun on the
updated branch.

## Risks

- Low risk: the workflow change only affects how PR policy checks
determine the changed file set.
- Low risk: the ACPX change is test-only and aligns the test with the
instance-isolation behavior already used by
`scripts/run-vitest-stable.mjs` in CI.
- The remaining operational risk is limited to other unrelated
merge-ref-only failures that were not reproduced in the targeted local
verification above.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5-codex`, via the Codex local adapter in Paperclip.
- Tool-using coding model with shell execution, git, GitHub CLI, and
repository inspection in a local worktree.
- Context included the current repo, the Paperclip task thread, PR check
output, and the isolated execution workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-30 21:22:40 -07:00
github-actions[bot] 6a7cca95ef chore(lockfile): refresh pnpm-lock.yaml (#4899)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-04-30 20:00:07 -05:00
Dotta 4272c1604d Add ACPX local adapter runtime (#4893)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through a control plane
that can start, supervise, and recover agent runs.
> - Local adapters are the bridge between Paperclip issues and concrete
agent runtimes such as Claude, Codex, and other ACP-compatible tools.
> - The roadmap calls out broader “bring your own agent” and claw-style
agent support, and ACPX gives Paperclip one path to normalize multiple
ACP agents behind a single adapter.
> - The branch needed to become one reviewable PR against current
`paperclipai/paperclip:master`, without carrying stale base conflicts or
generated lockfile churn.
> - This pull request adds an experimental built-in `acpx_local`
adapter, integrates it through the server/CLI/UI adapter surfaces, and
adds regression coverage for runtime execution, skill sync, stream
parsing, diagnostics, and log redaction.
> - The benefit is that Paperclip can run Claude/Codex/custom ACP agents
through ACPX while keeping operator configuration, skills, logging, and
transcript rendering inside the existing adapter model.

## What Changed

- Added `@paperclipai/adapter-acpx-local` with server execution, config
schema, ACPX session handling, CLI formatting, UI config helpers, and
stdout parsing.
- Registered `acpx_local` across CLI, server, shared constants, UI
adapter metadata, adapter capabilities, and agent creation/editing
surfaces.
- Added ACPX runtime execution support with persistent sessions,
local-agent JWT environment handling, skill snapshots, runtime skill
materialization, and isolation/security regressions.
- Added ACPX adapter diagnostics and marked the adapter experimental in
the UI.
- Added command/env secret redaction for resolved command metadata in
adapter-utils, server event storage, and the Agent Detail invocation UI.
- Added Storybook coverage for ACPX config, transcript rendering, and
skill states, plus PR screenshots under `docs/pr-screenshots/pap-2944/`.
- Rebased the branch onto current `public-gh/master`; `pnpm-lock.yaml`
is intentionally not included and there are no migration/schema changes.

## Verification

- `pnpm exec vitest run
packages/adapters/acpx-local/src/server/execute.test.ts
packages/adapters/acpx-local/src/server/test.test.ts
packages/adapters/acpx-local/src/cli/format-event.test.ts
packages/adapters/acpx-local/src/ui/parse-stdout.test.ts
packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/redaction.test.ts
server/src/__tests__/acpx-local-execute.test.ts
server/src/__tests__/acpx-local-skill-sync.test.ts
server/src/__tests__/acpx-local-adapter-environment.test.ts
server/src/__tests__/adapter-routes.test.ts
server/src/__tests__/agent-skills-routes.test.ts
ui/src/adapters/metadata.test.ts` — 12 files, 87 tests passed.
- `pnpm --filter @paperclipai/adapter-acpx-local typecheck` — passed.
- `pnpm --filter @paperclipai/server typecheck` — passed.
- `pnpm --filter @paperclipai/ui typecheck` — passed.
- Confirmed PR diff does not include `pnpm-lock.yaml`, database schema
files, or migrations.

Screenshots:

![ACPX Claude skills
light](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-claude-light.png?raw=true)
![ACPX Claude skills
dark](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-claude-dark.png?raw=true)
![ACPX custom skills
light](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-custom-light.png?raw=true)

## Risks

- Medium risk: this introduces a new built-in adapter package and
touches runtime execution, adapter registration, agent config, skills,
and transcript rendering.
- ACPX and ACP agent behavior can vary by installed tool versions; the
adapter is marked experimental to set operator expectations.
- `pnpm-lock.yaml` is excluded per repository PR policy, so dependency
lock refresh must be handled by the repo’s automation or maintainers.
- No database migration risk: no schema or migration files changed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with repository tool use,
shell execution, git operations, and local verification. Exact hosted
context window was not exposed in this environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 19:57:05 -05:00
Dotta ad5432fece [codex] Harden issue recovery reliability (#4875)
## Thinking Path

> - Paperclip is the control plane for autonomous agent companies, so
non-terminal issue state must always have a clear live, waiting, or
recovery owner.
> - This change stays inside the server reliability and liveness
subsystem for assigned issue recovery, blocker attention, and live-run
polling.
> - Closed PR #4860 mixed this reliability work with separate
mutation-boundary policy changes, which made review and merge risk too
broad.
> - [PAP-2981](/PAP/issues/PAP-2981) asked for a replacement PR
containing only the remaining reliability slice and explicitly excluding
user-assignment and execution-policy restrictions.
> - Follow-up review also split `advanced` run-liveness continuation
behavior out of this PR so it can be reviewed separately.
> - The implementation hardens repeated recovery escalation, expands
blocker-attention coverage for explicit waiting and recovery paths, and
caps company live-run polling defaults.
> - The benefit is a smaller reliability PR that improves liveness
behavior without changing agent/user mutation authorization boundaries
or `advanced` continuation semantics.

## What Changed

- Avoid repeated liveness escalation updates when the source issue is
already blocked by the same open escalation.
- Treat open liveness escalation recovery issues, their source issues,
and their leaf blockers as covered waiting paths in blocker attention.
- Cap default company live-run polling at 50 rows for both `minCount`
and `limit`, including explicit zero values, to avoid unbounded
responses.
- Preserve the existing behavior where succeeded `advanced` runs are
considered productive/healthy for stranded-work recovery and are not
actionable bounded run-liveness continuations.
- Added focused server coverage for recovery dedupe, blocker attention,
liveness escalation, run continuations, and live-run polling.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
server/src/__tests__/issue-blocker-attention.test.ts
server/src/__tests__/run-continuations.test.ts
server/src/__tests__/agent-live-run-routes.test.ts`
- Result: 5 files passed, 63 tests passed.
- `pnpm --filter @paperclipai/server typecheck`
- Result: passed.
- No UI changes; screenshots are not applicable.

## Risks

- Recovery and blocker-attention classification changes can affect which
blocked chains are shown as covered versus needing attention.
- Live-run polling now treats omitted, invalid, or non-positive `limit`
/ `minCount` values as the capped default of 50.
- `advanced` run-liveness continuation behavior is intentionally
excluded from this PR and split for separate review.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 16:44:28 -05:00
Dotta a3de1d764d Add cheap model profiles for local adapters (#4881)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies, where
adapters are the boundary between the board, agents, and execution
runtimes.
> - Local adapters currently expose a primary runtime configuration, but
operators often need a cheaper model lane for routine or low-risk work.
> - That cheap lane has to stay adapter-owned: runtime profile settings
should not mutate the primary adapter config or bypass existing
auth/secret mediation.
> - Issue creation also needs an ergonomic way to request primary,
cheap, or custom model behavior for a selected assignee.
> - This pull request adds a first-class `cheap` model profile contract
across adapter capabilities, heartbeat config resolution, agent
configuration, and issue creation.
> - The benefit is cheaper task execution can be configured and
requested explicitly while preserving adapter boundaries, secret
handling, and audit visibility.

## What Changed

- Added adapter model-profile capability metadata and a `cheap` profile
contract for supported local adapters.
- Applied `runtimeConfig.modelProfiles.cheap.adapterConfig` during
heartbeat config resolution, including requested/applied/fallback run
metadata.
- Added agent configuration UI for cheap model profile settings without
writing those settings into primary `adapterConfig`.
- Added New Issue assignee model lane controls for Primary / Cheap /
Custom and request payload handling.
- Added run ledger profile badges and Storybook stories for the new
cheap-lane UI states.
- Added tests for validators, heartbeat model profile application,
permission/secret mediation, UI payload helpers, and run ledger
rendering.
- Added committed UI verification screenshots under
`docs/pr-screenshots/pap-2837/`.
- Addressed Greptile review feedback around cheap-profile defaults,
shared profile types, and fallback test data.

## Verification

Local:

- `pnpm exec vitest run packages/shared/src/validators/issue.test.ts
server/src/__tests__/adapter-registry.test.ts
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/heartbeat-model-profile.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/lib/agent-config-patch.test.ts
ui/src/lib/issue-assignee-overrides.test.ts
ui/src/lib/new-agent-runtime-config.test.ts` — passed, 8 files / 103
tests.
- `pnpm exec vitest run ui/src/lib/new-agent-runtime-config.test.ts
ui/src/components/IssueRunLedger.test.tsx` — passed after
Greptile/rebase follow-up, 2 files / 17 tests.
- `pnpm --filter @paperclipai/ui typecheck` — passed after
Greptile/rebase follow-up.
- `pnpm -r typecheck` — passed.
- `pnpm build` — passed.
- `pnpm test:run` — did not complete successfully in this local
worktree: it stopped in pre-existing `@paperclipai/adapter-utils`
sandbox/SSH fixture suites outside this PR diff. Failures were 5s local
timeouts plus `git init -b` unsupported by this machine's Git 2.21.0.
The branch-specific targeted suites above passed.
- Branch was fetched/rebased onto `public-gh/master`; `git rev-list
--left-right --count public-gh/master...HEAD` reports `0 9`.

Remote PR checks on latest head
`e30bf399146451c86cee98ed528d51d33fa5af5a`:

- `policy` — passed.
- `verify` — passed.
- `e2e` — passed.
- `Greptile Review` — passed, confidence score 5/5; Greptile review
threads resolved.
- `security/snyk (cryppadotta)` — passed.

Screenshots:

- [New issue cheap lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-cheap-desktop.png)
- [New issue custom lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-custom-desktop.png)
- [New issue unsupported adapter
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-unsupported-desktop.png)
- [Run ledger model profile badges
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/runledger-profile-badges-desktop.png)
- Mobile variants are also in `docs/pr-screenshots/pap-2837/`.

## Risks

- Medium: heartbeat config mediation now merges runtime model profiles
into adapter configs, so adapter secret normalization and host-command
restrictions must keep covering nested config paths.
- Medium: the UI adds another issue creation choice; unsupported
adapters must keep hiding the cheap lane and preserve primary behavior.
- Low migration risk: no database migration is included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

OpenAI Codex coding agent using GPT-5-class reasoning with repo tool use
and command execution. Exact served model/context window was not exposed
by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 15:32:04 -05:00
Dotta 1fe1067361 Polish board settings and skills workflow (#4863)
## Thinking Path

> - Paperclip's board UI and bundled skills are the operator layer for
configuring agents, routines, issue workflows, and local troubleshooting
loops.
> - The prior rollup mixed this operator polish with database backups,
backend reliability, thread scale, and cost/workflow primitives.
> - This pull request isolates the remaining board QoL, settings,
issue-detail integration, adapter config cleanup, and skills smoke
tooling.
> - It includes some integration-level overlap with the thread and
workflow slices so this branch can run from `origin/master` while still
preserving the full original work.
> - Preferred merge order is the narrower primitives first, then this
integration PR last.
> - The benefit is that reviewers can inspect the user-facing
board/settings/skills layer separately from backend infrastructure
changes.

## What Changed

- Added board/settings polish for agents, routines, company settings,
project workspace detail, and issue detail controls.
- Added agent/routine UI regression tests and New Issue dialog coverage.
- Integrated issue-detail activity/cost/interaction surfaces and leaf
work pause/resume controls.
- Cleaned bundled adapter UI config defaults and onboarding copy.
- Added terminal-bench loop and work-stoppage diagnosis skills plus a
smoke test script.
- Updated attachment type handling and Paperclip skill/API guidance.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/pages/Agents.test.tsx
ui/src/pages/Routines.test.tsx ui/src/components/NewIssueDialog.test.tsx
ui/src/pages/IssueDetail.test.tsx
server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts`
- Result: 7 test files passed, 54 tests passed.
- `pnpm run smoke:terminal-bench-loop-skill`
- Result: JSON output included `"ok": true` and `"cleanup": true`.
- UI screenshots not included because verification is focused
component/page coverage for the changed board surfaces.

## Risks

- This is the integration-heavy PR in the split and intentionally
overlaps some component/API primitives with the issue-thread and
workflow PRs so it can run from `origin/master`.
- Preferred merge order: #4859, #4860, #4861, #4862, then this PR last.
If earlier branches merge first, this PR may need a straightforward
conflict refresh in shared UI files.
- The terminal-bench smoke script creates temporary mock issues and
relies on cleanup; the verified run returned `cleanup: true`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 15:28:11 -05:00
Dotta c4269bab59 Add workflow interaction cancellation and issue cost summaries (#4862)
## Thinking Path

> - Paperclip coordinates work through issue-thread interactions, run
history, and cost telemetry.
> - Operators need workflow prompts to be cancellable and costs to be
visible at the issue level.
> - The earlier rollup mixed this workflow/cost work with database
backups, reliability recovery, thread scaling, and settings polish.
> - This pull request isolates the interaction and cost surfaces into a
reviewable slice.
> - The backend now supports cancelling pending question interactions
and summarizing issue-tree costs.
> - The UI component layer can render cancelled questions and interleave
activity with run ledger rows.

## What Changed

- Added `cancelled` as an issue-thread interaction status and result
shape for question interactions.
- Added the board-only `POST
/issues/:id/interactions/:interactionId/cancel` route and service
implementation.
- Added issue-tree cost summary support in the cost service and
`/issues/:id/cost-summary` API route.
- Extended shared cost exports and UI API/query keys for issue cost
summaries.
- Updated `IssueThreadInteractionCard` and `IssueRunLedger` components
for cancelled questions, issue cost surfaces, and activity/run
interleaving.
- Added focused server and component regression coverage.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- Result: 4 test files passed, 45 tests passed.
- UI screenshots not included because this PR updates reusable
components and API surfaces without wiring a new page-level layout.

## Risks

- Adds a new interaction terminal status; clients that switch
exhaustively on interaction status may need to handle `cancelled`.
- Issue-tree cost summaries use recursive issue traversal and should be
watched on unusually large issue trees.
- Page-level issue detail wiring is intentionally left to the board
QoL/issue-detail branch to keep this PR narrow.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 13:57:25 -05:00
Dotta 87f19cd9a6 Improve issue thread scale and markdown polish (#4861)
## Thinking Path

> - Paperclip's board UI is the operator surface for supervising
AI-agent companies.
> - Issue threads are where operators read progress, respond to agents,
inspect markdown, and jump through long histories.
> - Large threads and rich markdown had become difficult to navigate and
expensive to render.
> - The previous rollup mixed these UI scale fixes with unrelated
backend recovery, costs, backups, and settings changes.
> - This pull request isolates the issue-thread scale and markdown
polish work.
> - The benefit is a reviewable UI slice that can merge independently of
the backend reliability, database backup, workflow, and board QoL PRs.

## What Changed

- Virtualized long issue chat threads and stabilized
anchor/jump-to-latest behavior for large histories.
- Added incremental issue-list row loading and tests for
scroll-triggered pagination behavior.
- Hardened markdown body rendering and markdown editor behavior around
HTML tags, image drops, code-copy UI, and escaped newline handling.
- Added a long-thread measurement harness at
`scripts/measure-issue-chat-long-thread.mjs` plus
`perf:issue-chat-long-thread`.
- Added focused UI/lib regression coverage for thread rendering,
markdown, optimistic comments, and message building.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssuesList.test.tsx
ui/src/components/MarkdownBody.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/lib/issue-chat-messages.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`
- Result: 6 test files passed, 170 tests passed.
- UI screenshots not included because this PR is covered by targeted
component tests and does not introduce a new page layout.

## Risks

- Virtualization changes can affect scroll anchoring in edge cases on
very long threads.
- Markdown/editor hardening changes are intentionally defensive, but
malformed content may render differently than before.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 13:18:01 -05:00
Dotta cd606563f6 Expand database backups to non-system schemas (#4859)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - Reliable backups are part of operating that control plane safely.
> - The previous backup path was public-schema oriented and did not
clearly cover plugin-owned schemas or migration history.
> - Paperclip now has plugin database namespaces and Drizzle migration
state that must survive backup/restore.
> - This pull request expands logical database backups to non-system
schemas and documents the backup boundary.
> - The benefit is safer restore behavior for core and plugin-owned
database state without implying full filesystem disaster recovery.

## What Changed

- Include non-system database schemas in JavaScript and pg_dump backup
paths.
- Preserve enum, table, sequence, index, constraint, migration, and
plugin-schema objects across backup/restore.
- Add restore coverage for plugin-owned schemas and Drizzle migration
history.
- Clarify docs that DB backups are logical database backups, not full
instance filesystem backups.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run packages/db/src/backup-lib.test.ts`
- Result: 1 test file passed, 4 tests passed.
- Confirmed this PR does not include `pnpm-lock.yaml` or
`.github/workflows/*` changes.

## Risks

- Medium: backup generation touches schema discovery and restore
ordering, so unusual database objects may need additional coverage
later.
- No migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use enabled, medium reasoning
effort. Exact hosted context-window details are not exposed in this
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: no UI changes are included in this PR, so screenshots are not
applicable.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 12:54:35 -05:00
Devin Foley c0ce35d1fb Improve E2B plugin configuration UX and fix execution timeouts (#4802)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - E2B is a sandbox provider plugin that runs agent code in isolated
cloud environments
> - Operators configure E2B through the plugin settings page
> - But the E2B API key configuration was unclear — the settings field
description didn't explain that pasted keys are auto-saved as company
secrets, and the fallback to the host `E2B_API_KEY` variable wasn't
documented
> - Additionally, long-running E2B sandbox commands were timing out
because the plugin environment RPC driver used a fixed timeout, and
environment commands competed for the single foreground command slot
> - This PR clarifies the E2B configuration UX, fixes RPC timeouts for
plugin environment execution, and runs E2B environment commands in
background mode to avoid blocking the foreground slot
> - The benefit is clearer E2B setup for operators and more reliable
sandbox command execution

## What Changed

- Updated E2B plugin manifest and settings UI to clarify API key
configuration — field description now explains that pasted keys are
saved as company secrets and documents the `E2B_API_KEY` host fallback
- Added test coverage for the plugin settings page rendering
- Fixed `plugin-environment-driver.ts` to pass the configured timeout
through to RPC calls instead of using a hardcoded default
- Updated `environment-runtime.ts` to propagate timeout from the
environment lease to the plugin driver
- Changed E2B sandbox command execution to use background handles so
long-running agent commands don't block the foreground slot needed by
the callback bridge

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to plugin settings, verify E2B API key field shows
the updated description text
- Manual: run an E2B-backed agent task with a long-running command,
verify it completes without RPC timeout

## Risks

- Low risk. Configuration UX change is cosmetic. The timeout fix passes
an existing value through instead of dropping it. Background command
execution is a behavioral change but only affects E2B sandbox commands —
the foreground slot is still available for bridge health checks.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 17:12:30 -07:00
Devin Foley a4ac6ff133 Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end

## What Changed

- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export

## Verification

- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge

## Risks

- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
Devin Foley 4cf612a92d Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments that are leased, and the server
manages runtime state, workspace configuration, and plugin lifecycle
> - Several edge cases caused failures during concurrent operations: a
race condition in runtime state insertion could produce duplicate-key
errors, reused workspaces didn't sync their configuration when the
parent issue was updated, sandbox provider plugins could be queried
before registration completed, and orphaned environment leases from
failed runs were never released
> - This PR fixes these four runtime/environment issues
> - The benefit is more reliable concurrent agent execution and proper
resource cleanup

## What Changed

- `services/heartbeat.ts`: Fixed a race condition where concurrent
runtime state inserts could fail with a duplicate-key error by using an
upsert pattern
- `services/issues.ts`: Sync reused workspace configuration when an
issue is updated, so the workspace reflects the latest issue state
- `services/environment-runtime.ts`: Fixed a startup race where sandbox
provider plugins could be queried before registration completed, by
awaiting plugin readiness before resolving environment drivers
- `services/heartbeat.ts`: Release environment leases for orphaned runs
that lost their process without cleanup

## Verification

- `pnpm test` — all existing and new tests pass, including new tests for
runtime state upsert and process recovery lease cleanup
- `pnpm typecheck` — clean
- Manual: trigger concurrent agent runs to verify no duplicate-key
failures; verify orphaned leases are released after process loss

## Risks

- Low risk. The runtime state upsert changes insert-to-upsert behavior,
which could mask a legitimate duplicate if two different runs produce
the same key — but this is prevented by the run ID being part of the
key. The plugin startup await is bounded by the existing registration
timeout.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:10 -07:00
Devin Foley f9cf1d2f6a Add cursor sandbox support and fix SSH workspace sync (#4803)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, or on remote
hosts via SSH
> - The cursor adapter needs to resolve `cursor-agent` inside sandbox
environments where it's installed in `~/.local/bin`
> - But when using the default `agent` command on a sandbox target, the
adapter didn't know to look in `~/.local/bin/cursor-agent`, causing
"command not found" failures
> - Additionally, repeated SSH runs failed because `git checkout` during
workspace sync conflicted with leftover `.paperclip-runtime` files from
previous runs
> - This PR adds sandbox-aware command resolution for cursor and fixes
the SSH workspace sync conflict
> - The benefit is cursor works in E2B sandboxes out of the box, and
repeated SSH runs don't fail on workspace sync

## What Changed

- `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox
targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and
prefers `~/.local/bin/cursor-agent` when the default command is
requested; tightened the sandbox command probe to validate the binary
exists before launching; preserves explicit custom command overrides
- `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH
workspace sync to handle `.paperclip-runtime` untracked file conflicts
from previous runs

## Verification

- `pnpm test` — all existing and new tests pass, including cursor
sandbox probe, sandbox execution, and custom command override tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run a cursor-local task, verify
it resolves cursor-agent from the sandbox install path

## Risks

- Low-medium. The `--force` flag on git checkout could discard
uncommitted changes in the remote workspace, but the workspace is
managed by Paperclip and should not contain user edits.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:12:06 -07:00
Devin Foley a0f5cbffd7 Harden release flow with registry verification and dist-tag checks (#4800)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Paperclip is distributed as npm packages, including plugins like
`plugin-e2b`
> - The release process publishes canary and stable builds via npm
dist-tags
> - But there was no automated verification that published packages
actually landed with the correct dist-tags, and broken canary publishes
could silently ship to users
> - This PR adds a registry verification script that checks published
packages match their expected dist-tags, and wires it into PR CI so
regressions are caught before merge
> - The benefit is release integrity is verified automatically, and
broken dist-tag states are caught early

## What Changed

- Added `scripts/verify-release-registry-state.mjs` — verifies that
published npm packages have correct dist-tag assignments and detects
orphaned or mispointed tags
- Added `scripts/verify-release-registry-state.test.mjs` — test coverage
for the verification logic
- Updated `scripts/release.sh` to include canary dist-tag safety checks
before publishing
- Updated `.github/workflows/pr.yml` to run registry verification as a
CI step
- Updated `doc/PUBLISHING.md` and `doc/RELEASING.md` with the new
verification workflow

## Verification

- `pnpm test` — all tests pass including new verification script tests
- `node scripts/verify-release-registry-state.mjs` — runs against the
live npm registry and reports current state
- CI: the new PR workflow step runs on every PR push

## Risks

- Low risk. This is additive CI and tooling — no runtime code changes.
The registry verification is read-only (queries npm, does not publish).
The release script changes add safety checks that abort before
publishing if state is unexpected.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:20 -07:00
Devin Foley 367d4cab72 Fix SSH callback URL selection for LAN and private networks (#4799)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run on remote hosts via SSH environments
> - When a remote agent needs to call back to the Paperclip API, it
needs a reachable URL
> - But the runtime API URL candidate builder did not account for
private network topologies where the server is only reachable via LAN or
VPN addresses
> - Agents on SSH hosts were failing to connect because the callback URL
pointed to localhost or an unreachable address
> - This PR fixes callback URL selection to honor `PAPERCLIP_API_URL`,
prefer LAN-reachable candidates, filter unreachable link-local
addresses, and include interface hosts in onboarding invite URLs
> - The benefit is SSH-based agents can reliably reach the Paperclip API
on private networks without manual URL configuration

## What Changed

- `runtime-api.ts`: Added `PAPERCLIP_API_URL` as a first-priority
candidate in `buildRuntimeApiCandidateUrls`; extracted
`collectReachableInterfaceHosts` to enumerate non-loopback,
non-link-local network interface IPs with IPv4 preference
- `server/src/index.ts`: Export `PAPERCLIP_API_URL` from the server
environment so it is available to callback candidate resolution
- `server/src/routes/access.ts`: Include LAN interface hosts in
onboarding invite connection candidates
- `server/src/config.ts`: Attempted auto-allowing LAN interface hosts,
then reverted to the per-instance allowlist approach (both commits
included for history clarity)

## Verification

- `pnpm test` — all existing and new tests pass, including new tests for
LAN candidate ordering and link-local filtering
- `pnpm typecheck` — clean
- Manual: start a Paperclip server on a machine with a LAN IP, create an
SSH environment pointing to another host on the same LAN, verify the
agent's callback URL uses the LAN IP rather than localhost

## Risks

- Low-medium. The candidate list now includes more addresses (all
non-loopback LAN interfaces). These are candidates for the agent to try,
not an allowlist — the server's allowed hostnames still gate which
origins are accepted. Ordering change (LAN preferred over loopback)
could affect existing setups where localhost was intentionally
preferred.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:17 -07:00
Devin Foley 9b99d30330 Add dedicated environment settings page and test-in-environment (#4798)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments (local, SSH, E2B sandbox)
> - Operators need to configure and manage these environments
> - But environment settings were buried inside the general company
settings page, making them hard to find
> - Additionally, when testing an agent from the configuration form, the
test always ran locally regardless of which environment was selected
> - This PR moves environments into a dedicated top-level company
settings section and wires the "Test Environment" button to run inside
the selected environment
> - The benefit is operators can find and manage environments more
easily, and the test button now validates the actual environment the
agent will use

## What Changed

- Added a dedicated `CompanyEnvironments` settings page with its own
route and sidebar entry
- Updated `CompanySettingsSidebar` and `CompanySettingsNav` to include
the new environments section
- Modified the agent test route (`POST /agents/:id/test`) to accept an
optional `environmentId` parameter
- Updated all adapter `test.ts` handlers to resolve and use the
specified execution target environment
- Added `resolveTestExecutionTarget` to `execution-target.ts` for remote
environment test resolution with cwd fallback
- Moved the "Test Environment" button and its feedback display into the
`NewAgent` page footer for better UX flow

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to Company Settings, confirm "Environments" appears
as a top-level section
- Manual: configure an agent with a non-local environment, click "Test
Environment", confirm the test runs inside that environment

## Risks

- Low risk. UI-only routing change for the settings page. The
test-in-environment change adds an optional parameter with a local
fallback, so existing behavior is preserved when no environment is
specified.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:13 -07:00
Dotta 3494e84a29 Add v2026.428.0 release changelog (#4665)
## Summary

- Adds `releases/v2026.428.0.md` covering the diff between `v2026.427.0`
and `origin/master` (seven merged PRs).
- Generated via `.agents/skills/release-changelog/SKILL.md`.
- Flags the additive migrations `0071_default_hire_approval_off` and
`0072_large_sandman` plus the new-companies hire-approval default flip
in the upgrade guide.

## Notes

- Highlight: pause/resume actions in the sidebar agents panel
([#4616](https://github.com/paperclipai/paperclip/pull/4616)).
- Improvements: assigned-todo recovery dispatch, recovery issue
hardening, hire-approval opt-in default, inline selector keyboard
handling.
- Fixes: manual routine inbox visibility, stale company skill refresh
rejection, stale stored company-selection cleanup.

## Test plan

- [ ] Reviewer confirms section coverage and PR-to-bullet attribution.
- [ ] Confirm the file lands at \`releases/v2026.428.0.md\`.
- [ ] Confirm no canary suffix in title or filename.

Source issue: [PAP-2599](/PAP/issues/PAP-2599)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 17:40:20 -05:00
Dotta 6b7f6ce4b8 [codex] Split PR #4692 UI/QoL updates (#4701)
## Thinking Path

> - Paperclip orchestrates AI agents through a company-scoped control
plane.
> - The affected surface is the board UI for issue threads, issue lists,
routines, dialogs, navigation, and issue review indicators.
> - Closed PR #4692 bundled backend, schema, docs, workflow, and UI/QoL
work into one oversized change set.
> - Greptile could not keep reviewing that broad PR because it exceeded
the 100-file review limit and mixed unrelated concerns.
> - This pull request extracts the UI/QoL slice into a fresh branch
under the review limit while leaving workflow and lockfile churn out.
> - The benefit is a focused review path for the board UI performance
and workflow improvements without reopening the oversized PR.

## What Changed

- Added long issue-thread virtualization, scroll-container binding,
anchor preservation, latest-comment jump targeting, and related
regression/perf fixtures.
- Improved issue list scalability with scroll-based loading, server
offset parameters, and pagination-focused UI tests.
- Reduced new issue dialog typing churn and split dialog action
subscriptions so broad layout/nav surfaces avoid unnecessary renders.
- Added routine variables help and routine description mention options
for users, agents, and projects.
- Added productivity review badge/link UI and fixed the badge to use
Paperclip's company-prefixed router link.
- Kept the split PR below Greptile's review limit and excluded
`.github/workflows/pr.yml` and `pnpm-lock.yaml`.

## Verification

- `pnpm install --no-frozen-lockfile` in the clean worktree to install
`@tanstack/react-virtual` locally without committing lockfile churn.
- `pnpm --filter @paperclipai/ui exec vitest run --config
vitest.config.ts src/components/IssueChatThread.test.tsx
src/components/IssuesList.test.tsx
src/components/NewIssueDialog.test.tsx src/pages/Routines.test.tsx
src/pages/Issues.test.tsx` passed: 5 files, 83 tests.
- `pnpm --filter @paperclipai/ui typecheck` passed.
- `git diff --check origin/master..HEAD` passed.
- Split-scope checks: 53 changed files; no `.github/workflows/pr.yml`;
no `pnpm-lock.yaml`.
- Screenshots were not captured in this heartbeat; the changes are
primarily virtualization, routing, pagination, and editor behavior
covered by focused regression tests.

## Risks

- Moderate UI risk because issue-thread virtualization changes scroll
behavior on long conversations; regression tests cover anchor jumps,
latest-comment targeting, row metadata, and short-thread fallback.
- Moderate integration risk because the issue-list offset parameter and
productivity review field depend on matching API behavior.
- Dependency risk: the UI package adds `@tanstack/react-virtual` while
repository policy keeps `pnpm-lock.yaml` out of PRs, so CI must resolve
dependency changes through the repo's normal lockfile policy.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled local repository and
GitHub workflow. Exact runtime context window was not exposed by the
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 17:18:58 -05:00
Dotta 1991ec9d6f [codex] Split backend control-plane QoL slice (#4700)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, so
backend task ownership, recovery, review visibility, and company-scoped
limits need to stay enforceable without UI-only coupling.
> - Closed PR #4692 bundled those backend changes with UI workflow,
docs, skills, workflow, and lockfile churn.
> - PAP-2694 asks for a clean backend/control-plane slice from that
closed branch.
> - This branch starts from current `master` and mines only the `cli`,
`packages/db`, `packages/shared`, and `server` contracts/tests needed
for the backend behavior.
> - It explicitly excludes UI workflow/performance work,
`.github/workflows/pr.yml`, `pnpm-lock.yaml`, docs, skills,
package-script, adapter UI build-config, and perf fixture script
changes; the only UI files are fixture/test updates required by the
tightened shared `Company` contract.
> - The benefit is a smaller reviewable PR that preserves the
control-plane fixes while staying under Greptile s 100-file review
limit.

## What Changed

- Added company-scoped attachment-size limits through DB
schema/migrations, shared company portability contracts, CLI
import/export coverage, and server attachment upload enforcement.
- Added productivity review service/API behavior for no-comment streak,
long-active, and high-churn review issues, including request-depth
clamping and issue summary exposure.
- Hardened issue ownership and recovery/control-plane paths: peer-agent
mutation denial, issue tree pause/resume behavior, stranded recovery
origins, and related activity/test coverage.
- Preserved related backend contract updates for routine timestamp
variables and managed agent instruction bundles because they live in
shared/server contracts from the source branch.
- Addressed Greptile feedback by making `Company.attachmentMaxBytes`
non-optional, simplifying review request-depth clamping, fixing the
migration final newline, and enforcing the process-level attachment cap
as the final ceiling for uploads.
- Added minimal company fixtures needed for repo-wide typecheck/build
and kept the PR to 66 changed files with forbidden/non-slice paths
excluded.

## Verification

- `pnpm install --frozen-lockfile`
- `git diff --check origin/master..HEAD`
- `git diff --name-only origin/master..HEAD | wc -l` -> 66 files
- `git diff --name-only origin/master..HEAD -- .github/workflows/pr.yml
pnpm-lock.yaml package.json doc skills .agents scripts
packages/adapters` -> no output
- `pnpm exec vitest run --config vitest.config.ts
packages/shared/src/validators/issue.test.ts
packages/shared/src/routine-variables.test.ts
packages/shared/src/adapter-types.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts
cli/src/__tests__/company.test.ts
server/src/__tests__/productivity-review-service.test.ts
server/src/__tests__/issue-tree-control-service.test.ts
server/src/__tests__/issue-tree-control-routes.test.ts
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
server/src/__tests__/issue-attachment-routes.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/issues-service.test.ts` -> 12 files, 147 tests
passed
- `pnpm exec vitest run --config vitest.config.ts
cli/src/__tests__/company-delete.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts
server/src/__tests__/productivity-review-service.test.ts` -> 3 files, 18
tests passed
- `pnpm exec vitest run --config vitest.config.ts
server/src/__tests__/issue-attachment-routes.test.ts` -> 1 file, 6 tests
passed
- `pnpm --filter @paperclipai/db typecheck && pnpm --filter
@paperclipai/shared typecheck && pnpm --filter @paperclipai/server
typecheck && pnpm --filter paperclipai typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck && pnpm --filter
@paperclipai/ui build`

## Risks

- Includes migrations `0073_shiny_salo.sql` and
`0074_striped_genesis.sql`; merge ordering matters if another PR adds
migrations first.
- This is intentionally backend-only apart from fixture/test updates
forced by shared type correctness; UI affordances from PR #4692 are not
present here and should land in separate UI slices.
- The worktree install emitted plugin SDK bin-link warnings for unbuilt
plugin packages, but the targeted tests and package typechecks completed
successfully.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected; check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow. Exact runtime context window was not exposed by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 16:46:45 -05:00
Dotta d9f540c331 [codex] Refresh docs and agent skills (#4693)
## Thinking Path

> - Paperclip orchestrates AI agents through a company-scoped control
plane
> - Contributors and agents need docs and skills that match the current
V1 behavior
> - The source branch included documentation updates alongside
implementation work
> - Keeping docs and skill guidance separate makes the implementation PR
easier to review
> - This pull request refreshes the V1 docs and agent-operating guidance
without changing runtime behavior
> - The benefit is current contributor guidance that can merge
independently from code changes

## What Changed

- Refreshed V1 product, goal, implementation, database, and development
documentation.
- Updated the Paperclip heartbeat skill guidance and create-agent skill
references.
- Added the Paperclip plan-to-task conversion skill.
- Updated release changelog skill guidance.

## Verification

- `git diff --check public-gh/master..HEAD` passed in the PR worktree
after the Greptile fix.
- Greptile Review passed on head `673317ed` with zero unresolved review
threads.
- GitHub PR checks passed on head `673317ed`: `policy`, `verify`, `e2e`,
and `security/snyk (cryppadotta)`.

## Risks

- Low runtime risk because this branch only changes docs and skill
guidance.
- Documentation may need follow-up wording adjustments if reviewers want
a different framing for V1 behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow. Exact runtime context window was not exposed by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 16:12:03 -05:00
Dotta d0bdbe11a9 Stabilize inline selector keyboard handling (#4617)
## Thinking Path

> - Paperclip's board UI relies on compact selectors for frequent issue
and agent edits.
> - Inline selectors often live inside larger keyboard-aware surfaces
such as composers and popovers.
> - Arrow, enter, tab, and escape keys handled by the selector should
not leak to parent document shortcuts.
> - Stale company selection should also stay hidden until the company
list confirms it is valid.
> - This pull request tightens inline selector keyboard handling and
adds regression coverage for stale company bootstrap behavior.
> - The benefit is fewer accidental parent interactions and safer
company-scoped UI initialization.

## What Changed

- Added a stable empty `recentOptionIds` default so selector filtering
does not get a new array every render.
- Mirrored highlighted option state into a ref so Enter/Tab commits the
current highlighted option reliably after keyboard navigation.
- Stopped propagation for selector-owned navigation/commit/escape keys.
- Added jsdom regressions for inline selector keyboard handling and
CompanyProvider stale selection behavior.

## Verification

- `pnpm exec vitest run ui/src/components/InlineEntitySelector.test.tsx
ui/src/context/CompanyContext.test.tsx`
- Targeted selector and CompanyProvider tests pass cleanly without React
`act(...)` warnings.
- Screenshots not attached: this is keyboard/state behavior covered by
component tests.

## Risks

- Low risk: changes are scoped to inline selector key handling and
tests. The main behavior shift is intentionally preventing handled
selector keys from reaching parent listeners.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:04:35 -05:00
Dotta 43b0f2ae58 Add pause and resume actions to sidebar agents (#4616)
## Thinking Path

> - Paperclip operators need fast control over the agents running their
company.
> - The sidebar is the persistent place operators scan agent state while
navigating the board UI.
> - Agent pause and resume already exist as control-plane actions, but
sidebar users had to navigate away to use them.
> - This pull request adds a compact per-agent action menu in the
sidebar.
> - It keeps edit, pause, and resume close to the visible agent list
while preserving existing navigation behavior.
> - The benefit is faster operator intervention when an agent needs to
be paused or restarted.

## What Changed

- Refactored sidebar agent rows into a small item component with a
hover/focus action menu.
- Added edit, pause, and resume actions using existing agent API calls
and cache invalidation keys.
- Added success/error toasts for pause and resume mutations.
- Tracked pause/resume pending state per agent so one active mutation
does not disable every sidebar row.
- Disabled direct sidebar resume for budget-paused agents and labeled
that state clearly.
- Added jsdom coverage for active-agent pause, paused-agent resume,
per-agent pending state, and budget-paused resume protection.
- Added visual review artifacts for the default row and opened action
menu:
  - [Sidebar row](docs/pr-screenshots/pr-4616/sidebar-agent-row.png)
- [Sidebar action
menu](docs/pr-screenshots/pr-4616/sidebar-agent-actions.png)

## Verification

- `pnpm exec vitest run ui/src/components/SidebarAgents.test.tsx`
- `pnpm --filter @paperclipai/server prepare:ui-dist`
- Browser screenshot pass against a temporary local trusted instance at
`http://127.0.0.1:3102` using Playwright.

## Risks

- Low risk: UI-only addition using existing agent pause/resume
endpoints. The main risk is layout crowding for very narrow sidebars,
mitigated by the icon-only trigger and existing truncation.
- Budget-paused agents now require a non-sidebar path to resume, which
is intentional to avoid accidentally restarting agents stopped by budget
enforcement.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell/browser access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:03:54 -05:00
Dotta f88f538e6d Keep manual routine runs visible in the runner inbox (#4615)
## Thinking Path

> - Paperclip coordinates recurring agent work through scheduled and
manual routines.
> - Manual routine runs are board-initiated work and should stay visible
to the human who kicked them off.
> - Routine execution issues are agent-assigned, so they can be filtered
away from a board user's inbox unless the user is recorded as touching
the work.
> - Coalesced or skipped active routine runs have the same visibility
problem because they reuse an existing live issue.
> - This pull request carries the manual runner actor into routine
dispatch and touches the linked issue for that user's inbox.
> - The benefit is that manually triggered routine work stays
discoverable by the operator who started it.

## What Changed

- Passed the board or agent actor from the routine run route into the
routine service.
- Recorded manual board runners as `createdByUserId` on fresh routine
execution issues.
- Touched coalesced or skipped active routine issues for the manual
runner by updating read state and clearing that user's inbox archive.
- Added route and service regressions for manual routine run actor
propagation and inbox visibility.

## Verification

- `pnpm exec vitest run server/src/__tests__/routines-routes.test.ts
server/src/__tests__/routines-service.test.ts`

## Risks

- Low risk: the change is scoped to manual routine runs and only updates
issue attribution/read-state metadata for the initiating actor.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:03:24 -05:00
Dotta 68c37660f0 Dispatch assigned todo work during recovery sweeps (#4614)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies.
> - Agent assignments must reliably turn into heartbeat work without
board operators manually nudging stuck tasks.
> - The stranded-assignment recovery sweep already handles failed or
lost runs.
> - But assigned `todo` issues with no prior run could sit idle because
there was nothing to retry or recover.
> - This pull request dispatches those never-started assigned todos as
normal assignment wakes.
> - The benefit is that recovery fixes missed initial dispatches without
creating unnecessary recovery issues.

## What Changed

- Added an initial assigned-todo dispatch path to the recovery service
when an assigned `todo` issue has no heartbeat run yet.
- Reused invocation budget hard-stop checks before dispatching or
requeueing recovery work.
- Counted `assignmentDispatched` in startup/scheduled recovery logs.
- Added heartbeat recovery regressions for first dispatch, duplicate
queued wake prevention, budget-blocked skips, and paused-agent skips.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`

## Risks

- Low to medium risk: this changes liveness recovery behavior for
assigned `todo` issues, but it stays on the existing assignment wake
path and skips paused or budget-blocked agents.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-27 20:02:44 -05:00
Dotta 7a9b3a6037 [codex] Harden recovery issue handling (#4600)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The control plane must recover stranded agent work without creating
new operational loops
> - Stranded recovery issues can themselves fail, and exposing raw retry
errors in comments can leak sensitive adapter details
> - New local companies also should not force a hire-approval gate
unless operators enable that policy
> - This pull request hardens recovery issue handling, redacts retry
failure details in issue copy, preserves `maxConcurrentRuns: 1`, and
flips new-hire approval to an opt-in default
> - The benefit is safer automatic recovery and smoother default company
setup without hidden migration conflicts

## What Changed

- Added migration `0071_default_hire_approval_off` and updated company
schema/import/export/docs so hire approvals default off and serialize
only when enabled.
- Added migration `0072_large_sandman` with a partial unique index
preventing duplicate active stranded recovery issues for the same source
issue.
- Blocked failed `stranded_issue_recovery` issues in place instead of
creating nested recovery issues.
- Redacted latest retry failure details from recovery issue comments
while still linking reviewers to run evidence.
- Allowed `maxConcurrentRuns: 1` to be honored by heartbeat concurrency
normalization.
- Added focused regression coverage for recovery recursion, redaction,
migration ordering, and concurrency behavior.

## Verification

- `pnpm --filter @paperclipai/db run check:migrations`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/recovery-classifiers.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-portability.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-permissions-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-process-recovery.test.ts --pool=forks
--poolOptions.forks.isolate=true` exits 0, but this host skipped the
embedded Postgres tests with the existing init guard.
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
--pool=forks --poolOptions.forks.isolate=true` exits 0, but this host
skipped the embedded Postgres tests with the existing init guard.

## Risks

- Migration risk is low but this PR intentionally owns both new
migrations to avoid separate PR migration-journal conflicts.
- Recovery comments now require operators to inspect linked run evidence
for details instead of reading raw errors inline.
- The hire approval default changes behavior for newly created/imported
companies only; existing persisted company settings are not changed
except by the SQL default for future rows.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 15:02:47 -05:00
Dotta 6ccf80bcf2 [codex] Reject stale company skill refreshes (#4601)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Company skills are part of the reusable agent capability layer
> - Skill inventory refresh work can outlive the company it was
requested for
> - Without an explicit company existence check, stale refreshes can
continue into bundled/local skill cleanup for deleted or missing
companies
> - This pull request makes company-skill listing fail fast when the
company no longer exists
> - The benefit is clearer API behavior and less stale background work
against missing company scope

## What Changed

- Added a company existence check before `companySkillService.list()`
refreshes bundled and local-path skill state.
- Added regression coverage asserting missing companies return `404
Company not found`.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-skills-service.test.ts --pool=forks
--poolOptions.forks.isolate=true` exits 0, but this host skipped the
embedded Postgres tests with the existing init guard.

## Risks

- Low risk. Existing callers for valid companies are unchanged.
- Missing-company callers now receive an explicit 404 instead of
continuing refresh work.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-27 13:19:38 -05:00
Dotta d95968a9f8 [codex] Ignore stale stored company selections (#4602)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board UI is the operator’s control surface for selecting the
active company
> - A company id stored in localStorage can become stale across resets,
imports, or deleted companies
> - Exposing that stale id before companies load can briefly put
downstream UI in an invalid company scope
> - This pull request defers selected-company exposure until the loaded
company list validates the stored id
> - The benefit is a cleaner company-selection bootstrap path and fewer
transient invalid API requests

## What Changed

- Initialized `CompanyProvider` selection as `null` until companies
finish loading.
- Reused a stored company id only when it exists in the loaded
selectable company list.
- Cleared storage and selected state when no companies are available.
- Added jsdom regression coverage for stale stored ids before and after
company loading.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/context/CompanyContext.test.tsx`

## Risks

- Low risk. The change only affects selection bootstrap and keeps valid
stored selections intact.
- There may be a slightly longer initial `null` selected-company state
while the company list is loading.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 13:18:21 -05:00
Dotta 15c0ce3722 Add Twitter/X link to READMEs (PAP-2475) (#4593)
## Summary

- Adds [@papercliping on X](https://x.com/papercliping) next to the
existing Discord link in the top-of-file nav and the bottom
**Community** list of both `README.md` and `cli/README.md`.
- The `cli/README.md` change keeps the npm-published readme consistent
with the GitHub one.

Resolves PAP-2475.

## Test plan

- [ ] Render `README.md` on GitHub and confirm the new Twitter link in
the header strip and the Community section.
- [ ] Render `cli/README.md` (preview on GitHub or via npm) and confirm
the same.
- [ ] Click both Twitter links and verify they land on
`https://x.com/papercliping`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 09:35:33 -05:00
Dotta ecd92af001 release: v2026.427.0 notes (#4590)
## Summary

- Adds `releases/v2026.427.0.md` covering the 77 PRs since `v2026.416.0`
- Highlights: multi-user access + invites, structured issue-thread
interactions, run liveness continuations, sub-issues as a workflow
checklist, issue subtree pause/cancel/restore, first-class issue
references
- Beta Features: Environments + pluggable sandbox providers (incl.
`@paperclipai/plugin-e2b`)
- Notes 14 additive migrations (`0057`–`0070`) in the Upgrade Guide; no
breaking changes flagged

Source issue: [PAP-2476](/PAP/issues/PAP-2476)

## Test plan
- [ ] Spot-check that each linked PR actually landed in the v2026.427.0
range
- [ ] Confirm migration list matches `server/src/database/migrations`
- [ ] Verify rendered markdown looks right on GitHub

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 09:16:20 -05:00
Dotta 215b6cd161 [codex] Add security role route coverage (#4589)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Agent creation accepts roles that become part of the agent contract
and telemetry.
> - The shared role list already includes the security role.
> - Direct agent creation should preserve that role through route
handling and analytics metadata.
> - This pull request adds route coverage for creating a security-role
agent and asserting telemetry receives the same role.
> - The benefit is regression coverage for security agents without
changing the production route behavior.

## What Changed

- Added a server route test that creates an agent with `role:
"security"`.
- Asserted the create payload and telemetry metadata preserve `security`
as the agent role.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-skills-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`

## Risks

- Low risk; test-only coverage.
- No runtime behavior, schema, or API contract changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:49:59 -05:00
Dotta 53396f272a [codex] Fix sub-issue progress summary styling (#4588)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The issue list and issue detail surfaces summarize child/sub-issue
progress for operators.
> - Those summaries need to be compact and visually consistent because
they appear in dense lists.
> - The progress strip is most useful when there are multiple sub-issues
to compare, so the summary intentionally stays hidden for a single
sub-issue.
> - This pull request tightens the sub-issue progress summary styling
and updates the related tests.
> - The benefit is a cleaner, more scannable task list without changing
task ownership, status, or workflow behavior.

## What Changed

- Adjusted sub-issue progress summary copy/styling in the issue list and
detail summary helpers.
- Intentionally render the progress summary only for two or more child
issues; a single child issue still appears in the normal sub-issue list
without a redundant progress strip.
- Updated the UI tests that assert the rendered summary behavior.
- Clarified the two-plus-child threshold in code with a named constant.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssuesList.test.tsx
ui/src/lib/issue-detail-subissues.test.ts`

## Screenshots

![Before/after comparison of sub-issue progress summary
styling](https://gist.githubusercontent.com/cryppadotta/3a0aded379de3515acd3360bd54638e0/raw/cd26b5bd63ee65d01334f6c8ad88b1c831eb5d8f/pap-2449-subissue-progress-before-after.svg)

## Risks

- Low risk; this is a small UI presentation change with focused test
coverage.
- The intentional threshold change means parents with exactly one child
no longer show the aggregate progress strip, avoiding redundant summary
chrome while keeping the child visible in the list.
- No schema or API behavior changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:48:26 -05:00
Dotta fda296ee4f [codex] Add configurable liveness auto-recovery controls (#4587)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Heartbeat liveness recovery decides when stalled issue trees need
manager-visible follow-up.
> - Automatic recovery issue creation is useful, but operators need
instance-level controls for how aggressive it is.
> - Without controls, recovery behavior is harder to tune for local
development, production operations, and noisy edge cases.
> - This pull request adds configurable liveness auto-recovery settings
across shared contracts, API routes, services, and the instance
experimental settings UI.
> - The benefit is that operators can keep liveness findings advisory or
enable bounded recovery automation with explicit intervals and lookback
windows.

## What Changed

- Added shared types and validators for liveness auto-recovery settings.
- Extended instance settings routes and services to persist and validate
the new controls.
- Wired heartbeat/recovery services to honor enablement, minimum
interval, and lookback settings.
- Added UI controls for liveness recovery under instance experimental
settings.
- Covered the new server behavior with instance settings and liveness
escalation tests.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
server/src/__tests__/instance-settings-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Moderate behavioral risk because recovery automation timing changes
when enabled; defaults keep existing advisory behavior unless the
setting is turned on.
- No database migration in this PR; settings are stored through the
existing instance settings path.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:46:44 -05:00
Neeraj Kumar Singh B f0f9460d1d docs: AWS ECS Fargate deployment runbook (#3897)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies and ships
a
>   "local-first, cloud-ready" deployment model
> - The deploy docs currently cover local/Docker but not a production
> cloud target, so teams asking "how do I put this behind a real domain"
>   have no canonical path
> - We already support Docker images, RDS-compatible Postgres, and an
EFS
>   storage profile, so AWS ECS Fargate is a natural fit
> - Without a runbook, each team reinvents VPC, security groups, TLS,
and
>   secrets wiring and usually gets at least one step wrong
> - This pull request adds `docs/deploy/aws-ecs.md`, an ECS
task-definition
> template, and an `.env.aws.example`, cross-linked from the deploy
overview
> - The benefit is a single, reproducible ~$110/mo path to a production
>   deployment, plus a full teardown for throwaway environments

## What Changed

- New `docs/deploy/aws-ecs.md` — an 11-step ECS Fargate runbook covering
ECR,
  VPC, RDS, EFS, Secrets Manager, IAM, ALB, and ECS service with the
  deployment circuit breaker enabled
- New `docker/ecs-task-definition.json` — Fargate-ready task definition
with
  `<ACCOUNT_ID>`, `<REGION>`, `<EFS_ID>`, `<DOMAIN>` placeholder tokens
- New `docker/.env.aws.example` — documents every non-secret env var the
  ECS deployment needs
- `docs/deploy/overview.md` — one-line cross-reference to the new guide
- Greptile feedback addressed in follow-up commits:
  - `containerName` in the service-create call now matches
    `paperclip-server` in the task definition
  - HTTP :80 listener added that 301-redirects to :443
  - Dedicated RDS DB subnet group created before `create-db-instance`
  - EFS teardown polls on mount-target deletion instead of `sleep 30`

## Verification

- Walked every step of the runbook against the task definition to
confirm
  variable names (`$ALB_SG`, `$ECS_SG`, `$RDS_SG`, `$EFS_SG`, `$TG_ARN`,
`$LISTENER_ARN`, `$HTTP_LISTENER_ARN`, `$EFS_ID`, `$RDS_ENDPOINT`, etc.)
are
  defined before they are referenced
- Confirmed the `containerName` in Step 10 (`paperclip-server`) matches
  `docker/ecs-task-definition.json` line 11
- Confirmed the `sed` placeholder substitution in Step 8 matches the
tokens
  in the task definition template
- Teardown order was checked in reverse-dependency order: ECS service →
  listeners → target group → ALB → RDS (waits for deletion) → DB subnet
  group → EFS mount targets (polled) → EFS → secrets → SGs → ECR → IAM →
  log group

## Risks

- **Low risk for the repo.** Docs-only change plus two template files
under
`docker/`; no runtime code paths are touched and nothing is imported by
  the build.
- **Risk for users who follow the runbook:** AWS bills accrue
immediately
  once RDS/ALB/EFS exist. The runbook calls this out and includes a full
  teardown procedure. Placeholder tokens (`<ACCOUNT_ID>`, `<REGION>`,
`<EFS_ID>`, `<DOMAIN>`) are documented so nothing is silently
hard-coded.

## Model Used

- Claude (Anthropic), model `claude-opus-4-6`, ~200K context window,
extended thinking mode on, used with tool access (file edit, shell) via
Claude Code. The Greptile follow-up commits were authored the same way.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass — N/A for docs/config
templates; validated by reading
- [x] I have added or updated tests where applicable — N/A for docs
- [x] If this change affects the UI, I have included before/after
screenshots — N/A, no UI
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 08:41:47 -05:00
Dotta 1d8c7a09b8 [codex] Add security role route regression (#4586)
## Thinking Path

> - Paperclip orchestrates AI agents through company-scoped
control-plane workflows.
> - Agent creation is one of the core board/operator surfaces for
defining who works in a company.
> - The shared taxonomy now includes a first-class `security` agent
role.
> - Direct agent creation must preserve that role through default
instruction materialization and telemetry.
> - A prior replacement PR covered this path, but Greptile identified
that the route-test mock could let a future patch object shadow the
regression.
> - This pull request reopens the narrow regression coverage from
current `master` with the mock ordering fixed.
> - The benefit is a focused guardrail that keeps `security` role
creation observable without expanding the production diff.

## What Changed

- Added a direct agent creation route regression test for `role:
"security"`.
- Verified telemetry receives `agentRole: "security"` after the default
instruction materialization update path.
- Ordered the regression mock as `...patch` before `role: "security"` so
future patch fields cannot shadow the asserted role.

## Verification

- `pnpm install --frozen-lockfile` to link dependencies in the fresh
worktree; it completed with existing plugin SDK bin warnings.
- `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts
packages/shared/src/adapter-types.test.ts`

## Risks

- Low risk. This is test-only coverage and does not change runtime
behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 based coding agent, tool-enabled with local shell
and repository editing capabilities.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no UI changes)
- [x] I have updated relevant documentation to reflect my changes (N/A:
test-only regression)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:11:52 -05:00
Devin Foley d2cbe2cb23 Prefer pushing feature branches to a user fork in paperclip-dev skill (#4572)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The `paperclip-dev` skill is the canonical reference agents read
before doing development work on the Paperclip repo itself
> - Today the skill assumes feature branches get pushed to `origin` (=
`paperclipai/paperclip`), which clutters the upstream branch list when
contributors actually have personal forks
> - This is the standard open-source contribution pattern (push to fork,
PR upstream) and the skill should reflect it
> - This pull request adds a "Forks — Prefer Pushing to a User Fork"
section that teaches agents to detect a fork remote, push there, and
only fall back to `origin` when no fork is configured
> - The benefit is cleaner upstream branch hygiene and behavior that
matches typical contributor workflows without any code/runtime change

## What Changed

- Added a new **Forks — Prefer Pushing to a User Fork** section to
`skills/paperclip-dev/SKILL.md` covering:
- How to detect a user fork via `git remote -v` (treat any
non-`paperclipai` GitHub remote as the fork)
  - How to push to the fork (`git push -u <fork-remote> HEAD`)
- How to create the PR from the fork (`gh pr create --repo
paperclipai/paperclip --head <fork-owner>:<branch>`)
- The no-fork fallback (push to `origin`, do not auto-create a fork —
ask first)
  - Keeping the fork's `master` in sync
- Added a reinforcing entry to the **Common Mistakes** table linking
back to the new section

## Verification

- Docs-only change to a single markdown skill file. Reviewer can confirm
by reading the diff in `skills/paperclip-dev/SKILL.md`:
- New `## Forks — Prefer Pushing to a User Fork` section sits between
`## Worktrees` and `## Pull Requests`
  - New row appended to the `## Common Mistakes` table
- No tests, no build, no runtime behavior affected.

## Risks

- Low risk. Documentation-only edit. The instructions are advisory —
they only change agent behavior on future runs that read the skill.

## Model Used

- Provider: Anthropic (Claude)
- Model ID: `claude-opus-4-7` (Claude Opus 4)
- Capabilities: tool use (file read/edit, shell, git, gh CLI), extended
reasoning
- Context: invoked via Claude Code / Paperclip heartbeat for issue
PAPA-139

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (N/A — docs-only change; no
test surface)
- [x] I have added or updated tests where applicable (N/A)
- [x] If this change affects the UI, I have included before/after
screenshots (N/A — no UI change)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 22:19:07 -07:00
Dotta 82e257c7ba Cancel stale queued heartbeats when issue graph changes (PAP-2314) (#4534)
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 21:17:38 -05:00
Devin Foley 868d08903e test: isolate CLI company import e2e state (#4560)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and its
CLI import/export path is part of how operators move company state
safely between environments.
> - The `paperclipai company import/export` e2e test is supposed to
validate that portability flow inside a hermetic harness, not against a
developer's live Paperclip home.
> - This regression showed nested CLI subprocesses could silently fall
back to ambient `PAPERCLIP_*` state and mutate a real local instance by
creating extra companies such as `CLI-1-Roundtrip-Test`.
> - The first job was to pin the test subprocesses to isolated config,
home, instance, auth, and context paths, and to add a regression
assertion that proves the nested CLI writes stay inside the test-owned
state.
> - Once the PR was up, CI and Greptile exposed two follow-on issues
that were blocking merge: plugin SDK typecheck bootstrap was racing
across packages in fresh CI, and the new lock helper needed one more fix
to release its lock on failure.
> - This pull request therefore ends up doing two tightly related
things: fixing the original CLI isolation leak, and hardening the
supporting typecheck/bootstrap path enough for the fix to verify cleanly
in CI.
> - The benefit is that the portability e2e test is now actually
isolated, and the PR verification path is stable enough to catch
regressions instead of introducing its own nondeterministic failures.

## What Changed

- Hardened `cli/src/__tests__/company-import-export-e2e.test.ts` so
nested CLI subprocesses re-seed isolated `PAPERCLIP_CONFIG`,
`PAPERCLIP_HOME`, `PAPERCLIP_INSTANCE_ID`, `PAPERCLIP_CONTEXT`,
`PAPERCLIP_AUTH_STORE`, and throwaway `HOME` values instead of falling
back to ambient machine state.
- Added a regression assertion around `paperclipai context set --json`,
then cleared the temporary `context.json` so the isolation check and the
later export/import flow stay independent.
- Passed the same isolated `HOME` into the server subprocess so both
sides of the e2e harness are symmetric.
- Introduced locking in `scripts/ensure-plugin-build-deps.mjs` and
switched the server/plugin example `typecheck` scripts to use that
helper instead of launching concurrent raw `@paperclipai/plugin-sdk`
builds.
- Fixed the helper failure path so it releases the lock before exiting
non-zero, which prevents stale-lock timeouts during parallel typecheck
runs.

## Verification

- `pnpm vitest run cli/src/__tests__/company-import-export-e2e.test.ts
--project paperclipai`
- `pnpm --filter paperclipai typecheck`
- `pnpm -r typecheck`
- PR checks now pass on the current head, including `policy`, `verify`,
`e2e`, `security/snyk`, and `Greptile Review`.

## Risks

- Low risk. The product-facing behavior change is scoped to test harness
code in the CLI e2e suite.
- The CI stabilization changes only affect bootstrap/typecheck helper
paths for the server and plugin/example packages, but they do touch
shared verification plumbing; the main risk is changing how fresh build
artifacts are prepared in local/CI typecheck runs.

## Model Used

- Anthropic Claude via Paperclip `claude_local`, model
`claude-opus-4-7`, high-effort local coding agent, used for the initial
implementation and first peer-reviewed verification.
- OpenAI Codex via Paperclip `codex_local`, model `gpt-5.4`, high
reasoning-effort local coding agent with tool use, used for CI triage,
Greptile follow-up fixes, verification, and PR maintenance.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 19:10:01 -07:00
Devin Foley 1d9f7a5149 Fix flaky heartbeat recovery teardown CI failure (#4559)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The linked CI job is in the server test/recovery path, where
heartbeat runs and issue cleanup need to leave the control plane in a
consistent state even when retries fail.
> - In this case the failure was not runtime product behavior but test
teardown behavior inside `heartbeat-process-recovery.test.ts`.
> - The failing GitHub Actions job showed a foreign-key race on
`company_skills_company_id_companies_id_fk` while the test tried to
delete the parent company record.
> - The surrounding teardown code already uses bounded retry cleanup for
other dependent tables (`issues`, `heartbeatRuns`, and `agents`) because
this test file intentionally exercises asynchronous recovery flows.
> - This pull request applies that same retry pattern to the final
`db.delete(companies)` step, re-clearing `companySkills` before each
retry.
> - The benefit is a targeted fix for the CI flake without changing
runtime behavior or expanding the scope beyond the failing teardown
path.

## What Changed

- Wrapped the final `db.delete(companies)` call in
`server/src/__tests__/heartbeat-process-recovery.test.ts` with the same
5-attempt retry pattern already used elsewhere in that teardown.
- Re-cleared `companySkills` before each company-delete retry so
late-arriving FK-dependent rows do not mask the real test result.
- Verified the fix against the originally failing
`heartbeat-process-recovery` test file and the broader `pnpm test:run`
command under CI-like env conditions.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`
- Re-ran `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts` multiple times
locally; the previously failing teardown stayed green.
- `env -u PAPERCLIP_API_URL -u PAPERCLIP_RUNTIME_API_URL -u
PAPERCLIP_RUN_ID -u PAPERCLIP_TASK_ID -u PAPERCLIP_AGENT_ID -u
PAPERCLIP_COMPANY_ID -u PAPERCLIP_API_KEY -u PAPERCLIP_WAKE_REASON -u
PAPERCLIP_WAKE_COMMENT_ID -u PAPERCLIP_WAKE_PAYLOAD_JSON -u
PAPERCLIP_APPROVAL_ID -u PAPERCLIP_APPROVAL_STATUS pnpm test:run`

## Risks

- Low risk. The change is test-only and scoped to teardown retry
behavior in a single server test file.
- If the underlying async cleanup behavior changes again, this test
could still become flaky in a different way, but this PR addresses the
specific FK race seen in the linked CI job.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI `gpt-5.4` via Paperclip `codex_local`, high reasoning mode,
with tool use for shell, git, HTTP API calls, and patch application.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 17:30:20 -07:00
Devin Foley 8145141c55 Fix external issue URL rewriting in markdown (#4558)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Issue and comment rendering is part of the board UI where humans
supervise and inspect agent work.
> - External Paperclip issue URLs can appear in comments as references
to other runs, review threads, or remote test environments.
> - Those links must preserve their full destination, including origin,
port, and `#comment-...` fragments, or the operator is taken to the
wrong place.
> - The bug here was that absolute `http(s)` issue URLs were being
normalized into internal `/issues/...` routes in the markdown path.
> - This pull request stops rewriting absolute URLs while keeping
internal issue-reference behavior for relative paths and identifiers.
> - The benefit is that authored external links now navigate exactly
where the operator expects, especially for remote test and
comment-deep-link workflows.

## What Changed

- Stopped `ui/src/lib/issue-reference.ts` from treating absolute
`http(s)` URLs as internal issue paths.
- Added defense-in-depth in `ui/src/lib/mention-chips.ts` so absolute
`http(s)` URLs are never reclassified as issue mention chips.
- Updated `ui/src/lib/issue-reference.test.ts` to cover absolute
Paperclip URLs with preserved origin, port, and comment hash.
- Updated `ui/src/components/MarkdownBody.test.tsx` to assert the
reported URL renders as an external link, not an internal `/issues/...`
href.

## Verification

- `pnpm exec vitest run ui/src/lib/issue-reference.test.ts
ui/src/components/MarkdownBody.test.tsx`
- Expected result: `2` files passed, `37` tests passed.
- Manual spot-check from the issue report path: a URL like
`http://remote.example.test:3103/PAPA/issues/PAPA-115#comment-...`
should remain an external link with its full destination preserved.

## Risks

- Low risk. The change narrows when Paperclip rewrites URLs, so the main
risk is if some existing workflow depended on absolute `http(s)`
Paperclip URLs being converted into internal issue links. The added
regression coverage is aimed at preventing that from regressing
silently.

## Model Used

- OpenAI Codex local agent via Paperclip `codex_local`
- Backing model family: GPT-5-based Codex runtime
- Exact backend model ID/version: not exposed by this adapter/runtime
surface
- Context window: not exposed by this adapter/runtime surface
- Capabilities used: tool use, shell command execution, code editing,
git operations, and local test execution

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 17:19:23 -07:00
Devin Foley 54ab0d24cd Fix disappearing issue comments (#4557)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies, so issue detail
pages are a primary surface for understanding agent work and human
feedback.
> - The relevant subsystem here is the issue comments/chat experience
across the React issue detail page and the server comment pagination
API.
> - Long issue threads were only surfacing the newest page of comments
at first render, which hid earlier human and agent messages behind extra
pagination.
> - The first UI fix exposed that the descending cursor path on the
server could also fail for older-page fetches, leaving the chat tab
stuck on an infinite "Loading earlier comments..." state.
> - This needed to be addressed in both layers so the chat tab can
surface earlier conversation history without manual recovery and without
server errors.
> - This pull request auto-loads earlier comment pages in the issue
detail chat view and fixes the descending cursor predicate used by issue
comment pagination.
> - The benefit is that long-running issues like `PAPA-103` now show the
missing conversation history near the top of the chat surface instead of
hiding it or failing to load it.

## What Changed

- Auto-load earlier issue comment pages in the issue detail chat tab
until the thread reaches a 150-comment cap or there are no older
comments left.
- Add UI-side guard logic and regression coverage for optimistic issue
comment pagination so the autoload behavior stops cleanly.
- Replace the raw SQL descending cursor predicate in
`issueService.listComments` with typed Drizzle comparisons for the
`(createdAt, id)` anchor tuple.
- Add a server regression test that paginates earlier comments in
descending order from an anchor comment.
- Smoke-test the exact previously failing seeded `PAPA-103` cursor path
on the isolated dev instance used for review.

## Verification

- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/issues-service.test.ts`
- `pnpm --filter @paperclipai/server typecheck`
- Manual smoke against seeded `PAPA-103` data on the isolated dev
server:
- `GET /api/issues/PAPA-103/comments?order=desc&limit=50` returns `200`
- `GET
/api/issues/PAPA-103/comments?after=765d3609-edc6-4d11-a8fe-d466affbe85d&order=desc&limit=50`
now returns `200` with 50 comments instead of `500`

## Risks

- Moderate UI/perf risk on very large threads because the chat tab now
prefetches multiple earlier pages on mount; the cap is intentionally
limited to 150 comments to bound that work.
- Low API risk because the server fix only changes the cursor predicate
construction for anchor-based comment pagination, but any mistake there
would affect older-comment paging order.

> I checked `ROADMAP.md` before opening this PR and this bug fix does
not duplicate planned core work.

## Model Used

- OpenAI Codex coding agent in the Paperclip local adapter environment.
The exact backend model ID and context window were not exposed
in-session. Tool-assisted workflow included shell execution, git/GitHub
CLI, local test execution, and targeted code edits.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 16:23:53 -07:00
Devin Foley b2496c8067 fix(auth): trust allowed hostname port variants on detected listen port (#4554)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, so
authenticated board access has to be predictable across local and
worktree deployments.
> - This change sits in the authenticated-mode server startup and Better
Auth origin-trust wiring.
> - The original auth branch fixed one real gap by adding port-qualified
trusted origins for allowed hostnames on non-default ports.
> - Review of that branch found a second-order bug: trusted origins were
still derived from the configured port before startup detected the
actual listen port.
> - In isolated worktrees, that meant a common `3100 -> 3101` port shift
could still leave Better Auth trusting the stale origin.
> - This pull request keeps the original allowed-hostname port-variant
fix, then moves trust derivation onto the resolved listen port and adds
regression coverage around startup wiring.
> - The benefit is that authenticated sessions keep working on allowed
private hostnames even when Paperclip has to auto-shift to a different
local port.

## What Changed

- Added `:port` trusted-origin variants for authenticated-mode
`allowedHostnames` when Paperclip runs on non-default ports.
- Changed authenticated startup so `listenPort` is detected before
Better Auth initialization, and explicit auth base URLs are rewritten
before auth startup.
- Updated `deriveAuthTrustedOrigins()` to accept the resolved listen
port so Better Auth trusts the actual browser origin instead of the
stale configured port.
- Added focused regression coverage in
`server/src/__tests__/better-auth.test.ts` and
`server/src/__tests__/server-startup-feedback-export.test.ts`.

## Verification

- `pnpm exec vitest run server/src/__tests__/better-auth.test.ts
server/src/__tests__/server-startup-feedback-export.test.ts`
- Reviewer re-check: reviewed commits `380f5b9f` and `092bb34c` after
the follow-up fix landed and found no remaining issues.

## Risks

- Low risk: this only affects authenticated-mode origin derivation and
startup ordering around detected listen ports.
- Main behavioral shift: startup no longer mutates `config.port` to the
selected port; it now carries `requestedListenPort` separately and uses
`listenPort` where runtime behavior needs the resolved value.
- If another path was implicitly relying on `config.port` being
overwritten during startup, that path would need follow-up, though the
current startup/test coverage did not reveal one.

> I checked `ROADMAP.md` and did not find an overlapping planned core
work item for this auth trusted-origin port handling fix.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agents for implementation and
review. Exact backend model ID/context window were not surfaced in this
run context; work was performed through the Codex local adapter with
tool use, code execution, and review passes.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 15:40:39 -07:00
Devin Foley 08af830430 Tighten publicBaseUrl port rewriting (#4553)
## Thinking Path

> - Paperclip is a control plane for autonomous agent companies, so its
local and authenticated deployment behavior has to stay predictable
under port rebinding and worktree isolation.
> - This change sits in the server/worktree configuration path that
derives runtime URLs and auth origins from `auth.publicBaseUrl`.
> - The original hostname-port rewrite change fixed one real gap for
private/tailnet host:port worktree setups, but it widened the rewrite
rule too far.
> - Rewriting every explicit `auth.publicBaseUrl` can corrupt public or
reverse-proxy URLs by turning a stable origin like
`https://paperclip.example` into a local listen-port URL.
> - Paperclip's auth and trusted-origin handling depend on that URL
staying semantically correct, so this had to be narrowed before merge.
> - This pull request tightens the rewrite rule to explicit-port URLs
only and adds regression coverage across the CLI helper, worktree config
persistence, and server startup path.
> - The benefit is that private host:port worktree flows still work,
while public/default-port URLs remain stable and safe.

## What Changed

- Tightened `rewriteLocalUrlPort` in `cli/src/commands/worktree-lib.ts`,
`server/src/worktree-config.ts`, and `server/src/index.ts` so it only
rewrites URLs that already include an explicit port.
- Removed the old loopback-only hostname gate from the CLI/worktree
helpers and replaced it with the more precise `parsed.port` guard.
- Updated CLI helper coverage to assert that explicit-port non-loopback
URLs still rewrite while no-port public URLs stay unchanged.
- Expanded `server/src/__tests__/worktree-config.test.ts` to cover
explicit-port rewrite and no-port stability for both persisted worktree
config and in-memory runtime port selection.
- Added startup-path coverage in
`server/src/__tests__/server-startup-feedback-export.test.ts` for
`detect-port` rebinding with both explicit-port and no-port
`auth.publicBaseUrl` values.

## Verification

- `pnpm --filter @paperclipai/plugin-sdk build`
- `npx vitest run
server/src/__tests__/server-startup-feedback-export.test.ts`
- `npx vitest run cli/src/__tests__/worktree.test.ts
server/src/__tests__/worktree-config.test.ts`
- All of the above were run locally in this issue worktree and passed.

## Risks

- Low risk. The behavior change is deliberately narrower than the
reviewed broad-host rewrite and is guarded by regression coverage for
both the explicit-port and no-port cases.
- The main remaining risk is behavioral only if another code path starts
depending on port rewriting for URLs that never declared a port, which
would be a separate bug.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex local agent using `gpt-5.4` with high reasoning effort,
tool use, shell execution, and file editing.
- Anthropic Claude local agent using `claude-opus-4-6` for follow-up
code review approval on the implementation issue.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 14:29:22 -07:00
Devin Foley d47ffa87f0 Fix CEO AGENT_HOME paths and centralize workspace env propagation (#4551)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The local adapter layer is responsible for turning Paperclip runtime
context into the environment seen by the child agent process.
> - The CEO onboarding bundle tells the agent where to read and write
its persistent memory and fact files.
> - That bundle was using `./memory/...` and `./life/...`, which only
works when the process cwd happens to equal the agent home directory.
> - At the same time, six local adapters each duplicated the same
workspace-env propagation logic, including `AGENT_HOME`, which makes
this contract easy to drift.
> - This pull request fixes the CEO instructions to use
`$AGENT_HOME/...` and centralizes workspace-env propagation in one
shared helper with shared tests.
> - The benefit is a real bug fix for agent memory paths plus a single
tested contract that makes future built-in adapter work less likely to
forget `AGENT_HOME`.

## What Changed

- Updated `server/src/onboarding-assets/ceo/HEARTBEAT.md` to use
`$AGENT_HOME/memory/...` and `$AGENT_HOME/life/...` instead of
cwd-relative `./memory/...` and `./life/...`.
- Added `applyPaperclipWorkspaceEnv(...)` in
`packages/adapter-utils/src/server-utils.ts` to centralize
`PAPERCLIP_WORKSPACE_*` and `AGENT_HOME` propagation.
- Added shared helper coverage in
`packages/adapter-utils/src/server-utils.test.ts` for both populated and
skip-empty cases.
- Switched the built-in local adapters (`claude_local`, `codex_local`,
`cursor_local`, `gemini_local`, `opencode_local`, `pi_local`) over to
the shared helper instead of inline env assignment blocks.

## Verification

- `pnpm install`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/codex-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- Result: 7 test files passed, 31 tests passed, 0 failures.

## Risks

- Low risk.
- The only behavioral surface is the shared env propagation refactor
across six adapters; if the helper diverged from prior semantics, an
adapter could miss a workspace env var.
- The shared helper test plus the affected adapter execute tests reduce
that risk, and the helper preserves the prior "set only non-empty
strings" behavior.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agent runtime; tool-assisted
coding workflow with shell execution, file patching, git operations, and
API interaction. The exact backend model identifier and context window
are not surfaced by this local runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 13:57:35 -07:00
Devin Foley d1484551ee Add open-source hygiene note to paperclip-dev skill (#4541)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The `paperclip-dev` skill is part of the contributor and agent
workflow layer that tells developers how to work in this repository
safely.
> - That skill already references the public upstream `origin`, but it
did not explicitly say that pushes there must be treated as publishable
open-source output.
> - Without that reminder, contributors are more likely to leak secrets,
PII, private logs, machine-local config, or noisy throwaway git history
into the public repo.
> - This pull request adds a prominent `OPEN SOURCE HYGIENE` callout
near the top of the skill, before the git workflow guidance.
> - The benefit is clearer safety guidance for contributors and less
accidental disclosure or branch/commit noise on the upstream project.

## What Changed

- Added an `OPEN SOURCE HYGIENE` callout near the top of
`skills/paperclip-dev/SKILL.md`.
- Explicitly warned that anything pushed to `origin` must be
publishable.
- Called out avoiding secrets, API keys, PII, private logs,
machine-local config, and noisy throwaway branches or checkpoint
commits.

## Verification

- N/a

## Risks

- Low risk. This is a docs-only change in a skill file; the main risk is
wording tone or placement, not runtime behavior.

## Model Used

- OpenAI Codex via the `codex_local` Paperclip adapter, GPT-5-based
coding agent runtime. Exact backend serving model ID is not exposed in
this heartbeat environment. Tool use, shell execution, and patch
application were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 12:14:49 -07:00
Devin Foley 91333ec86f feat: add paperclip-dev skill with optional bundled skill support (#3854)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents working on the Paperclip codebase itself need guidance on dev
workflows: server lifecycle, worktrees, builds, database ops,
diagnostics
> - There was no bundled skill covering these workflows — agents had to
figure it out from scratch each time
> - Additionally, not every skill should be force-installed on every
agent — a dev-focused skill should be opt-in
> - This PR adds a `paperclip-dev` skill with `required: false`
frontmatter so it ships with Paperclip but isn't auto-installed
> - The skill's PR section references canonical files
(`.github/PULL_REQUEST_TEMPLATE.md`, `CONTRIBUTING.md`) instead of
duplicating their content, with gated instructions that force agents to
read those files before creating any PR
> - The benefit is that developers (human or agent) can opt in to
structured dev guidance without polluting the default agent skill set or
creating drift between duplicated docs

## What Changed

- Added `skills/paperclip-dev/SKILL.md` covering server management,
worktree lifecycle, builds, database ops, diagnostics, agent operations,
and common mistakes
- The Pull Requests section uses gated, reference-based instructions —
agents MUST read `.github/PULL_REQUEST_TEMPLATE.md` and
`CONTRIBUTING.md` before running `gh pr create`, with a brief checklist
of required section names (no content duplication)
- Updated `packages/adapter-utils/src/server-utils.ts` to respect
`required: false` frontmatter — optional skills are bundled but not
auto-installed on agents
- Added test in `server/src/__tests__/paperclip-skill-utils.test.ts`
verifying that optional skills are excluded from the default install set

## Verification

```bash
# Run tests
pnpm test

# Manual verification: create a fresh worktree without seeding
npx paperclipai worktree:make test-optional-skill --no-seed
cd ~/paperclip-test-optional-skill
eval "$(npx paperclipai worktree env)"
npx paperclipai run

# Verify paperclip-dev appears in company skill library but is NOT auto-assigned
# Call listPaperclipSkillEntries() — paperclip-dev should show required: false
# Call resolvePaperclipDesiredSkillNames() — paperclip-dev should NOT be in the default set

# Cleanup
npx paperclipai worktree:cleanup test-optional-skill
```

## Risks

- Low risk. The `required` field defaults to `true` when absent, so all
existing skills behave identically. Only the new `paperclip-dev` skill
sets `required: false`.

## Model Used

Claude Opus 4.6 (`claude-opus-4-6`) via Claude Code, with tool use and
extended context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-26 11:06:13 -07:00
Dotta c036bbfa98 Add first-class security agent role to taxonomy (#4532)
## Thinking Path

> - Paperclip is the control plane for AI-agent companies, so agent
metadata is part of the platform's governance and audit surface.
> - The shared agent taxonomy in `@paperclipai/shared` is the source of
truth for allowed agent roles and their UI labels.
> - The current taxonomy lacks a `security` role, which causes Security
Engineer hires to collapse into `engineer`.
> - That breaks separation-of-duties evidence in telemetry and weakens
role-level audit fidelity even though it does not directly change
permissions.
> - This pull request adds `security` as a first-class shared role and
covers the prior rejection path with a regression test.
> - The benefit is that Security Engineer agents can now be persisted
and rendered under the correct role without schema or permission churn.

## What Changed

- Added `security` to `AGENT_ROLES` in
`packages/shared/src/constants.ts`.
- Added the `Security` display label to `AGENT_ROLE_LABELS` so existing
UI consumers render the new role automatically.
- Added a shared validator regression test proving `createAgentSchema`
accepts `role: "security"` and that the label stays stable.

## Verification

- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/shared exec vitest run
src/adapter-types.test.ts`

## Risks

- Low risk. This is a shared enum expansion with no database migration
and no permission-model change.
- Residual risk: this PR does not backfill existing agents already
persisted as `engineer`; it only fixes new validations and labels going
forward.

> I checked `ROADMAP.md`/`doc` for overlap and did not find an existing
planned item covering this taxonomy fix.

## Model Used

- OpenAI GPT-5.4 via the Codex local adapter, with tool use and local
code execution enabled. The runtime did not surface a separate
context-window identifier in agent metadata.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 07:52:05 -05:00
Dotta df425fde96 Present ordered sub-issues as a workflow checklist (#4523)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators use issue detail pages and child issue lists to understand
multi-step execution plans.
> - Ordered sub-issues currently read like a flat table, so dependency
chains and current next steps are harder to scan.
> - The branch work adds a workflow-oriented presentation for child
issues without changing the single-assignee task model.
> - This pull request makes ordered sub-issues read more like a progress
checklist while preserving normal issue list controls.
> - The benefit is that operators can see completed steps, active work,
blocked follow-ups, and dependency order at a glance.

## What Changed

- Added workflow sorting utilities and tests for dependency-aware child
issue ordering.
- Added sub-issue progress summary, checklist numbering, current-step
affordances, blocker context, and done-state de-emphasis in the issue
list UI.
- Wired issue detail sub-issue panels to use the workflow sort/progress
checklist presentation.
- Updated issue service behavior/tests for child issue ordering inputs
used by the UI.
- Added a Storybook visual review fixture and screenshot helper for the
sub-issue workflow checklist surface.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/issues-service.test.ts
ui/src/components/IssueRow.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/pages/IssueDetail.test.tsx
ui/src/lib/issue-detail-subissues.test.ts
ui/src/lib/workflow-sort.test.ts`
- Result: 6 test files passed, 55 tests passed, 34 embedded Postgres
issue-service tests skipped because `@embedded-postgres/darwin-x64` is
unavailable on this host.
- Visual review: generated Storybook screenshots from the existing local
Storybook server on port 6006 with `node
scripts/screenshot-subissues.mjs /tmp/pap-2189-subissues-screens
http://localhost:6006`.
- Screenshot artifacts:
- Desktop dark: ![Desktop
dark](doc/assets/pap-2189/desktop-1440x900-dark.png)
- Desktop light: ![Desktop
light](doc/assets/pap-2189/desktop-1440x900-light.png)
- Mobile dark: ![Mobile
dark](doc/assets/pap-2189/mobile-390x844-dark.png)
- Mobile light: ![Mobile
light](doc/assets/pap-2189/mobile-390x844-light.png)
- Local Storybook note: starting a second Storybook process selected
port 6008 because 6006 was occupied, then Vite failed with an esbuild
host/binary version mismatch (`0.25.12` host vs `0.27.3` binary). The
already-running Storybook server on 6006 served the fixture successfully
for screenshots.

## Risks

- Medium UI risk: the issue list now has additional sub-issue-specific
visual states, so dense lists should be checked for spacing and
scanability.
- Low ordering risk: workflow sorting is covered by focused unit tests,
but unusual dependency topologies may still need reviewer attention.
- No migration risk: this PR does not add database migrations or touch
`pnpm-lock.yaml`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub
workflow. Context window is runtime-provided and not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-26 07:36:49 -05:00
Devin Foley 40782f703d Fix release packaging for standalone public packages (#4494)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and the
sandbox-provider work just moved E2B into a standalone publishable
plugin package.
> - That plugin is intentionally excluded from the root pnpm workspace
so it can model third-party install behavior without forcing lockfile
churn in the main repo.
> - The merged architecture change exposed a follow-up release problem:
the canary publish workflow tried to publish `@paperclipai/plugin-e2b`,
but the tarball had no `dist/` payload because standalone public
packages were not being built in the release path.
> - That means the release pipeline needed a packaging fix in core
release tooling, not another architectural change in the sandbox
provider itself.
> - This pull request adds a generic release step for public packages
that live outside the pnpm workspace, instead of hardcoding E2B-specific
behavior into the release script.
> - The benefit is that standalone publishable packages can be built and
packed correctly during release, including future sandbox-provider
plugins that follow the same pattern.

## What Changed

- Added `scripts/build-standalone-public-packages.mjs` to discover
public packages outside the pnpm workspace, run a clean package-local
install, and build them before publish.
- Updated `scripts/release.sh` to invoke that helper immediately after
the normal workspace build step.
- Kept the behavior generic by driving off the existing public package
map and pnpm workspace patterns rather than special-casing
`@paperclipai/plugin-e2b`.

## Verification

- `rm -rf packages/plugins/sandbox-providers/e2b/dist`
- `node ./scripts/build-standalone-public-packages.mjs`
- `cd packages/plugins/sandbox-providers/e2b && npm pack --dry-run`
- Confirm the tarball now includes the rebuilt `dist/` files instead of
only `README.md` / `package.json`

## Risks

- Low risk: this only changes the release build path for public packages
outside the pnpm workspace.
- The helper performs a clean package-local install for each standalone
public package, so release time may increase slightly as more such
packages are added.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, GitHub CLI, and local
build/test inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-25 12:16:23 -07:00
Devin Foley 4ef969f084 Add E2B sandbox provider plugin (#4452)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Sandbox environments are part of that execution layer, and the
recent core refactor moved provider-specific behavior to a generic
plugin seam
> - This pull request adds a dedicated `@paperclipai/plugin-e2b` package
so E2B can live entirely outside core host code
> - Because the feature is still unreleased, the plugin should model
third-party packaging directly instead of carrying extra
backward-compatibility complexity in core or the workspace lockfile
> - This branch therefore makes the E2B provider a standalone
publishable package, documents the package-local dev flow, and keeps the
publish manifest/runtime dependency story correct
> - The benefit is that E2B becomes a true plugin reference
implementation that can be installed by package name without reopening
core Paperclip code

## What Changed

- Added `packages/plugins/paperclip-plugin-e2b` as the E2B sandbox
provider plugin package
- Implemented config validation, lease acquire/resume/release/destroy
handlers, workspace realization, and command execution for E2B sandboxes
- Excluded the E2B plugin package from the root workspace so the repo no
longer needs `pnpm-lock.yaml` churn for its third-party dependency graph
- Added package-local development/install support plus a prepack
manifest generator so the published tarball still declares
`@paperclipai/plugin-sdk` and `e2b` runtime dependencies
- Addressed review feedback by fixing sandbox cleanup on acquire
failures, rejecting blank templates, normalizing fractional `timeoutMs`,
and always passing the configured template name to the E2B SDK
- Updated focused Vitest coverage for config normalization, validation,
acquire cleanup, command execution, and lease release behavior
- Updated the Dockerfile deps stage to copy the E2B package manifest so
the policy check stays in sync

## Verification

- `cd packages/plugins/paperclip-plugin-e2b && pnpm install
--ignore-workspace --no-lockfile`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm build`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace
test`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace
typecheck`
- `cd packages/plugins/paperclip-plugin-e2b && npm pack --dry-run`

## Risks

- The package now relies on a prepack manifest rewrite so the
publish-time dependency list stays correct while the repo-local dev
manifest stays workspace-light
- The current repo snapshot is still unreleased, so the generated
publish manifest points at the repo SDK version until the normal release
flow rewrites versions before publish
- Real-world E2B environments may still expose edge cases around
lifecycle timing or sandbox metadata beyond the mocked unit coverage

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, GitHub CLI, and local
build/test inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-25 11:01:11 -07:00
Devin Foley 5bd0f578fd Generalize sandbox provider core for plugin-only providers (#4449)
## Thinking Path

> - Paperclip is a control plane, so optional execution providers should
sit at the plugin edge instead of hardcoding provider-specific behavior
into core shared/server/ui layers.
> - Sandbox environments are already first-class, and the fake provider
proves the built-in path; the remaining gap was that real providers
still leaked provider-specific config and runtime assumptions into core.
> - That coupling showed up in config normalization, secret persistence,
capabilities reporting, lease reconstruction, and the board UI form
fields.
> - As long as core knew about those provider-shaped details, shipping a
provider as a pure third-party plugin meant every new provider would
still require host changes.
> - This pull request generalizes the sandbox provider seam around
schema-driven plugin metadata and generic secret-ref handling.
> - The runtime and UI now consume provider metadata generically, so
core only special-cases the built-in fake provider while third-party
providers can live entirely in plugins.

## What Changed

- Added generic sandbox-provider capability metadata so plugin-backed
providers can expose `configSchema` through shared environment support
and the environments capabilities API.
- Reworked sandbox config normalization/persistence/runtime resolution
to handle schema-declared secret-ref fields generically, storing them as
Paperclip secrets and resolving them for probe/execute/release flows.
- Generalized plugin sandbox runtime handling so provider validation,
reusable-lease matching, lease reconstruction, and plugin worker calls
all operate on provider-agnostic config instead of provider-shaped
branches.
- Replaced hardcoded sandbox provider form fields in Company Settings
with schema-driven rendering and blocked agent environment selection
from the built-in fake provider.
- Added regression coverage for the generic seam across shared support
helpers plus environment config, probe, routes, runtime, and
sandbox-provider runtime tests.

## Verification

- `pnpm vitest --run packages/shared/src/environment-support.test.ts
server/src/__tests__/environment-config.test.ts
server/src/__tests__/environment-probe.test.ts
server/src/__tests__/environment-routes.test.ts
server/src/__tests__/environment-runtime.test.ts
server/src/__tests__/sandbox-provider-runtime.test.ts`
- `pnpm -r typecheck`

## Risks

- Plugin sandbox providers now depend more heavily on accurate
`configSchema` declarations; incorrect schemas can misclassify
secret-bearing fields or omit required config.
- Reusable lease matching is now metadata-driven for plugin-backed
providers, so providers that fail to persist stable metadata may
reprovision instead of resuming an existing lease.
- The UI form is now fully schema-driven for plugin-backed sandbox
providers; provider manifests without good defaults or descriptions may
produce a rougher operator experience.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, and local code/test
inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 18:03:41 -07:00
Dotta deba60ebb2 Stabilize serialized server route tests (#4448)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The server route suite is a core confidence layer for auth, issue
context, and workspace runtime behavior
> - Some route tests were doing extra module/server isolation work that
made local runs slower and more fragile
> - The stable Vitest runner also needs to pass server-relative exclude
paths to avoid accidentally re-including serialized suites
> - This pull request tightens route test isolation and runner
serialization behavior
> - The benefit is more reliable targeted and stable-route test
execution without product behavior changes

## What Changed

- Updated `run-vitest-stable.mjs` to exclude serialized server tests
using server-relative paths.
- Forced the server Vitest config to use a single worker in addition to
isolated forks.
- Simplified agent permission route tests to create per-request test
servers without shared server lifecycle state.
- Stabilized issue goal context route mocks by using static mocked
services and a sequential suite.
- Re-registered workspace runtime route mocks before cache-busted route
imports.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/issues-goal-context-routes.test.ts
server/src/__tests__/workspace-runtime-routes-authz.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `node --check scripts/run-vitest-stable.mjs`

## Risks

- Low risk. This is test infrastructure only.
- The stable runner path fix changes which tests are excluded from the
non-serialized server batch, matching the server project root that
Vitest applies internally.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:27:00 -05:00
Dotta f68e9caa9a Polish markdown external link wrapping (#4447)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board UI renders agent comments, PR links, issue links, and
operational markdown throughout issue threads
> - Long GitHub and external links can wrap awkwardly, leaving icons
orphaned from the text they describe
> - Small inbox visual polish also helps repeated board scanning without
changing behavior
> - This pull request glues markdown link icons to adjacent link
characters and removes a redundant inbox list border
> - The benefit is cleaner, more stable markdown and inbox rendering for
day-to-day operator review

## What Changed

- Added an external-link indicator for external markdown links.
- Kept the GitHub icon attached to the first link character so it does
not wrap onto a separate line.
- Kept the external-link icon attached to the final link character so it
does not wrap away from the URL/text.
- Added markdown rendering regressions for GitHub and external link icon
wrapping.
- Removed the extra border around the inbox list card.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/MarkdownBody.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Low risk. The markdown change is limited to link child rendering and
preserves existing href/target/rel behavior.
- Visual-only inbox polish.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:26:13 -05:00
Dotta 73fbdf36db Gate stale-run watchdog decisions by board access (#4446)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The run ledger surfaces stale-run watchdog evaluation issues and
recovery actions
> - Viewer-level board users should be able to inspect status without
getting controls that the server will reject
> - The UI also needs enough board-access context to know when to hide
those decision actions
> - This pull request exposes board memberships in the current board
access snapshot and gates watchdog action controls for known viewer
contexts
> - The benefit is clearer least-privilege UI behavior around recovery
controls

## What Changed

- Included memberships in `/api/cli-auth/me` so the board UI can
distinguish active viewer memberships from operator/admin access.
- Added the stale-run evaluation issue assignee to output silence
summaries.
- Hid stale-run watchdog decision buttons for known non-owner viewer
contexts.
- Surfaced watchdog decision failures through toast and inline error
text.
- Threaded `companyId` through the issue activity run ledger so access
checks are company-scoped.
- Added IssueRunLedger coverage for non-owner viewers.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Medium-low risk. This is a UI gating change backed by existing server
authorization.
- Local implicit and instance-admin board contexts continue to show
watchdog decision controls.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:25:23 -05:00
Dotta 6916e30f8e Cancel stale retries when issue ownership changes (#4445)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue execution is guarded by run locks and bounded retry scheduling
> - A failed run can schedule a retry, but the issue may be reassigned
before that retry becomes due
> - The old assignee's scheduled retry should not continue to hold or
reclaim execution for the issue
> - This pull request cancels stale scheduled retries when ownership
changes and cancels live work when an issue is explicitly cancelled
> - The benefit is cleaner issue handoff semantics and fewer stranded or
incorrect execution locks

## What Changed

- Cancel scheduled retry runs when their issue has been reassigned
before the retry is promoted.
- Clear stale issue execution locks and cancel the associated wakeup
request when a stale retry is cancelled.
- Avoid deferring a new assignee behind a previous assignee's scheduled
retry.
- Cancel an active run when an issue status is explicitly changed to
`cancelled`, while leaving `done` transitions alone.
- Added route and heartbeat regressions for reassignment and
cancellation behavior.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
  - `issue-comment-reopen-routes.test.ts`: 28 passed.
- `heartbeat-retry-scheduling.test.ts`: skipped by the existing embedded
Postgres host guard (`Postgres init script exited with code null`).
- `pnpm --filter @paperclipai/server typecheck`

## Risks

- Medium risk because this changes heartbeat retry lifecycle behavior.
- The cancellation path is scoped to scheduled retries whose issue
assignee no longer matches the retrying agent, and logs a lifecycle
event for auditability.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:24:13 -05:00
Dotta 0c6961a03e Normalize escaped multiline issue and approval text (#4444)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board and agent APIs accept multiline issue, approval,
interaction, and document text
> - Some callers accidentally send literal escaped newline sequences
like `\n` instead of JSON-decoded line breaks
> - That makes comments, descriptions, documents, and approval notes
render as flattened text instead of readable markdown
> - This pull request centralizes multiline text normalization in shared
validators
> - The benefit is newline-preserving API behavior across issue and
approval workflows without route-specific fixes

## What Changed

- Added a shared `multilineTextSchema` helper that normalizes escaped
`\n`, `\r\n`, and `\r` sequences to real line breaks.
- Applied the helper to issue descriptions, issue update comments, issue
comment bodies, suggested task descriptions, interaction summaries,
issue documents, approval comments, and approval decision notes.
- Added shared validator regressions for issue and approval multiline
inputs.

## Verification

- `pnpm exec vitest run --project @paperclipai/shared
packages/shared/src/validators/approval.test.ts
packages/shared/src/validators/issue.test.ts`
- `pnpm --filter @paperclipai/shared typecheck`

## Risks

- Low risk. This only changes text fields that are explicitly multiline
user/operator content.
- If a caller intentionally wanted literal backslash-n text in these
fields, it will now render as a real line break.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 18:02:45 -05:00
Dotta 5a0c1979cf [codex] Add runtime lifecycle recovery and live issue visibility (#4419) 2026-04-24 15:50:32 -05:00
Dotta 9a8d219949 [codex] Stabilize tests and local maintenance assets (#4423)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - A fast-moving control plane needs stable local tests and repeatable
local maintenance tools so contributors can safely split and review work
> - Several route suites needed stronger isolation, Codex manual model
selection needed a faster-mode option, and local browser cleanup missed
Playwright's headless shell binary
> - Storybook static output also needed to be preserved as a generated
review artifact from the working branch
> - This pull request groups the test/local-dev maintenance pieces so
they can be reviewed separately from product runtime changes
> - The benefit is more predictable contributor verification and cleaner
local maintenance without mixing these changes into feature PRs

## What Changed

- Added stable Vitest runner support and serialized route/authz test
isolation.
- Fixed workspace runtime authz route mocks and stabilized
Claude/company-import related assertions.
- Allowed Codex fast mode for manually selected models.
- Broadened the agent browser cleanup script to detect
`chrome-headless-shell` as well as Chrome for Testing.
- Preserved generated Storybook static output from the source branch.

## Verification

- `pnpm exec vitest run
src/__tests__/workspace-runtime-routes-authz.test.ts
src/__tests__/claude-local-execute.test.ts --config vitest.config.ts`
from `server/` passed: 2 files, 19 tests.
- `pnpm exec vitest run src/server/codex-args.test.ts --config
vitest.config.ts` from `packages/adapters/codex-local/` passed: 1 file,
3 tests.
- `bash -n scripts/kill-agent-browsers.sh &&
scripts/kill-agent-browsers.sh --dry` passed; dry-run detected
`chrome-headless-shell` processes without killing them.
- `test -f ui/storybook-static/index.html && test -f
ui/storybook-static/assets/forms-editors.stories-Dry7qwx2.js` passed.
- `git diff --check public-gh/master..pap-2228-test-local-maintenance --
. ':(exclude)ui/storybook-static'` passed.
- `pnpm exec vitest run
cli/src/__tests__/company-import-export-e2e.test.ts --config
cli/vitest.config.ts` did not complete in the isolated split worktree
because `paperclipai run` exited during build prep with `TS2688: Cannot
find type definition file for 'react'`; this appears to be caused by the
worktree dependency symlink setup, not the code under test.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: the stable Vitest runner changes how route/authz tests
are scheduled.
- Generated `ui/storybook-static` files are large and contain minified
third-party output; `git diff --check` reports whitespace inside those
generated assets, so reviewers may choose to drop or regenerate that
artifact before merge.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshot checklist item is not applicable to source UI behavior;
the included Storybook static output is generated artifact preservation
from the source branch.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 15:11:42 -05:00
Devin Foley 70679a3321 Add sandbox environment support (#4415)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.

## What Changed

- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.

## Verification

- Ran locally before the final fixture-only scrub:
  - `pnpm -r typecheck`
  - `pnpm test:run`
  - `pnpm build`
- Ran locally after the final scrub amend:
  - `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
  - create a sandbox environment backed by the fake provider plugin
  - run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly

## Risks

- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.

## Model Used

- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
Dotta 641eb44949 [codex] Harden create-agent skill governance (#4422)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Hiring agents is a governance-sensitive workflow because it grants
roles, adapter config, skills, and execution capability
> - The create-agent skill needs explicit templates and review guidance
so hires are auditable and not over-permissioned
> - Skill sync also needs to recognize bundled Paperclip skills
consistently for Codex local agents
> - This pull request expands create-agent role templates, adds a
security-engineer template, and documents capability/secret-handling
review requirements
> - The benefit is safer, more repeatable agent creation with clearer
approval payloads and less permission sprawl

## What Changed

- Expanded `paperclip-create-agent` guidance for template selection,
adjacent-template drafting, and role-specific review bars.
- Added a Security Engineer agent template and collaboration/safety
sections for Coder, QA, and UX Designer templates.
- Hardened draft-review guidance around desired skills, external-system
access, secrets, and confidential advisory handling.
- Updated LLM agent-configuration guidance to point hiring workflows at
the create-agent skill.
- Added tests for bundled skill sync, create-agent skill injection, hire
approval payloads, and LLM route guidance.

## Verification

- `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts
server/src/__tests__/codex-local-skill-injection.test.ts
server/src/__tests__/codex-local-skill-sync.test.ts
server/src/__tests__/llms-routes.test.ts
server/src/__tests__/paperclip-skill-utils.test.ts --config
server/vitest.config.ts` passed: 5 files, 23 tests.
- `git diff --check public-gh/master..pap-2228-create-agent-governance
-- . ':(exclude)ui/storybook-static'` passed.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Low-to-medium risk: this primarily changes skills/docs and tests, but
it affects future hiring guidance and approval expectations.
- Reviewers should check whether the new Security Engineer template is
too broad for default company installs.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshot checklist item is not applicable; this PR changes
skills, docs, and server tests.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 14:15:28 -05:00
Dotta 77a72e28c2 [codex] Polish issue composer and long document display (#4420)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue comments and documents are the main working surface where
operators and agents collaborate
> - File drops, markdown editing, and long issue descriptions need to
feel predictable because they sit directly in the task execution loop
> - The composer had edge cases around drag targets, attachment
feedback, image drops, and long markdown content crowding the page
> - This pull request polishes the issue composer, hardens markdown
editor regressions, and adds a fold curtain for long issue
descriptions/documents
> - The benefit is a calmer issue detail surface that handles uploads
and long work products without hiding state or breaking layout

## What Changed

- Scoped issue-composer drag/drop behavior so the composer owns file
drops without turning the whole thread into a competing drop target.
- Added clearer attachment upload feedback for non-image files and
image-drop stability coverage.
- Hardened markdown editor and markdown body handling around HTML-like
tag regressions.
- Added `FoldCurtain` and wired it into issue descriptions and issue
documents so long markdown previews can expand/collapse.
- Added Storybook coverage for the fold curtain state.

## Verification

- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/components/MarkdownBody.test.tsx --config ui/vitest.config.ts`
passed: 3 files, 75 tests.
- `git diff --check public-gh/master..pap-2228-editor-composer-polish --
. ':(exclude)ui/storybook-static'` passed.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Low-to-medium risk: this changes user-facing composer/drop behavior
and long markdown display.
- The fold curtain uses DOM measurement and `ResizeObserver`; reviewers
should check browser behavior for very long descriptions and documents.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshots were not newly captured during branch splitting; the
UI states are covered by component tests and a Storybook story.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 14:12:41 -05:00
Dotta 8f1cd0474f [codex] Improve transient recovery and Codex model refresh (#4383)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapter execution and retry classification decide whether agent work
pauses, retries, or recovers automatically
> - Transient provider failures need to be classified precisely so
Paperclip does not convert retryable upstream conditions into false hard
failures
> - At the same time, operators need an up-to-date model list for
Codex-backed agents and prompts should nudge agents toward targeted
verification instead of repo-wide sweeps
> - This pull request tightens transient recovery classification for
Claude and Codex, updates the agent prompt guidance, and adds Codex
model refresh support end-to-end
> - The benefit is better automatic retry behavior plus fresher
operator-facing model configuration

## What Changed

- added Codex usage-limit retry-window parsing and Claude extra-usage
transient classification
- normalized the heartbeat transient-recovery contract across adapter
executions and heartbeat scheduling
- documented that deferred comment wakes only reopen completed issues
for human/comment-reopen interactions, while system follow-ups leave
closed work closed
- updated adapter-utils prompt guidance to prefer targeted verification
- added Codex model refresh support in the server route, registry,
shared types, and agent config form
- added adapter/server tests covering the new parsing, retry scheduling,
and model-refresh behavior

## Verification

- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-claude-local
packages/adapters/claude-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-codex-local
packages/adapters/codex-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/adapter-model-refresh-routes.test.ts
server/src/__tests__/adapter-models.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/codex-local-execute.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-retry-scheduling.test.ts`

## Risks

- Moderate behavior risk: retry classification affects whether runs
auto-recover or block, so mistakes here could either suppress needed
retries or over-retry real failures
- Low workflow risk: deferred comment wake reopening is intentionally
scoped to human/comment-reopen interactions so system follow-ups do not
revive completed issues unexpectedly

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:40:40 -05:00
Dotta 4fdbbeced3 [codex] Refine markdown issue reference rendering (#4382)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Task references are a core part of how operators understand issue
relationships across the UI
> - Those references appear both in markdown bodies and in sidebar
relationship panels
> - The rendering had drifted between surfaces, and inline markdown
pills were reading awkwardly inside prose and lists
> - This pull request unifies the underlying issue-reference treatment,
routes issue descriptions through `MarkdownBody`, and switches inline
markdown references to a cleaner text-link presentation
> - The benefit is more consistent issue-reference UX with better
readability in markdown-heavy views

## What Changed

- unified sidebar and markdown issue-reference rendering around the
shared issue-reference components
- routed resting issue descriptions through `MarkdownBody` so
description previews inherit the richer issue-reference treatment
- replaced inline markdown pill chrome with a cleaner inline reference
presentation for prose contexts
- added and updated UI tests for `MarkdownBody` and `InlineEditor`

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/MarkdownBody.test.tsx
ui/src/components/InlineEditor.test.tsx`

## Risks

- Moderate UI risk: issue-reference rendering now differs intentionally
between inline markdown and relationship sidebars, so regressions would
show up as styling or hover-preview mismatches

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:39:21 -05:00
Dotta 7ad225a198 [codex] Improve issue thread review flow (#4381)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue detail is where operators coordinate review, approvals, and
follow-up work with active runs
> - That thread UI needs to surface blockers, descendants, review
handoffs, and reply ergonomics clearly enough for humans to guide agent
work
> - Several small gaps in the issue-thread flow were making review and
navigation clunkier than necessary
> - This pull request improves the reply composer, descendant/blocker
presentation, interaction folding, and review-request handoff plumbing
together as one cohesive issue-thread workflow slice
> - The benefit is a cleaner operator review loop without changing the
broader task model

## What Changed

- restored and refined the floating reply composer behavior in the issue
thread
- folded expired confirmation interactions and improved post-submit
thread scrolling behavior
- surfaced descendant issue context and inline blocker/paused-assignee
notices on the issue detail view
- tightened large-board first paint behavior in `IssuesList`
- added loose review-request handoffs through the issue
execution-policy/update path and covered them with tests

## Verification

- `pnpm vitest run ui/src/pages/IssueDetail.test.tsx`
- `pnpm vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/issue-execution-policy.test.ts`
- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/lib/issue-tree.test.ts
ui/src/api/issues.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-comment-reopen-routes.test.ts -t "coerces
executor handoff patches into workflow-controlled review wakes|wakes the
return assignee with execution_changes_requested"`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-execution-policy.test.ts
server/src/__tests__/issues-service.test.ts`

## Visual Evidence

- UI layout changes are covered by the focused issue-thread component
and issue-detail tests listed above. Browser screenshots were not
attachable from this automated greploop environment, so reviewers should
use the running preview for final visual confirmation.

## Risks

- Moderate UI-flow risk: these changes touch the issue detail experience
in multiple spots, so regressions would most likely show up as
thread-layout quirks or incorrect review-handoff behavior

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or documented the visual verification path
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 08:02:45 -05:00
Dotta 35a9dc37b0 [codex] Speed up company skill detail loading (#4380)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Company skills are part of the control plane for distributing
reusable capabilities
> - Board flows that inspect company skill detail should stay responsive
because they are operator-facing control-plane reads
> - The existing detail path was doing broader work than needed for the
specific detail screen
> - This pull request narrows that company-skill detail loading path and
adds a regression test around it
> - The benefit is faster company skill detail reads without changing
the external API contract

## What Changed

- tightened the company-skill detail loading path in
`server/src/services/company-skills.ts`
- added `server/src/__tests__/company-skills-detail.test.ts` to verify
the detail route only pulls the required data

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-skills-detail.test.ts`

## Risks

- Low risk: this only changes the company-skill detail query path, but
any missed assumption in the detail consumer would surface when loading
that screen

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 07:37:13 -05:00
Devin Foley e4995bbb1c Add SSH environment support (#4358)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation

## What Changed

- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.

## Verification

- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
  - enabled the experimental flag
  - created an SSH environment
  - created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back

## Risks

- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.

## Model Used

- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
Dotta f98c348e2b [codex] Add issue subtree pause, cancel, and restore controls (#4332)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - This branch extends the issue control-plane so board operators can
pause, cancel, and later restore whole issue subtrees while keeping
descendant execution and wake behavior coherent.
> - That required new hold state in the database, shared contracts,
server routes/services, and issue detail UI controls so subtree actions
are durable and auditable instead of ad hoc.
> - While this branch was in flight, `master` advanced with new
environment lifecycle work, including a new `0065_environments`
migration.
> - Before opening the PR, this branch had to be rebased onto
`paperclipai/paperclip:master` without losing the existing
subtree-control work or leaving conflicting migration numbering behind.
> - This pull request rebases the subtree pause/cancel/restore feature
cleanly onto current `master`, renumbers the hold migration to
`0066_issue_tree_holds`, and preserves the full branch diff in a single
PR.
> - The benefit is that reviewers get one clean, mergeable PR for the
subtree-control feature instead of stale branch history with migration
conflicts.

## What Changed

- Added durable issue subtree hold data structures, shared
API/types/validators, server routes/services, and UI flows for subtree
pause, cancel, and restore operations.
- Added server and UI coverage for subtree previewing, hold
creation/release, dependency-aware scheduling under holds, and issue
detail subtree controls.
- Rebased the branch onto current `paperclipai/paperclip:master` and
renumbered the branch migration from `0065_issue_tree_holds` to
`0066_issue_tree_holds` so it no longer conflicts with upstream
`0065_environments`.
- Added a small follow-up commit that makes restore requests return `200
OK` explicitly while keeping pause/cancel hold creation at `201
Created`, and updated the route test to match that contract.

## Verification

- `pnpm --filter @paperclipai/db typecheck`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `cd server && pnpm exec vitest run
src/__tests__/issue-tree-control-routes.test.ts
src/__tests__/issue-tree-control-service.test.ts
src/__tests__/issue-tree-control-service-unit.test.ts
src/__tests__/heartbeat-dependency-scheduling.test.ts`
- `cd ui && pnpm exec vitest run src/components/IssueChatThread.test.tsx
src/pages/IssueDetail.test.tsx`

## Risks

- This is a broad cross-layer change touching DB/schema, shared
contracts, server orchestration, and UI; regressions are most likely
around subtree status restoration or wake suppression/resume edge cases.
- The migration was renumbered during PR prep to avoid the new upstream
`0065_environments` conflict. Reviewers should confirm the final
`0066_issue_tree_holds` ordering is the only hold-related migration that
lands.
- The issue-tree restore endpoint now responds with `200` instead of
relying on implicit behavior, which is semantically better for a restore
operation but still changes an API detail that clients or tests could
have assumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent in the Paperclip Codex runtime (GPT-5-class
tool-using coding model; exact deployment ID/context window is not
exposed inside this session).

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-23 14:51:46 -05:00
Russell Dempsey 854fa81757 fix(pi-local): prepend installed skill bin/ dirs to child PATH (#4331)
## Thinking Path

> - Paperclip orchestrates AI agents; each agent runs under an adapter
that spawns a model CLI as a child process.
> - The pi-local adapter (`packages/adapters/pi-local`) spawns `pi` and
inherits the child's shell environment — including `PATH`, which
determines what the child's bash tool can execute by name.
> - Paperclip skills ship executable helpers under `<skill>/bin/` (e.g.
`paperclip-get-issue`) and Reviewer/QA-style `AGENTS.md` files invoke
them by name via the agent's bash tool.
> - Pi-local builds its runtime env with `ensurePathInEnv({
...process.env, ...env })` only — it never adds the installed skills'
`bin/` dirs to PATH. The pi CLI's `--skill` arg loads each skill's
SKILL.md but does not augment PATH.
> - Consequence: every bash invocation of a skill helper fails with
`exit 127: command not found`. The agent then spends its heartbeat
guessing (re-reading SKILL.md, trying `find`, inventing command paths)
and either times out or gives up.
> - This PR prepends each injected skill's `bin/` directory to the child
PATH immediately before runtimeEnv is constructed.
> - The benefit: pi_local agents whose AGENTS.md uses any `paperclip-*`
skill helper can actually run those helpers.

## What Changed

- `packages/adapters/pi-local/src/server/execute.ts`: compute
`skillBinDirs` from the already-resolved `piSkillEntries`, dedupe
against the existing PATH, prepend them to whichever of `PATH` / `Path`
the merged env uses, then build `runtimeEnv`. No new helpers, no
adapter-utils changes.

## Verification

Manual repro before the fix:

1. Create a pi_local agent wired to a paperclip skill (e.g.
paperclip-control).
2. Wake the agent on an in_review issue with an AGENTS.md that starts
with `paperclip-get-issue "$PAPERCLIP_TASK_ID"`.
3. Session file: `{ "role": "toolResult", "isError": true, "content": [{
"text": "/bin/bash: paperclip-get-issue: command not found\n\nCommand
exited with code 127" }] }`.

After the fix: same wake; `paperclip-get-issue` resolves and returns the
issue JSON; agent proceeds.

Local commands:

```
pnpm --filter @paperclipai/adapter-pi-local typecheck   # clean
pnpm --filter @paperclipai/adapter-pi-local build       # clean
pnpm --filter @paperclipai/server exec vitest run \
  src/__tests__/pi-local-execute.test.ts \
  src/__tests__/pi-local-adapter-environment.test.ts \
  src/__tests__/pi-local-skill-sync.test.ts
# 5/5 passing
```

No new tests: the existing `pi-local-skill-sync.test.ts` covers skill
symlink injection (upstream of the PATH step), and
`pi-local-execute.test.ts` covers the spawn path; this change only
augments env on the same spawn path.

## Risks

Low. Pure PATH augmentation on the child env. Edge cases:

- Zero skills installed → no PATH change (guarded by
`skillBinDirs.length > 0`).
- Duplicate bin dirs already on PATH → deduped; no pollution on re-runs.
- Windows `Path` casing → falls back correctly when merged env uses
`Path` instead of `PATH`.
- Skill dir without `bin/` subdir → joined path simply won't resolve;
harmless.

No behavioral change for pi_local agents that don't use skill-provided
commands.

## Model Used

- Claude, `claude-opus-4-7` (1M context), extended thinking enabled,
tool use enabled. Walked pi-local/cursor-local/claude-local and
adapter-utils to isolate the gap, wrote the inlined fix, and ran
typecheck/build/test locally.

## Checklist

- [x] Thinking path from project context to this change
- [x] Model used specified
- [x] Checked ROADMAP.md — no overlap
- [x] Tests run locally, passing
- [x] Tests added — new case in
`server/src/__tests__/pi-local-execute.test.ts`; verified it fails when
the fix is reverted
- [ ] UI screenshots — N/A (backend adapter change)
- [x] Docs updated — N/A (internal adapter, no user-facing docs)
- [x] Risks documented
- [x] Will address reviewer comments before merge
2026-04-23 10:15:10 -05:00
Dotta fe14de504c [codex] Document README architecture systems (#4250)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - The public README is the first place many operators and contributors
learn what the product already includes.
> - The existing README explained the product promise but did not give a
compact, concrete tour of the major systems behind it.
> - This made Paperclip easier to underestimate as a wrapper around
agents instead of a full control plane with identity, work, execution,
governance, budgets, plugins, and portability.
> - This pull request adds an under-the-hood README section that names
those systems and shows how adapters connect into the server.
> - Greptile caught consistency gaps between the diagram and prose, so
the final version aligns the system labels and adapter examples across
both surfaces.
> - The benefit is a clearer first-read model of Paperclip's
architecture and shipped capabilities without changing runtime behavior.

## What Changed

- Added a `What's Under the Hood` section to `README.md`.
- Added an ASCII architecture diagram for the Paperclip server and
external agent adapters.
- Added a systems table covering identity, org charts, tasks, heartbeat
execution, workspaces, governance, budgets, routines, plugins,
secrets/storage, activity/events, and company portability.
- Addressed Greptile feedback by aligning diagram labels with table rows
and grouping adapter examples consistently.

## Verification

- `git diff --check public-gh/master...HEAD`
- Attempted `pnpm exec prettier --check README.md`, but this checkout
does not expose a `prettier` binary through `pnpm exec`.
- Greptile review rerun passed after addressing its two comments; review
threads are resolved.
- Remote PR checks passed on the latest head: `policy`, `verify`, `e2e`,
`security/snyk (cryppadotta)`, and `Greptile Review`.
- Not run locally: Vitest/build suites, because this is a README-only
documentation change and the PR's remote `verify` job ran typecheck,
tests, build, and release canary dry run.

## Risks

- Low runtime risk: documentation-only change.
- The main risk is wording drift if the README overstates or
underspecifies evolving product capabilities; the section was aligned
against the current product/spec docs and roadmap.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex / GPT-5 coding agent in a Paperclip heartbeat, with shell
and GitHub CLI tool use. Exact runtime model identifier and context
window were not exposed by the adapter.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-23 09:48:19 -05:00
Michel Tomas 3d15798c22 fix(adapters/routes): apply resolveExternalAdapterRegistration on hot-install (#4324)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The external adapter plugin system (#2218) lets adapters ship as npm
modules loaded via `server/src/adapters/plugin-loader.ts`; since #4296
merged, each `ServerAdapterModule` can declare `sessionManagement`
(`supportsSessionResume`, `nativeContextManagement`,
`defaultSessionCompaction`) and have it preserved through the init-time
load via the new `resolveExternalAdapterRegistration` helper
> - #4296 fixed the init-time IIFE path at
`server/src/adapters/registry.ts:363-369` but noted that the hot-install
path at `server/src/routes/adapters.ts:174
registerWithSessionManagement` still unconditionally overwrites
module-provided `sessionManagement` during `POST /api/adapters/install`
> - Practical impact today: an external adapter installed via the API
needs a Paperclip restart before its declared `sessionManagement` takes
effect — the IIFE runs on next boot and preserves it, but until then the
hot-install overwrite wins
> - This PR closes that parity gap: `registerWithSessionManagement`
delegates to the same `resolveExternalAdapterRegistration` helper
introduced by #4296, unifying both load paths behind one resolver
> - The benefit is consistent behaviour between cold-start and
hot-install: no "install then restart" ritual; declared
`sessionManagement` on an external module is honoured the moment `POST
/api/adapters/install` returns 201

## What Changed

- `server/src/routes/adapters.ts`: `registerWithSessionManagement`
delegates to the exported `resolveExternalAdapterRegistration` helper
(added in #4296). Honours module-provided `sessionManagement` first,
falls back to host registry lookup, defaults `undefined`. Updated the
section comment to document the parity-with-IIFE intent.
- `server/src/routes/adapters.ts`: dropped the now-unused
`getAdapterSessionManagement` import.
- `server/src/adapters/registry.ts`: updated the JSDoc on
`resolveExternalAdapterRegistration` — previously said "Exported for
unit tests; runtime callers use the IIFE below", now says the helper is
used by both the init-time IIFE and the hot-install path in
`routes/adapters.ts`. Addresses Greptile C1.
- `server/src/__tests__/adapter-routes.test.ts`: new integration test —
installs a mocked external adapter module carrying a non-trivial
`sessionManagement` declaration and asserts
`findServerAdapter(type).sessionManagement` preserves it after `POST
/api/adapters/install` returns 201.
- `server/src/__tests__/adapter-routes.test.ts`: added
`findServerAdapter` to the shared test-scope variable set so the new
test can inspect post-install registry state.

## Verification

Targeted test runs from a clean tree on
`fix/external-session-management-hot-install` (rebased onto current
`upstream/master` now that #4296 has merged):

- `pnpm test server/src/__tests__/adapter-routes.test.ts` — 6 passed
(new test + 5 pre-existing)
- `pnpm test server/src/__tests__/adapter-registry.test.ts` — 15 passed
(ensures the IIFE path from #4296 continues to behave correctly)
- `pnpm -w run test` full workspace suite — 1923 passed / 1 skipped
(unrelated skip)

End-to-end smoke on file:
[`@superbiche/cline-paperclip-adapter@0.1.1`](https://www.npmjs.com/package/@superbiche/cline-paperclip-adapter)
and
[`@superbiche/qwen-paperclip-adapter@0.1.1`](https://www.npmjs.com/package/@superbiche/qwen-paperclip-adapter),
both public on npm, both declare `sessionManagement`. With this PR in
place, the "restart after install" step disappears — the declared
compaction policy is active immediately after the install response.

## Risks

- Low risk. The change replaces an inline mutation with a call to a
helper that already has dedicated unit coverage (#4296 added three tests
for `resolveExternalAdapterRegistration` covering module-provided,
registry-fallback, and undefined paths). Behaviour is a strict superset
of the prior path — externals that did not declare `sessionManagement`
continue to get the hardcoded-registry lookup; externals that did
declare it now have those values preserved instead of overwritten.
- No migration impact. The stored plugin records
(`~/.paperclip/adapter-plugins.json`) are unchanged. Existing
hot-installed adapters behave correctly before and after.
- No behavioural change for builtin adapters; they hit
`registerServerAdapter` directly and never flow through
`registerWithSessionManagement`.

## Model Used

- Provider and model: Claude (Anthropic) via Claude Code
- Model ID: `claude-opus-4-7` (1M context)
- Reasoning mode: standard (no extended thinking on this PR)
- Tool use: yes — file edits, subprocess invocations for
builds/tests/git via the Claude Code harness

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A — server-only change)
- [x] I have updated relevant documentation to reflect my changes (the
JSDoc on `resolveExternalAdapterRegistration` and the section comment
above `registerWithSessionManagement` now document the parity-with-IIFE
intent)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 09:45:24 -05:00
Michel Tomas 24232078fd fix(adapters/registry): honor module-provided sessionManagement for external adapters (#4296)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters are how paperclip hands work off to specific agent
runtimes; since #2218, external adapter packages can ship as npm modules
loaded via `server/src/adapters/plugin-loader.ts`
> - Each `ServerAdapterModule` can declare `sessionManagement`
(`supportsSessionResume`, `nativeContextManagement`,
`defaultSessionCompaction`) — but the init-time load at
`registry.ts:363-369` hard-overwrote it with a hardcoded-registry lookup
that has no entries for external types, so modules could not actually
set these fields
> - The hot-install path at `routes/adapters.ts:179` →
`registerServerAdapter` preserves module-provided `sessionManagement`,
so externals worked after `POST /api/adapters/install` — *until the next
server restart*, when the init-time IIFE wiped it back to `undefined`
> - #2218 explicitly deferred this: *"Adapter execution model, heartbeat
protocol, and session management are untouched."* This PR is the natural
follow-up for session management on the plugin-loader path
> - This PR aligns init-time registration with the hot-install path:
honor module-provided `sessionManagement` first, fall back to the
hardcoded registry when absent (so externals overriding a built-in type
still inherit its policy). Extracted as a testable helper with three
unit tests
> - The benefit is external adapters can declare session-resume
capabilities consistently across cold-start and hot-install, without
requiring upstream additions to the hardcoded registry for each new
plugin

## What Changed

- `server/src/adapters/registry.ts`: extracted the merge logic into a
new exported helper `resolveExternalAdapterRegistration()` — honors
module-provided `sessionManagement` first, falls back to
`getAdapterSessionManagement(type)`, else `undefined`. The init-time
IIFE calls the helper instead of inlining an overwrite.
- `server/src/adapters/registry.ts`: updated the section comment (lines
331–340) to reflect the new semantics and cross-reference the
hot-install path's behavior.
- `server/src/__tests__/adapter-registry.test.ts`: new
`describe("resolveExternalAdapterRegistration")` block with three tests
— module-provided value preserved, registry fallback when module omits,
`undefined` when neither provides.

## Verification

Targeted test run from a clean tree on
`fix/external-session-management`:

```
cd server && pnpm exec vitest run src/__tests__/adapter-registry.test.ts
# 1 test file, 15 tests passed, 0 failed (12 pre-existing + 3 new)
```

Full server suite via the independent review pass noted under Model
Used: **1,156 tests passed, 0 failed**.

Typecheck note: `pnpm --filter @paperclipai/server exec tsc --noEmit`
surfaces two errors in `src/services/plugin-host-services.ts:1510`
(`createInteraction` + implicit-any). Verified by `git stash` + re-run
on clean `upstream/master` — they reproduce without this PR's changes.
Pre-existing, out of scope.

## Risks

- **Low behavioral risk.** Strictly additive: externals that do NOT
provide `sessionManagement` continue to receive exactly the same value
as before (registry lookup → `undefined` for pure externals, or the
builtin's entry for externals overriding a built-in type). Only a new
capability is unlocked; no existing behavior changes for existing
adapters.
- **No breaking change.** `ServerAdapterModule.sessionManagement` was
already optional at the type level. Externals that never set it see no
difference on either path.
- **Consistency verified.** Init-time IIFE now matches the post-`POST
/api/adapters/install` behavior — a server restart no longer regresses
the field.

## Note

This is part of a broader effort to close the parity gap between
external and built-in adapters. Once externals reach 1:1 capability
coverage with internals, new-adapter contributions can increasingly be
steered toward the external-plugin path instead of the core product — a
trajectory CONTRIBUTING.md already encourages ("*If the idea fits as an
extension, prefer building it with the plugin system*").

## Model Used

- **Provider**: Anthropic
- **Model**: Claude Opus 4.7
- **Exact model ID**: `claude-opus-4-7` (1M-context variant:
`claude-opus-4-7[1m]`)
- **Context window**: 1,000,000 tokens
- **Harness**: Claude Code (Anthropic's official CLI), orchestrated by
@superbiche as human-in-the-loop. Full file-editing, shell, and `gh`
tool use, plus parallel research subagents for fact-finding against
paperclip internals (plugin-loader contract, sessionCodec reachability,
UI parser surface, Cline CLI JSON schema).
- **Independent local review**: Gemini 3.1 Pro (Google) performed a
separate verification pass on the committed branch — confirmed the
approach & necessity, ran the full workspace build, and executed the
complete server test suite (1,156 tests, all passing). Not used for
authoring; second-opinion pass only.
- **Authoring split**: @superbiche identified the gap (while mapping the
external-adapter surface for a downstream adapter build) and shaped the
plan — categorising the surface into `works / acceptable /
needs-upstream` buckets, directing the surgical-diff approach on a fresh
branch from `upstream/master`, and calling the framing ("alignment bug
between init-time IIFE and hot-install path" rather than "missing
capability"). Opus 4.7 executed the fact-finding, the diff, the tests,
and drafted this PR body — all under direct review.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work (convention-aligned bug fix on the external-adapter
plugin path introduced by #2218)
- [x] I have run tests locally and they pass (15/15 in the touched file;
1,156/1,156 full server suite via the independent Gemini 3.1 Pro review)
- [x] I have added tests where applicable (3 new for the extracted
helper)
- [x] If this change affects the UI, I have included before/after
screenshots (no UI touched)
- [x] I have updated relevant documentation to reflect my changes
(in-file comment reflects new semantics)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 07:39:43 -05:00
Devin Foley 13551b2bac Add local environment lifecycle (#4297)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Every heartbeat run needs a concrete place where the agent's adapter
process executes.
> - Today that execution location is implicitly the local machine, which
makes it hard to track, audit, and manage as a first-class runtime
concern.
> - The first step is to represent the current local execution path
explicitly without changing how users experience agent runs.
> - This pull request adds core Environment and Environment Lease
records, then routes existing local heartbeat execution through a
default `Local` environment.
> - The benefit is that local runs remain behavior-preserving while the
system now has durable environment identity, lease lifecycle tracking,
and activity records for execution placement.

## What Changed

- Added `environments` and `environment_leases` database tables, schema
exports, and migration `0065_environments.sql`.
- Added shared environment constants, TypeScript types, and validators
for environment drivers, statuses, lease policies, lease statuses, and
cleanup states.
- Added `environmentService` for listing, reading, creating, updating,
and ensuring company-scoped environments.
- Added environment lease lifecycle operations for acquire, metadata
update, single-lease release, and run-wide release.
- Updated heartbeat execution to lazily ensure a company-scoped default
`Local` environment before adapter execution.
- Updated heartbeat execution to acquire an ephemeral local environment
lease, write `paperclipEnvironment` into the run context snapshot, and
release active leases during run finalization.
- Added activity log events for environment lease acquisition and
release.
- Added tests for environment service behavior and the local heartbeat
environment lifecycle.
- Added a CI-follow-up heartbeat guard so deferred issue comment wakes
are promoted before automatic missing-comment retries, with focused
batching test coverage.

## Verification

Local verification run for this branch:

- `pnpm -r typecheck`
- `pnpm build`
- `pnpm exec vitest run server/src/__tests__/environment-service.test.ts
server/src/__tests__/heartbeat-local-environment.test.ts --pool=forks`

Additional reviewer/CI verification:

- Confirm `pnpm-lock.yaml` is not modified.
- Confirm `pnpm test:run` passes in CI.
- Confirm `PAPERCLIP_E2E_SKIP_LLM=true pnpm run test:e2e` passes in CI.
- Confirm a local heartbeat run creates one active `Local` environment
when needed, records one lease for the run, releases the lease when the
run finishes, and includes `paperclipEnvironment` in the run context
snapshot.

Screenshots: not applicable; this PR has no UI changes.

## Risks

- Migration risk: introduces two new tables and a new migration journal
entry. Review should verify company scoping, indexes, foreign keys, and
enum defaults are correct.
- Lifecycle risk: heartbeat finalization now releases environment leases
in addition to existing runtime cleanup. A finalization bug could leave
stale active leases or mark a failed run's lease incorrectly.
- Behavior-preservation risk: local adapter execution should remain
unchanged apart from environment bookkeeping. Review should pay
attention to the heartbeat path around context snapshot updates and
final cleanup ordering.
- Activity volume risk: each heartbeat run now logs lease acquisition
and release events, increasing activity log volume by two records per
run.

## Model Used

OpenAI GPT-5.4 via Codex CLI. Capabilities used: repository inspection,
TypeScript implementation review, local test/build execution, and
PR-description drafting.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no UI changes)
- [x] I have updated relevant documentation to reflect my changes (N/A:
no user-facing docs or commands changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-22 20:07:41 -07:00
Dotta b69b563aa8 [codex] Fix stale issue execution run locks (#4258)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies, so issue
checkout and execution ownership are core safety contracts.
> - The affected subsystem is the issue service and route layer that
gates agent writes by `checkoutRunId` and `executionRunId`.
> - PAP-1982 exposed a stale-lock failure mode where a terminal
heartbeat run could leave `executionRunId` pinned after checkout
ownership had moved or been cleared.
> - That stale execution lock could reject legitimate
PATCH/comment/release requests from the rightful assignee after a
harness restart.
> - This pull request centralizes terminal-run cleanup, applies it
before ownership-gated writes, and adds a board-only recovery endpoint
for operator intervention.
> - The benefit is that crashed or terminal runs no longer strand issues
behind stale execution locks, while live execution locks still block
conflicting writes.

## What Changed

- Added `issueService.clearExecutionRunIfTerminal()` to atomically lock
the issue/run rows and clear terminal or missing execution-run locks.
- Reused stale execution-lock cleanup from checkout,
`assertCheckoutOwner()`, and `release()`.
- Allowed the same assigned agent/current run to adopt an unowned
`in_progress` checkout after stale execution-lock cleanup.
- Updated release to clear `executionRunId`, `executionAgentNameKey`,
and `executionLockedAt`.
- Added board-only `POST /api/issues/:id/admin/force-release` with
company access checks, optional `clearAssignee=true`, and
`issue.admin_force_release` audit logging.
- Added embedded Postgres service tests and route integration tests for
stale-lock recovery, release behavior, and admin force-release
authorization/audit behavior.
- Documented the new force-release API in `doc/SPEC-implementation.md`.

## Verification

- `pnpm vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/issue-stale-execution-lock-routes.test.ts` passed.
- `pnpm vitest run
server/src/__tests__/issue-stale-execution-lock-routes.test.ts
server/src/__tests__/approval-routes-idempotency.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/issue-telemetry-routes.test.ts` passed.
- `pnpm -r typecheck` passed.
- `pnpm build` passed.
- `git diff --check` passed.
- `pnpm lint` could not run because this repo has no `lint` command.
- Full `pnpm test:run` completed with 4 failures in existing route
suites: `approval-routes-idempotency.test.ts` (2),
`issue-comment-reopen-routes.test.ts` (1), and
`issue-telemetry-routes.test.ts` (1). Those same files pass when run
isolated and when run together with the new stale-lock route test, so
this appears to be a whole-suite ordering/mock-isolation issue outside
this patch path.

## Risks

- Medium: this changes ownership-gated write behavior. The new adoption
path is limited to the current run, the current assignee, `in_progress`
issues, and rows with no checkout owner after terminal-lock cleanup.
- Low: the admin force-release endpoint is board-only and
company-scoped, but misuse can intentionally clear a live lock. It
writes an audit event with prior lock IDs.
- No schema or migration changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent (`gpt-5`), agentic coding with
terminal/tool use and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-22 10:43:38 -05:00
Dotta a957394420 [codex] Add structured issue-thread interactions (#4244)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators supervise that work through issues, comments, approvals,
and the board UI.
> - Some agent proposals need structured board/user decisions, not
hidden markdown conventions or heavyweight governed approvals.
> - Issue-thread interactions already provide a natural thread-native
surface for proposed tasks and questions.
> - This pull request extends that surface with request confirmations,
richer interaction cards, and agent/plugin/MCP helpers.
> - The benefit is that plan approvals and yes/no decisions become
explicit, auditable, and resumable without losing the single-issue
workflow.

## What Changed

- Added persisted issue-thread interactions for suggested tasks,
structured questions, and request confirmations.
- Added board UI cards for interaction review, selection, question
answers, and accept/reject confirmation flows.
- Added MCP and plugin SDK helpers for creating interaction cards from
agents/plugins.
- Updated agent wake instructions, onboarding assets, Paperclip skill
docs, and public docs to prefer structured confirmations for
issue-scoped decisions.
- Rebased the branch onto `public-gh/master` and renumbered branch
migrations to `0063` and `0064`; the idempotency migration uses `ADD
COLUMN IF NOT EXISTS` for old branch users.

## Verification

- `git diff --check public-gh/master..HEAD`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
packages/mcp-server/src/tools.test.ts
packages/shared/src/issue-thread-interactions.test.ts
ui/src/lib/issue-thread-interactions.test.ts
ui/src/lib/issue-chat-messages.test.ts
ui/src/components/IssueThreadInteractionCard.test.tsx
ui/src/components/IssueChatThread.test.tsx
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
server/src/services/issue-thread-interactions.test.ts` -> 9 files / 79
tests passed
- `pnpm -r typecheck` -> passed, including `packages/db` migration
numbering check

## Risks

- Medium: this adds a new issue-thread interaction model across
db/shared/server/ui/plugin surfaces.
- Migration risk is reduced by placing this branch after current master
migrations (`0063`, `0064`) and making the idempotency column add
idempotent for users who applied the old branch numbering.
- UI interaction behavior is covered by component tests, but this PR
does not include browser screenshots.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-class coding agent runtime. Exact model ID and
context window are not exposed in this Paperclip run; tool use and local
shell/code execution were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 20:15:11 -05:00
Dotta 014aa0eb2d [codex] Clear stale queued comment targets (#4234)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators interact with agent work through issue threads and queued
comments.
> - When the selected comment target becomes stale, the composer can
keep pointing at an invalid target after thread state changes.
> - That makes follow-up comments easier to misroute and harder to
reason about.
> - This pull request clears stale queued comment targets and covers the
behavior with tests.
> - The benefit is more predictable issue-thread commenting during live
agent work.

## What Changed

- Clears queued comment targets when they no longer match the current
issue thread state.
- Adjusts issue detail comment-target handling to avoid stale target
reuse.
- Adds regression tests for optimistic issue comment target behavior.

## Verification

- `pnpm exec vitest run ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low risk; scoped to comment-target state handling in the issue UI.
- No migrations.

> Checked `ROADMAP.md`; this is a focused UI reliability fix, not a new
roadmap-level feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled repository
editing and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 16:50:26 -05:00
Dotta bcbbb41a4b [codex] Harden heartbeat runtime cleanup (#4233)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat runtime is the control-plane path that turns issue
assignments into agent runs and recovers after process exits.
> - Several edge cases could leave high-volume reads unbounded, stale
runtime services visible, blocked dependency wakes too eager, or
terminal adapter processes still around after output finished.
> - These problems make operator views noisy and make long-running agent
work less predictable.
> - This pull request tightens the runtime/read paths and adds focused
regression coverage.
> - The benefit is safer heartbeat execution and cleaner runtime state
without changing the public task model.

## What Changed

- Bounded high-volume issue/log reads in runtime code paths.
- Hardened heartbeat handling for blocked dependency wakes and terminal
run cleanup.
- Added adapter process cleanup coverage for terminal output cases.
- Added workspace runtime control tests for stale command matching and
stopped services.

## Verification

- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
ui/src/components/WorkspaceRuntimeControls.test.tsx`

## Risks

- Medium risk because heartbeat cleanup and runtime filtering affect
active agent execution paths.
- No migrations.

> Checked `ROADMAP.md`; this is runtime hardening and bug-fix work, not
a new roadmap-level feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled repository
editing and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 16:48:47 -05:00
Dotta 73ef40e7be [codex] Sandbox dynamic adapter UI parsers (#4225)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies.
> - External adapters can provide UI parser code that the board loads
dynamically for run transcript rendering.
> - Running adapter-provided parser code directly in the board page
gives that parser access to same-origin browser state.
> - This PR narrows that surface by evaluating dynamically loaded
external adapter UI parser code in a dedicated browser Web Worker with a
constrained postMessage protocol.
> - The worker here is a frontend isolation boundary for adapter UI
parser JavaScript; it is not Paperclip's server plugin-worker system and
it is not a server-side job runner.

## What Changed

- Runs dynamically loaded external adapter UI parsers inside a dedicated
Web Worker instead of importing/evaluating them directly in the board
page.
- Adds a narrow postMessage protocol for parser initialization and line
parsing.
- Caches completed async parse results and notifies the adapter registry
so transcript recomputation can synchronously drain the final parsed
line.
- Disables common worker network, persistence, child worker, Blob/object
URL, and WebRTC escape APIs inside the parser worker bootstrap.
- Handles worker error messages after initialization and drains pending
callbacks on worker termination or mid-session worker error.
- Adds focused regression coverage for the parser worker lockdown and
unused protocol removal.

## Verification

- `pnpm exec vitest run --config ui/vitest.config.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm exec tsc --noEmit --target es2021 --moduleResolution bundler
--module esnext --jsx react-jsx --lib dom,es2021 --skipLibCheck
ui/src/adapters/dynamic-loader.ts
ui/src/adapters/sandboxed-parser-worker.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm --filter @paperclipai/ui typecheck` was attempted; it reached
existing unrelated failures in HeartbeatRun test/storybook fixtures and
missing Storybook type resolution, with no adapter-module errors
surfaced.
- PR #4225 checks on current head `34c9da00`: `policy`, `e2e`, `verify`,
`security/snyk`, and `Greptile Review` are all `SUCCESS`.
- Greptile Review on current head `34c9da00` reached 5/5.

## Risks

- Medium risk: parser execution is now asynchronous through a worker
while the existing parser interface is synchronous, so transcript
updates should be watched with external adapters.
- Some adapter parser bundles may rely on direct ESM `export` syntax or
browser APIs that are no longer available inside the worker lockdown.
- The worker lockdown is a hardening layer around external parser code,
not a complete browser security sandbox for arbitrary untrusted
applications.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 13:42:44 -05:00
Dotta a26e1288b6 [codex] Polish issue board workflows (#4224)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Human operators supervise that work through issue lists, issue
detail, comments, inbox groups, markdown references, and
profile/activity surfaces
> - The branch had many small UI fixes that improve the operator loop
but do not need to ship with backend runtime migrations
> - These changes belong together as board workflow polish because they
affect scanning, navigation, issue context, comment state, and markdown
clarity
> - This pull request groups the UI-only slice so it can merge
independently from runtime/backend changes
> - The benefit is a clearer board experience with better issue context,
steadier optimistic updates, and more predictable keyboard navigation

## What Changed

- Improves issue properties, sub-issue actions, blocker chips, and issue
list/detail refresh behavior.
- Adds blocker context above the issue composer and stabilizes
queued/interrupted comment UI state.
- Improves markdown issue/GitHub link rendering and opens external
markdown links in a new tab.
- Adds inbox group keyboard navigation and fold/unfold support.
- Polishes activity/avatar/profile/settings/workspace presentation
details.

## Verification

- `pnpm exec vitest run ui/src/components/IssueProperties.test.tsx
ui/src/components/IssueChatThread.test.tsx
ui/src/components/MarkdownBody.test.tsx ui/src/lib/inbox.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low to medium risk: changes are UI-focused but cover high-traffic
issue and inbox surfaces.
- This branch intentionally does not include the backend runtime changes
from the companion PR; where UI calls newer API filters, unsupported
servers should continue to fail visibly through existing API error
handling.
- Visual screenshots were not captured in this heartbeat; targeted
component/helper tests cover the changed behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 12:25:34 -05:00
Dotta 09d0678840 [codex] Harden heartbeat scheduling and runtime controls (#4223)
## Thinking Path

> - Paperclip orchestrates AI agents through issue checkout, heartbeat
runs, routines, and auditable control-plane state
> - The runtime path has to recover from lost local processes, transient
adapter failures, blocked dependencies, and routine coalescing without
stranding work
> - The existing branch carried several reliability fixes across
heartbeat scheduling, issue runtime controls, routine dispatch, and
operator-facing run state
> - These changes belong together because they share backend contracts,
migrations, and runtime status semantics
> - This pull request groups the control-plane/runtime slice so it can
merge independently from board UI polish and adapter sandbox work
> - The benefit is safer heartbeat recovery, clearer runtime controls,
and more predictable recurring execution behavior

## What Changed

- Adds bounded heartbeat retry scheduling, scheduled retry state, and
Codex transient failure recovery handling.
- Tightens heartbeat process recovery, blocker wake behavior, issue
comment wake handling, routine dispatch coalescing, and
activity/dashboard bounds.
- Adds runtime-control MCP tools and Paperclip skill docs for issue
workspace runtime management.
- Adds migrations `0061_lively_thor_girl.sql` and
`0062_routine_run_dispatch_fingerprint.sql`.
- Surfaces retry state in run ledger/agent UI and keeps related shared
types synchronized.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/routines-service.test.ts`
- `pnpm exec vitest run src/tools.test.ts` from `packages/mcp-server`

## Risks

- Medium risk: this touches heartbeat recovery and routine dispatch,
which are central execution paths.
- Migration order matters if split branches land out of order: merge
this PR before branches that assume the new runtime/routine fields.
- Runtime retry behavior should be watched in CI and in local operator
smoke tests because it changes how transient failures are resumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 12:24:11 -05:00
Dotta ab9051b595 Add first-class issue references (#4214)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators and agents coordinate through company-scoped issues,
comments, documents, and task relationships.
> - Issue text can mention other tickets, but those references were
previously plain markdown/text without durable relationship data.
> - That made it harder to understand related work, surface backlinks,
and keep cross-ticket context visible in the board.
> - This pull request adds first-class issue reference extraction,
storage, API responses, and UI surfaces.
> - The benefit is that issue references become queryable, navigable,
and visible without relying on ad hoc text scanning.

## What Changed

- Added shared issue-reference parsing utilities and exported
reference-related types/constants.
- Added an `issue_reference_mentions` table, idempotent migration DDL,
schema exports, and database documentation.
- Added server-side issue reference services, route integration,
activity summaries, and a backfill command for existing issue content.
- Added UI reference pills, related-work panels, markdown/editor mention
handling, and issue detail/property rendering updates.
- Added focused shared, server, and UI tests for parsing, persistence,
display, and related-work behavior.
- Rebased `PAP-735-first-class-task-references` cleanly onto
`public-gh/master`; no `pnpm-lock.yaml` changes are included.

## Verification

- `pnpm -r typecheck`
- `pnpm test:run packages/shared/src/issue-references.test.ts
server/src/__tests__/issue-references-service.test.ts
ui/src/components/IssueRelatedWorkPanel.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/MarkdownBody.test.tsx`

## Risks

- Medium risk because this adds a new issue-reference persistence path
that touches shared parsing, database schema, server routes, and UI
rendering.
- Migration risk is mitigated by `CREATE TABLE IF NOT EXISTS`, guarded
foreign-key creation, and `CREATE INDEX IF NOT EXISTS` statements so
users who have applied an older local version of the numbered migration
can re-run safely.
- UI risk is limited by focused component coverage, but reviewers should
still manually inspect issue detail pages containing ticket references
before merge.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-using shell workflow with
repository inspection, git rebase/push, typecheck, and focused Vitest
verification.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: dotta <dotta@example.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 10:02:52 -05:00
Dotta 1954eb3048 [codex] Detect issue graph liveness deadlocks (#4209)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat harness is responsible for waking agents, reconciling
issue state, and keeping execution moving.
> - Some dependency graphs can become live-locks when a blocked issue
depends on an unassigned, cancelled, or otherwise uninvokable issue.
> - Review and approval stages can also stall when the recorded
participant can no longer be resolved.
> - This pull request adds issue graph liveness classification plus
heartbeat reconciliation that creates durable escalation work for those
cases.
> - The benefit is that harness-level deadlocks become visible,
assigned, logged, and recoverable instead of silently leaving task
sequences blocked.

## What Changed

- Added an issue graph liveness classifier for blocked dependency and
invalid review participant states.
- Added heartbeat reconciliation that creates one stable escalation
issue per liveness incident, links it as a blocker, comments on the
affected issue, wakes the recommended owner, and logs activity.
- Wired startup and periodic server reconciliation for issue graph
liveness incidents.
- Added focused tests for classifier behavior, heartbeat escalation
creation/deduplication, and queued dependency wake promotion.
- Fixed queued issue wakes so a coalesced wake re-runs queue selection,
allowing dependency-unblocked work to start immediately.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
server/src/__tests__/issue-liveness.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts`
- Passed locally: `server/src/__tests__/issue-liveness.test.ts` (5
tests)
- Skipped locally: embedded Postgres suites because optional package
`@embedded-postgres/darwin-x64` is not installed on this host
- `pnpm --filter @paperclipai/server typecheck`
- `git diff --check`
- Greptile review loop: ran 3 times as requested; the final
Greptile-reviewed head `0a864eab` had 0 comments and all Greptile
threads were resolved. Later commits are CI/test-stability fixes after
the requested max Greptile pass count.
- GitHub PR checks on head `87493ed4`: `policy`, `verify`, `e2e`, and
`security/snyk (cryppadotta)` all passed.

## Risks

- Moderate operational risk: the reconciler creates escalation issues
automatically, so incorrect classification could create noise. Stable
incident keys and deduplication limit repeated escalation.
- Low schema risk: this uses existing issue, relation, comment, wake,
and activity log tables with no migration.
- No UI screenshots included because this change is server-side harness
behavior only.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent. Exact runtime model ID and
context window were not exposed in this session. Used tool execution for
git, tests, typecheck, Greptile review handling, and GitHub CLI
operations.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 09:11:12 -05:00
Robin van Duiven 8d0c3d2fe6 fix(hermes): inject agent JWT into Hermes adapter env to fix identity attribution (#3608)
## Thinking Path

> - Paperclip orchestrates AI agents and records their actions through
auditable issue comments and API writes.
> - The local adapter registry is responsible for adapting each agent
runtime to Paperclip's server-side execution context.
> - The Hermes local adapter delegated directly to
`hermes-paperclip-adapter`, whose current execution context type
predates the server `authToken` field.
> - Without explicitly passing the run-scoped agent token and run id
into Hermes, Hermes could inherit a server or board-user
`PAPERCLIP_API_KEY` and lack a usable `PAPERCLIP_RUN_ID` for mutating
API calls.
> - That made Paperclip writes from Hermes agents risk appearing under
the wrong identity or without the correct run-scoped attribution.
> - This pull request wraps the Hermes execution call so Hermes receives
the agent run JWT as `PAPERCLIP_API_KEY` and the current execution id as
`PAPERCLIP_RUN_ID` while preserving explicit adapter configuration where
appropriate.
> - Follow-up review fixes preserve Hermes' built-in prompt when no
custom prompt template exists and document the intentional type cast.
> - The benefit is reliable agent attribution for the covered local
Hermes path without clobbering Hermes' default heartbeat/task
instructions.

## What Changed

- Wrapped `hermesLocalAdapter.execute` so `ctx.authToken` is injected
into `adapterConfig.env.PAPERCLIP_API_KEY` when no explicit Paperclip
API key is already configured.
- Injected `ctx.runId` into `adapterConfig.env.PAPERCLIP_RUN_ID` so the
auth guard's `X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID` instruction
resolves to the current run id.
- Added a Paperclip API auth guard to existing custom Hermes
`promptTemplate` values without creating a replacement prompt when no
custom template exists.
- Documented the intentional `as unknown as` cast needed until
`hermes-paperclip-adapter` ships an `AdapterExecutionContext` type that
includes `authToken`.
- Added registry tests for JWT injection, run-id injection, explicit key
preservation, default prompt preservation, and the no-`authToken`
early-return path.

## Verification

- [x] `pnpm --filter "./server" exec vitest run adapter-registry` - 8
tests passed.
- [x] `pnpm --filter "./server" typecheck` - passed.
- [x] Trigger a Hermes agent heartbeat and verify Paperclip writes
appear under the agent identity rather than a shared board-user
identity, with the correct run id on mutating requests.

## Risks

- Low migration risk: this changes only the Hermes local adapter wrapper
and tests.
- Existing explicit `adapterConfig.env.PAPERCLIP_API_KEY` values are
preserved to avoid breaking intentionally configured agents.
- `PAPERCLIP_RUN_ID` is set from `ctx.runId` for each execution so
mutating API calls use the current run id instead of a stale or literal
placeholder value.
- Prompt behavior is intentionally conservative: the auth guard is only
prepended when a custom prompt template already exists, so Hermes'
built-in default prompt remains intact for unconfigured agents.
- Remaining operational risk: the identity and run-id behavior should
still be verified with a live Hermes heartbeat before relying on it in
production.

## Model Used

- OpenAI Codex, GPT-5 family coding agent, tool use enabled for local
shell, GitHub CLI, and test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable: backend-only change)
- [x] I have updated relevant documentation to reflect my changes (not
applicable: no product docs changed; PR description updated)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Dotta <bippadotta@protonmail.com>
2026-04-21 07:18:11 -05:00
Dotta 1266954a4e [codex] Make heartbeat scheduling blocker-aware (#4157)
## Thinking Path

> - Paperclip orchestrates AI agents through issue-driven heartbeats,
checkouts, and wake scheduling.
> - This change sits in the server heartbeat and issue services that
decide which queued runs are allowed to start.
> - Before this branch, queued heartbeats could be selected even when
their issue still had unresolved blocker relationships.
> - That let blocked descendant work compete with actually-ready work
and risked auto-checking out issues that were not dependency-ready.
> - This pull request teaches the scheduler and checkout path to consult
issue dependency readiness before claiming queued runs.
> - It also exposes dependency readiness in the agent inbox so agents
can see which assigned issues are still blocked.
> - The result is that heartbeat execution follows the DAG of blocked
dependencies instead of waking work out of order.

## What Changed

- Added `IssueDependencyReadiness` helpers to `issueService`, including
unresolved blocker lookup for single issues and bulk issue lists.
- Prevented issue checkout and `in_progress` transitions when unresolved
blockers still exist.
- Made heartbeat queued-run claiming and prioritization dependency-aware
so ready work starts before blocked descendants.
- Included dependency readiness fields in `/api/agents/me/inbox-lite`
for agent heartbeat selection.
- Added regression coverage for dependency-aware heartbeat promotion and
issue-service participation filtering.

## Verification

- `pnpm run preflight:workspace-links`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
server/src/__tests__/issues-service.test.ts`
- On this host, the Vitest command passed, but the embedded-Postgres
portions of those files were skipped because
`@embedded-postgres/darwin-x64` is not installed.

## Risks

- Scheduler ordering now prefers dependency-ready runs, so any hidden
assumptions about strict FIFO ordering could surface in edge cases.
- The new guardrails reject checkout or `in_progress` transitions for
blocked issues; callers depending on the old permissive behavior would
now get `422` errors.
- Local verification did not execute the embedded-Postgres integration
paths on this macOS host because the platform binary package was
missing.

> I checked `ROADMAP.md`; this is a targeted execution/scheduling fix
and does not duplicate planned roadmap feature work.

## Model Used

- OpenAI Codex via the Paperclip `codex_local` adapter in this
workspace. Exact backend model ID is not surfaced in the runtime here;
tool-enabled coding agent with terminal execution and repository editing
capabilities.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-20 16:03:57 -05:00
Hiuri Noronha 1bf2424377 fix: honor Hermes local command override (#3503)
## Summary

This fixes the Hermes local adapter so that a configured command
override is respected during both environment tests and execution.

## Problem

The Hermes adapter expects `adapterConfig.hermesCommand`, but the
generic local command path in the UI was storing
`adapterConfig.command`.

As a result, changing the command in the UI did not reliably affect
runtime behavior. In real use, the adapter could still fall back to the
default `hermes` binary.

This showed up clearly in setups where Hermes is launched through a
wrapper command rather than installed directly on the host.

## What changed

- switched the Hermes local UI adapter to the Hermes-specific config
builder
- updated the configuration form to read and write `hermesCommand` for
`hermes_local`
- preserved the override correctly in the test-environment path
- added server-side normalization from legacy `command` to
`hermesCommand`

## Compatibility

The server-side normalization keeps older saved agent configs working,
including configs that still store the value under `command`.

## Validation

Validated against a Docker-based Hermes workflow using a local wrapper
exposed through a symlinked command:

- `Command = hermes-docker`
- environment test respects the override
- runs no longer fall back to `hermes`

Typecheck also passed for both UI and server.

Co-authored-by: NoronhaH <NoronhaH@users.noreply.github.com>
2026-04-20 15:55:08 -05:00
LeonSGP 51f127f47b fix(hermes): stop advertising unsupported instructions bundles (#3908)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Local adapter capability flags decide which configuration surfaces
the UI and server expose for each adapter.
> - `hermes_local` currently advertises managed instructions bundle
support, so Paperclip exposes the AGENTS.md bundle flow for Hermes
agents.
> - The bundled `hermes-paperclip-adapter` only consumes
`promptTemplate` at runtime and does not read `instructionsFilePath`, so
that advertised bundle path silently does nothing.
> - Issue #3833 reports exactly that mismatch: users configure AGENTS.md
instructions, but Hermes only receives the built-in heartbeat prompt.
> - This pull request stops advertising managed instructions bundles for
`hermes_local` until the adapter actually consumes bundle files at
runtime.

## What Changed

- Changed the built-in `hermes_local` server adapter registration to
report `supportsInstructionsBundle: false`.
- Updated the UI's synchronous built-in capability fallback so Hermes no
longer shows the managed instructions bundle affordance on first render.
- Added regression coverage in
`server/src/__tests__/adapter-routes.test.ts` to assert that
`hermes_local` still reports skills + local JWT support, but not
instructions bundle support.

## Verification

- `git diff --check`
- `node --experimental-strip-types --input-type=module -e "import {
findActiveServerAdapter } from './server/src/adapters/index.ts'; const
adapter = findActiveServerAdapter('hermes_local');
console.log(JSON.stringify({ type: adapter?.type,
supportsInstructionsBundle: adapter?.supportsInstructionsBundle,
supportsLocalAgentJwt: adapter?.supportsLocalAgentJwt, supportsSkills:
Boolean(adapter?.listSkills || adapter?.syncSkills) }));"`
- Observed
`{"type":"hermes_local","supportsInstructionsBundle":false,"supportsLocalAgentJwt":true,"supportsSkills":true}`
- Added adapter-routes regression assertions for the Hermes capability
contract; CI should validate the full route path in a clean workspace.

## Risks

- Low risk: this only changes the advertised capability surface for
`hermes_local`.
- Behavior change: Hermes agents will no longer show the broken managed
instructions bundle UI until the underlying adapter actually supports
`instructionsFilePath`.
- Existing Hermes skill sync and local JWT behavior are unchanged.

## Model Used

- OpenAI Codex, GPT-5.4 class coding agent, medium reasoning,
terminal/git/gh tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-20 15:54:14 -05:00
github-actions[bot] b94f1a1565 chore(lockfile): refresh pnpm-lock.yaml (#4139)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-04-20 12:15:59 -05:00
Dotta 2de893f624 [codex] add comprehensive UI Storybook coverage (#4132)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The board UI is the main operator surface, so its component and
workflow coverage needs to stay reviewable as the product grows.
> - This branch adds Storybook as a dedicated UI reference surface for
core Paperclip screens and interaction patterns.
> - That work spans Storybook infrastructure, app-level provider wiring,
and a large fixture set that can render real control-plane states
without a live backend.
> - The branch also expands coverage across agents, budgets, issues,
chat, dialogs, navigation, projects, and data visualization so future UI
changes have a concrete visual baseline.
> - This pull request packages that Storybook work on top of the latest
`master`, excludes the lockfile from the final diff per repo policy, and
fixes one fixture contract drift caught during verification.
> - The benefit is a single reviewable PR that adds broad UI
documentation and regression-surfacing coverage without losing the
existing branch work.

## What Changed

- Added Storybook 10 wiring for the UI package, including root scripts,
UI package scripts, Storybook config, preview wrappers, Tailwind
entrypoints, and setup docs.
- Added a large fixture-backed data source for Storybook so complex
board states can render without a live server.
- Added story suites covering foundations, status language,
control-plane surfaces, overview, UX labs, agent management, budget and
finance, forms and editors, issue management, navigation and layout,
chat and comments, data visualization, dialogs and modals, and
projects/goals/workspaces.
- Adjusted several UI components for Storybook parity so dialogs, menus,
keyboard shortcuts, budget markers, markdown editing, and related
surfaces render correctly in isolation.
- Rebasing work for PR assembly: replayed the branch onto current
`master`, removed `pnpm-lock.yaml` from the final PR diff, and aligned
the dashboard fixture with the current `DashboardSummary.runActivity`
API contract.

## Verification

- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/ui build-storybook`
- Manual diff audit after rebase: verified the PR no longer includes
`pnpm-lock.yaml` and now cleanly targets current `master`.
- Before/after UI note: before this branch there was no dedicated
Storybook surface for these Paperclip views; after this branch the local
Storybook build includes the new overview and domain story suites in
`ui/storybook-static`.

## Risks

- Large static fixture files can drift from shared types as dashboard
and UI contracts evolve; this PR already needed one fixture correction
for `runActivity`.
- Storybook bundle output includes some large chunks, so future growth
may need chunking work if build performance becomes an issue.
- Several component tweaks were made for isolated rendering parity, so
reviewers should spot-check key board surfaces against the live app
behavior.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in the Paperclip harness; exact
serving model ID is not exposed in-runtime to the agent.
- Tool-assisted workflow with terminal execution, git operations, local
typecheck/build verification, and GitHub CLI PR creation.
- Context window/reasoning mode not surfaced by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 12:13:23 -05:00
Dotta 7a329fb8bb Harden API route authorization boundaries (#4122)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The REST API is the control-plane boundary for companies, agents,
plugins, adapters, costs, invites, and issue mutations.
> - Several routes still relied on broad board or company access checks
without consistently enforcing the narrower actor, company, and
active-checkout boundaries those operations require.
> - That can allow agents or non-admin users to mutate sensitive
resources outside the intended governance path.
> - This pull request hardens the route authorization layer and adds
regression coverage for the audited API surfaces.
> - The benefit is tighter multi-company isolation, safer plugin and
adapter administration, and stronger enforcement of active issue
ownership.

## What Changed

- Added route-level authorization checks for budgets, plugin
administration/scoped routes, adapter management, company import/export,
direct agent creation, invite test resolution, and issue mutation/write
surfaces.
- Enforced active checkout ownership for agent-authenticated issue
mutations, while preserving explicit management overrides for permitted
managers.
- Restricted sensitive adapter and plugin management operations to
instance-admin or properly scoped actors.
- Tightened company portability and invite probing routes so agents
cannot cross company boundaries.
- Updated access constants and the Company Access UI copy for the new
active-checkout management grant.
- Added focused regression tests covering cross-company denial, agent
self-mutation denial, admin-only operations, and active checkout
ownership.
- Rebased the branch onto `public-gh/master` and fixed validation
fallout from the rebase: heartbeat-context route ordering and a company
import/export e2e fixture that now opts out of direct-hire approval
before using direct agent creation.
- Updated onboarding and signoff e2e setup to create seed agents through
`/agent-hires` plus board approval, so they remain compatible with the
approval-gated new-agent default.
- Addressed Greptile feedback by removing a duplicate company export API
alias, avoiding N+1 reporting-chain lookups in active-checkout override
checks, allowing agent mutations on unassigned `in_progress` issues, and
blocking NAT64 invite-probe targets.

## Verification

- `pnpm exec vitest run
server/src/__tests__/issues-goal-context-routes.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts`
- `pnpm exec vitest run server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/adapter-routes-authz.test.ts
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/company-portability-routes.test.ts
server/src/__tests__/costs-service.test.ts
server/src/__tests__/invite-test-resolution-route.test.ts
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
server/src/__tests__/agent-adapter-validation-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/invite-test-resolution-route.test.ts`
- `pnpm -r typecheck`
- `pnpm --filter server typecheck`
- `pnpm --filter ui typecheck`
- `pnpm build`
- `pnpm test:e2e -- tests/e2e/onboarding.spec.ts
tests/e2e/signoff-policy.spec.ts`
- `pnpm test:e2e -- tests/e2e/signoff-policy.spec.ts`
- `pnpm test:run` was also run. It failed under default full-suite
parallelism with two order-dependent failures in
`plugin-routes-authz.test.ts` and `routines-e2e.test.ts`; both files
passed when rerun directly together with `pnpm exec vitest run
server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/routines-e2e.test.ts`.

## Risks

- Medium risk: this changes authorization behavior across multiple
sensitive API surfaces, so callers that depended on broad board/company
access may now receive `403` or `409` until they use the correct
governance path.
- Direct agent creation now respects the company-level board-approval
requirement; integrations that need pending hires should use
`/api/companies/:companyId/agent-hires`.
- Active in-progress issue mutations now require checkout ownership or
an explicit management override, which may reveal workflow assumptions
in older automation.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

OpenAI Codex, GPT-5 coding agent, tool-using workflow with local shell,
Git, GitHub CLI, and repository tests.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:56:48 -05:00
Dotta 549ef11c14 [codex] Respect manual workspace runtime controls (#4125)
## Thinking Path

> - Paperclip orchestrates AI agents inside execution and project
workspaces
> - Workspace runtime services can be controlled manually by operators
and reused by agent runs
> - Manual start/stop state was not preserved consistently across
workspace policies and routine launches
> - Routine launches also needed branch/workspace variables to default
from the selected workspace context
> - This pull request makes runtime policy state explicit, preserves
manual control, and auto-fills routine branch variables from workspace
data
> - The benefit is less surprising workspace service behavior and fewer
manual inputs when running workspace-scoped routines

## What Changed

- Added runtime-state handling for manual workspace control across
execution and project workspace validators, routes, and services.
- Updated heartbeat/runtime startup behavior so manually stopped
services are respected.
- Auto-filled routine workspace branch variables from available
workspace context.
- Added focused server and UI tests for workspace runtime and routine
variable behavior.
- Removed muted gray background styling from workspace pages and cards
for a cleaner workspace UI.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run server/src/__tests__/routines-service.test.ts
server/src/__tests__/workspace-runtime.test.ts
ui/src/components/RoutineRunVariablesDialog.test.tsx`
- Result: 55 tests passed, 21 skipped. The embedded Postgres routines
tests skipped on this host with the existing PGlite/Postgres init
warning; workspace-runtime and UI tests passed.

## Risks

- Medium risk: this touches runtime service start/stop policy and
heartbeat launch behavior.
- The focused tests cover manual runtime state, routine variables, and
workspace runtime reuse paths.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted component/service verification
is sufficient here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:39:37 -05:00
Dotta c7c1ca0c78 [codex] Clean up terminal-result adapter process groups (#4129)
## Thinking Path

> - Paperclip runs local adapter processes for agents and streams their
output into heartbeat runs
> - Some adapters can emit a terminal result before all descendant
processes have exited
> - If those descendants keep running, a heartbeat can appear complete
while the process group remains alive
> - Claude local runs need a bounded cleanup path after terminal JSON
output is observed and the child exits
> - This pull request adds terminal-result cleanup support to adapter
process utilities and wires it into the Claude local adapter
> - The benefit is fewer stranded adapter process groups after
successful terminal results

## What Changed

- Added terminal-result cleanup options to `runChildProcess`.
- Tracked child exit plus terminal output before signaling lingering
process groups.
- Added Claude local adapter configuration for terminal result cleanup
grace time.
- Added process cleanup tests covering terminal-output cleanup and noisy
non-terminal runs.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts`
- Result: 9 tests passed.

## Risks

- Medium risk: this changes adapter child-process cleanup behavior.
- The cleanup only arms after terminal result detection and child exit,
and it is covered by process-group tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why it is not applicable
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:38:57 -05:00
Dotta 56b3120971 [codex] Improve mobile org chart navigation (#4127)
## Thinking Path

> - Paperclip models companies as teams of human and AI operators
> - The org chart is the primary visual map of that company structure
> - Mobile users need to pan and inspect the chart without awkward
gestures or layout jumps
> - The roadmap also needed to reflect that the multiple-human-users
work is complete
> - This pull request improves mobile org chart gestures and updates the
roadmap references
> - The benefit is a smoother company navigation experience and docs
that match shipped multi-user support

## What Changed

- Added one-finger mobile pan handling for the org chart.
- Expanded org chart test coverage for touch gesture behavior.
- Updated README, ROADMAP, and CLI README references to mark
multiple-human-users work as complete.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run ui/src/pages/OrgChart.test.tsx`
- Result: 4 tests passed.

## Risks

- Low-medium risk: org chart pointer/touch handling changed, but the
behavior is scoped to the org chart page and covered by targeted tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted interaction tests are sufficient
here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:35:33 -05:00
Dotta 4357a3f352 [codex] Harden dashboard run activity charts (#4126)
## Thinking Path

> - Paperclip gives operators a live view of agent work across
dashboards, transcripts, and run activity charts
> - Those views consume live run updates and aggregate run activity from
backend dashboard data
> - Missing or partial run data could make charts brittle, and live
transcript updates were heavier than needed
> - Operators need dashboard data to stay stable even when recent run
payloads are incomplete
> - This pull request hardens dashboard run aggregation, guards chart
rendering, and lightens live run update handling
> - The benefit is a more reliable dashboard during active agent
execution

## What Changed

- Added dashboard run activity types and backend aggregation coverage.
- Guarded activity chart rendering when run data is missing or partial.
- Reduced live transcript update churn in active agent and run chat
surfaces.
- Fixed issue chat avatar alignment in the thread renderer.
- Added focused dashboard, activity chart, and live transcript tests.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run server/src/__tests__/dashboard-service.test.ts
ui/src/components/ActivityCharts.test.tsx
ui/src/components/transcript/useLiveRunTranscripts.test.tsx`
- Result: 8 tests passed, 1 skipped. The embedded Postgres dashboard
service test skipped on this host with the existing PGlite/Postgres init
warning; UI chart and transcript tests passed.

## Risks

- Medium-low risk: aggregation semantics changed, but the UI remains
guarded around incomplete data.
- The dashboard service test is host-skipped here, so CI should confirm
the embedded database path.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted component tests are sufficient
here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:34:21 -05:00
Dotta 0f4e4b4c10 [codex] Split reusable agent hiring templates (#4124)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Hiring new agents depends on clear, reusable operating instructions
> - The create-agent skill had one large template reference that mixed
multiple roles together
> - That made it harder to reuse, review, and adapt role-specific
instructions during governed hires
> - This pull request splits the reusable agent instruction templates
into focused role files and polishes the agent instructions pane layout
> - The benefit is faster, clearer agent hiring without bloating the
main skill document

## What Changed

- Split coder, QA, and UX designer reusable instructions into dedicated
reference files.
- Kept the index reference concise and pointed it at the role-specific
files.
- Updated the create-agent skill to describe the separated template
structure.
- Polished the agent detail instructions/package file tree layout so the
longer template references remain readable.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm --filter @paperclipai/ui typecheck`
- UI screenshot rationale: no screenshots attached because the visible
change is limited to the Agent detail instructions file-tree layout
(`wrapLabels` plus the side-by-side breakpoint). There is no new user
flow or state transition to demonstrate; reviewers can verify visually
by opening an agent's Instructions tab and resizing across the
single-column and side-by-side breakpoints to confirm long file names
wrap instead of truncating or overflowing.

## Risks

- Low risk: this is documentation and UI layout only.
- Main risk is stale links in the skill references; the new files are
committed in the referenced paths.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted component/type verification is
sufficient here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:33:19 -05:00
Aron Prins 73eb23734f docs: use structured agent mentions in paperclip skill (#4103)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents coordinate work through tasks and comments, and @-mentions
are part of the wakeup path for cross-agent handoffs and review requests
> - The current repo skill still instructs machine-authored comments to
use raw `@AgentName` text as the default mention format
> - But the current backend mention parsing is still unreliable for
multi-word display names, so agents following that guidance can silently
fail to wake the intended target
> - This pull request updates the Paperclip skill and API reference to
prefer structured `agent://` markdown mentions for machine-authored
comments
> - The benefit is a low-risk documentation workaround that steers
agents onto the mention format the server already resolves reliably
while broader runtime fixes are reviewed upstream

## What Changed

- Updated `skills/paperclip/SKILL.md` to stop recommending raw
`@AgentName` mentions for machine-authored comments
- Updated `skills/paperclip/references/api-reference.md` with a concrete
workflow: resolve the target via `GET
/api/companies/{companyId}/agents`, then emit `[@Display
Name](agent://<agent-id>)`
- Added explicit guidance that raw `@AgentName` text is fallback-only
and unreliable for names containing spaces
- Cross-referenced the current upstream mention-bug context so reviewers
can connect this docs workaround to the open parser/runtime fixes
  Related issue/PR refs: #448, #459, #558, #669, #722, #1412, #2249

## Verification

- `pnpm -r typecheck`
- `pnpm build`
- `pnpm test:run` currently fails on upstream `master` in existing tests
unrelated to this docs-only change:
- `src/__tests__/worktree.test.ts` — `seeds authenticated users into
minimally cloned worktree instances` timed out after 20000ms
- `src/__tests__/onboard.test.ts` — `keeps tailnet quickstart on
loopback until tailscale is available` expected `127.0.0.1` but got
`100.125.202.3`
- Confirmed the git diff is limited to:
  - `skills/paperclip/SKILL.md`
  - `skills/paperclip/references/api-reference.md`

## Risks

- Low risk. This is a docs/skill-only change and does not alter runtime
behavior.
- It is a mitigation, not a full fix: it helps agent-authored comments
that follow the Paperclip skill, but it does not fix manually typed raw
mentions or other code paths that still emit plain `@Name` text.
- If upstream chooses a different long-term mention format, this
guidance may need to be revised once the runtime-side fix lands.

## Model Used

- OpenAI Codex desktop agent on a GPT-5-class model. Exact deployed
model ID and context window are not exposed by the local harness. Tool
use enabled, including shell execution, git, and GitHub CLI.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-20 07:38:04 -07:00
Dotta 9c6f551595 [codex] Add plugin orchestration host APIs (#4114)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The plugin system is the extension path for optional capabilities
that should not require core product changes for every integration.
> - Plugins need scoped host APIs for issue orchestration, documents,
wakeups, summaries, activity attribution, and isolated database state.
> - Without those host APIs, richer plugins either cannot coordinate
Paperclip work safely or need privileged core-side special cases.
> - This pull request adds the plugin orchestration host surface, scoped
route dispatch, a database namespace layer, and a smoke plugin that
exercises the contract.
> - The benefit is a broader plugin API that remains company-scoped,
auditable, and covered by tests.

## What Changed

- Added plugin orchestration host APIs for issue creation, document
access, wakeups, summaries, plugin-origin activity, and scoped API route
dispatch.
- Added plugin database namespace tables, schema exports, migration
checks, and idempotent replay coverage under migration
`0059_plugin_database_namespaces`.
- Added shared plugin route/API types and validators used by server and
SDK boundaries.
- Expanded plugin SDK types, protocol helpers, worker RPC host behavior,
and testing utilities for orchestration flows.
- Added the `plugin-orchestration-smoke-example` package to exercise
scoped routes, restricted database namespaces, issue orchestration,
documents, wakeups, summaries, and UI status surfaces.
- Kept the new orchestration smoke fixture out of the root pnpm
workspace importer so this PR preserves the repository policy of not
committing `pnpm-lock.yaml`.
- Updated plugin docs and database docs for the new orchestration and
database namespace surfaces.
- Rebased the branch onto `public-gh/master`, resolved conflicts, and
removed `pnpm-lock.yaml` from the final PR diff.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm --filter @paperclipai/db typecheck`
- `pnpm exec vitest run packages/db/src/client.test.ts`
- `pnpm exec vitest run server/src/__tests__/plugin-database.test.ts
server/src/__tests__/plugin-orchestration-apis.test.ts
server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/plugin-scoped-api-routes.test.ts
server/src/__tests__/plugin-sdk-orchestration-contract.test.ts`
- From `packages/plugins/examples/plugin-orchestration-smoke-example`:
`pnpm exec vitest run --config ./vitest.config.ts`
- `pnpm --dir
packages/plugins/examples/plugin-orchestration-smoke-example run
typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- PR CI on latest head `293fc67c`: `policy`, `verify`, `e2e`, and
`security/snyk` all passed.

## Risks

- Medium risk: this expands plugin host authority, so route auth,
company scoping, and plugin-origin activity attribution need careful
review.
- Medium risk: database namespace migration behavior must remain
idempotent for environments that may have seen earlier branch versions.
- Medium risk: the orchestration smoke fixture is intentionally excluded
from the root workspace importer to avoid a `pnpm-lock.yaml` PR diff;
direct fixture verification remains listed above.
- Low operational risk from the PR setup itself: the branch is rebased
onto current `master`, the migration is ordered after upstream
`0057`/`0058`, and `pnpm-lock.yaml` is not in the final diff.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

Roadmap checked: this work aligns with the completed Plugin system
milestone and extends the plugin surface rather than duplicating an
unrelated planned core feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in a tool-enabled CLI
environment. Exact hosted model build and context-window size are not
exposed by the runtime; reasoning/tool use were enabled for repository
inspection, editing, testing, git operations, and PR creation.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no core UI screen change; example plugin UI contract
is covered by tests)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 08:52:51 -05:00
Dotta 16b2b84d84 [codex] Improve agent runtime recovery and governance (#4086)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat runtime, agent import path, and agent configuration
defaults determine whether work is dispatched safely and predictably.
> - Several accumulated fixes all touched agent execution recovery, wake
routing, import behavior, and runtime concurrency defaults.
> - Those changes need to land together so the heartbeat service and
agent creation defaults stay internally consistent.
> - This pull request groups the runtime/governance changes from the
split branch into one standalone branch.
> - The benefit is safer recovery for stranded runs, bounded high-volume
reads, imported-agent approval correctness, skill-template support, and
a clearer default concurrency policy.

## What Changed

- Fixed stranded continuation recovery so successful automatic retries
are requeued instead of incorrectly blocking the issue.
- Bounded high-volume issue/log reads across issue, heartbeat, agent,
project, and workspace paths.
- Fixed imported-agent approval and instruction-path permission
handling.
- Quarantined seeded worktree execution state during worktree
provisioning.
- Queued approval follow-up wakes and hardened SQL_ASCII heartbeat
output handling.
- Added reusable agent instruction templates for hiring flows.
- Set the default max concurrent agent runs to five and updated related
UI/tests/docs.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run server/src/__tests__/company-portability.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-comment-wake-batching.test.ts
server/src/__tests__/heartbeat-list.test.ts
server/src/__tests__/issues-service.test.ts
server/src/__tests__/agent-permissions-routes.test.ts
packages/adapter-utils/src/server-utils.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- Split integration check: merged this branch first, followed by the
other [PAP-1614](/PAP/issues/PAP-1614) branches, with no merge
conflicts.
- Confirmed this branch does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: touches heartbeat recovery, queueing, and issue list
bounds in central runtime paths.
- Imported-agent and concurrency default behavior changes may affect
existing automation that assumes one-at-a-time default runs.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4 tool-enabled coding model, agentic
code-editing/runtime with local shell and GitHub CLI access; exact
context window and reasoning mode are not exposed by the Paperclip
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:19:48 -05:00
Dotta 057fee4836 [codex] Polish issue and operator workflow UI (#4090)
## Thinking Path

> - Paperclip operators spend much of their time in issues, inboxes,
selectors, and rich comment threads.
> - Small interaction problems in those surfaces slow down supervision
of AI-agent work.
> - The branch included related operator quality-of-life fixes for issue
layout, inbox actions, recent selectors, mobile inputs, and chat
rendering stability.
> - These changes are UI-focused and can land independently from
workspace navigation and access-profile work.
> - This pull request groups the operator QoL fixes into one standalone
branch.
> - The benefit is a more stable and efficient board workflow for issue
triage and task editing.

## What Changed

- Widened issue detail content and added a desktop inbox archive action.
- Fixed mobile text-field zoom by keeping touch input font sizes at
16px.
- Prioritized recent picker selections for assignees/projects in issue
and routine flows.
- Showed actionable approvals in the Mine inbox model.
- Fixed issue chat renderer state crashes and hardened tests.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/lib/inbox.test.ts ui/src/lib/recent-selections.test.ts`
- Split integration check: merged last after the other
[PAP-1614](/PAP/issues/PAP-1614) branches with no merge conflicts.
- Confirmed this branch does not include `pnpm-lock.yaml`.

## Risks

- Low to medium risk: mostly UI state, layout, and selection-priority
behavior.
- Visual layout and mobile zoom behavior may need browser/device QA
beyond component tests.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4 tool-enabled coding model, agentic
code-editing/runtime with local shell and GitHub CLI access; exact
context window and reasoning mode are not exposed by the Paperclip
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:16:41 -05:00
Dotta fee514efcb [codex] Improve workspace navigation and runtime UI (#4089)
## Thinking Path

> - Paperclip agents do real work in project and execution workspaces.
> - Operators need workspace state to be visible, navigable, and
copyable without digging through raw run logs.
> - The branch included related workspace cards, navigation, runtime
controls, stale-service handling, and issue-property visibility.
> - These changes share the workspace UI and runtime-control surfaces
and can stand alone from unrelated access/profile work.
> - This pull request groups the workspace experience changes into one
standalone branch.
> - The benefit is a clearer workspace overview, better metadata copy
flows, and more accurate runtime service controls.

## What Changed

- Polished project workspace summary cards and made workspace metadata
copyable.
- Added a workspace navigation overview and extracted reusable project
workspace content.
- Squared and polished the execution workspace configuration page.
- Fixed stale workspace command matching and hid stopped stale services
in runtime controls.
- Showed live workspace service context in issue properties.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run
ui/src/components/ProjectWorkspaceSummaryCard.test.tsx
ui/src/lib/project-workspaces-tab.test.ts
ui/src/components/Sidebar.test.tsx
ui/src/components/WorkspaceRuntimeControls.test.tsx
ui/src/components/IssueProperties.test.tsx`
- `pnpm exec vitest run packages/shared/src/workspace-commands.test.ts
--config /dev/null` because the root Vitest project config does not
currently include `packages/shared` tests.
- Split integration check: merged after runtime/governance,
dev-infra/backups, and access/profiles with no merge conflicts.
- Confirmed this branch does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: touches workspace navigation, runtime controls, and issue
property rendering.
- Visual layout changes may need browser QA, especially around smaller
screens and dense workspace metadata.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4 tool-enabled coding model, agentic
code-editing/runtime with local shell and GitHub CLI access; exact
context window and reasoning mode are not exposed by the Paperclip
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:14:32 -05:00
Dotta d8b63a18e7 [codex] Add access cleanup and user profile page (#4088)
## Thinking Path

> - Paperclip is moving from a solo local operator model toward teams
supervising AI-agent companies.
> - Human access management and human-visible profile surfaces are part
of that multiple-user path.
> - The branch included related access cleanup, archived-member removal,
permission protection, and a user profile page.
> - These changes share company membership, user attribution, and
access-service behavior.
> - This pull request groups those human access/profile changes into one
standalone branch.
> - The benefit is safer member removal behavior and a first profile
surface for user work, activity, and cost attribution.

## What Changed

- Added archived company member removal support across shared contracts,
server routes/services, and UI.
- Protected company member removal with stricter permission checks and
tests.
- Added company user profile API, shared types, route wiring, client
API, route, and UI page.
- Simplified the user profile page visual design to a neutral
typography-led layout.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run server/src/__tests__/access-service.test.ts
server/src/__tests__/user-profile-routes.test.ts
ui/src/pages/CompanyAccess.test.tsx --hookTimeout=30000`
- `pnpm exec vitest run server/src/__tests__/user-profile-routes.test.ts
--testTimeout=30000 --hookTimeout=30000` after an initial local
embedded-Postgres hook timeout in the combined run.
- Split integration check: merged after runtime/governance and
dev-infra/backups with no merge conflicts.
- Confirmed this branch does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: changes member removal permissions and adds a new user
profile route with cross-table stats.
- The profile page is a new UI surface and may need visual follow-up in
browser QA.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4 tool-enabled coding model, agentic
code-editing/runtime with local shell and GitHub CLI access; exact
context window and reasoning mode are not exposed by the Paperclip
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:10:20 -05:00
Dotta e89d3f7e11 [codex] Add backup endpoint and dev runtime hardening (#4087)
## Thinking Path

> - Paperclip is a local-first control plane for AI-agent companies.
> - Operators need predictable local dev behavior, recoverable instance
data, and scripts that do not churn the running app.
> - Several accumulated changes improve backup streaming, dev-server
health, static UI caching/logging, diagnostic-file ignores, and instance
isolation.
> - These are operational improvements that can land independently from
product UI work.
> - This pull request groups the dev-infra and backup changes from the
split branch into one standalone branch.
> - The benefit is safer local operation, easier manual backups, less
noisy dev output, and less cross-instance auth leakage.

## What Changed

- Added a manual instance database backup endpoint and route tests.
- Streamed backup/restore handling to avoid materializing large payloads
at once.
- Reduced dev static UI log/cache churn and ignored Node diagnostic
report captures.
- Added guarded dev auto-restart health polling coverage.
- Preserved worktree config during provisioning and scoped auth cookies
by instance.
- Added a Discord daily digest helper script and environment
documentation.
- Hardened adapter-route and startup feedback export tests around the
changed infrastructure.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run packages/db/src/backup-lib.test.ts
server/src/__tests__/instance-database-backups-routes.test.ts
server/src/__tests__/server-startup-feedback-export.test.ts
server/src/__tests__/adapter-routes.test.ts
server/src/__tests__/dev-runner-paths.test.ts
server/src/__tests__/health-dev-server-token.test.ts
server/src/__tests__/http-log-policy.test.ts
server/src/__tests__/vite-html-renderer.test.ts
server/src/__tests__/workspace-runtime.test.ts
server/src/__tests__/better-auth.test.ts`
- Split integration check: merged after the runtime/governance branch
and before UI branches with no merge conflicts.
- Confirmed this branch does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: touches server startup, backup streaming, auth cookie
naming, dev health checks, and worktree provisioning.
- Backup endpoint behavior depends on existing board/admin access
controls and database backup helpers.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4 tool-enabled coding model, agentic
code-editing/runtime with local shell and GitHub CLI access; exact
context window and reasoning mode are not exposed by the Paperclip
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:08:55 -05:00
Dotta 236d11d36f [codex] Add run liveness continuations (#4083)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Heartbeat runs are the control-plane record of each agent execution
window.
> - Long-running local agents can exhaust context or stop while still
holding useful next-step state.
> - Operators need that stop reason, next action, and continuation path
to be durable and visible.
> - This pull request adds run liveness metadata, continuation
summaries, and UI surfaces for issue run ledgers.
> - The benefit is that interrupted or long-running work can resume with
clearer context instead of losing the agent's last useful handoff.

## What Changed

- Added heartbeat-run liveness fields, continuation attempt tracking,
and an idempotent `0058` migration.
- Added server services and tests for run liveness, continuation
summaries, stop metadata, and activity backfill.
- Wired local and HTTP adapters to surface continuation/liveness context
through shared adapter utilities.
- Added shared constants, validators, and heartbeat types for liveness
continuation state.
- Added issue-detail UI surfaces for continuation handoffs and the run
ledger, with component tests.
- Updated agent runtime docs, heartbeat protocol docs, prompt guidance,
onboarding assets, and skills instructions to explain continuation
behavior.
- Addressed Greptile feedback by scoping document evidence by run,
excluding system continuation-summary documents from liveness evidence,
importing shared liveness types, surfacing hidden ledger run counts,
documenting bounded retry behavior, and moving run-ledger liveness
backfill off the request path.

## Verification

- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/run-continuations.test.ts
server/src/__tests__/run-liveness.test.ts
server/src/__tests__/activity-service.test.ts
server/src/__tests__/documents-service.test.ts
server/src/__tests__/issue-continuation-summary.test.ts
server/src/services/heartbeat-stop-metadata.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/components/IssueContinuationHandoff.test.tsx
ui/src/components/IssueDocumentsSection.test.tsx`
- `pnpm --filter @paperclipai/db build`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
server/src/__tests__/run-continuations.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts -t "treats a
plan document update"`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts -t "activity
service|treats a plan document update"`
- Remote PR checks on head `e53b1a1d`: `verify`, `e2e`, `policy`, and
Snyk all passed.
- Confirmed `public-gh/master` is an ancestor of this branch after
fetching `public-gh master`.
- Confirmed `pnpm-lock.yaml` is not included in the branch diff.
- Confirmed migration `0058_wealthy_starbolt.sql` is ordered after
`0057` and uses `IF NOT EXISTS` guards for repeat application.
- Greptile inline review threads are resolved.

## Risks

- Medium risk: this touches heartbeat execution, liveness recovery,
activity rendering, issue routes, shared contracts, docs, and UI.
- Migration risk is mitigated by additive columns/indexes and idempotent
guards.
- Run-ledger liveness backfill is now asynchronous, so the first ledger
response can briefly show historical missing liveness until the
background backfill completes.
- UI screenshot coverage is not included in this packaging pass;
validation is currently through focused component tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4, local tool-use coding agent with terminal, git,
GitHub connector, GitHub CLI, and Paperclip API access.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Screenshot note: no before/after screenshots were captured in this PR
packaging pass; the UI changes are covered by focused component tests
listed above.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:01:49 -05:00
Dotta b9a80dcf22 feat: implement multi-user access and invite flows (#3784)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - V1 needs to stay local-first while also supporting shared,
authenticated deployments.
> - Human operators need real identities, company membership, invite
flows, profile surfaces, and company-scoped access controls.
> - Agents and operators also need the existing issue, inbox, workspace,
approval, and plugin flows to keep working under those authenticated
boundaries.
> - This branch accumulated the multi-user implementation, follow-up QA
fixes, workspace/runtime refinements, invite UX improvements,
release-branch conflict resolution, and review hardening.
> - This pull request consolidates that branch onto the current `master`
branch as a single reviewable PR.
> - The benefit is a complete multi-user implementation path with tests
and docs carried forward without dropping existing branch work.

## What Changed

- Added authenticated human-user access surfaces: auth/session routes,
company user directory, profile settings, company access/member
management, join requests, and invite management.
- Added invite creation, invite landing, onboarding, logo/branding,
invite grants, deduped join requests, and authenticated multi-user E2E
coverage.
- Tightened company-scoped and instance-admin authorization across
board, plugin, adapter, access, issue, and workspace routes.
- Added profile-image URL validation hardening, avatar preservation on
name-only profile updates, and join-request uniqueness migration cleanup
for pending human requests.
- Added an atomic member role/status/grants update path so Company
Access saves no longer leave partially updated permissions.
- Improved issue chat, inbox, assignee identity rendering,
sidebar/account/company navigation, workspace routing, and execution
workspace reuse behavior for multi-user operation.
- Added and updated server/UI tests covering auth, invites, membership,
issue workspace inheritance, plugin authz, inbox/chat behavior, and
multi-user flows.
- Merged current `public-gh/master` into this branch, resolved all
conflicts, and verified no `pnpm-lock.yaml` change is included in this
PR diff.

## Verification

- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts
ui/src/components/IssueChatThread.test.tsx ui/src/pages/Inbox.test.tsx`
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/plugin-routes-authz.test.ts`
- `pnpm exec vitest run server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/workspace-runtime-service-authz.test.ts
server/src/__tests__/access-validators.test.ts`
- `pnpm exec vitest run
server/src/__tests__/authz-company-access.test.ts
server/src/__tests__/routines-routes.test.ts
server/src/__tests__/sidebar-preferences-routes.test.ts
server/src/__tests__/approval-routes-idempotency.test.ts
server/src/__tests__/openclaw-invite-prompt-route.test.ts
server/src/__tests__/agent-cross-tenant-authz-routes.test.ts
server/src/__tests__/routines-e2e.test.ts`
- `pnpm exec vitest run server/src/__tests__/auth-routes.test.ts
ui/src/pages/CompanyAccess.test.tsx`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/db typecheck && pnpm --filter @paperclipai/server
typecheck`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm db:generate`
- `npx playwright test --config tests/e2e/playwright.config.ts --list`
- Confirmed branch has no uncommitted changes and is `0` commits behind
`public-gh/master` before PR creation.
- Confirmed no `pnpm-lock.yaml` change is staged or present in the PR
diff.

## Risks

- High review surface area: this PR contains the accumulated multi-user
branch plus follow-up fixes, so reviewers should focus especially on
company-boundary enforcement and authenticated-vs-local deployment
behavior.
- UI behavior changed across invites, inbox, issue chat, access
settings, and sidebar navigation; no browser screenshots are included in
this branch-consolidation PR.
- Plugin install, upgrade, and lifecycle/config mutations now require
instance-admin access, which is intentional but may change expectations
for non-admin board users.
- A join-request dedupe migration rejects duplicate pending human
requests before creating unique indexes; deployments with unusual
historical duplicates should review the migration behavior.
- Company member role/status/grant saves now use a new combined
endpoint; older separate endpoints remain for compatibility.
- Full production build was not run locally in this heartbeat; CI should
cover the full matrix.

## Model Used

- OpenAI Codex coding agent, GPT-5-based model, CLI/tool-use
environment. Exact deployed model identifier and context window were not
exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note on screenshots: this is a branch-consolidation PR for an
already-developed multi-user branch, and no browser screenshots were
captured during this heartbeat.

---------

Co-authored-by: dotta <dotta@example.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-17 09:44:19 -05:00
Roman Barinov e93e418cbf fix: add ssh client and jq to production image (#3826)
## Thinking Path

> - Paperclip is the control plane that runs long-lived AI-agent work in
production.
> - The production container image is the runtime boundary for agent
tools and shell access.
> - In our deployment, Paperclip agents now need a native SSH client and
`jq` available inside the final runtime container.
> - Installing those tools only via ai-rig entrypoint hacks is brittle
and drifts from the image source of truth.
> - This pull request updates the production Docker image itself so the
required binaries are present whenever the image is built.
> - The change is intentionally scoped to the final production stage so
build/deps stages do not gain extra packages unnecessarily.
> - The benefit is a cleaner, reproducible runtime image with fewer
deploy-specific workarounds.

## What Changed

- Added `openssh-client` to the production Docker image stage.
- Added `jq` to the production Docker image stage.
- Kept the package install in the final `production` stage instead of
the shared base stage to minimize scope.

## Verification

- Reviewed the final Dockerfile diff to confirm the packages are
installed in the `production` stage only.
- Attempted local image build with:
  - `docker build --target production -t paperclip:ssh-jq-test .`
- Local build could not be completed in this environment because the
local Docker daemon was unavailable:
- `Cannot connect to the Docker daemon at
unix:///Users/roman/.docker/run/docker.sock. Is the docker daemon
running?`

## Risks

- Low risk: image footprint increases slightly because two Debian
packages are added.
- `openssh-client` expands runtime capability, so this is appropriate
only because the deployed Paperclip runtime explicitly needs SSH access.

## Model Used

- OpenAI Codex / `gpt-5.4`
- Tool-using agent workflow via Hermes
- Context from local repository inspection, git, and shell tooling

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [ ] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-16 17:11:55 -05:00
Dotta 407e76c1db [codex] Fix Docker gh installation (#3844)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, and the
Docker image is the no-local-Node path for running that control plane.
> - The deploy workflow builds and pushes that image from the repository
`Dockerfile`.
> - The current image setup adds GitHub CLI through GitHub's external
apt repository and verifies a mutable keyring URL with a pinned SHA256.
> - GitHub rotated the CLI Linux package signing key, so that pinned
keyring checksum now fails before Buildx can publish the image.
> - Paperclip already has a repo-local precedent in
`docker/untrusted-review/Dockerfile`: install Debian trixie's packaged
`gh` directly from the base distribution.
> - This pull request removes the external GitHub CLI apt
keyring/repository path from the production image and installs `gh` with
the rest of the Debian packages.
> - The benefit is a simpler Docker build that no longer fails when
GitHub rotates the apt keyring file.

## What Changed

- Updated the main `Dockerfile` base stage to install `gh` from Debian
trixie's package repositories.
- Removed the mutable GitHub CLI apt keyring download, pinned checksum
verification, extra apt source, second `apt-get update`, and separate
`gh` install step.

## Verification

- `git diff --check`
- `./scripts/docker-build-test.sh` skipped because Docker is installed
but the daemon is not running on this machine.
- Confirmed `https://packages.debian.org/trixie/gh` returns HTTP 200,
matching the base image distribution package source.

## Risks

- Debian's `gh` package can lag the latest upstream GitHub CLI release.
This is acceptable for the current image contract, which requires `gh`
availability but does not document a latest-upstream version guarantee.
- A full image build still needs to run in CI because the local Docker
daemon is unavailable in this environment.

## Model Used

- OpenAI Codex, GPT-5-based coding agent. Exact backend model ID was not
exposed in this runtime; tool use and shell execution were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-16 17:10:42 -05:00
Devin Foley e458145583 docs: add public roadmap and update contribution policy for feature PRs (#3835)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - As the project grows, more contributors want to build features —
which is great
> - Without a public roadmap or clear contribution guidance,
contributors spend time on PRs that overlap with planned core work
> - This creates frustration on both sides when those PRs can't be
merged
> - This PR publishes a roadmap, updates the contribution guide with a
clear path for feature proposals, and reinforces the workflow in the PR
template
> - The benefit is that contributors know exactly how to propose
features and where to focus for the highest-impact contributions

## What Changed

- Added `ROADMAP.md` with expanded descriptions of all shipped and
planned milestones, plus guidance on coordinating feature contributions
- Added "Feature Contributions" section to `CONTRIBUTING.md` explaining
how to propose features (check roadmap → discuss in #dev → consider the
plugin system)
- Updated `.github/PULL_REQUEST_TEMPLATE.md` with a callout linking to
the roadmap and a new checklist item to check for overlap with planned
work, while preserving the newer required `Model Used` section from
`master`
- Added `Memory / Knowledge` to the README roadmap preview and linked
the preview to the full `ROADMAP.md`

## Verification

- Open `ROADMAP.md` on GitHub and confirm it renders correctly with all
milestone sections
- Read the new "Feature Contributions" section in `CONTRIBUTING.md` and
verify all links resolve
- Open a new PR and confirm the template shows the roadmap callout and
the new checklist item
- Verify README links to `ROADMAP.md` and the roadmap preview includes
"Memory / Knowledge"

## Risks

- Docs-only change — no runtime or behavioral impact
- Contribution policy changes were written to be constructive and to
offer clear alternative paths (plugins, coordination via #dev, reference
implementations as feedback)

## Model Used

- OpenAI Codex local agent (GPT-5-based coding model; exact runtime
model ID is not exposed in this environment)
- Tool use enabled for shell, git, GitHub CLI, and patch application
- Used to rebase the branch, resolve merge conflicts, update the PR
metadata, and verify the repo state

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable (N/A — docs only)
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — no UI changes)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-16 13:04:50 -07:00
Dewaldt Huysamen f701c3e78c feat(claude-local): add Opus 4.7 to adapter model dropdown (#3828)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each adapter advertises a model list that powers the agent config UI
dropdown
> - The `claude_local` adapter's dropdown is sourced from the hard-coded
`models` array in `packages/adapters/claude-local/src/index.ts`
> - Anthropic recently released Opus 4.7, the newest current-generation
Opus model
> - Without a list entry, users cannot discover or select Opus 4.7 from
the dropdown (they can still type it manually, since the field is
creatable, but discoverability is poor)
> - This pull request adds `claude-opus-4-7` to the `claude_local` model
list so new agents can be configured with the latest model by default
> - The benefit is out-of-the-box access to the newest Opus model,
consistent with how every other current-generation Claude model is
already listed

## What Changed

- Added `{ id: "claude-opus-4-7", label: "Claude Opus 4.7" }` as the
**first** entry of the `models` array in
`packages/adapters/claude-local/src/index.ts`. Newest-first ordering
matches the convention already used for 4.6.

## Verification

- `pnpm --filter @paperclipai/adapter-claude-local typecheck` → passes.
- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/adapter-models.test.ts
src/__tests__/claude-local-adapter.test.ts` → 12/12 passing (both
directly-related files).
- No existing test pins the `claude_local` models array (see
`server/src/__tests__/adapter-models.test.ts`), so appending a new entry
is non-breaking.
- Manual check of UI consumer: `AgentConfigForm.tsx` fetches the list
via `agentsApi.adapterModels()` and renders it in a creatable popover —
no hard-coded expectations anywhere in the UI layer.
- Screenshots: single new option appears at the top of the Claude Code
(local) model dropdown; existing options unchanged.

## Risks

- Low risk. Purely additive: one new entry in a list consumed by a UI
dropdown. No behavior change for existing agents, no schema change, no
migration, no env var.
- `BEDROCK_MODELS` in
`packages/adapters/claude-local/src/server/models.ts` is intentionally
**not** touched — the exact region-qualified Bedrock id for Opus 4.7 is
not yet confirmed, and shipping a guessed id could produce a broken
option for Bedrock users. Tracked as a follow-up on the linked issue.

## Model Used

- None — human-authored.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable (no tests needed:
existing suite already covers the list-consumer paths)
- [x] If this change affects the UI, I have included before/after
screenshots (dropdown gains one new top entry; all other entries
unchanged)
- [x] I have updated relevant documentation to reflect my changes (no
doc update needed: `docs/adapters/claude-local.md` uses
`claude-opus-4-6` only as an example, still valid)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Closes #3827
2026-04-16 13:18:30 -05:00
akhater 1afb6be961 fix(heartbeat): add hermes_local to SESSIONED_LOCAL_ADAPTERS (#3561)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The heartbeat service monitors agent health via PID liveness checks
for local adapters
> - `SESSIONED_LOCAL_ADAPTERS` in `heartbeat.ts` controls which adapters
get PID tracking and retry-on-lost behavior
> - `hermes_local` (the Hermes Agent adapter) was missing from this set
> - Without it, the orphan reaper immediately marks all Hermes runs as
`process_lost` instead of retrying
> - This PR adds the one-line registration so `hermes_local` gets the
same treatment as `claude_local`, `codex_local`, `cursor`, and
`gemini_local`
> - The benefit is Hermes agent runs complete normally instead of being
killed after ~5 minutes

## What Changed

- Added `"hermes_local"` to the `SESSIONED_LOCAL_ADAPTERS` set in
`server/src/services/heartbeat.ts`

## Verification

- Trigger a Hermes agent run via the wakeup API
- Confirm `heartbeat_runs.status` transitions to `succeeded` (not
`process_lost`)
- Tested end-to-end on a production Paperclip instance with Hermes agent
running heartbeat cycles for 48+ hours

## Risks

Low risk. Additive one-line change — adds a string to an existing set.
No behavioral change for other adapters. Consistent with
`BUILTIN_ADAPTER_TYPES` which already includes `hermes_local`.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.6 (claude-opus-4-6)
- Context window: 1M tokens
- Capabilities: Tool use, code execution

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Antoine Khater <akhater@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 07:35:02 -05:00
Dotta b8725c52ef release: v2026.416.0 notes (#3782)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, and
stable releases need a clear changelog artifact for operators upgrading
between versions.
> - The release-note workflow in this repo stores one stable changelog
file per release under `releases/`.
> - `v2026.410.0` and `v2026.413.0` were intermediate drafts for the
same release window, while the next stable release is `v2026.416.0`.
> - Keeping superseded draft release notes around would make the stable
release history noisy and misleading.
> - This pull request consolidates the intended content into
`releases/v2026.416.0.md` and removes the older
`releases/v2026.410.0.md` and `releases/v2026.413.0.md` files.
> - The benefit is a single canonical stable release note for
`v2026.416.0` with no duplicate release artifacts.

## What Changed

- Added `releases/v2026.416.0.md` as the canonical stable changelog for
the April 16, 2026 release.
- Removed the superseded `releases/v2026.410.0.md` and
`releases/v2026.413.0.md` draft release-note files.
- Kept the final release-note ordering and content as edited in the
working tree before commit.

## Verification

- Reviewed the git diff to confirm the PR only changes release-note
artifacts in `releases/`.
- Confirmed the branch is based on `public-gh/master` and contains a
single release-note commit.
- Did not run tests because this is a docs-only changelog update.

## Risks

- Low risk. The change is limited to release-note markdown files.
- The main risk is editorial: if any release item was meant to stay in a
separate changelog file, it now exists only in `v2026.416.0.md`.

## Model Used

- OpenAI GPT-5 Codex, model `gpt-5.4`, medium reasoning, tool use and
code execution in the Codex CLI environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-15 21:40:35 -05:00
Dotta 5f45712846 Sync/master post pap1497 followups 2026 04 15 (#3779)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board depends on issue, inbox, cost, and company-skill surfaces
to stay accurate and fast while agents are actively working
> - The PAP-1497 follow-up branch exposed a few rough edges in those
surfaces: stale active-run state on completed issues, missing creator
filters, oversized issue payload scans, and placeholder issue-route
parsing
> - Those gaps make the control plane harder to trust because operators
can see misleading run state, miss the right subset of work, or pay
extra query/render cost on large issue records
> - This pull request tightens those follow-ups across server and UI
code, and adds regression coverage for the affected paths
> - The benefit is a more reliable issue workflow, safer high-volume
cost aggregation, and clearer board/operator navigation

## What Changed

- Added the `v2026.415.0` release changelog entry.
- Fixed stale issue-run presentation after completion and reused the
shared issue-path parser so literal route placeholders no longer become
issue links.
- Added creator filters to the Issues page and Inbox, including
persisted filter-state normalization and regression coverage.
- Bounded issue detail/list project-mention scans and trimmed large
issue-list payload fields to keep issue reads lighter.
- Hardened company-skill list projection and cost/finance aggregation so
large markdown blobs and large summed values do not leak into list
responses or overflow 32-bit casts.
- Added targeted server/UI regression tests for company skills,
costs/finance, issue mention scanning, creator filters, inbox
normalization, and issue reference parsing.

## Verification

- `pnpm exec vitest run
server/src/__tests__/company-skills-service.test.ts
server/src/__tests__/costs-service.test.ts
server/src/__tests__/issues-goal-context-routes.test.ts
server/src/__tests__/issues-service.test.ts ui/src/lib/inbox.test.ts
ui/src/lib/issue-filters.test.ts ui/src/lib/issue-reference.test.ts`
- `gh pr checks 3779`
Current pass set on the PR head: `policy`, `verify`, `e2e`,
`security/snyk (cryppadotta)`, `Greptile Review`

## Risks

- Creator filter options are derived from the currently loaded
issue/agent data, so very sparse result sets may not surface every
historical creator until they appear in the active dataset.
- Cost/finance aggregate casts now use `double precision`; that removes
the current overflow risk, but future schema changes should keep
large-value aggregation behavior under review.
- Issue detail mention scanning now skips comment-body scans on the
detail route, so any consumer that relied on comment-only project
mentions there would need to fetch them separately.

## Model Used

- OpenAI Codex, GPT-5-based coding agent with terminal tool use and
local code execution in the Paperclip workspace. Exact internal model
ID/context-window exposure is not surfaced in this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-15 21:13:56 -05:00
Dotta d4c3899ca4 [codex] improve issue and routine UI responsiveness (#3744)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Operators rely on issue, inbox, and routine views to understand what
the company is doing in real time
> - Those views need to stay fast and readable even when issue lists,
markdown comments, and run metadata get large
> - The current branch had a coherent set of UI and live-update
improvements spread across issue search, issue detail rendering, routine
affordances, and workspace lookups
> - This pull request groups those board-facing changes into one
standalone branch that can merge independently of the heartbeat/runtime
work
> - The benefit is a faster, clearer issue and routine workflow without
changing the underlying task model

## What Changed

- Show routine execution issues by default and rename the filter to
`Hide routine runs` so the default state no longer looks like an active
filter.
- Show the routine name in the run dialog and tighten the issue
properties pane with a workspace link, copy-on-click behavior, and an
inline parent arrow.
- Reduce issue detail rerenders, keep queued issue chat mounted, improve
issues page search responsiveness, and speed up issues first paint.
- Add inbox "other search results", refresh visible issue runs after
status updates, and optimize workspace lookups through summary-mode
execution workspace queries.
- Improve markdown wrapping and scrolling behavior for long strings and
self-comment code blocks.
- Relax the markdown sanitizer assertion so the test still validates
safety after the new wrap-friendly inline styles.

## Verification

- `pnpm vitest run ui/src/components/IssuesList.test.tsx
ui/src/lib/inbox.test.ts ui/src/pages/Issues.test.tsx
ui/src/context/BreadcrumbContext.test.tsx
ui/src/context/LiveUpdatesProvider.test.ts
ui/src/components/MarkdownBody.test.tsx
ui/src/api/execution-workspaces.test.ts
server/src/__tests__/execution-workspaces-routes.test.ts`

## Risks

- This touches several issue-facing UI surfaces at once, so regressions
would most likely show up as stale rendering, search result mismatches,
or small markdown presentation differences.
- The workspace lookup optimization depends on the summary-mode route
shape staying aligned between server and UI.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in the Codex CLI environment.
Exact backend model deployment ID was not exposed in-session.
Tool-assisted editing and shell execution were used.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-15 15:54:05 -05:00
Jannes Stubbemann 7463479fc8 fix: disable HTTP caching on run log endpoints (#3724)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Every run emits a streaming log that the web UI polls so humans can
watch what the agent is doing
> - Log responses go out without explicit cache directives, so Express
adds an ETag
> - If the first poll lands before any bytes have been written, the
browser caches the empty / partial snapshot and keeps getting `304 Not
Modified` on every subsequent poll
> - The transcript pane then stays stuck on "Waiting for transcript…"
even after the log has plenty of content
> - This pull request sets `Cache-Control: no-cache, no-store` on both
run-log endpoints so the conditional-request path is defeated

## What Changed

- `server/src/routes/agents.ts` — `GET /heartbeat-runs/:runId/log` now
sets `Cache-Control: no-cache, no-store` on the response.
- Same change applied to `GET /workspace-operations/:operationId/log`
(same structure, same bug).

## Verification

- Reproduction: start a long-running agent, watch the transcript pane.
Before the fix, open devtools and observe `304 Not Modified` on each
poll after the initial 200 with an empty body; the UI never updates.
After the fix, each poll is a 200 with fresh bytes.
- Existing tests pass.

## Risks

Low. Cache headers only affect whether the browser revalidates; the
response body is unchanged. No API surface change.

## Model Used

Claude Opus 4.6 (1M context), extended thinking mode.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified
- [x] Tests run locally and pass
- [x] CI green
- [x] Greptile review addressed
2026-04-15 09:53:25 -05:00
Dotta 3fa5d25de1 [codex] harden heartbeat run summaries and recovery context (#3742)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Heartbeat runs are the control-plane record of what agents did, why
they woke up, and what operators should see next
> - Run lists, stranded issue comments, and live log polling all depend
on compact but accurate heartbeat summaries
> - The current branch had a focused backend slice that improves how run
result JSON is summarized, how stale process recovery comments are
written, and how live log polling resolves the active run
> - This pull request isolates that heartbeat/runtime reliability work
from the unrelated UI and dev-tooling changes
> - The benefit is more reliable issue context and cheaper run lookups
without dragging unrelated board UI changes into the same review

## What Changed

- Include the latest run failure in stranded issue comments during
orphaned process recovery.
- Bound heartbeat `result_json` payloads for list responses while
preserving the raw stored payloads.
- Narrow heartbeat log endpoint lookups so issue polling resolves the
relevant active run with less unnecessary scanning.
- Add focused tests for heartbeat list summaries, live run polling,
orphaned process recovery, and the run context/result summary helpers.

## Verification

- `pnpm vitest run
server/src/__tests__/heartbeat-context-summary.test.ts
server/src/__tests__/heartbeat-list.test.ts
server/src/__tests__/agent-live-run-routes.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts`

## Risks

- The main risk is accidentally hiding a field that some client still
expects from summarized `result_json`, or over-constraining the live log
lookup path for edge-case run routing.
- Recovery comments now surface the latest failure more aggressively, so
wording changes may affect downstream expectations if anyone parses
those comments too strictly.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in the Codex CLI environment.
Exact backend model deployment ID was not exposed in-session.
Tool-assisted editing and shell execution were used.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-15 09:48:39 -05:00
Dotta c1a02497b0 [codex] fix worktree dev dependency ergonomics (#3743)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Local development needs to work cleanly across linked git worktrees
because Paperclip itself leans on worktree-based engineering workflows
> - Dev-mode asset routing, Vite watch behavior, and workspace package
links are part of that day-to-day control-plane ergonomics
> - The current branch had a small but coherent set of
worktree/dev-tooling fixes that are independent from both the issue UI
changes and the heartbeat runtime changes
> - This pull request isolates those environment fixes into a standalone
branch that can merge without carrying unrelated product work
> - The benefit is a smoother multi-worktree developer loop with fewer
stale links and less noisy dev watching

## What Changed

- Serve dev public assets before the HTML shell and add a routing test
that locks that behavior in.
- Ignore UI test files in the Vite dev watch helper so the dev server
does less unnecessary work.
- Update `ensure-workspace-package-links.ts` to relink stale workspace
dependencies whenever a workspace `node_modules` directory exists,
instead of only inside linked-worktree detection paths.

## Verification

- `pnpm vitest run server/src/__tests__/app-vite-dev-routing.test.ts
ui/src/lib/vite-watch.test.ts`
- `node cli/node_modules/tsx/dist/cli.mjs
scripts/ensure-workspace-package-links.ts`

## Risks

- The asset routing change is low risk but sits near app shell behavior,
so a regression would show up as broken static assets in dev mode.
- The workspace-link repair now runs in more cases, so the main risk is
doing unexpected relinks when a checkout has intentionally unusual
workspace symlink state.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in the Codex CLI environment.
Exact backend model deployment ID was not exposed in-session.
Tool-assisted editing and shell execution were used.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-15 09:47:29 -05:00
Jannes Stubbemann 390502736c chore(ui): drop console.* and legal comments in production builds (#3728)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The web UI is a single-page app built with Vite and shipped as a
static bundle to every deployment
> - Production bundles carry `console.log` / `console.debug` calls from
dev code and `/*! … */` legal-comment banners from third-party packages
> - The console calls leak internals to anyone opening devtools and
waste bytes per call site; the legal banners accumulate throughout the
bundle
> - Both problems affect every self-hoster, since they all ship the same
UI bundle
> - This pull request configures esbuild (via `vite.config.ts`) to strip
`console` and `debugger` statements and drop inline legal comments from
production builds only

## What Changed

- `ui/vite.config.ts`:
  - Switch to the functional `defineConfig(({ mode }) => …)` form.
- Add `build.minify: "esbuild"` (explicit — it's the existing default).
- Add `esbuild.drop: ["console", "debugger"]` and
`esbuild.legalComments: "none"`, gated on `mode === "production"` so
`vite dev` is unaffected.

## Verification

- `pnpm --filter @paperclipai/ui build` then grep the
`ui/dist/assets/*.js` bundle for `console.log` — no occurrences.
- `pnpm --filter @paperclipai/ui dev` — `console.log` calls in source
still reach the browser console.
- Bundle size: small reduction (varies with project but measurable on a
fresh build).

## Risks

Low. No API surface change. Production code should not depend on
`console.*` for side effects; any call that did is now a dead call,
which is the same behavior most minifiers apply.

## Model Used

Claude Opus 4.6 (1M context), extended thinking mode.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified
- [x] Tests run locally and pass
- [x] CI green
- [x] Greptile review addressed
2026-04-15 09:46:12 -05:00
Jannes Stubbemann 0d87fd9a11 fix: proper cache headers for static assets and SPA fallback (#3734)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Every deployment serves the same Vite-built UI bundle from the same
express app
> - Vite emits JS/CSS under `/assets/<name>.<hash>.<ext>` — the hash
rolls whenever the content rolls, so these files are inherently
immutable
> - `index.html` references specific hashed filenames, so it has the
opposite lifecycle: whenever we deploy, the file changes but the URL
doesn't
> - Today the static middleware sends neither with cache headers, and
the SPA fallback serves `index.html` for any unmatched route — including
paths under `/assets/` that no longer exist after a deploy
> - That combination produces the familiar "blank screen after deploy" +
`Failed to load module script: Expected a JavaScript MIME type but
received 'text/html'` bug
> - This pull request caches hashed assets immutably, forces
`index.html` to `no-cache` everywhere it gets served, and returns 404
for missing `/assets/*` paths

## What Changed

- `server/src/app.ts`:
- Serve `/assets/*` with `Cache-Control: public, max-age=31536000,
immutable`.
- Serve the remaining static files (favicon, manifest, robots.txt) with
a 1-hour cache, but override to `no-cache` specifically for `index.html`
via the `setHeaders` hook — because `express.static` serves it directly
for `/` and `/index.html`.
- The SPA fallback (`app.get(/.*/, …)`) sets `Cache-Control: no-cache`
on its `index.html` response.
- The fallback returns 404 for paths under `/assets/` so browsers don't
cache the HTML shell as a JavaScript module.

## Verification

- `curl -i http://localhost:3100/assets/index-abc123.js` →
`cache-control: public, max-age=31536000, immutable`.
- `curl -i http://localhost:3100/` → `cache-control: no-cache`.
- `curl -i http://localhost:3100/assets/missing.js` → `404`.
- `curl -i http://localhost:3100/some/spa/route` → `200` HTML with
`cache-control: no-cache`.

## Risks

Low. Asset URLs and HTML content are unchanged; only response headers
and the 404 behavior for missing asset paths change. No API surface
affected.

## Model Used

Claude Opus 4.6 (1M context), extended thinking mode.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified
- [x] Tests run locally and pass
- [x] CI green
- [x] Greptile review addressed
2026-04-15 09:45:22 -05:00
Jannes Stubbemann 6059c665d5 fix(a11y): remove maximum-scale and user-scalable=no from viewport (#3726)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Humans watch and oversee those agents through a web UI
> - Accessibility matters for anyone who cannot read small text
comfortably — they rely on browser zoom
> - The app shell's viewport meta tag includes `maximum-scale=1.0,
user-scalable=no`
> - Those tokens disable pinch-zoom and are a WCAG 2.1 SC 1.4.4 (Resize
Text) failure
> - The original motivation — suppressing iOS Safari's auto-zoom on
focused inputs — is actually a font-size issue, not a viewport issue,
and modern Safari only auto-zooms when input font-size is below 16px
> - This pull request drops the two tokens, restoring pinch-zoom while
leaving the real fix (inputs at ≥16px) to CSS

## What Changed

- `ui/index.html` — remove `maximum-scale=1.0, user-scalable=no` from
the viewport meta tag. Keep `width=device-width, initial-scale=1.0,
viewport-fit=cover`.

## Verification

- Manual on iOS and Chrome mobile: pinch-to-zoom now works across the
app.
- Manual on desktop: Ctrl+/- zoom already worked via
`initial-scale=1.0`; unchanged.

## Risks

Low. Users who were relying on auto-zoom-suppression for text inputs
will notice nothing (modern Safari only auto-zooms below 16px). No API
surface change.

## Model Used

Claude Opus 4.6 (1M context), extended thinking mode.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified
- [x] Tests run locally and pass
- [x] CI green
- [x] Greptile review addressed
2026-04-15 09:43:45 -05:00
Jannes Stubbemann f460f744ef fix: trust PAPERCLIP_PUBLIC_URL in board mutation guard (#3731)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Humans interact with the system through a web UI that authenticates
a session and then issues mutations against the board
> - A CSRF-style guard (`boardMutationGuard`) protects those mutations
by requiring the request origin match a trusted set built from the
`Host` / `X-Forwarded-Host` header
> - Behind certain reverse proxies, neither header matches the public
URL — TLS terminates at the edge and the inbound `Host` carries an
internal service name (cluster-local hostname, IP, or an Ingress backend
reference)
> - Mutations from legitimate browser sessions then fail with `403 Board
mutation requires trusted browser origin`
> - `PAPERCLIP_PUBLIC_URL` is already the canonical "what operators told
us the public URL is" value — it's used by better-auth and `config.ts`
> - This pull request adds it to the trusted-origin set when set, so
browsers reaching the legit public URL aren't blocked

## What Changed

- `server/src/middleware/board-mutation-guard.ts` — parse
`PAPERCLIP_PUBLIC_URL` and add its origin to the trusted set in
`trustedOriginsForRequest`. Additive only.

## Verification

- `PAPERCLIP_PUBLIC_URL=https://example.com pnpm start` then issue a
mutation from a browser pointed at `https://example.com`: 200, as
before. From an unrecognized origin: 403, as before.
- Without `PAPERCLIP_PUBLIC_URL` set: behavior is unchanged.

## Risks

Low. Additive only. The default dev origins and the
`Host`/`X-Forwarded-Host`-derived origins continue to be trusted; this
just adds the operator-configured public URL on top.

## Model Used

Claude Opus 4.6 (1M context), extended thinking mode.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified
- [x] Tests run locally and pass
- [x] CI green
- [x] Greptile review addressed
2026-04-15 09:42:55 -05:00
Dotta 32a9165ddf [codex] harden authenticated routes and issue editor reliability (#3741)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The control plane depends on authenticated routes enforcing company
boundaries and role permissions correctly
> - This branch also touches the issue detail and markdown editing flows
operators use while handling advisory and triage work
> - Partial issue cache seeds and fragile rich-editor parsing could
leave important issue content missing or blank at the moment an operator
needed it
> - Blocked issues becoming actionable again should wake their assignee
automatically instead of silently staying idle
> - This pull request rebases the advisory follow-up branch onto current
`master`, hardens authenticated route authorization, and carries the
issue-detail/editor reliability fixes forward with regression tests
> - The benefit is tighter authz on sensitive routes plus more reliable
issue/advisory editing and wakeup behavior on top of the latest base

## What Changed

- Hardened authenticated route authorization across agent, activity,
approval, access, project, plugin, health, execution-workspace,
portability, and related server paths, with new cross-tenant and
runtime-authz regression coverage.
- Switched issue detail queries from `initialData` to placeholder-based
hydration so list/quicklook seeds still refetch full issue bodies.
- Normalized advisory-style HTML images before mounting the markdown
editor and strengthened fallback behavior when the rich editor silently
fails or rejects the content.
- Woke assigned agents when blocked issues move back to `todo`, with
route coverage for reopen and unblock transitions.
- Rebasing note: this branch now sits cleanly on top of the latest
`master` tip used for the PR base.

## Verification

- `pnpm exec vitest run ui/src/lib/issueDetailQuery.test.tsx
ui/src/components/MarkdownEditor.test.tsx
server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/activity-routes.test.ts
server/src/__tests__/agent-cross-tenant-authz-routes.test.ts`
- Confirmed `pnpm-lock.yaml` is not part of the PR diff.
- Rebased the branch onto current `public-gh/master` before publishing.

## Risks

- Broad authz tightening may expose existing flows that were relying on
permissive board or agent access and now need explicit grants.
- Markdown editor fallback changes could affect focus or rendering in
edge-case content that mixes HTML-like advisory markup with normal
markdown.
- This verification was intentionally scoped to touched regressions and
did not run the full repository suite.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in the Codex CLI environment
with tool use for terminal, git, and GitHub operations. The exact
runtime model identifier is not exposed inside this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, it is behavior-only and does not
need before/after screenshots
- [x] I have updated relevant documentation to reflect my changes, or no
documentation changes were needed for these internal fixes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-15 08:41:15 -05:00
Chris Farhood 50cd76d8a3 feat(adapters): add capability flags to ServerAdapterModule (#3540)
## Thinking Path

> - Paperclip orchestrates AI agents via adapters (`claude_local`,
`codex_local`, etc.)
> - Each adapter type has different capabilities — instructions bundles,
skill materialization, local JWT — but these were gated by 5 hardcoded
type lists scattered across server routes and UI components
> - External adapter plugins (e.g. a future `opencode_k8s`) cannot add
themselves to those hardcoded lists without patching Paperclip source
> - The existing `supportsLocalAgentJwt` field on `ServerAdapterModule`
proves the right pattern already exists; it just wasn't applied to the
other capability gates
> - This pull request replaces the 4 remaining hardcoded lists with
declarative capability flags on `ServerAdapterModule`, exposed through
the adapter listing API
> - The benefit is that external adapter plugins can now declare their
own capabilities without any changes to Paperclip source code

## What Changed

- **`packages/adapter-utils/src/types.ts`** — added optional capability
fields to `ServerAdapterModule`: `supportsInstructionsBundle`,
`instructionsPathKey`, `requiresMaterializedRuntimeSkills`
- **`server/src/routes/agents.ts`** — replaced
`DEFAULT_MANAGED_INSTRUCTIONS_ADAPTER_TYPES` and
`ADAPTERS_REQUIRING_MATERIALIZED_RUNTIME_SKILLS` hardcoded sets with
capability-aware helper functions that fall back to the legacy sets for
adapters that don't set flags
- **`server/src/routes/adapters.ts`** — `GET /api/adapters` now includes
a `capabilities` object per adapter (all four flags + derived
`supportsSkills`)
- **`server/src/adapters/registry.ts`** — all built-in adapters
(`claude_local`, `codex_local`, `process`, `cursor`) now declare flags
explicitly
- **`ui/src/adapters/use-adapter-capabilities.ts`** — new hook that
fetches adapter capabilities from the API
- **`ui/src/pages/AgentDetail.tsx`** — replaced hardcoded `isLocal`
allowlist with `capabilities.supportsInstructionsBundle` from the API
- **`ui/src/components/AgentConfigForm.tsx`** /
**`OnboardingWizard.tsx`** — replaced `NONLOCAL_TYPES` denylist with
capability-based checks
- **`server/src/__tests__/adapter-registry.test.ts`** /
**`adapter-routes.test.ts`** — tests covering flag exposure,
undefined-when-unset, and per-adapter values
- **`docs/adapters/creating-an-adapter.md`** — new "Capability Flags"
section documenting all flags and an example for external plugin authors

## Verification

- Run `pnpm test --filter=@paperclip/server -- adapter-registry
adapter-routes` — all new tests pass
- Run `pnpm test --filter=@paperclip/adapter-utils` — existing tests
still pass
- Spin up dev server, open an agent with `claude_local` type —
instructions bundle tab still visible
- Create/open an agent with a non-local type — instructions bundle tab
still hidden
- Call `GET /api/adapters` and verify each adapter includes a
`capabilities` object with the correct flags

## Risks

- **Low risk overall** — all new flags are optional with
backwards-compatible fallbacks to the existing hardcoded sets; no
adapter behaviour changes unless a flag is explicitly set
- Adapters that do not declare flags continue to use the legacy lists,
so there is no regression risk for built-in adapters
- The UI capability hook adds one API call to AgentDetail mount; this is
a pre-existing endpoint, so no new latency path is introduced

## Model Used

- Provider: Anthropic
- Model: Claude Sonnet 4.6 (`claude-sonnet-4-6`)
- Context: 200k token context window
- Mode: Agentic tool use (code editing, bash, grep, file reads)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Pawla Abdul (Bot) <pawla@groombook.dev>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-15 07:10:52 -05:00
Knife.D f6ce976544 fix: Anthropic subscription quota always shows 100% used (#3589)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Costs > Providers tab displays live subscription quota from each
adapter (Claude, Codex)
> - The Claude adapter fetches utilization from the Anthropic OAuth
usage API and converts it to a 0-100 percent via `toPercent()`
> - The API changed to return utilization as 0-100 percentages (e.g.
`34.0` = 34%), but `toPercent()` assumed 0-1 fractions and multiplied by
100
> - After `Math.min(100, ...)` clamping, every quota window displayed as
100% used regardless of actual usage
> - Additionally, `extra_usage.used_credits` and `monthly_limit` are
returned in cents but were formatted as dollars, showing $6,793 instead
of $67.93
> - This PR applies the same `< 1` heuristic already proven in the Codex
adapter and fixes the cents-to-dollars conversion
> - The benefit is accurate quota display matching what users see on
claude.ai/settings/usage

## What Changed

- `toPercent()`: apply `< 1` heuristic to handle both legacy 0-1
fractions and current 0-100 percentage API responses (consistent with
Codex adapter's `normalizeCodexUsedPercent()`)
- `formatExtraUsageLabel()`: divide `used_credits` and `monthly_limit`
by 100 to convert cents to dollars before formatting
- Updated all `toPercent` and `fetchClaudeQuota` tests to use current
API format (0-100 range)
- Added backward-compatibility test for legacy 0-1 fraction values
- Added test for enabled extra usage with utilization and
cents-to-dollars conversion

## Verification

- `toPercent(34.0)` → `34` (was `100`)
- `toPercent(91.0)` → `91` (was `100`)
- `toPercent(0.5)` → `50` (legacy format still works)
- Extra usage `used_credits: 6793, monthly_limit: 14000` → `$67.93 /
$140.00` (was `$6,793.00 / $14,000.00`)
- Verified on a live instance with Claude Max subscription — Costs >
Providers tab now shows correct percentages matching
claude.ai/settings/usage

## Risks

Low risk. The `< 1` heuristic is already battle-tested in the Codex
adapter. The only edge case is a true utilization of exactly `1.0` which
maps to `1%` instead of `100%` — this is consistent with the Codex
adapter behavior and is an acceptable trade-off since 1% and 100% are
distinguishable in practice (100% would be returned as `100.0` by the
API).

## Model Used

Claude Opus 4.6 (1M context) via Claude Code CLI — tool use, code
analysis, and code generation

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Closes #2188

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 06:44:26 -05:00
Chris Farhood b816809a1e fix(server): respect externally set PAPERCLIP_API_URL env var (#3472)
## Thinking Path

> - Paperclip server starts up and sets internal `PAPERCLIP_API_URL` for
downstream services and adapters
> - The server startup code was unconditionally overwriting
`PAPERCLIP_API_URL` with `http://localhost:3100` (or equivalent based on
`config.host`)
> - In Kubernetes deployments, `PAPERCLIP_API_URL` is set via a
ConfigMap to the externally accessible load balancer URL (e.g.
`https://paperclip.example.com`)
> - Because the env var was unconditionally set after loading the
ConfigMap value, the ConfigMap-provided URL was ignored and replaced
with the internal localhost address
> - This caused downstream services (adapter env building) to use the
wrong URL, breaking external access
> - This pull request makes the assignment conditional — only set if not
already provided by the environment
> - External deployments can now supply `PAPERCLIP_API_URL` and it will
be respected; local development continues to work without setting it

## What Changed

- `server/src/index.ts`: Wrapped `PAPERCLIP_API_URL` assignment in `if
(!process.env.PAPERCLIP_API_URL)` guard so externally provided values
are preserved
- `server/src/__tests__/server-startup-feedback-export.test.ts`: Added
tests verifying external `PAPERCLIP_API_URL` is respected and fallback
behavior is correct
- `docs/deploy/environment-variables.md`: Updated `PAPERCLIP_API_URL`
description to clarify it can be externally provided and the load
balancer/reverse proxy use case

## Verification

- Run the existing test suite: `pnpm test:run
server/src/__tests__/server-startup-feedback-export.test.ts` — all 3
tests pass
- Manual verification: Set `PAPERCLIP_API_URL` to a custom value before
starting the server and confirm it is not overwritten

## Risks

- Low risk — purely additive conditional check; existing behavior for
unset env var is unchanged

## Model Used

MiniMax M2.7 — reasoning-assisted for tracing the root cause through the
startup chain (`buildPaperclipEnv` → `startServer` → `config.host` →
`HOST` env var)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Pawla Abdul (Bot) <pawla@groombook.dev>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-15 06:43:48 -05:00
Lempkey d0a8d4e08a fix(routines): include cronExpression and timezone in list trigger response (#3209)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Routines are recurring tasks that trigger agents on a schedule or
via webhook
> - Routine triggers store their schedule as a `cronExpression` +
`timezone` in the database
> - The `GET /companies/:companyId/routines` list endpoint is the
primary way API consumers (and the UI) discover all routines and their
triggers
> - But the list endpoint was silently dropping `cronExpression` and
`timezone` from each trigger object — the DB query fetched them, but the
explicit object-construction mapping only forwarded seven other fields
> - This PR fixes the mapping to include `cronExpression` and
`timezone`, and extends the `RoutineListItem.triggers` type to match
> - The benefit is that API consumers can now see the actual schedule
from the list endpoint, and future UI components reading from the list
cache will get accurate schedule data

## What Changed

- **`server/src/services/routines.ts`** — Added `cronExpression` and
`timezone` to the explicit trigger object mapping inside
`routinesService.list()`. The DB query (`listTriggersForRoutineIds`)
already fetched all columns via `SELECT *`; the values were being
discarded during object construction.
- **`packages/shared/src/types/routine.ts`** — Extended
`RoutineListItem.triggers` `Pick<RoutineTrigger, ...>` to include
`cronExpression` and `timezone` so the TypeScript type contract matches
the actual runtime shape.
- **`server/src/__tests__/routines-e2e.test.ts`** — Added assertions to
the existing schedule-trigger E2E test that verify both `cronExpression`
and `timezone` are present in the `GET /companies/:companyId/routines`
list response.

## Verification

```bash
# Run the route + service unit tests
npx vitest run server/src/__tests__/routines-routes.test.ts server/src/__tests__/routines-service.test.ts
# → 21 tests pass

# Confirm cronExpression appears in list response
curl /api/companies/{id}/routines | jq '.[].triggers[].cronExpression'
# → now returns the actual cron string instead of undefined
```

Manual reproduction per the issue:
1. Create a routine with a schedule trigger (`cronExpression: "47 14 * *
*"`, `timezone: "America/Mexico_City"`)
2. `GET /api/companies/{id}/routines` — trigger object now includes
`cronExpression` and `timezone`

## Risks

Low risk. The change only adds two fields to an existing response shape
— no fields removed, no behavior changed. The `cronExpression` is `null`
for non-schedule trigger kinds (webhook, etc.), consistent with
`RoutineTrigger.cronExpression: string | null`. No migration required.

## Model Used

- **Provider:** Anthropic
- **Model:** Claude Sonnet 4.6 (`claude-sonnet-4-6`)
- **Context window:** 200k tokens
- **Mode:** Extended thinking + tool use (agentic)
- Secondary adversarial review: OpenAI Codex (via codex-companion
plugin)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (API-only fix; no UI rendering change)
- [ ] I have updated relevant documentation to reflect my changes (no
doc changes needed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-15 06:42:24 -05:00
Clément DREISKI 213bcd8c7a fix: include routine-execution issues in agent inbox-lite (#3329)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents query their own inbox via `/agents/me/inbox-lite` to discover
assigned work
> - `issuesSvc.list()` excludes `routine_execution` issues by default,
which is appropriate for the board UI
> - But agents calling `inbox-lite` need to see **all** their assigned
work, including routine-created issues
> - Without `includeRoutineExecutions: true`, agents miss their own
in-progress issues after the first delegation step
> - This causes routine-driven pipelines to stall — agents report "Inbox
empty" and exit
> - This pull request adds `includeRoutineExecutions: true` to the
`inbox-lite` query
> - The benefit is routine-driven pipelines no longer stall after
delegation

## What Changed

- Added `includeRoutineExecutions: true` to the `issuesSvc.list()` call
in the `/agents/me/inbox-lite` route (`server/src/routes/agents.ts`)

## Verification

1. Create a routine that assigns an issue to an agent
2. Trigger the routine — first run works via `issue_assigned` event
injection
3. Agent delegates (creates a subtask) and exits
4. On next heartbeat, agent queries `inbox-lite`
5. **Before fix**: issue is invisible, agent reports "Inbox empty"
6. **After fix**: issue appears in inbox, agent continues working

Tested on production instance — fix resolves the stall immediately.

## Risks

Low risk — additive change, only affects agent-facing inbox endpoint.
Board UI keeps its default behavior (routine executions hidden for clean
view).

## Model Used

Claude Opus 4.6 (`claude-opus-4-6`) via Claude Code CLI — high thinking
effort, tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Closes #3282
2026-04-15 06:41:40 -05:00
Dotta 7f893ac4ec [codex] Harden execution reliability and heartbeat tooling (#3679)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Reliable execution depends on heartbeat routing, issue lifecycle
semantics, telemetry, and a fast enough local verification loop to keep
regressions visible
> - The remaining commits on this branch were mostly server/runtime
correctness fixes plus test and documentation follow-ups in that area
> - Those changes are logically separate from the UI-focused
issue-detail and workspace/navigation branches even when they touch
overlapping issue APIs
> - This pull request groups the execution reliability, heartbeat,
telemetry, and tooling changes into one standalone branch
> - The benefit is a focused review of the control-plane correctness
work, including the follow-up fix that restored the implicit
comment-reopen helpers after branch splitting

## What Changed

- Hardened issue/heartbeat execution behavior, including self-review
stage skipping, deferred mention wakes during active execution, stranded
execution recovery, active-run scoping, assignee resolution, and
blocked-to-todo wake resumption
- Reduced noisy polling/logging overhead by trimming issue run payloads,
compacting persisted run logs, silencing high-volume request logs, and
capping heartbeat-run queries in dashboard/inbox surfaces
- Expanded telemetry and status semantics with adapter/model fields on
task completion plus clearer status guidance in docs/onboarding material
- Updated test infrastructure and verification defaults with faster
route-test module isolation, cheaper default `pnpm test`, e2e isolation
from local state, and repo verification follow-ups
- Included docs/release housekeeping from the branch and added a small
follow-up commit restoring the implicit comment-reopen helpers that were
dropped during branch reconstruction

## Verification

- `pnpm vitest run
server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/issue-telemetry-routes.test.ts`
- `pnpm vitest run server/src/__tests__/http-log-policy.test.ts
server/src/__tests__/heartbeat-run-log.test.ts
server/src/__tests__/health.test.ts`
- `server/src/__tests__/activity-service.test.ts`,
`server/src/__tests__/heartbeat-comment-wake-batching.test.ts`, and
`server/src/__tests__/heartbeat-process-recovery.test.ts` were attempted
on this host but the embedded Postgres harness reported
init-script/data-dir problems and skipped or failed to start, so they
are noted as environment-limited

## Risks

- Medium: this branch changes core issue/heartbeat routing and
reopen/wakeup behavior, so regressions would affect agent execution flow
rather than isolated UI polish
- Because it also updates verification infrastructure, reviewers should
pay attention to whether the new tests are asserting the right failure
modes and not just reshaping harness behavior

## Model Used

- OpenAI Codex coding agent (GPT-5-class runtime in Codex CLI; exact
deployed model ID is not exposed in this environment), reasoning
enabled, tool use and local code execution enabled

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-14 13:34:52 -05:00
Dotta e89076148a [codex] Improve workspace runtime and navigation ergonomics (#3680)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - That operator experience depends not just on issue chat, but also on
how workspaces, inbox groups, and navigation state behave over
long-running sessions
> - The current branch included a separate cluster of workspace-runtime
controls, inbox grouping, sidebar ordering, and worktree lifecycle fixes
> - Those changes cross server, shared contracts, database state, and UI
navigation, but they still form one coherent operator workflow area
> - This pull request isolates the workspace/runtime and navigation
ergonomics work into one standalone branch
> - The benefit is better workspace recovery and navigation persistence
without forcing reviewers through the unrelated issue-detail/chat work

## What Changed

- Improved execution workspace and project workspace controls, request
wiring, layout, and JSON editor ergonomics
- Hardened linked worktree reuse/startup behavior and documented the
`worktree repair` flow for recovering linked worktrees safely
- Added inbox workspace grouping, mobile collapse, archive undo,
keyboard navigation, shared group-header styling, and persisted
collapsed-group behavior
- Added persistent sidebar order preferences with the supporting DB
migration, shared/server contracts, routes, services, hooks, and UI
integration
- Scoped issue-list preferences by context and added targeted UI/server
tests for workspace controls, inbox behavior, sidebar preferences, and
worktree validation

## Verification

- `pnpm vitest run
server/src/__tests__/sidebar-preferences-routes.test.ts
ui/src/pages/Inbox.test.tsx
ui/src/components/ProjectWorkspaceSummaryCard.test.tsx
ui/src/components/WorkspaceRuntimeControls.test.tsx
ui/src/api/workspace-runtime-control.test.ts`
- `server/src/__tests__/workspace-runtime.test.ts` was attempted, but
the embedded Postgres suite self-skipped/hung on this host after
reporting an init-script issue, so it is not counted as a local pass
here

## Risks

- Medium: this branch includes migration-backed preference storage plus
worktree/runtime behavior, so merge review should pay attention to state
persistence and worktree recovery semantics
- The sidebar preference migration is standalone, but it should still be
watched for conflicts if another migration lands first

## Model Used

- OpenAI Codex coding agent (GPT-5-class runtime in Codex CLI; exact
deployed model ID is not exposed in this environment), reasoning
enabled, tool use and local code execution enabled

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-14 12:57:11 -05:00
Dotta 6e6f538630 [codex] Improve issue detail and issue-list UX (#3678)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - A core part of that is the operator experience around reading issue
state, agent chat, and sub-task structure
> - The current branch had a long run of issue-detail and issue-list UX
fixes that all improve how humans follow and steer active work
> - Those changes mostly live in the UI/chat surface and should be
reviewed together instead of mixed with workspace/runtime work
> - This pull request packages the issue-detail, chat, markdown, and
sub-issue list improvements into one standalone change
> - The benefit is a cleaner, less jumpy, more reliable issue workflow
on desktop and mobile without coupling it to unrelated server/runtime
refactors

## What Changed

- Stabilized issue chat runtime wiring, optimistic comment handling,
queued-comment cancellation, and composer anchoring during live updates
- Fixed several issue-detail rendering and navigation regressions
including placeholder bleed, local polling scope, mobile inbox-to-issue
transitions, and visible refresh resets
- Improved markdown and rich-content handling with advisory image
normalization, editor fallback behavior, touch mention recovery, and
`issue:` quicklook links
- Refined sub-issue behavior with parent-derived defaults, current-user
inheritance fixes, empty-state cleanup, and a reusable issue-list
presentation for sub-issues
- Added targeted UI tests for the new issue-detail, chat scroll/message,
placeholder-data, markdown, and issue-list behaviors

## Verification

- `pnpm vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/components/IssuesList.test.tsx
ui/src/context/LiveUpdatesProvider.test.tsx
ui/src/lib/issue-chat-messages.test.ts
ui/src/lib/issue-chat-scroll.test.ts
ui/src/lib/issue-detail-subissues.test.ts
ui/src/lib/query-placeholder-data.test.tsx
ui/src/hooks/usePaperclipIssueRuntime.test.tsx`

## Risks

- Medium: this branch touches the highest-traffic issue-detail UI paths,
so regressions would show up as chat/thread or sub-issue UX glitches
- The changes are UI-heavy and would benefit from reviewer screenshots
or a quick manual browser pass before merge

## Model Used

- OpenAI Codex coding agent (GPT-5-class runtime in Codex CLI; exact
deployed model ID is not exposed in this environment), reasoning
enabled, tool use and local code execution enabled

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-14 12:50:48 -05:00
Dotta 5d1ed71779 chore: add v2026.410.0 release notes for security release
## Summary

- Adds release notes for v2026.410.0 security release covering
GHSA-68qg-g8mg-6pr7
- Required before triggering the stable release workflow to publish
2026.410.0 to npm

## Context

The security advisory GHSA-68qg-g8mg-6pr7 was published on 2026-04-10
listing 2026.410.0 as the patched version, but only canary builds exist
on npm. The authz fix (PR #3315) is already merged to master. This PR
adds release notes so the stable release workflow can be triggered.

## Test plan

- Verify release notes content is accurate
- Merge, then trigger release.yml workflow_dispatch with
source_ref=master, stable_date=2026-04-10, dry_run=false
- Confirm npm view paperclipai version returns 2026.410.0

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-13 12:43:17 -05:00
Dotta 76fe736e8e chore: add v2026.410.0 release notes for security release
## Summary

- Adds comprehensive release notes for `v2026.410.0`, the security
release that patches GHSA-68qg-g8mg-6pr7 (unauthenticated RCE via import
authorization bypass)
- Required before triggering the stable release workflow to publish
`2026.410.0` to npm and create the GitHub Release

## Context

The security fix (PR #3315) is already merged to master. The GHSA
advisory references `2026.410.0` as the patched version, but only canary
builds exist on npm. This PR unblocks the stable release.

## Test plan

- [x] Release notes file is valid markdown
- [ ] Merge and trigger `release.yml` workflow with `source_ref=master`,
`stable_date=2026-04-10`
- [ ] Verify `npm view paperclipai version` returns `2026.410.0`
- [ ] Verify GitHub Release `v2026.410.0` exists

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Paperclip <noreply@paperclip.ing>

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-13 12:43:07 -05:00
Dotta d6b06788f6 Merge pull request #3542 from cryppadotta/PAP-1346-faster-issue-to-issue-links
Speed up issue-to-issue navigation
2026-04-12 21:39:27 -05:00
Dotta 6844226572 Address Greptile navigation review 2026-04-12 21:30:50 -05:00
Dotta 0cb42f49ea Fix rebased issue detail prefetch typing 2026-04-12 21:18:57 -05:00
Dotta e59047187b Reset scroll on issue detail navigation 2026-04-12 21:14:12 -05:00
Dotta 1729e41179 Speed up issue-to-issue navigation 2026-04-12 21:14:12 -05:00
Dotta 11de5ae9c9 Merge pull request #3538 from paperclipai/PAP-1355-right-now-when-agents-boot-they-re-instructed-to-call-the-api-to-checkout-the-issue-so-that-they-have-exclusive
Improve scoped wake checkout and linked worktree reuse
2026-04-12 21:08:20 -05:00
Dotta 8e82ac7e38 Handle harness checkout conflicts gracefully 2026-04-12 20:57:31 -05:00
Dotta be82a912b2 Fix signoff e2e for auto-checked out issues 2026-04-12 20:43:50 -05:00
Dotta ab5eeca94e Fix stale issue live-run state 2026-04-12 20:41:31 -05:00
Dotta 2172476e84 Fix linked worktree reuse for execution workspaces 2026-04-12 20:34:06 -05:00
Dotta c1bb938519 Auto-checkout scoped issue wakes in the harness 2026-04-11 10:53:28 -05:00
Dotta b649bd454f Merge pull request #3383 from paperclipai/pap-1347-codex-fast-mode
feat(codex-local): add fast mode support
2026-04-11 08:45:50 -05:00
Dotta a692e37f3e Merge pull request #3386 from paperclipai/pap-1347-dev-runner-worktree-env
fix: isolate dev runner worktree env
2026-04-11 08:45:16 -05:00
Dotta 96637a1e09 Merge pull request #3385 from paperclipai/pap-1347-inbox-issue-search
feat(inbox): improve issue search matches
2026-04-11 08:44:59 -05:00
Dotta a5aed931ab fix(dev-runner): tighten worktree env bootstrap 2026-04-11 08:35:53 -05:00
Dotta a63e847525 fix(inbox): avoid refetching on filter-only changes 2026-04-11 08:34:17 -05:00
Dotta a7dc88941b fix(codex-local): avoid fast mode in env probe 2026-04-11 08:33:18 -05:00
Dotta b6115424b1 fix: isolate dev runner worktree env 2026-04-11 08:27:25 -05:00
Dotta 1f78e55072 Broaden comment matches in issue search 2026-04-11 08:26:09 -05:00
Dotta fcab770518 Add inbox issue search fallback 2026-04-11 08:26:09 -05:00
Dotta 2d8f97feb0 feat(codex-local): add fast mode support 2026-04-11 08:21:55 -05:00
Dotta 03a2cf5c8a Merge pull request #3303 from cryppadotta/PAP-438-review-openclaw-s-docs-on-networking-discovery-and-binding-what-could-we-learn-from-this
Introduce bind presets for deployment setup
2026-04-11 07:24:57 -05:00
Dotta a77206812e Harden tailnet bind setup 2026-04-11 07:13:41 -05:00
dotta 6208899d0a Fix dev runner workspace import regression 2026-04-11 07:09:07 -05:00
dotta 2a84e53c1b Introduce bind presets for deployment setup
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-11 07:09:07 -05:00
Dotta e1bf9d66a7 Merge pull request #3355 from cryppadotta/pap-1331-issue-thread-ux
feat: polish issue thread markdown and references
2026-04-11 06:55:26 -05:00
Dotta b48be80d5d fix: address PR 3355 review regressions 2026-04-11 06:40:37 -05:00
Dotta 45ebecab5a Merge pull request #3356 from cryppadotta/pap-1331-inbox-ux
feat: polish inbox and issue list workflows
2026-04-11 06:35:59 -05:00
Dotta dae888cc5d Merge pull request #3354 from cryppadotta/pap-1331-runtime-workflows
fix: harden heartbeat and adapter runtime workflows
2026-04-11 06:31:28 -05:00
Dotta aaf42f3a7e Merge pull request #3353 from cryppadotta/pap-1331-dev-tools-docs
chore: improve worktree tooling and security docs
2026-04-11 06:28:49 -05:00
Dotta 62d05a7ae2 Merge pull request #3232 from officialasishkumar/fix/clear-empty-agent-env-bindings
fix(ui): persist cleared agent env bindings on save
2026-04-11 06:23:14 -05:00
Dotta 1cd0281b4d test(ui): fix heartbeat run fixture drift 2026-04-10 22:42:52 -05:00
Dotta 65480ffab1 fix: restore inbox optimistic run fixture 2026-04-10 22:40:49 -05:00
Dotta dc94e3d1df fix: keep thread polish independent of quicklook routing 2026-04-10 22:36:45 -05:00
Dotta 0162bb332c fix: keep runtime UI changes self-contained 2026-04-10 22:36:45 -05:00
Dotta 7ec8716159 fix: keep inbox quicklook and tests standalone 2026-04-10 22:36:45 -05:00
Dotta 8cb70d897d fix: use CLI tsx entrypoint for workspace preflight 2026-04-10 22:32:55 -05:00
Dotta 8bdf4081ee chore: improve worktree tooling and security docs 2026-04-10 22:26:30 -05:00
Dotta 958c11699e feat: polish issue thread markdown and references 2026-04-10 22:26:21 -05:00
Dotta c566a9236c fix: harden heartbeat and adapter runtime workflows 2026-04-10 22:26:21 -05:00
Dotta dab95740be feat: polish inbox and issue list workflows 2026-04-10 22:26:21 -05:00
Devin Foley 548721248e fix(ui): keep latest issue document revision current (#3342)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Board users and agents collaborate on issue-scoped documents such as
plans and revisions need to be trustworthy because they are the audit
trail for those artifacts.
> - The issue document UI now supports revision history and restore, so
the UI has to distinguish the current revision from historical revisions
correctly even while multiple queries are refreshing.
> - In `PAPA-72`, the newest content could appear under an older
revision label because the current document snapshot and the
revision-history query could temporarily disagree after an edit.
> - That made the UI treat the newest revision like a historical restore
target, which is the opposite of the intended behavior.
> - This pull request derives one authoritative revision view from both
sources, sorts revisions newest-first, and keeps the freshest revision
marked current.
> - The benefit is that revision history stays stable and trustworthy
immediately after edits instead of briefly presenting the newest content
as an older revision.

## What Changed

- Added a `document-revisions` helper that merges the current document
snapshot with fetched revision history into one normalized revision
state.
- Updated `IssueDocumentsSection` to render from that normalized state
instead of trusting either query in isolation.
- Added focused tests covering the current-revision selection and
ordering behavior.

## Verification

- `pnpm -r typecheck`
- `pnpm build`
- Targeted revision tests passed locally.
- Manual reviewer check:
  - Open an issue document with revision history.
  - Edit and save the document.
  - Immediately open the revision selector.
- Confirm the newest revision remains marked current and older revisions
remain the restore targets.

## Risks

- Low risk. The change is isolated to issue document revision
presentation in the UI.
- Main risk is merging the current snapshot with fetched history
incorrectly for edge cases, which is why the helper has focused unit
coverage.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-10 17:14:06 -07:00
Devin Foley f4a05dc35c fix(cli): prepare plugin sdk before cli dev boot (#3343)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The company import/export e2e exercises the local CLI startup path
that boots the dev server inside a workspace
> - That startup path loads server and plugin code which depends on
built workspace package artifacts such as `@paperclipai/shared` and
`@paperclipai/plugin-sdk`
> - In a clean worktree those `dist/*` artifacts may not exist yet even
though `paperclipai run` can still attempt to import the local server
entry
> - That mismatch caused the import/export e2e to fail before the actual
company package flow ran
> - This pull request adds a CLI preflight step that prepares the needed
workspace build dependencies before the local server import and fails
closed if that preflight is interrupted or stalls
> - The benefit is that clean worktrees can boot `paperclipai run`
reliably without silently continuing after incomplete dependency
preparation

## What Changed

- Updated `cli/src/commands/run.ts` to execute
`scripts/ensure-plugin-build-deps.mjs` before importing
`server/src/index.ts` for local dev startup.
- Ensured `paperclipai run` can materialize missing workspace artifacts
such as `packages/shared/dist` and `packages/plugins/sdk/dist`
automatically in clean worktrees.
- Made the preflight fail closed when the child process exits via signal
and bounded it with a 120-second timeout so the CLI does not hang
indefinitely.
- Kept the fix isolated to the CLI startup path; no API contract,
schema, or UI behavior changed.
- Reused the existing
`cli/src/__tests__/company-import-export-e2e.test.ts` coverage that
already exercises the failing boot path, so no additional test file was
needed.

## Verification

- `pnpm test:run cli/src/__tests__/company-import-export-e2e.test.ts`
- `pnpm --filter paperclipai typecheck`
- On the isolated branch, confirmed `packages/shared/dist/index.js` and
`packages/plugins/sdk/dist/index.js` were absent before the run, then
reran the targeted e2e and observed a passing result.

## Risks

- Low risk: the change only affects the local CLI dev startup path
before the server import.
- Residual risk: other entrypoints still rely on their own
preflight/build behavior, so this does not normalize every workspace
startup path.
- The 120-second timeout is intentionally generous, but unusually slow
machines could still hit it and surface a startup error instead of
waiting forever.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in the Codex CLI environment,
with shell/tool execution enabled. The exact runtime revision and
context window are not exposed by this environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-10 17:01:06 -07:00
Dotta b00d52c5b6 Merge pull request #3015 from aronprins/feature/backups-configuration
feat(backups): gzip compression and tiered retention with UI controls
2026-04-10 11:56:12 -05:00
Dotta ac664df8e4 fix(authz): scope import, approvals, activity, and heartbeat routes (#3315)
## Thinking Path

> - Paperclip orchestrates AI agents and company-scoped control-plane
actions for zero-human companies.
> - This change touches the server authz boundary around company
portability, approvals, activity, and heartbeat-run operations.
> - The vulnerability was that board-authenticated callers could cross
company boundaries or create new companies through import paths without
the same authorization checks enforced elsewhere.
> - Once that gap existed, an attacker could chain it into higher-impact
behavior through agent execution paths.
> - The fix needed to harden every confirmed authorization gap in the
reported chain, not just the first route that exposed it.
> - This pull request adds the missing instance-admin and company-access
checks and adds regression tests for each affected route.
> - The benefit is that cross-company actions and new-company import
flows now follow the same control-plane authorization rules as the rest
of the product.

## What Changed

- Required instance-admin access for `new_company` import preview/apply
flows in `server/src/routes/companies.ts`.
- Required company access before approval decision routes in
`server/src/routes/approvals.ts`.
- Required company access for activity creation and heartbeat-run issue
listing in `server/src/routes/activity.ts`.
- Required company access before heartbeat cancellation in
`server/src/routes/agents.ts`.
- Added regression coverage in the corresponding server route tests.

## Verification

- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/company-portability-routes.test.ts
src/__tests__/approval-routes-idempotency.test.ts
src/__tests__/activity-routes.test.ts
src/__tests__/agent-permissions-routes.test.ts`
- `pnpm --filter @paperclipai/server typecheck`
- Prior verification on the original security patch branch also included
`pnpm build`.

## Risks

- Low code risk: the change is narrow and only adds missing
authorization gates to existing routes.
- Operational risk: the advisory is already public, so this PR should be
merged quickly to minimize the public unpatched window.
- Residual product risk remains around open signup / bootstrap defaults,
which was intentionally left out of this patch because the current
first-user onboarding flow depends on it.

## Model Used

- OpenAI GPT-5 Codex coding agent with tool use and local code execution
in the Codex CLI environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Forgotten <forgottenrunes@protonmail.com>
2026-04-10 11:55:27 -05:00
Dotta 4477ca2a7e Merge pull request #3299 from aronprins/codex/fix-ceo-instruction-relative-paths
[codex] Clarify Claude instruction sibling file base path
2026-04-10 11:54:46 -05:00
Aron Prins 724893ad5b fix claude instruction sibling path hint 2026-04-10 14:22:48 +02:00
Aron Prins 7c42345177 chore: re-trigger CI to refresh PR base SHA 2026-04-10 12:16:25 +02:00
Dotta 0e87fdbe35 Merge pull request #3222 from paperclipai/pap-1266-issue-workflow
feat(issue-ui): refine issue workflow surfaces and live updates
2026-04-09 14:52:16 -05:00
dotta 4077ccd343 Fix signoff stage access and comment wake retries 2026-04-09 14:48:12 -05:00
Asish Kumar 44d94d0add fix(ui): persist cleared agent env bindings on save
Agent configuration edits already had an API path for replacing the full adapterConfig, but the edit form was still sending merge-style patches. That meant clearing the last environment variable serialized as undefined, the key disappeared from JSON, and the server merged the old env bindings back into the saved config.

Build adapter config save payloads as full replacement patches, strip undefined keys before send, and reuse the existing replaceAdapterConfig contract so explicit clears persist correctly. Add regression coverage for the cleared-env case and for adapter-type changes that still need to preserve adapter-agnostic fields.

Fixes #3179
2026-04-09 17:50:14 +00:00
dotta 03dff1a29a Refine issue workflow surfaces and live updates 2026-04-09 10:26:17 -05:00
Aron Prins b1e457365b fix: clean up orphaned .sql on compression failure and fix stale startup log
- backup-lib: delete uncompressed .sql file in catch block when gzip
  compression fails, preventing silent disk usage accumulation
- server: replace stale retentionDays scalar with retentionSource in
  startup log since retention is now read from DB on each backup tick
2026-04-08 14:40:05 +02:00
Aron Prins fcbae62baf feat(backups): tiered daily/weekly/monthly retention with UI controls
Replace single retentionDays with a three-tier BackupRetentionPolicy:
- Daily: keep all backups (presets: 3, 7, 14 days; default 7)
- Weekly: keep one per calendar week (presets: 1, 2, 4 weeks; default 4)
- Monthly: keep one per calendar month (presets: 1, 3, 6 months; default 1)

Pruning sorts backups newest-first and applies each tier's cutoff,
keeping only the newest entry per ISO week/month bucket. The Instance
Settings General page now shows three preset selectors (no icon, matches
existing page design). Remove Database icon import.
2026-04-08 14:40:05 +02:00
Aron Prins cc44d309c0 feat(backups): gzip compress backups and add retention config to Instance Settings
Compress database backups with gzip (.sql.gz), reducing file size ~83%.
Add backup retention configuration to Instance Settings UI with preset
options (7 days, 2 weeks, 1 month). The backup scheduler now reads
retention from the database on each tick so changes take effect without
restart. Default retention changed from 30 to 7 days.
2026-04-08 14:40:05 +02:00
1743 changed files with 613024 additions and 14817 deletions
+8
View File
@@ -154,6 +154,14 @@ Each AGENTS.md body should include not just what the agent does, but how they fi
This turns a collection of agents into an organization that actually works together. Without workflow context, agents operate in isolation — they do their job but don't know what happens before or after them.
Add a concise execution contract to every generated working agent:
- Start actionable work in the same heartbeat and do not stop at a plan unless planning was requested.
- Leave durable progress in comments, documents, or work products with the next action.
- Use child issues for long or parallel delegated work instead of polling agents, sessions, or processes.
- Mark blocked work with the unblock owner and action.
- Respect budget, pause/cancel, approval gates, and company boundaries.
### Step 5: Confirm Output Location
Ask the user where to write the package. Common options:
@@ -105,6 +105,13 @@ Your responsibilities:
- Implement features and fix bugs
- Write tests and documentation
- Participate in code reviews
Execution contract:
- Start actionable implementation work in the same heartbeat; do not stop at a plan unless planning was requested.
- Leave durable progress with a clear next action.
- Use child issues for long or parallel delegated work instead of polling agents, sessions, or processes.
- Mark blocked work with the unblock owner and action.
```
## teams/engineering/TEAM.md
+1 -1
View File
@@ -548,7 +548,7 @@ Import from `@paperclipai/adapter-utils/server-utils`:
### Prompt Templates
- Support `promptTemplate` for every run
- Use `renderTemplate()` with the standard variable set
- Default prompt: `"You are agent {{agent.id}} ({{agent.name}}). Continue your Paperclip work."`
- Default prompt should use `DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE` from `@paperclipai/adapter-utils/server-utils` so local adapters share Paperclip's execution contract: act in the same heartbeat, avoid planning-only exits unless requested, leave durable progress and a next action, use child issues instead of polling, mark blockers with owner/action, and respect governance boundaries.
### Error Handling
- Differentiate timeout vs process error vs parse failure
@@ -0,0 +1,230 @@
---
name: deal-with-security-advisory
description: >
Handle a GitHub Security Advisory response for Paperclip, including
confidential fix development in a temporary private fork, human coordination
on advisory-thread comments, CVE request, synchronized advisory publication,
and immediate security release steps.
---
# Security Vulnerability Response Instructions
## ⚠️ CRITICAL: This is a security vulnerability. Everything about this process is confidential until the advisory is published. Do not mention the vulnerability details in any public commit message, PR title, branch name, or comment. Do not push anything to a public branch. Do not discuss specifics in any public channel. Assume anything on the public repo is visible to attackers who will exploit the window between disclosure and user upgrades.
***
## Context
A security vulnerability has been reported via GitHub Security Advisory:
* **Advisory:** {{ghsaId}} (e.g. GHSA-x8hx-rhr2-9rf7)
* **Reporter:** {{reporterHandle}}
* **Severity:** {{severity}}
* **Notes:** {{notes}}
***
## Step 0: Fetch the Advisory Details
Pull the full advisory so you understand the vulnerability before doing anything else:
```
gh api repos/paperclipai/paperclip/security-advisories/{{ghsaId}}
```
Read the `description`, `severity`, `cvss`, and `vulnerabilities` fields. Understand the attack vector before writing code.
## Step 1: Acknowledge the Report
⚠️ **This step requires a human.** The advisory thread does not have a comment API. Ask the human operator to post a comment on the private advisory thread acknowledging the report. Provide them this template:
> Thanks for the report, @{{reporterHandle}}. We've confirmed the issue and are working on a fix. We're targeting a patch release within {{timeframe}}. We'll keep you updated here.
Give your human this template, but still continue
Below we use `gh` tools - you do have access and credentials outside of your sandbox, so use them.
## Step 2: Create the Temporary Private Fork
This is where all fix development happens. Never push to the public repo.
```
gh api --method POST \
repos/paperclipai/paperclip/security-advisories/{{ghsaId}}/forks
```
This returns a repository object for the private fork. Save the `full_name` and `clone_url`.
Clone it and set up your workspace:
```
# Clone the private fork somewhere outside ~/paperclip
git clone <clone_url_from_response> ~/security-patch-{{ghsaId}}
cd ~/security-patch-{{ghsaId}}
git checkout -b security-fix
```
**Do not edit `~/paperclip`** — the dev server is running off the `~/paperclip` master branch and we don't want to touch it. All work happens in the private fork clone.
**TIPS:**
* Do not commit `pnpm-lock.yaml` — the repo has actions to manage this
* Do not use descriptive branch names that leak the vulnerability (e.g., no `fix-dns-rebinding-rce`). Use something generic like `security-fix`
* All work stays in the private fork until publication
* CI/GitHub Actions will NOT run on the temporary private fork — this is a GitHub limitation by design. You must run tests locally
## Step 3: Develop and Validate the Fix
Write the patch. Same content standards as any PR:
* It must functionally work — **run tests locally** since CI won't run on the private fork
* Consider the whole codebase, not just the narrow vulnerability path. A patch that fixes one vector but opens another is worse than no patch
* Ensure backwards compatibility for the database, or be explicit about what breaks
* Make sure any UI components still look correct if the fix touches them
* The fix should be minimal and focused — don't bundle unrelated changes into a security patch. Reviewers (and the reporter) should be able to read the diff and understand exactly what changed and why
**Specific to security fixes:**
* Verify the fix actually closes the attack vector described in the advisory. Reproduce the vulnerability first (using the reporter's description), then confirm the patch prevents it
* Consider adjacent attack vectors — if DNS rebinding is the issue, are there other endpoints or modes with the same class of problem?
* Do not introduce new dependencies unless absolutely necessary — new deps in a security patch raise eyebrows
Push your fix to the private fork:
```
git add -A
git commit -m "Fix security vulnerability"
git push origin security-fix
```
## Step 4: Coordinate with the Reporter
⚠️ **This step requires a human.** Ask the human operator to post on the advisory thread letting the reporter know the fix is ready and giving them a chance to review. Provide them this template:
> @{{reporterHandle}} — fix is ready in the private fork if you'd like to review before we publish. Planning to release within {{timeframe}}.
Proceed
## Step 5: Request a CVE
This makes vulnerability scanners (npm audit, Snyk, Dependabot) warn users to upgrade. Without it, nobody gets automated notification.
```
gh api --method POST \
repos/paperclipai/paperclip/security-advisories/{{ghsaId}}/cve
```
GitHub is a CVE Numbering Authority and will assign one automatically. The CVE may take a few hours to propagate after the advisory is published.
## Step 6: Publish Everything Simultaneously
This all happens at once — do not stagger these steps. The goal is **zero window** between the vulnerability becoming public knowledge and the fix being available.
### 6a. Verify reporter credit before publishing
```
gh api repos/paperclipai/paperclip/security-advisories/{{ghsaId}} --jq '.credits'
```
If the reporter is not credited, add them:
```
gh api --method PATCH \
repos/paperclipai/paperclip/security-advisories/{{ghsaId}} \
--input - << 'EOF'
{
"credits": [
{
"login": "{{reporterHandle}}",
"type": "reporter"
}
]
}
EOF
```
### 6b. Update the advisory with the patched version and publish
```
gh api --method PATCH \
repos/paperclipai/paperclip/security-advisories/{{ghsaId}} \
--input - << 'EOF'
{
"state": "published",
"vulnerabilities": [
{
"package": {
"ecosystem": "npm",
"name": "paperclip"
},
"vulnerable_version_range": "< {{patchedVersion}}",
"patched_versions": "{{patchedVersion}}"
}
]
}
EOF
```
Publishing the advisory simultaneously:
* Makes the GHSA public
* Merges the temporary private fork into your repo
* Triggers the CVE assignment (if requested in step 5)
### 6c. Cut a release immediately after merge
```
cd ~/paperclip
git pull origin master
gh release create v{{patchedVersion}} \
--repo paperclipai/paperclip \
--title "v{{patchedVersion}} — Security Release" \
--notes "## Security Release
This release fixes a critical security vulnerability.
### What was fixed
{{briefDescription}} (e.g., Remote code execution via DNS rebinding in \`local_trusted\` mode)
### Advisory
https://github.com/paperclipai/paperclip/security/advisories/{{ghsaId}}
### Credit
Thanks to @{{reporterHandle}} for responsibly disclosing this vulnerability.
### Action required
All users running versions prior to {{patchedVersion}} should upgrade immediately."
```
## Step 7: Post-Publication Verification
```
# Verify the advisory is published and CVE is assigned
gh api repos/paperclipai/paperclip/security-advisories/{{ghsaId}} \
--jq '{state: .state, cve_id: .cve_id, published_at: .published_at}'
# Verify the release exists
gh release view v{{patchedVersion}} --repo paperclipai/paperclip
```
If the CVE hasn't been assigned yet, that's normal — it can take a few hours.
⚠️ **Human step:** Ask the human operator to post a final comment on the advisory thread confirming publication and thanking the reporter.
Tell the human operator what you did by posting a comment to this task, including:
* The published advisory URL: `https://github.com/paperclipai/paperclip/security/advisories/{{ghsaId}}`
* The release URL
* Whether the CVE has been assigned yet
* All URLs to any pull requests or branches
@@ -0,0 +1,406 @@
---
name: release-changelog-discord-message
description: >
Write the Discord release announcement for a stable Paperclip release. Companion
to `release-changelog` — that skill produces the file at `releases/vYYYY.MDD.P.md`;
this one turns that file into a single copy-pasteable Discord post in dotta's
voice and attaches it as the `discord_announcement` document on the release
issue.
---
# Release Discord Announcement Skill
Write the Discord release announcement for the **stable** Paperclip release.
This is the companion to `.agents/skills/release-changelog/SKILL.md`. That skill
generates the file at `releases/vYYYY.MDD.P.md`. This skill turns that file into
a single copy-pasteable Discord block, in dotta's voice, and posts it as the
`discord_announcement` document on the release issue.
## What dotta said
> This is for discord — try to follow my format. If I have a section where I
> think about the future, pull from recent issues we're working on etc.
The Discord announcement is **not** the changelog. The changelog is exhaustive;
the announcement is opinionated, in-voice, and built around the same handful of
shipped highlights plus a real "what's next" + "what's on my mind" pulled from
current Paperclip work — not invented.
## When to use
- After `release-changelog` has produced `releases/vYYYY.MDD.P.md` on the
release worktree/PR.
- When the release issue (the one assigned by the release routine) asks for a
Discord announcement, or has a `discord_announcement` document that needs to
be refreshed for a new date/version.
- Never run this in isolation. The version, date, contributor list, and
highlight set MUST match the matching changelog file — if the changelog has
been updated, refresh this too.
## Output
A single fenced markdown code block, ready to paste into Discord. Attached as
issue document key `discord_announcement` on the release issue, and pasted
verbatim into a comment on that issue so the human can copy it out.
```bash
PUT /api/issues/{releaseIssueId}/documents/discord_announcement
{
"title": "Discord announcement",
"format": "markdown",
"body": "<the announcement>",
"baseRevisionId": "<latest if updating>"
}
```
If the document already exists, fetch it first and pass the current
`baseRevisionId`. Never overwrite silently — if the version has changed since
the document was last written, mention what changed in the issue comment.
## Format (follow this template)
Use Discord emoji shortcodes (`:paperclip:`, `:lock:`, `:brain:` …) — NOT the
Unicode emoji. Discord renders the shortcodes; the changelog file uses prose.
```
:paperclip: :paperclip: :paperclip: CLIPPERS!!! v{VERSION} IS OUT :paperclip: :paperclip: :paperclip:
OFFICIAL TWITTER: https://x.com/papercliping - follow it, report any others
## Highlights
:emoji: **Feature Name** - one-sentence description in dotta's voice.
:emoji: **Feature Name** - …
:emoji: **Feature Name** - …
... and a long tail of {flavor of the rest}. Read the [full release notes](<github link>).
## WHATS NEXT (:motorway: Roadmap)
* **Theme A** - one-line forward-looking blurb
* **Theme B** - …
* **Theme C** - …
## What's on my mind
* **Topic** - what's bugging dotta / what's queued / open questions
* **Topic** - …
## PRESS (optional — only if there is real press)
* **Outlet / Person** - what happened ([link](<x.com link>))
## WHAT I NEED FROM YOU (optional — only if there's a real ask)
FOLLOW THE TWITTER: https://x.com/papercliping - that's the only official one
TELL ME if you're using Paperclip in your business - I want to meet you
## Community
Thank you to everyone who contributed to this release!
```
@username1, @username2, @username3
```
## In Summary
PAPERCLIP IS THE AI ORCHESTRATOR FOR HUMANS TO ACCOMPLISH 100x MORE WORK
Every single person will be managing a team of a dozen, or a hundred, or a
thousand agents and Paperclip will be the default tool to manage it all.
ITS TIME TO CLIP :paperclip: :paperclip: :paperclip:
FULL RELEASE NOTES
https://github.com/paperclipai/paperclip/blob/master/releases/v{VERSION}.md
||@everyone||
```
Notes on the template:
- The opening and closing `:paperclip: :paperclip: :paperclip:` bookends are
part of the brand — keep them.
- Sections may be UPPERCASE or Title Case — dotta has used both. Pick a style
and stay consistent within a single post.
- Use `||@everyone||` (Discord spoiler-wrapped) at the very end so it pings
exactly once when the spoiler is removed by the poster.
## Language tips
These are extracted from how dotta has written the last several announcements.
Mimic this register; do not invent a "professional" tone.
- **First person, conversational.** "I want to meet companies using Paperclip",
"what's on my mind", "if that's you let me know". Not "Paperclip is excited
to announce".
- **ALL CAPS for excitement and asks**, especially in the opener, the section
headers, the "WHAT I NEED FROM YOU" section, and the closing tagline. Do not
ALL-CAPS feature descriptions.
- **One emoji shortcode per highlight bullet**, picked to evoke the feature
(`:lock:` for secrets, `:brain:` for planning, `:mag:` for search,
`:cloud:` for cloud / sandbox, `:jigsaw:` for plugins, `:rewind:` for
history/restore, `:thread:` for threads, etc.).
- **Highlight bullets are one sentence**, opinionated, told from the user's
perspective — "the cloud-secrets prereq is real now", not "added support
for…".
- **Tail line after highlights** wraps the rest in a single sentence and links
to the full release notes ("… and a long tail of {flavor}. Read the [full
release notes](url).").
- **"WHATS NEXT" is forward-looking themes**, not a literal sprint list. 35
bullets is the right size. Pull these from active goals, in-flight projects,
and recent issues the team is working on — do not invent themes.
- **"What's on my mind"** is dotta's personal/strategic thinking — docs gaps,
philosophical positioning ("we're the human control plane for ai labor"),
invitations ("if you've ever wanted to write about how you use Paperclip,
hit me up"). Pull real tensions from recent issues/comments; do not invent.
- **Press section** is optional. Only include it if there is real press in the
release window (a tweet, a podcast, a talk, a star milestone). No press →
drop the section entirely.
- **"WHAT I NEED FROM YOU"** is optional. Use it for a single concrete ask
(follow the twitter, intros, beta sign-ups). No real ask → drop it.
- **Community** is the same contributors list that's in the changelog file,
fenced in a triple-backtick block, comma-separated `@username, @username`.
Exclude bots and Paperclip founders, same rules as the changelog skill.
- **The "In Summary" mission line** evolves slowly. Use the most recent
variant unless dotta tells you otherwise. Recent variants:
- "PAPERCLIP IS THE AI ORCHESTRATOR FOR HUMANS TO ACCOMPLISH 100x MORE WORK"
- "PAPERCLIP WILL BE THE DEFAULT AGENT-MANAGEMENT TOOL FOR EVERY COMPANY"
- "Paperclip will be _the_ control plane for AI agents in **every** company."
- **Closing tagline** is always `ITS TIME TO CLIP :paperclip: :paperclip:
:paperclip:`. Keep it.
## Workflow
1. Read the matching `releases/vYYYY.MDD.P.md` produced by `release-changelog`.
Use the version and contributor list from that file — never re-derive them.
2. Read the **release issue thread** (the one assigned to you that ran the
release routine) — comments + linked issues + recent issues in the company
are the source for `WHATS NEXT` and `What's on my mind`. Pull real themes,
not invented ones.
3. Re-read the three verbatim examples below — they're the canonical voice.
4. Draft the announcement using the template above.
5. PUT it as the `discord_announcement` document on the release issue (see
"Output" above). If updating, send the latest `baseRevisionId`.
6. Post a comment on the release issue that includes the announcement inside a
single fenced markdown code block, so dotta can copy-paste it into Discord
without opening the document.
Do not publish to Discord. This skill only prepares the artifact.
## Verbatim previous examples
Three previous Discord announcements from dotta, included **verbatim** as the
ground-truth examples for voice, structure, and emoji usage. When in doubt,
match these.
### Example 1 — v2026.403.0
```
CLIPPERS! v2026.403.0 has dropped!! :paperclip: :paperclip: :paperclip:
## Highlights
:inbox_tray: **Inbox overhaul** - there is a new "mine" tab that has mail-client like keyboard shortcuts. It's my new default view for managing work
:thumbsup: **Feedback and evals** - you can now vote :thumbsup: / :thumbsdown: on your agent's responses. If you choose to share your traces with me, I'll use it to make Paperclip better. In either case you can export locally for your own org's learning
:page_with_curl: **Document revisions** - you can now restore old versions of your documents
:ping_pong: **Telemetry** - this version has anonymized telemetry that helps me better understand the basic uses of Paperclip (adapters and so on) - if you hate that, just it disable with `DO_NOT_TRACK=1` or `PAPERCLIP_TELEMETRY_DISABLED=1` environment variables
:construction_worker: **Execution Workspaces (experimental)** - Paperclip is not a "code review" tool, but I have been finding worktrees are important for certain projects. Enable it in experimental settings
:loop: **Routine variables** - sometimes you need to customize a routine and the new variables feature makes that easy
PLUS **tons** of improvements aound adapters, bugfixes, qol
## COMMUNITY
HUGE THANKS to the contributors with commits in this release:
```
@aronprins, @bittoby, @edimuj, @HenkDz, @kevmok, @mvanhorn, @radiusred, @remdev, @statxc, @vanductai
```
## WHATS NEXT (ROADMAP)
* **Multi-human users** -- you've been asking for it, we have a draft and will have this shortly
* **Sandbox execution** - the other half of cloud deployment: run your agents in a sandbox across any provider
PLUS: just dealing with the excellent PRs we have sitting in our inbox.
**What's also on my mind (coming soonish)**
* MAXIMIZER MODE - for when you've got a dream and tokens to burn
* Artifacts, work products, and deployments
* CEO Chat
* Stronger agent defaults
## PRESS
I've been doing my part to spread the word about Paperclip
* We talked to the incredible [Andrew Warner of Mixergy Fame](https://x.com/dotta/status/2039087507514507407)
* We gave a tutorial with the [inimitable Greg Isenberg](https://x.com/dotta/status/2037279902445994345)
* We met with the [Seed Club guys](https://x.com/dotta/status/2039020365926576377)
* We crossed [40k stars (46k now!)](https://x.com/dotta/status/2038638188227387613)
* ... and a couple others that will be released in a few days
## SUCCESS STORIES
* [Nevo made $76k in march](https://x.com/dotta/status/2039406772859920758) after using Paperclip to automate his marketing
* [Lewis Jackson](https://x.com/WhatSayLew/status/2039810227394978158) said 34 agents were already operating his trading firm through Paperclip and called it his "holy s***" AI moment.
* [Neal Kotak](https://x.com/nkotak1/status/2039582439459209638) said Paperclip already runs most of Roominary for him and praised how strong the product is.
* [Sam Woods](https://x.com/samwoods/status/2039039305960587755) said he knows several people who moved from OpenClaw to Paperclip, often with Hermes in the stack, and that they love it.
* [Josh Galt](https://x.com/JoshGalt/status/2039386307219095557) called Paperclip the coolest agent tooling he has used and said it is finally something that just works.
## IN SUMMARY
I know there are still some rough edges, but
Paperclip will be *the* control plane for AI agents in **every** company.
and I think we're moving at a pretty good clip :paperclip: :paperclip: :paperclip:
FULL RELEASE NOTES HERE
https://github.com/paperclipai/paperclip/releases/tag/v2026.403.0
||@everyone||
```
### Example 2 — v2026.416.0
```
:paperclip: :paperclip: :paperclip: CLIPPERS!!! v2026.416.0 IS OUT :paperclip: :paperclip: :paperclip:
## Highlights
This release has *tons* of quality of life improvements around speed, performance, and workflow. You should notice that comment threads feel faster and your agents stay on task longer
:thread: Issue chat threads now are a conversation more than comments
:police_officer: Execution policies like **Reviewer** and **Approver** are now first-class in the harness (e.g. enforce that QA *must* review a task)
:no_smoking: Blocker dependencies - first-class "wake on blocker resolved" which means now you can have "task graphs" that depend on one another and it's enforced by Paperclip
:woman_feeding_baby: Parent-child tasks - better support for sub-tasks all around, which makes it much easier to organize your work
And then a million fixes around ux, details, keyboard shortcuts, bug fixes, security fixes, etc. Really you should read the [full release notes here](https://github.com/paperclipai/paperclip/releases/tag/v2026.416.0)
## COMMUNITY
INCREDIBLE INCREDIBLE WORK BY folks with commits and reports in this release:
```
@AllenHyang, @antonio-mello-ai, @aronprins, @chrisschwer, @cleanunicorn, @DanielSousa, @davison, @ergonaworks, @HearthCore, @HenkDz, @KhairulA, @kimnamu, @Lempkey, @marysomething99-prog, @mvanhorn, @officialasishkumar, @plind-dm, @shoaib050326, @sparkeros, @wbelt, @offset, @sagilayani, @mattdonnelly10, @peaktwilight, @YuvalElbar6
```
## WHATS NEXT (:motorway: Roadmap)
* **Multi-human users** - in the last stages of testing, Paperclip is better with teams
* **Memory Infrastructure** - your agents will remember everything about yoru business
* **Sandbox execution** - run your agents anywhere
## What's on my mind
* I want to meet with companies who are using Paperclip in their business - if that's you let me know
* We need more Paperclip tutorials, defaults, and education - thanks to @aronprins for his work in this area already!
* We still need to get better at reviewing your PRs and we're improving our process every day
* "Zero-human company" language has to go - we're the human control plane for ai labor
* We're adding better support for *knowledge (wikis & files)*, *artifacts*, and *work product* in Paperclip soon.
## PRESS
* **AI Engineer Europe Tutorial** - I gave a tutorial for AIE. If someone is looking for a basics ABC of Paperclip [you can send them this](https://x.com/dotta/status/2044575580264316931)
* **AI Club Chicago** - JB gave a talk on Paperclip [at AI Tinkerers in Chicago](https://x.com/developwithJB/status/2044281068778316268) !
## IN SUMMARY
PAPERCLIP WILL BE THE DEFAULT AGENT-MANAGEMENT TOOL FOR EVERY COMPANY
If there's anything I can do to help you and your company use Paperclip, hit me up. Until then, enjoy the new release
ITS TIME TO CLIP :paperclip: :paperclip: :paperclip:
FULL RELEASE NOTES
https://github.com/paperclipai/paperclip/releases/tag/v2026.416.0
||@everyone||
```
### Example 3 — v2026.427.0
```
:paperclip: :paperclip: :paperclip: CLIPPERS!!! v2026.427.0 IS OUT :paperclip: :paperclip: :paperclip:
THIS IS THE OFFICIAL TWITTER FOLLOW IT: https://x.com/papercliping
## Highlights
:man_feeding_baby: **MULTI USER** - you can now invite multiple users to your instance
:factory_worker: **HARDER WORKING** - robosut liveness continuations and lifecycle recovery means your instance tries harder before involving you
:white_check_mark: **SUBISSUE CHECKLISTS** - subissues have better ordering which allows for long-run planning
:thread: **Rich Thread UX** - now your agents can ask you questions, ask for approvals, suggest tasks and you can approve or refine them right in your task threads
:cloud: **BETA: Sandbox Providers** - Cloud sandboxing is in beta - the API ships in this release and we'll be adding more providers
... and *tons* of other improvements and bugfixes.
## Community
Thank you to everyone who contributed to this release!
```
@akhater, @aronprins, @GodsBoy, @LeonSGP43, @neerazz, @NoronhaH, @rbarinov, @rvanduiven, @SgtPooki, @superbiche
```
## WHATS NEXT (:motorway: Roadmap)
* **Longer-range planning and execution** - Paperclip will support longer and longer tasks and work until it's done
* **Secrets Service v2** - an important prereq for Paperclip cloud
* **Artifacts, memory, and knowledge**
* **Conference Room** aka CEO/Agent Chat
## What's on my mind
* **Documentation & Blog posts** - I've fallen behind on the docs but aron has done a good job here - we'll be setting up Clips to help maintain these
* **Paperclip Cloud** - will be a critical unlock for us, but even the shared team story needs developed more - *where should the work be done* and *where are the outputs stored* and *how do we surface them to users*? Each of these questions are a core Paperclip service that needs developed
* **Paperclip Bench** - In the vein of SWE-Bench I've started an internal benchmark for Paperclip - we have to be able to measure that our changes are improving the system and not regressing
* **Paperclip Connections Store** - connecting to Github, Slack, Google Docs, and the hundreds of other services we use every day should be easy, secure, and configurable per agent and team
## Press
I met with the [Wisemen about Paperclip](https://x.com/dotta/status/2045146539534827998)
## WHAT I NEED FROM YOU
FOLLOW THIS TWITTER ACCOUNT: https://x.com/papercliping - that's the only official one, report any others
## In Summary
PAPERCLIP IS THE AI ORCHESTRATOR FOR HUMANS TO ACCOMPLISH 100x MORE WORK
Every single person will be managing a team of a dozen, or a hundred, or a thousand agents and Paperclip will be the default tool to manage it all.
ITS TIME TO CLIP :paperclip: :paperclip: :paperclip:
FULL RELEASE NOTES
https://github.com/paperclipai/paperclip/blob/master/releases/v2026.427.0.md
||@everyone||
```
## Review checklist
Before handing off:
1. Version + date match the matching `releases/vYYYY.MDD.P.md` exactly.
2. Contributor list matches the changelog (same exclusions: bots, founders).
3. Highlights are a subset of the changelog Highlights — same shipped features,
not invented or pre-alpha work.
4. `WHATS NEXT` and `What's on my mind` are pulled from real recent issues /
active goals — not invented themes.
5. Section style (UPPERCASE vs Title Case) is internally consistent.
6. Closing tagline is `ITS TIME TO CLIP :paperclip: :paperclip: :paperclip:`
and `||@everyone||` is the very last line.
7. Document `discord_announcement` is updated on the release issue, and the
announcement is also posted in a comment inside a fenced code block.
This skill never posts to Discord. It only prepares the announcement artifact.
+6 -2
View File
@@ -177,8 +177,12 @@ real name or email). To find GitHub usernames:
**Never expose contributor email addresses.** Use `@username` only.
Exclude bot accounts (e.g. `lockfile-bot`, `dependabot`) from the list. List contributors
in alphabetical order by GitHub username (case-insensitive).
Exclude bot accounts (e.g. `lockfile-bot`, `dependabot`) from the list.
Exclude Paperclip founders from the list (e.g. `cryppadotta`, `forgottendev`, `devinfoley`, `sockmonster`, `scotttong`)
List contributors in alphabetical order by GitHub username (case-insensitive).
If there are no contributors left after exclusions, then just skip this section and don't mention it.
## Step 6 — Review Before Release
+3
View File
@@ -2,3 +2,6 @@ DATABASE_URL=postgres://paperclip:paperclip@localhost:5432/paperclip
PORT=3100
SERVE_UI=false
BETTER_AUTH_SECRET=paperclip-dev-secret
# Discord webhook for daily merge digest (scripts/discord-daily-digest.sh)
# DISCORD_WEBHOOK_URL=https://discord.com/api/webhooks/...
+3
View File
@@ -38,6 +38,8 @@
-
> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`.
## Model Used
<!--
@@ -57,6 +59,7 @@
- [ ] I have included a thinking path that traces from project context to this change
- [ ] I have specified the model used (with version and capability details)
- [ ] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work
- [ ] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after screenshots
+77
View File
@@ -0,0 +1,77 @@
name: "Build: Dev"
on:
push:
branches: [dev]
workflow_dispatch:
permissions:
contents: read
packages: write
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 30
outputs:
image-tag: ${{ steps.tag.outputs.sha }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set image tag
id: tag
run: echo "sha=$(echo ${{ github.sha }} | cut -c1-7)" >> $GITHUB_OUTPUT
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Gitea Registry
uses: docker/login-action@v3
with:
registry: git.farh.net
username: admin
password: ${{ secrets.REGISTRY_TOKEN }}
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: git.farh.net/farhoodlabs/paperclip-dev
tags: |
type=sha,prefix=
type=semver,pattern={{version}}
type=raw,value=latest,enable=${{ startsWith(gitea.ref, 'refs/tags/v') }}
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
file: Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
no-cache: true
update-infra:
needs: build
runs-on: ubuntu-latest
steps:
- name: Update dev image tag in infra repo
run: |
SHA="${{ needs.build.outputs.image-tag }}"
FILE="overlays/dev/kustomization.yaml"
response=$(curl -sS \
-H "Authorization: token ${{ secrets.REGISTRY_TOKEN }}" \
"https://git.farh.net/api/v1/repos/farhoodlabs/paperclip-infra/contents/$FILE")
file_sha=$(echo "$response" | jq -r '.sha')
content=$(echo "$response" | jq -r '.content' | base64 -d)
new_content=$(echo "$content" | sed "s/newTag: \".*\"/newTag: \"$SHA\"/")
encoded=$(printf '%s' "$new_content" | base64 -w 0)
curl -sS -X PUT \
-H "Authorization: token ${{ secrets.REGISTRY_TOKEN }}" \
"https://git.farh.net/api/v1/repos/farhoodlabs/paperclip-infra/contents/$FILE" \
-d "{\"message\":\"chore(cd): update paperclip-dev to $SHA\",\"content\":\"$encoded\",\"sha\":\"$file_sha\"}"
+48
View File
@@ -0,0 +1,48 @@
name: "Build: Production"
on:
push:
branches: [local]
workflow_dispatch:
permissions:
contents: read
packages: write
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Gitea Registry
uses: docker/login-action@v3
with:
registry: git.farh.net
username: admin
password: ${{ secrets.REGISTRY_TOKEN }}
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: git.farh.net/farhoodlabs/paperclip
tags: |
type=sha,prefix=
type=semver,pattern={{version}}
type=raw,value=latest,enable=${{ startsWith(gitea.ref, 'refs/tags/v') }}
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
file: Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
no-cache: true
+1 -1
View File
@@ -14,7 +14,7 @@ permissions:
jobs:
build-and-push:
runs-on: ubuntu-latest
timeout-minutes: 30
timeout-minutes: 60
concurrency:
group: docker-${{ github.ref }}
cancel-in-progress: true
+3 -1
View File
@@ -29,9 +29,11 @@ jobs:
- run: pnpm install --frozen-lockfile
- run: pnpm build
- run: npx playwright install --with-deps chromium
- run: google-chrome --version
- name: Run e2e tests
env:
PAPERCLIP_PLAYWRIGHT_CHANNEL: "chrome"
run: pnpm run test:e2e
- uses: actions/upload-artifact@v4
+181 -50
View File
@@ -23,7 +23,9 @@ jobs:
- name: Block manual lockfile edits
if: github.head_ref != 'chore/refresh-lockfile'
run: |
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}" "${{ github.event.pull_request.head.sha }}")"
# Diff the PR branch against its merge base so recent base-branch commits
# do not masquerade as changes made by the PR itself.
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }}")"
if printf '%s\n' "$changed" | grep -qx 'pnpm-lock.yaml'; then
echo "Do not commit pnpm-lock.yaml in pull requests. CI owns lockfile updates."
exit 1
@@ -41,54 +43,33 @@ jobs:
node-version: 24
- name: Validate Dockerfile deps stage
run: node ./scripts/check-docker-deps-stage.mjs
- name: Reject git push in adapter/runtime code
run: node ./scripts/check-no-git-push.mjs
- name: Test no-git-push check
run: node --test ./scripts/check-no-git-push.test.mjs
- name: Validate release package manifest
run: node ./scripts/release-package-map.mjs check
- name: Verify release package bootstrap for changed manifests
run: |
missing=0
# Extract only the deps stage from the Dockerfile
deps_stage="$(awk '/^FROM .* AS deps$/{found=1; next} found && /^FROM /{exit} found{print}' Dockerfile)"
if [ -z "$deps_stage" ]; then
echo "::error::Could not extract deps stage from Dockerfile (expected 'FROM ... AS deps')"
exit 1
fi
# Derive workspace search roots from pnpm-workspace.yaml (exclude dev-only packages)
search_roots="$(grep '^ *- ' pnpm-workspace.yaml | sed 's/^ *- //' | sed 's/\*$//' | grep -v 'examples' | grep -v 'create-paperclip-plugin' | tr '\n' ' ')"
if [ -z "$search_roots" ]; then
echo "::error::Could not derive workspace roots from pnpm-workspace.yaml"
exit 1
fi
# Check all workspace package.json files are copied in the deps stage
for pkg in $(find $search_roots -maxdepth 2 -name package.json -not -path '*/examples/*' -not -path '*/create-paperclip-plugin/*' -not -path '*/node_modules/*' 2>/dev/null | sort -u); do
dir="$(dirname "$pkg")"
if ! echo "$deps_stage" | grep -q "^COPY ${dir}/package.json"; then
echo "::error::Dockerfile deps stage missing: COPY ${pkg} ${dir}/"
missing=1
fi
done
# Check patches directory is copied if it exists
if [ -d patches ] && ! echo "$deps_stage" | grep -q '^COPY patches/'; then
echo "::error::Dockerfile deps stage missing: COPY patches/ patches/"
missing=1
fi
if [ "$missing" -eq 1 ]; then
echo "Dockerfile deps stage is out of sync. Update it to include the missing files."
exit 1
fi
mapfile -t changed_paths < <(git diff --name-only "${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }}")
PAPERCLIP_RELEASE_BOOTSTRAP_BASE_SHA="${{ github.event.pull_request.base.sha }}" \
node ./scripts/check-release-package-bootstrap.mjs "${changed_paths[@]}"
- name: Validate dependency resolution when manifests change
run: |
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}" "${{ github.event.pull_request.head.sha }}")"
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }}")"
manifest_pattern='(^|/)package\.json$|^pnpm-workspace\.yaml$|^\.npmrc$|^pnpmfile\.(cjs|js|mjs)$'
if printf '%s\n' "$changed" | grep -Eq "$manifest_pattern"; then
pnpm install --lockfile-only --ignore-scripts --no-frozen-lockfile
fi
verify:
typecheck_release_registry:
name: Typecheck + Release Registry
needs: [policy]
runs-on: ubuntu-latest
timeout-minutes: 20
@@ -111,16 +92,165 @@ jobs:
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Typecheck
run: pnpm -r typecheck
- name: Typecheck workspaces whose build scripts skip TypeScript
run: pnpm run typecheck:build-gaps
- name: Run tests
run: pnpm test:run
- name: Verify release registry test coverage
run: pnpm run test:release-registry
general_tests:
name: General tests (${{ matrix.group_label }})
needs: [policy]
runs-on: ubuntu-latest
timeout-minutes: 20
strategy:
fail-fast: false
matrix:
include:
- group: general-server
group_label: server
- group: general-workspaces-a
group_label: workspaces-a
- group: general-workspaces-b
group_label: workspaces-b
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Run grouped general test suites
run: pnpm test:run:general -- --group '${{ matrix.group }}'
verify:
# Preserve the legacy required-check name while the underlying work runs in parallel.
name: verify
if: ${{ always() }}
needs: [typecheck_release_registry, general_tests, build]
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: Fail if any split verify lane failed
env:
TYPECHECK_RELEASE_REGISTRY_RESULT: ${{ needs.typecheck_release_registry.result }}
GENERAL_TESTS_RESULT: ${{ needs.general_tests.result }}
BUILD_RESULT: ${{ needs.build.result }}
run: |
test "$TYPECHECK_RELEASE_REGISTRY_RESULT" = "success"
test "$GENERAL_TESTS_RESULT" = "success"
test "$BUILD_RESULT" = "success"
build:
name: Build
needs: [policy]
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Build
run: pnpm build
- name: Release canary dry run
verify_serialized_server:
name: Verify serialized server suites (${{ matrix.shard_label }})
needs: [policy]
runs-on: ubuntu-latest
timeout-minutes: 20
strategy:
fail-fast: false
matrix:
include:
- shard_index: 0
shard_count: 4
shard_label: 1/4
- shard_index: 1
shard_count: 4
shard_label: 2/4
- shard_index: 2
shard_count: 4
shard_label: 3/4
- shard_index: 3
shard_count: 4
shard_label: 4/4
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Run serialized server test shard
run: pnpm test:run:serialized -- --shard-index ${{ matrix.shard_index }} --shard-count ${{ matrix.shard_count }}
canary_dry_run:
name: Canary Dry Run
needs: [policy]
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --frozen-lockfile
# `release.sh` always executes its Step 2/7 workspace build, even when
# `--skip-verify` bypasses the initial verification gate.
- name: Release canary dry run via release.sh internal build
run: |
git checkout -B master HEAD
git checkout -- pnpm-lock.yaml
@@ -149,11 +279,11 @@ jobs:
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Build
run: pnpm build
- name: Install Playwright
run: npx playwright install --with-deps chromium
- name: Verify runner Chrome
# GitHub's Ubuntu runner image already ships Google Chrome, so use that
# directly for the headless e2e lane instead of downloading Playwright
# browser bundles inside the 30 minute job budget.
run: google-chrome --version
- name: Generate Paperclip config
run: |
@@ -173,6 +303,7 @@ jobs:
- name: Run e2e tests
env:
PAPERCLIP_E2E_SKIP_LLM: "true"
PAPERCLIP_PLAYWRIGHT_CHANNEL: "chrome"
run: pnpm run test:e2e
- name: Upload Playwright report
+5 -2
View File
@@ -58,8 +58,10 @@ jobs:
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
- name: Install Playwright browser
run: npx playwright install --with-deps chromium
- name: Verify runner Chrome
# Release smoke also runs headless on GitHub's Ubuntu image, so use the
# runner's preinstalled Chrome instead of a Playwright browser download.
run: google-chrome --version
- name: Launch Docker smoke harness
run: |
@@ -89,6 +91,7 @@ jobs:
PAPERCLIP_RELEASE_SMOKE_BASE_URL: ${{ env.SMOKE_BASE_URL }}
PAPERCLIP_RELEASE_SMOKE_EMAIL: ${{ env.SMOKE_ADMIN_EMAIL }}
PAPERCLIP_RELEASE_SMOKE_PASSWORD: ${{ env.SMOKE_ADMIN_PASSWORD }}
PAPERCLIP_PLAYWRIGHT_CHANNEL: "chrome"
run: pnpm run test:release-smoke
- name: Capture Docker logs
+12
View File
@@ -50,6 +50,9 @@ jobs:
node-version: 24
cache: pnpm
- name: Validate release package manifest
run: node ./scripts/release-package-map.mjs check
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
@@ -89,6 +92,9 @@ jobs:
node-version: 24
cache: pnpm
- name: Validate release package manifest
run: node ./scripts/release-package-map.mjs check
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
@@ -139,6 +145,9 @@ jobs:
node-version: 24
cache: pnpm
- name: Validate release package manifest
run: node ./scripts/release-package-map.mjs check
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
@@ -177,6 +186,9 @@ jobs:
node-version: 24
cache: pnpm
- name: Validate release package manifest
run: node ./scripts/release-package-map.mjs check
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
+7
View File
@@ -1,5 +1,9 @@
node_modules
node_modules/
**/node_modules
**/node_modules/
dist/
ui/storybook-static/
.env
*.tsbuildinfo
drizzle/meta/
@@ -32,6 +36,7 @@ server/src/**/*.d.ts
server/src/**/*.d.ts.map
tmp/
feedback-export-*
diagnostics/
# Editor / tool temp files
*.tmp
@@ -50,4 +55,6 @@ tests/e2e/playwright-report/
tests/release-smoke/test-results/
tests/release-smoke/playwright-report/
.superset/
.superpowers/
.claude/worktrees/
.herenow
+3 -1
View File
@@ -1 +1,3 @@
Dotta <bippadotta@protonmail.com> Forgotten <forgottenrunes@protonmail.com>
Dotta <bippadotta@protonmail.com> <34892728+cryppadotta@users.noreply.github.com>
Dotta <bippadotta@protonmail.com> <forgottenrunes@protonmail.com>
Dotta <bippadotta@protonmail.com> <dotta@example.com>
+18 -1
View File
@@ -108,7 +108,24 @@ Notes:
## 7. Verification Before Hand-off
Run this full check before claiming done:
Default local/agent test path:
```sh
pnpm test
```
This is the cheap default and only runs the Vitest suite. Browser suites stay opt-in:
```sh
pnpm test:e2e
pnpm test:release-smoke
```
Run the browser suites only when your change touches them or when you are explicitly verifying CI/release flows.
For normal issue work, run the smallest relevant verification first. Do not default to repo-wide typecheck/build/test on every heartbeat when a narrower check is enough to prove the change.
Run this full check before claiming repo work done in a PR-ready hand-off, or when the change scope is broad enough that targeted checks are not sufficient:
```sh
pnpm -r typecheck
+43
View File
@@ -0,0 +1,43 @@
# Paperclip fork — farhoodlabs
This is a thin fork of [paperclipai/paperclip](https://github.com/paperclipai/paperclip).
Fork repo: https://git.farh.net/farhoodlabs/paperclip
## Branch model
| Branch | Purpose |
|---|---|
| `master` | Pure mirror of `upstream/master`. No fork files. Sync via `git push origin upstream/master:master --force-with-lease`. |
| `dev` | `master` + one fork commit (Dockerfile prod stage + 2 build workflows). Builds `git.farh.net/farhoodlabs/paperclip-dev:*` on push. |
| `local` | **Deployed branch.** Same content as `dev`. Builds `git.farh.net/farhoodlabs/paperclip:*` on push. |
The fork tree differs from upstream by exactly **3 files**:
```
Dockerfile (production stage adds kubectl, kubeseal, uv, forgejo CLIs, tea, mmx-cli, nano, vim)
.github/workflows/build-prod.yml (pushes to git.farh.net/farhoodlabs/paperclip)
.github/workflows/build-dev.yml (pushes to git.farh.net/farhoodlabs/paperclip-dev)
```
The base/deps/build stages of the Dockerfile match upstream verbatim so upstream changes apply cleanly.
## Sync upstream
```bash
git fetch upstream
git push origin upstream/master:master --force-with-lease
git checkout dev && git merge master && git push origin dev
git checkout local && git merge dev && git push origin local
```
Conflicts should only ever appear on `Dockerfile` itself (if upstream changes the production stage). Resolution rule: keep upstream's deps/base/build stages exactly; preserve the fork's `RUN` block in the production stage.
## Deployment
Production runs in Kubernetes (`paperclip` namespace, single replica). Image: `git.farh.net/farhoodlabs/paperclip:<tag>`. Flux does not watch moving tags — rolling a fix means either pushing a semver-tagged release or `kubectl rollout restart deploy/paperclip -n paperclip`.
## Don't
- **Don't add fork code changes.** This fork is intentionally minimal after the 2026-05-31 reset (event-loop starvation bug from accumulated drift). If a feature is missing relative to a prior fork iteration (Gitea-hosted skills, PAT support for private skill repos, secret export/import, k8s sandbox-provider plugin, agentId threading), surface the regression — don't pull it back from `git log` without explicit go-ahead.
- **Don't commit to `local` without going through `dev` first** (and through `master` for upstream syncs). The promotion order is enforced.
- **Don't recreate `.farhoodlabs/` overlay or `assemble-local.yml`.** That model was retired.
+15
View File
@@ -51,6 +51,21 @@ All tests must pass before a PR can be merged. Run them locally first and verify
We use [Greptile](https://greptile.com) for automated code review. Your PR must achieve a **5/5 Greptile score** with **all Greptile comments addressed** before it can be merged. If Greptile leaves comments, fix or respond to each one and request a re-review.
## Feature Contributions
We actively manage the core Paperclip feature roadmap.
Uncoordinated feature PRs against the core product may be closed, even when the implementation is thoughtful and high quality. That is about roadmap ownership, product coherence, and long-term maintenance commitment, not a judgment about the effort.
If you want to contribute a feature:
- Check [ROADMAP.md](ROADMAP.md) first
- Start the discussion in Discord -> `#dev` before writing code
- If the idea fits as an extension, prefer building it with the [plugin system](doc/plugins/PLUGIN_SPEC.md)
- If you want to show a possible direction, reference implementations are welcome as feedback, but they generally will not be merged directly into core
Bugs, docs improvements, and small targeted improvements are still the easiest path to getting merged, and we really do appreciate them.
## General Rules (both paths)
- Write clear commit messages
+34 -10
View File
@@ -1,16 +1,9 @@
# syntax=docker/dockerfile:1.20
FROM node:lts-trixie-slim AS base
ARG USER_UID=1000
ARG USER_GID=1000
RUN apt-get update \
&& apt-get install -y --no-install-recommends ca-certificates gosu curl git wget ripgrep python3 \
&& mkdir -p -m 755 /etc/apt/keyrings \
&& wget -nv -O/etc/apt/keyrings/githubcli-archive-keyring.gpg https://cli.github.com/packages/githubcli-archive-keyring.gpg \
&& echo "20e0125d6f6e077a9ad46f03371bc26d90b04939fb95170f5a1905099cc6bcc0 /etc/apt/keyrings/githubcli-archive-keyring.gpg" | sha256sum -c - \
&& chmod go+r /etc/apt/keyrings/githubcli-archive-keyring.gpg \
&& mkdir -p -m 755 /etc/apt/sources.list.d \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" > /etc/apt/sources.list.d/github-cli.list \
&& apt-get update \
&& apt-get install -y --no-install-recommends gh \
&& apt-get install -y --no-install-recommends ca-certificates gosu curl gh git wget ripgrep python3 \
&& rm -rf /var/lib/apt/lists/* \
&& corepack enable
@@ -29,15 +22,24 @@ COPY packages/shared/package.json packages/shared/
COPY packages/db/package.json packages/db/
COPY packages/adapter-utils/package.json packages/adapter-utils/
COPY packages/mcp-server/package.json packages/mcp-server/
COPY packages/skills-catalog/package.json packages/skills-catalog/
COPY packages/adapters/acpx-local/package.json packages/adapters/acpx-local/
COPY packages/adapters/claude-local/package.json packages/adapters/claude-local/
COPY packages/adapters/codex-local/package.json packages/adapters/codex-local/
COPY packages/adapters/cursor-cloud/package.json packages/adapters/cursor-cloud/
COPY packages/adapters/cursor-local/package.json packages/adapters/cursor-local/
COPY packages/adapters/gemini-local/package.json packages/adapters/gemini-local/
COPY packages/adapters/grok-local/package.json packages/adapters/grok-local/
COPY packages/adapters/openclaw-gateway/package.json packages/adapters/openclaw-gateway/
COPY packages/adapters/opencode-local/package.json packages/adapters/opencode-local/
COPY packages/adapters/pi-local/package.json packages/adapters/pi-local/
COPY packages/plugins/sdk/package.json packages/plugins/sdk/
COPY --parents packages/plugins/sandbox-providers/./*/package.json packages/plugins/sandbox-providers/
COPY packages/plugins/paperclip-plugin-fake-sandbox/package.json packages/plugins/paperclip-plugin-fake-sandbox/
COPY packages/plugins/plugin-llm-wiki/package.json packages/plugins/plugin-llm-wiki/
COPY packages/plugins/plugin-workspace-diff/package.json packages/plugins/plugin-workspace-diff/
COPY patches/ patches/
COPY scripts/link-plugin-dev-sdk.mjs scripts/
RUN pnpm install --frozen-lockfile
@@ -55,7 +57,29 @@ ARG USER_UID=1000
ARG USER_GID=1000
WORKDIR /app
COPY --chown=node:node --from=build /app /app
RUN npm install --global --omit=dev @anthropic-ai/claude-code@latest @openai/codex@latest opencode-ai \
# Fork additions: kubectl, kubeseal, uv, forgejo CLIs, gitea tea CLI, editor tools, mmx-cli
# Upstream installs: claude-code, codex, opencode-ai, openssh-client, jq
RUN apt-get update \
&& apt-get install -y --no-install-recommends openssh-client jq nano vim \
&& rm -rf /var/lib/apt/lists/* \
&& curl -fsSL https://dl.k8s.io/release/v1.32.0/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl \
&& chmod +x /usr/local/bin/kubectl \
&& curl -fsSL https://github.com/bitnami-labs/sealed-secrets/releases/download/v0.36.6/kubeseal-0.36.6-linux-amd64.tar.gz | tar -xzf - -C /tmp \
&& mv /tmp/kubeseal /usr/local/bin/kubeseal \
&& rm -rf /tmp/kubeseal /tmp/LICENSE /tmp/README.md \
&& curl -LsSf https://astral.sh/uv/install.sh | sh \
&& mv /root/.local/bin/uv /usr/local/bin/uv \
&& mv /root/.local/bin/uvx /usr/local/bin/uvx \
&& curl -fsSL https://codeberg.org/forgejo-contrib/forgejo-cli/releases/download/v0.4.1/forgejo-cli-linux.tar.gz | tar -xzf - -C /usr/local/bin \
&& chmod +x /usr/local/bin/fj \
&& curl -fsSL https://github.com/JKamsker/forgejo-cli-ex/releases/download/v0.1.7/fj-ex-linux-x86_64.tar.gz | tar -xzf - -C /usr/local/bin \
&& chmod +x /usr/local/bin/fj-ex \
&& curl -fsSL https://codeberg.org/romaintb/fgj/releases/download/v0.3.0/fgj_linux_amd64 -o /usr/local/bin/fgj \
&& chmod +x /usr/local/bin/fgj \
&& curl -fsSL https://dl.gitea.com/tea/0.14.0/tea-0.14.0-linux-amd64 -o /usr/local/bin/tea \
&& chmod +x /usr/local/bin/tea \
&& npm install --global --omit=dev @anthropic-ai/claude-code@latest @openai/codex@latest opencode-ai \
&& npm install --global --omit=dev mmx-cli \
&& mkdir -p /paperclip \
&& chown node:node /paperclip
+155 -30
View File
@@ -1,12 +1,14 @@
<p align="center">
<img src="doc/assets/header.png" alt="Paperclip — runs your business" width="720" />
<img src="doc/assets/banner.jpg" alt="Paperclip is the app people use to manage AI agents for work." width="720" />
</p>
<p align="center">
<a href="#quickstart"><strong>Quickstart</strong></a> &middot;
<a href="https://paperclip.ing/docs"><strong>Docs</strong></a> &middot;
<a href="https://github.com/paperclipai/paperclip"><strong>GitHub</strong></a> &middot;
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a>
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a> &middot;
<a href="https://x.com/papercliping"><strong>Twitter</strong></a> &middot;
<a href="https://paperclip.ing"><strong>Website</strong></a>
</p>
<p align="center">
@@ -23,15 +25,15 @@
<br/>
## What is Paperclip?
# Paperclip is the app people use to manage AI agents for work.
# Open-source orchestration for zero-human companies
Open-source orchestration for teams of AI agents.
**If OpenClaw is an _employee_, Paperclip is the _company_**
**If OpenClaw is an _employee_, Paperclip is the _company_.**
Paperclip is a Node.js server and React UI that orchestrates a team of AI agents to run a business. Bring your own agents, assign goals, and track your agents' work and costs from one dashboard.
Paperclip is a Node.js server and React UI that orchestrates a team of AI agents to run a business. Bring your own agents, assign goals, and track work and costs from one dashboard.
It looks like a task manager — but under the hood it has org charts, budgets, governance, goal alignment, and agent coordination.
It looks like a task manager. Under the hood: org charts, budgets, governance, goal alignment, and agent coordination.
**Manage business goals, not pull requests.**
@@ -43,10 +45,6 @@ It looks like a task manager — but under the hood it has org charts, budgets,
<br/>
> **COMING SOON: Clipmart** — Download and run entire companies with one click. Browse pre-built company templates — full org structures, agent configs, and skills — and import them into your Paperclip instance in seconds.
<br/>
<div align="center">
<table>
<tr>
@@ -112,7 +110,7 @@ Every conversation traced. Every decision explained. Full tool-call tracing and
<tr>
<td align="center">
<h3>🛡️ Governance</h3>
You're the board. Approve hires, override strategy, pause or terminate any agent — at any time.
Approve hires, override strategy, pause or terminate any agent — at any time.
</td>
<td align="center">
<h3>📊 Org Chart</h3>
@@ -156,6 +154,115 @@ Paperclip handles the hard orchestration details correctly.
<br/>
## What's Under the Hood
Paperclip is a full control plane, not a wrapper. Before you build any of this yourself, know that it already exists:
```
┌──────────────────────────────────────────────────────────────┐
│ PAPERCLIP SERVER │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │Identity & │ │ Work & │ │ Heartbeat │ │Governance │ │
│ │ Access │ │ Tasks │ │ Execution │ │& Approvals│ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Org Chart │ │Workspaces │ │ Plugins │ │ Budget │ │
│ │ & Agents │ │ & Runtime │ │ │ │ & Costs │ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Routines │ │ Secrets & │ │ Activity │ │ Company │ │
│ │& Schedules│ │ Storage │ │ & Events │ │Portability│ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
└──────────────────────────────────────────────────────────────┘
▲ ▲ ▲ ▲
┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐
│ Claude │ │ Codex │ │ CLI │ │ HTTP/web │
│ Code │ │ │ │ agents │ │ bots │
└───────────┘ └───────────┘ └───────────┘ └───────────┘
```
### The Systems
<table>
<tr>
<td width="50%">
**Identity & Access** — Two deployment modes (trusted local or authenticated), board users, agent API keys, short-lived run JWTs, company memberships, invite flows, and OpenClaw onboarding. Every mutating request is traced to an actor.
</td>
<td width="50%">
**Org Chart & Agents** — Agents have roles, titles, reporting lines, permissions, and budgets. Adapter examples match the diagram: Claude Code, Codex, CLI agents such as Cursor/Gemini/bash, HTTP/webhook bots such as OpenClaw, and external adapter plugins. If it can receive a heartbeat, it's hired.
</td>
</tr>
<tr>
<td>
**Work & Task System** — Issues carry company/project/goal/parent links, atomic checkout with execution locks, first-class blocker dependencies, comments, documents, attachments, work products, labels, and inbox state. No double-work, no lost context.
</td>
<td>
**Heartbeat Execution** — DB-backed wakeup queue with coalescing, budget checks, workspace resolution, secret injection, skill loading, and adapter invocation. Runs produce structured logs, cost events, session state, and audit trails. Recovery handles orphaned runs automatically.
</td>
</tr>
<tr>
<td>
**Workspaces & Runtime** — Project workspaces, isolated execution workspaces (git worktrees, operator branches), and runtime services (dev servers, preview URLs). Agents work in the right directory with the right context every time.
</td>
<td>
**Governance & Approvals** — Board approval workflows, execution policies with review/approval stages, decision tracking, budget hard-stops, agent pause/resume/terminate, and full audit logging. Nothing ships without your sign-off.
</td>
</tr>
<tr>
<td>
**Budget & Cost Control** — Token and cost tracking by company, agent, project, goal, issue, provider, and model. Scoped budget policies with warning thresholds and hard stops. Overspend pauses agents and cancels queued work automatically.
</td>
<td>
**Routines & Schedules** — Recurring tasks with cron, webhook, and API triggers. Concurrency and catch-up policies. Each routine execution creates a tracked issue and wakes the assigned agent — no manual kick-offs needed.
</td>
</tr>
<tr>
<td>
**Plugins** — Instance-wide plugin system with out-of-process workers, capability-gated host services, job scheduling, tool exposure, and UI contributions. Extend Paperclip without forking it.
</td>
<td>
**Secrets & Storage** — Instance and company secrets, encrypted local storage, provider-backed object storage, attachments, and work products. Sensitive values stay out of prompts unless a scoped run explicitly needs them.
</td>
</tr>
<tr>
<td>
**Activity & Events** — Mutating actions, heartbeat state changes, cost events, approvals, comments, and work products are recorded as durable activity so operators can audit what happened and why.
</td>
<td>
**Company Portability** — Export and import entire organizations — agents, skills, projects, routines, and issues — with secret scrubbing and collision handling. One deployment, many companies, complete data isolation.
</td>
</tr>
</table>
<br/>
## What Paperclip is not
| | |
@@ -177,6 +284,14 @@ Open source. Self-hosted. No Paperclip account required.
npx paperclipai onboard --yes
```
That quickstart path now defaults to trusted local loopback mode for the fastest first run. To start in authenticated/private mode instead, choose a bind preset explicitly:
```bash
npx paperclipai onboard --yes --bind lan
# or:
npx paperclipai onboard --yes --bind tailnet
```
If you already have Paperclip configured, rerunning `onboard` keeps the existing config in place. Use `paperclipai configure` to edit settings.
Or manually:
@@ -199,7 +314,7 @@ This starts the API server at `http://localhost:3100`. An embedded PostgreSQL da
**What does a typical setup look like?**
Locally, a single Node.js process manages an embedded Postgres and local file storage. For production, point it at your own Postgres and deploy however you like. Configure projects, agents, and goals — the agents take care of the rest.
If you're a solo-entreprenuer you can use Tailscale to access Paperclip on the go. Then later you can deploy to e.g. Vercel when you need it.
If you're a solo entrepreneur you can use Tailscale to access Paperclip on the go. Then later you can deploy to e.g. Vercel when you need it.
**Can I run multiple companies?**
Yes. A single deployment can run an unlimited number of companies with complete data isolation.
@@ -225,11 +340,15 @@ pnpm dev:once # Full dev without file watching
pnpm dev:server # Server only
pnpm build # Build all
pnpm typecheck # Type checking
pnpm test:run # Run tests
pnpm test # Cheap default test run (Vitest only)
pnpm test:watch # Vitest watch mode
pnpm test:e2e # Playwright browser suite
pnpm db:generate # Generate DB migration
pnpm db:migrate # Apply migrations
```
`pnpm test` does not run Playwright. Browser suites stay separate and are typically run only when working on those flows or in CI.
See [doc/DEVELOPING.md](doc/DEVELOPING.md) for the full development guide.
<br/>
@@ -243,14 +362,23 @@ See [doc/DEVELOPING.md](doc/DEVELOPING.md) for the full development guide.
- ✅ Skills Manager
- ✅ Scheduled Routines
- ✅ Better Budgeting
- Artifacts & Deployments
- ⚪ CEO Chat
- ⚪ MAXIMIZER MODE
- ⚪ Multiple Human Users
- Agent Reviews and Approvals
- ✅ Multiple Human Users
- ⚪ Cloud / Sandbox agents (e.g. Cursor / e2b agents)
- ⚪ Artifacts & Work Products
- ⚪ Memory / Knowledge
- ⚪ Enforced Outcomes
- ⚪ MAXIMIZER MODE
- ⚪ Deep Planning
- ⚪ Work Queues
- ⚪ Self-Organization
- ⚪ Automatic Organizational Learning
- ⚪ CEO Chat
- ⚪ Cloud deployments
- ⚪ Desktop App
This is the short roadmap preview. See the full roadmap in [ROADMAP.md](ROADMAP.md).
<br/>
## Community & Plugins
@@ -263,12 +391,12 @@ Paperclip collects anonymous usage telemetry to help us understand how the produ
Telemetry is **enabled by default** and can be disabled with any of the following:
| Method | How |
|---|---|
| Environment variable | `PAPERCLIP_TELEMETRY_DISABLED=1` |
| Standard convention | `DO_NOT_TRACK=1` |
| CI environments | Automatically disabled when `CI=true` |
| Config file | Set `telemetry.enabled: false` in your Paperclip config |
| Method | How |
| -------------------- | ------------------------------------------------------- |
| Environment variable | `PAPERCLIP_TELEMETRY_DISABLED=1` |
| Standard convention | `DO_NOT_TRACK=1` |
| CI environments | Automatically disabled when `CI=true` |
| Config file | Set `telemetry.enabled: false` in your Paperclip config |
## Contributing
@@ -279,6 +407,7 @@ We welcome contributions. See the [contributing guide](CONTRIBUTING.md) for deta
## Community
- [Discord](https://discord.gg/m4HZY7xNG3) — Join the community
- [Twitter / X](https://x.com/papercliping) — Follow updates and announcements
- [GitHub Issues](https://github.com/paperclipai/paperclip/issues) — bugs and feature requests
- [GitHub Discussions](https://github.com/paperclipai/paperclip/discussions) — ideas and RFC
@@ -286,7 +415,7 @@ We welcome contributions. See the [contributing guide](CONTRIBUTING.md) for deta
## License
MIT &copy; 2026 Paperclip
MIT &copy; 2026 [Paperclip Labs, Inc](https://paperclip.ing)
## Star History
@@ -297,9 +426,5 @@ MIT &copy; 2026 Paperclip
---
<p align="center">
<img src="doc/assets/footer.jpg" alt="" width="720" />
</p>
<p align="center">
<sub>Open source under MIT. Built for people who want to run companies, not babysit agents.</sub>
<sub>Open source under MIT. Built for people who want to get work done, not babysit agents.</sub>
</p>
+97
View File
@@ -0,0 +1,97 @@
# Roadmap
This document expands the roadmap preview in `README.md`.
Paperclip is still moving quickly. The list below is directional, not promised, and priorities may shift as we learn from users and from operating real AI companies with the product.
We value community involvement and want to make sure contributor energy goes toward areas where it can land.
We may accept contributions in the areas below, but if you want to work on roadmap-level core features, please coordinate with us first in Discord (`#dev`) before writing code. Bugs, docs, polish, and tightly scoped improvements are still the easiest contributions to merge.
If you want to extend Paperclip today, the best path is often the [plugin system](doc/plugins/PLUGIN_SPEC.md). Community reference implementations are also useful feedback even when they are not merged directly into core.
## Milestones
### ✅ Plugin system
Paperclip should keep a thin core and rich edges. Plugins are the path for optional capabilities like knowledge bases, custom tracing, queues, doc editors, and other product-specific surfaces that do not need to live in the control plane itself.
### ✅ Get OpenClaw / claw-style agent employees
Paperclip should be able to hire and manage real claw-style agent workers, not just a narrow built-in runtime. This is part of the larger "bring your own agent" story and keeps the control plane useful across different agent ecosystems.
### ✅ companies.sh - import and export entire organizations
Reusable companies matter. Import/export is the foundation for moving org structures, agent definitions, and reusable company setups between environments and eventually for broader company-template distribution.
### ✅ Easy AGENTS.md configurations
Agent setup should feel repo-native and legible. Simple `AGENTS.md`-style configuration lowers the barrier to getting an agent team running and makes it easier for contributors to understand how a company is wired together.
### ✅ Skills Manager
Agents need a practical way to discover, install, and use skills without every setup becoming bespoke. The skills layer is part of making Paperclip companies more reusable and easier to operate.
### ✅ Scheduled Routines
Recurring work should be native. Routine tasks like reports, reviews, and other periodic work need first-class scheduling so the company keeps operating even when no human is manually kicking work off.
### ✅ Better Budgeting
Budgets are a core control-plane feature, not an afterthought. Better budgeting means clearer spend visibility, safer hard stops, and better operator control over how autonomy turns into real cost.
### ✅ Agent Reviews and Approvals
Paperclip should support explicit review and approval stages as first-class workflow steps, not just ad hoc comments. That means reviewer routing, approval gates, change requests, and durable audit trails that fit the same task model as the rest of the control plane.
### ✅ Multiple Human Users
Paperclip needs a clearer path from solo operator to real human teams. That means shared board access, safer collaboration, and a better model for several humans supervising the same autonomous company.
### ⚪ Cloud / Sandbox agents (e.g. Cursor / e2b agents)
We want agents to run in more remote and sandboxed environments while preserving the same Paperclip control-plane model. This makes the system safer, more flexible, and more useful outside a single trusted local machine.
### ⚪ Artifacts & Work Products
Paperclip should make outputs first-class. That means generated artifacts, previews, deployable outputs, and the handoff from "agent did work" to "here is the result" should become more visible and easier to operate.
### ⚪ Memory / Knowledge
We want a stronger memory and knowledge surface for companies, agents, and projects. That includes durable memory, better recall of prior decisions and context, and a clearer path for knowledge-style capabilities without turning Paperclip into a generic chat app.
### ⚪ Enforced Outcomes
Paperclip should get stricter about what counts as finished work. Tasks, approvals, and execution flows should resolve to clear outcomes like merged code, published artifacts, shipped docs, or explicit decisions instead of stopping at vague status updates.
### ⚪ MAXIMIZER MODE
This is the direction for higher-autonomy execution: more aggressive delegation, deeper follow-through, and stronger operating loops with clear budgets, visibility, and governance. The point is not hidden autonomy; the point is more output per human supervisor.
### ⚪ Deep Planning
Some work needs more than a task description before execution starts. Deeper planning means stronger issue documents, revisionable plans, and clearer review loops for strategy-heavy work before agents begin execution.
### ⚪ Work Queues
Paperclip should support queue-style work streams for repeatable inputs like support, triage, review, and backlog intake. That would make it easier to route work continuously without turning every system into a one-off workflow.
### ⚪ Self-Organization
As companies grow, agents should be able to propose useful structural changes such as role adjustments, delegation changes, and new recurring routines. The goal is adaptive organizations that still stay within governance and approval boundaries.
### ⚪ Automatic Organizational Learning
Paperclip should get better at turning completed work into reusable organizational knowledge. That includes capturing playbooks, recurring fixes, and decision patterns so future work starts from what the company has already learned.
### ⚪ CEO Chat
We want a lighter-weight way to talk to leadership agents, but those conversations should still resolve to real work objects like plans, issues, approvals, or decisions. This should improve interaction without changing the core task-and-comments model.
### ⚪ Cloud deployments
Local-first remains important, but Paperclip also needs a cleaner shared deployment story. Teams should be able to run the same product in hosted or semi-hosted environments without changing the mental model.
### ⚪ Desktop App
A desktop app can make Paperclip feel more accessible and persistent for day-to-day operators. The goal is easier access, better local ergonomics, and a smoother default experience for users who want the control plane always close at hand.
+8
View File
@@ -0,0 +1,8 @@
# Security Policy
## Reporting a Vulnerability
Please report security vulnerabilities through GitHub's Security Advisory feature:
[https://github.com/paperclipai/paperclip/security/advisories/new](https://github.com/paperclipai/paperclip/security/advisories/new)
Do not open public issues for security vulnerabilities.
+33 -4
View File
@@ -6,13 +6,14 @@
<a href="#quickstart"><strong>Quickstart</strong></a> &middot;
<a href="https://paperclip.ing/docs"><strong>Docs</strong></a> &middot;
<a href="https://github.com/paperclipai/paperclip"><strong>GitHub</strong></a> &middot;
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a>
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a> &middot;
<a href="https://x.com/papercliping"><strong>Twitter</strong></a>
</p>
<p align="center">
<a href="https://github.com/paperclipai/paperclip/blob/master/LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue" alt="MIT License" /></a>
<a href="https://github.com/paperclipai/paperclip/stargazers"><img src="https://img.shields.io/github/stars/paperclipai/paperclip?style=flat" alt="Stars" /></a>
<a href="https://discord.gg/m4HZY7xNG3"><img src="https://img.shields.io/badge/discord-join%20chat-5865F2?logo=discord&logoColor=white" alt="Discord" /></a>
<a href="https://discord.gg/m4HZY7xNG3"><img src="https://img.shields.io/discord/000000000?label=discord" alt="Discord" /></a>
</p>
<br/>
@@ -177,6 +178,14 @@ Open source. Self-hosted. No Paperclip account required.
npx paperclipai onboard --yes
```
That quickstart path now defaults to trusted local loopback mode for the fastest first run. To start in authenticated/private mode instead, choose a bind preset explicitly:
```bash
npx paperclipai onboard --yes --bind lan
# or:
npx paperclipai onboard --yes --bind tailnet
```
If you already have Paperclip configured, rerunning `onboard` keeps the existing config in place. Use `paperclipai configure` to edit settings.
Or manually:
@@ -217,6 +226,21 @@ By default, agents run on scheduled heartbeats and event-based triggers (task as
<br/>
## Paperclip Cloud Sync
Cloud upstream sync is behind the `Cloud Sync` experimental setting. Enable it in Instance Settings before pushing.
```bash
paperclipai cloud connect https://your-stack.paperclip.app
paperclipai cloud connect https://your-stack.paperclip.app --no-browser
paperclipai cloud push --company <local-company-id> --dry-run
paperclipai cloud push --company <local-company-id>
```
`cloud connect` authorizes the local instance against the target stack and stores the upstream token in the local instance secret store. The default path opens a browser for consent; `--no-browser` uses the device-code flow and prints the verification URL and user code.
`cloud push --dry-run` exports the selected local company, sends a preview bundle to the connected Cloud stack, and exits with code `2` when conflicts need user resolution. A schema mismatch exits with code `3`. Running without `--dry-run` stages chunks idempotently, applies the run, and prints the final summary and recent progress events.
## Development
```bash
@@ -225,11 +249,15 @@ pnpm dev:once # Full dev without file watching
pnpm dev:server # Server only
pnpm build # Build all
pnpm typecheck # Type checking
pnpm test:run # Run tests
pnpm test # Cheap default test run (Vitest only)
pnpm test:watch # Vitest watch mode
pnpm test:e2e # Playwright browser suite
pnpm db:generate # Generate DB migration
pnpm db:migrate # Apply migrations
```
`pnpm test` does not run Playwright. Browser suites stay separate and are typically run only when working on those flows or in CI.
See [doc/DEVELOPING.md](https://github.com/paperclipai/paperclip/blob/master/doc/DEVELOPING.md) for the full development guide.
<br/>
@@ -246,7 +274,7 @@ See [doc/DEVELOPING.md](https://github.com/paperclipai/paperclip/blob/master/doc
- ⚪ Artifacts & Deployments
- ⚪ CEO Chat
- ⚪ MAXIMIZER MODE
- Multiple Human Users
- Multiple Human Users
- ⚪ Cloud / Sandbox agents (e.g. Cursor / e2b agents)
- ⚪ Cloud deployments
- ⚪ Desktop App
@@ -266,6 +294,7 @@ We welcome contributions. See the [contributing guide](https://github.com/paperc
## Community
- [Discord](https://discord.gg/m4HZY7xNG3) — Join the community
- [Twitter / X](https://x.com/papercliping) — Follow updates and announcements
- [GitHub Issues](https://github.com/paperclipai/paperclip/issues) — bugs and feature requests
- [GitHub Discussions](https://github.com/paperclipai/paperclip/discussions) — ideas and RFC
+4 -1
View File
@@ -37,10 +37,13 @@
},
"dependencies": {
"@clack/prompts": "^0.10.0",
"@paperclipai/adapter-acpx-local": "workspace:*",
"@paperclipai/adapter-claude-local": "workspace:*",
"@paperclipai/adapter-codex-local": "workspace:*",
"@paperclipai/adapter-cursor-cloud": "workspace:*",
"@paperclipai/adapter-cursor-local": "workspace:*",
"@paperclipai/adapter-gemini-local": "workspace:*",
"@paperclipai/adapter-grok-local": "workspace:*",
"@paperclipai/adapter-opencode-local": "workspace:*",
"@paperclipai/adapter-pi-local": "workspace:*",
"@paperclipai/adapter-openclaw-gateway": "workspace:*",
@@ -48,7 +51,7 @@
"@paperclipai/db": "workspace:*",
"@paperclipai/server": "workspace:*",
"@paperclipai/shared": "workspace:*",
"drizzle-orm": "0.38.4",
"drizzle-orm": "0.45.2",
"dotenv": "^17.0.1",
"commander": "^13.1.0",
"embedded-postgres": "^18.1.0-beta.16",
+243
View File
@@ -0,0 +1,243 @@
import fs from "node:fs";
import os from "node:os";
import path from "node:path";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import type { CompanyPortabilityExportResult } from "@paperclipai/shared";
import {
assertDiscoveryCompatible,
buildBundleFromLocalCompany,
cloudCommandExitCodes,
connectCloud,
resolveDeviceCodeExpiresAt,
} from "../commands/client/cloud.js";
import {
LocalUpstreamPushCoordinator,
normalizedContentHash,
type LocalUpstreamExportBundle,
} from "../commands/client/cloud-transfer.js";
import { getCloudConnection } from "../commands/client/cloud-store.js";
const originalEnv = { ...process.env };
const originalFetch = globalThis.fetch;
describe("cloud CLI helpers", () => {
let tempHome: string;
beforeEach(() => {
tempHome = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-cloud-cli-"));
process.env = { ...originalEnv, PAPERCLIP_HOME: tempHome };
});
afterEach(() => {
process.env = { ...originalEnv };
globalThis.fetch = originalFetch;
vi.restoreAllMocks();
fs.rmSync(tempHome, { recursive: true, force: true });
});
it("connects with the device-code flow and stores the resulting cloud connection", async () => {
globalThis.fetch = vi.fn(async (url, init) => {
const requestUrl = String(url);
if (requestUrl.endsWith("/.well-known/paperclip-upstream")) {
return jsonResponse(discovery());
}
if (requestUrl.endsWith("/api/upstream-sync/device-code")) {
expect(JSON.parse(String(init?.body))).toMatchObject({
stackId: "stack-1",
scopes: ["upstream_import:preview", "upstream_import:write", "upstream_import:read"],
});
return jsonResponse({
deviceCode: "device-1",
userCode: "ABCD-EFGH",
verificationUri: "https://cloud.example.test/api/upstream-sync/device-code/approve",
expiresAt: new Date(Date.now() + 60_000).toISOString(),
intervalSeconds: 0,
});
}
if (requestUrl.endsWith("/api/upstream-sync/token")) {
return jsonResponse({
accessToken: "upt_test",
scopes: ["upstream_import:preview"],
token: {
id: "token-1",
companyStackId: "stack-1",
targetOrigin: "https://cloud.example.test",
sourceInstanceId: "paperclip-local-default",
sourceInstanceFingerprint: "sha256:test",
scopes: ["upstream_import:preview"],
expiresAt: new Date(Date.now() + 60_000).toISOString(),
},
});
}
return jsonResponse({ error: "not_found" }, 404);
}) as typeof fetch;
const connection = await connectCloud("https://cloud.example.test", { noBrowser: true, json: true });
expect(connection.accessToken).toBe("upt_test");
expect(getCloudConnection("https://cloud.example.test")?.token.id).toBe("token-1");
});
it("hard-blocks incompatible transfer schema versions with the stable schema exit code", () => {
expect(() => assertDiscoveryCompatible(discovery({ supportedSchemaMajor: 99 }))).toThrow(/schema mismatch/i);
expect(cloudCommandExitCodes.schemaMismatch).toBe(3);
});
it("falls back to a bounded device-code expiry when the cloud omits or malforms expiresAt", () => {
const now = Date.UTC(2026, 4, 22, 13, 0, 0);
const validExpiry = "2026-05-22T13:05:00.000Z";
expect(resolveDeviceCodeExpiresAt(validExpiry, now)).toBe(Date.parse(validExpiry));
expect(resolveDeviceCodeExpiresAt(undefined, now)).toBe(now + 15 * 60_000);
expect(resolveDeviceCodeExpiresAt("not-a-date", now)).toBe(now + 15 * 60_000);
});
it("builds deterministic chunks with validated payload hashes", async () => {
const bundle = await buildTestBundle();
expect(bundle.chunks).toHaveLength(2);
expect(bundle.chunks[0]?.sha256).toBe(normalizedContentHash(bundle.chunks[0]?.payload));
expect(bundle.manifest.chunks[0]?.manifestHash).toBe(bundle.manifest.manifestHash);
expect(bundle.manifest.idempotencyKey).toBe((await buildTestBundle()).manifest.idempotencyKey);
});
it("reuses the same manifest and chunk identity when an interrupted apply is retried", async () => {
const bundle = await buildTestBundle();
const calls: Array<{ path: string; body: unknown }> = [];
const coordinator = new LocalUpstreamPushCoordinator({
targetOrigin: "https://cloud.example.test",
paperclipCompanyId: "target-company-1",
fetch: async (url, init) => {
const parsed = new URL(String(url));
const body = init?.body ? JSON.parse(String(init.body)) as unknown : {};
calls.push({ path: parsed.pathname, body });
if (parsed.pathname.endsWith("/runs")) return jsonResponse({ run: { id: "run-1" } });
return jsonResponse({ run: { id: "run-1" }, summary: { create: 0, update: 0, adopt: 0, skip: 2, conflict: 0, staleMapping: 0 } });
},
});
await coordinator.apply(bundle);
await coordinator.apply(bundle);
const runBodies = calls.filter((call) => call.path.endsWith("/runs")).map((call) => call.body as { manifest: { idempotencyKey: string } });
const chunkBodies = calls.filter((call) => call.path.endsWith("/chunks")).map((call) => call.body as { chunkIndex: number; sha256: string });
expect(runBodies).toHaveLength(2);
expect(runBodies[0]?.manifest.idempotencyKey).toBe(runBodies[1]?.manifest.idempotencyKey);
expect(chunkBodies[0]).toEqual(chunkBodies[2]);
expect(chunkBodies[1]).toEqual(chunkBodies[3]);
});
});
async function buildTestBundle(): Promise<LocalUpstreamExportBundle> {
return buildBundleFromLocalCompany({
localCompanyId: "local-company-1",
connection: {
id: "conn-1",
remoteUrl: "https://cloud.example.test",
targetOrigin: "https://cloud.example.test",
targetHost: "cloud.example.test",
stackId: "stack-1",
targetCompanyId: "target-company-1",
accessToken: "upt_test",
token: {
id: "token-1",
companyStackId: "stack-1",
targetOrigin: "https://cloud.example.test",
sourceInstanceId: "paperclip-local-default",
sourceInstanceFingerprint: "sha256:test",
scopes: ["upstream_import:preview"],
expiresAt: new Date(Date.now() + 60_000).toISOString(),
},
privateKeyPem: "unused",
sourcePublicKey: "unused",
sourceInstanceId: "paperclip-local-default",
sourceInstanceFingerprint: "sha256:test",
scopes: ["upstream_import:preview"],
createdAt: "2026-05-18T00:00:00.000Z",
updatedAt: "2026-05-18T00:00:00.000Z",
},
discovery: discovery(),
localApi: {
post: async <T>() => portabilityExport() as T,
},
maxEntitiesPerChunk: 1,
mode: "apply",
});
}
function discovery(overrides: Partial<{ supportedSchemaMajor: number }> = {}) {
return {
schema: "paperclip-upstream-discovery-v1",
stack: {
id: "stack-1",
slug: "cloud-test",
displayName: "Cloud Test",
companyId: "target-company-1",
origin: "https://cloud.example.test",
},
auth: {
deviceCode: {
deviceCodeUrl: "https://cloud.example.test/api/upstream-sync/device-code",
verificationUrl: "https://cloud.example.test/api/upstream-sync/device-code/approve",
tokenUrl: "https://cloud.example.test/api/upstream-sync/token",
},
scopes: ["upstream_import:preview", "upstream_import:write", "upstream_import:read"],
},
transfer: {
supportedSchemaMajor: overrides.supportedSchemaMajor ?? 1,
featureFlags: ["cloud_sync"],
},
};
}
function portabilityExport(): CompanyPortabilityExportResult {
return {
rootPath: ".",
paperclipExtensionPath: ".paperclip.yaml",
manifest: {
schemaVersion: 1,
generatedAt: "2026-05-18T00:00:00.000Z",
source: {
companyId: "local-company-1",
companyName: "Local Company",
},
includes: {
company: true,
agents: true,
projects: true,
issues: true,
skills: true,
},
company: {
path: "company.json",
name: "Local Company",
description: null,
brandColor: null,
logoPath: null,
attachmentMaxBytes: null,
requireBoardApprovalForNewAgents: false,
feedbackDataSharingEnabled: false,
feedbackDataSharingConsentAt: null,
feedbackDataSharingConsentByUserId: null,
feedbackDataSharingTermsVersion: null,
},
sidebar: null,
agents: [],
skills: [],
projects: [],
issues: [],
envInputs: [],
},
files: {
"README.md": "Local Company",
},
warnings: [],
};
}
function jsonResponse(body: unknown, status = 200): Response {
return new Response(JSON.stringify(body), {
status,
headers: { "Content-Type": "application/json" },
});
}
+1
View File
@@ -14,6 +14,7 @@ function makeCompany(overrides: Partial<Company>): Company {
issueCounter: 1,
budgetMonthlyCents: 0,
spentMonthlyCents: 0,
attachmentMaxBytes: 10 * 1024 * 1024,
requireBoardApprovalForNewAgents: false,
feedbackDataSharingEnabled: false,
feedbackDataSharingConsentAt: null,
@@ -1,5 +1,5 @@
import { execFile, spawn } from "node:child_process";
import { mkdirSync, mkdtempSync, readFileSync, readdirSync, rmSync, writeFileSync } from "node:fs";
import { existsSync, mkdirSync, mkdtempSync, readFileSync, readdirSync, rmSync, writeFileSync } from "node:fs";
import net from "node:net";
import os from "node:os";
import path from "node:path";
@@ -104,20 +104,50 @@ function writeTestConfig(configPath: string, tempRoot: string, port: number, con
writeFileSync(configPath, `${JSON.stringify(config, null, 2)}\n`, "utf8");
}
function createServerEnv(configPath: string, port: number, connectionString: string) {
interface TestPaperclipEnv {
configPath: string;
paperclipHome: string;
instanceId: string;
shellHome?: string;
}
function createBasePaperclipEnv(options: TestPaperclipEnv) {
const env = { ...process.env };
for (const key of Object.keys(env)) {
if (key.startsWith("PAPERCLIP_")) {
delete env[key];
}
}
env.PAPERCLIP_CONFIG = options.configPath;
env.PAPERCLIP_HOME = options.paperclipHome;
env.PAPERCLIP_INSTANCE_ID = options.instanceId;
env.PAPERCLIP_CONTEXT = path.join(options.paperclipHome, "context.json");
env.PAPERCLIP_AUTH_STORE = path.join(options.paperclipHome, "auth.json");
if (options.shellHome) {
env.HOME = options.shellHome;
}
return env;
}
function createServerEnv(
configPath: string,
port: number,
connectionString: string,
options: Omit<TestPaperclipEnv, "configPath">,
) {
const env = createBasePaperclipEnv({
configPath,
...options,
});
delete env.DATABASE_URL;
delete env.PORT;
delete env.HOST;
delete env.SERVE_UI;
delete env.HEARTBEAT_SCHEDULER_ENABLED;
env.PAPERCLIP_CONFIG = configPath;
env.DATABASE_URL = connectionString;
env.HOST = "127.0.0.1";
env.PORT = String(port);
@@ -130,13 +160,8 @@ function createServerEnv(configPath: string, port: number, connectionString: str
return env;
}
function createCliEnv() {
const env = { ...process.env };
for (const key of Object.keys(env)) {
if (key.startsWith("PAPERCLIP_")) {
delete env[key];
}
}
function createCliEnv(options: TestPaperclipEnv) {
const env = createBasePaperclipEnv(options);
delete env.DATABASE_URL;
delete env.PORT;
delete env.HOST;
@@ -183,14 +208,25 @@ async function api<T>(baseUrl: string, pathname: string, init?: RequestInit): Pr
return text ? JSON.parse(text) as T : (null as T);
}
async function runCliJson<T>(args: string[], opts: { apiBase: string; configPath: string }) {
async function runCliJson<T>(
args: string[],
opts: TestPaperclipEnv & { apiBase?: string; includeConfigArg?: boolean },
) {
const repoRoot = path.resolve(path.dirname(fileURLToPath(import.meta.url)), "../../..");
const cliArgs = ["--silent", "paperclipai", ...args];
if (opts.apiBase) {
cliArgs.push("--api-base", opts.apiBase);
}
if (opts.includeConfigArg !== false) {
cliArgs.push("--config", opts.configPath);
}
cliArgs.push("--json");
const result = await execFileAsync(
"pnpm",
["--silent", "paperclipai", ...args, "--api-base", opts.apiBase, "--config", opts.configPath, "--json"],
cliArgs,
{
cwd: repoRoot,
env: createCliEnv(),
env: createCliEnv(opts),
maxBuffer: 10 * 1024 * 1024,
},
);
@@ -235,6 +271,9 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
let configPath = "";
let exportDir = "";
let apiBase = "";
let paperclipHome = "";
let cliShellHome = "";
let paperclipInstanceId = "";
let serverProcess: ServerProcess | null = null;
let tempDb: Awaited<ReturnType<typeof startEmbeddedPostgresTestDatabase>> | null = null;
@@ -242,6 +281,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
tempRoot = mkdtempSync(path.join(os.tmpdir(), "paperclip-company-cli-e2e-"));
configPath = path.join(tempRoot, "config", "config.json");
exportDir = path.join(tempRoot, "exported-company");
paperclipHome = path.join(tempRoot, "paperclip-home");
cliShellHome = path.join(tempRoot, "shell-home");
paperclipInstanceId = "company-cli-e2e";
mkdirSync(paperclipHome, { recursive: true });
mkdirSync(cliShellHome, { recursive: true });
tempDb = await startEmbeddedPostgresTestDatabase("paperclip-company-cli-db-");
@@ -256,7 +300,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
["paperclipai", "run", "--config", configPath],
{
cwd: repoRoot,
env: createServerEnv(configPath, port, tempDb.connectionString),
env: createServerEnv(configPath, port, tempDb.connectionString, {
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
}),
stdio: ["ignore", "pipe", "pipe"],
},
);
@@ -282,11 +330,41 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
it("exports a company package and imports it into new and existing companies", async () => {
expect(serverProcess).not.toBeNull();
const cliContext = await runCliJson<{
contextPath: string;
profileName: string;
profile: { apiBase?: string };
}>(
["context", "set", "--profile", "isolation-check", "--api-base", "https://example.test"],
{
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
includeConfigArg: false,
},
);
const expectedContextPath = path.join(paperclipHome, "context.json");
const leakedContextPath = path.join(cliShellHome, ".paperclip", "context.json");
expect(cliContext.contextPath).toBe(expectedContextPath);
expect(cliContext.profileName).toBe("isolation-check");
expect(cliContext.profile.apiBase).toBe("https://example.test");
expect(existsSync(expectedContextPath)).toBe(true);
expect(existsSync(leakedContextPath)).toBe(false);
rmSync(expectedContextPath, { force: true });
expect(existsSync(expectedContextPath)).toBe(false);
const sourceCompany = await api<{ id: string; name: string; issuePrefix: string }>(apiBase, "/api/companies", {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify({ name: `CLI Export Source ${Date.now()}` }),
});
await api(apiBase, `/api/companies/${sourceCompany.id}`, {
method: "PATCH",
headers: { "content-type": "application/json" },
body: JSON.stringify({ requireBoardApprovalForNewAgents: false }),
});
const sourceAgent = await api<{ id: string; name: string }>(
apiBase,
@@ -298,8 +376,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
name: "Export Engineer",
role: "engineer",
adapterType: "claude_local",
adapterConfig: {
promptTemplate: "You verify company portability.",
adapterConfig: {},
instructionsBundle: {
files: {
"AGENTS.md": "You verify company portability.",
},
},
}),
},
@@ -350,7 +431,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"--include",
"company,agents,projects,issues",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(exportResult.ok).toBe(true);
@@ -374,7 +461,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"company,agents,projects,issues",
"--yes",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(importedNew.company.action).toBe("created");
@@ -393,10 +486,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
apiBase,
`/api/companies/${importedNew.company.id}/issues`,
);
const importedMatchingIssues = importedIssues.filter((issue) => issue.title === sourceIssue.title);
expect(importedAgents.map((agent) => agent.name)).toContain(sourceAgent.name);
expect(importedProjects.map((project) => project.name)).toContain(sourceProject.name);
expect(importedIssues.map((issue) => issue.title)).toContain(sourceIssue.title);
expect(importedMatchingIssues).toHaveLength(1);
const previewExisting = await runCliJson<{
errors: string[];
@@ -421,7 +515,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"rename",
"--dry-run",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(previewExisting.errors).toEqual([]);
@@ -448,7 +548,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"rename",
"--yes",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(importedExisting.company.action).toBe("unchanged");
@@ -466,11 +572,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
apiBase,
`/api/companies/${importedNew.company.id}/issues`,
);
const twiceImportedMatchingIssues = twiceImportedIssues.filter((issue) => issue.title === sourceIssue.title);
expect(twiceImportedAgents).toHaveLength(2);
expect(new Set(twiceImportedAgents.map((agent) => agent.name)).size).toBe(2);
expect(twiceImportedProjects).toHaveLength(2);
expect(twiceImportedIssues).toHaveLength(2);
expect(twiceImportedMatchingIssues).toHaveLength(2);
expect(new Set(twiceImportedMatchingIssues.map((issue) => issue.identifier)).size).toBe(2);
const zipPath = path.join(tempRoot, "exported-company.zip");
const portableFiles: Record<string, string> = {};
@@ -493,10 +601,16 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"company,agents,projects,issues",
"--yes",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(importedFromZip.company.action).toBe("created");
expect(importedFromZip.agents.some((agent) => agent.action === "created")).toBe(true);
}, 60_000);
}, 90_000);
});
+4
View File
@@ -160,6 +160,7 @@ describe("renderCompanyImportPreview", () => {
path: "COMPANY.md",
name: "Source Co",
description: null,
attachmentMaxBytes: null,
brandColor: null,
logoPath: null,
requireBoardApprovalForNewAgents: false,
@@ -243,6 +244,7 @@ describe("renderCompanyImportPreview", () => {
billingCode: null,
executionWorkspaceSettings: null,
assigneeAdapterOverrides: null,
comments: [],
metadata: null,
},
],
@@ -375,6 +377,7 @@ describe("import selection catalog", () => {
path: "COMPANY.md",
name: "Source Co",
description: null,
attachmentMaxBytes: null,
brandColor: null,
logoPath: "images/company-logo.png",
requireBoardApprovalForNewAgents: false,
@@ -458,6 +461,7 @@ describe("import selection catalog", () => {
billingCode: null,
executionWorkspaceSettings: null,
assigneeAdapterOverrides: null,
comments: [],
metadata: null,
},
],
+24
View File
@@ -0,0 +1,24 @@
import path from "node:path";
import { describe, expect, it } from "vitest";
import { collectEnvLabDoctorStatus, resolveEnvLabSshStatePath } from "../commands/env-lab.js";
describe("env-lab command", () => {
it("resolves the default SSH fixture state path under the instance root", () => {
const statePath = resolveEnvLabSshStatePath("fixture-test");
expect(statePath).toContain(
path.join("instances", "fixture-test", "env-lab", "ssh-fixture", "state.json"),
);
});
it("reports doctor status for an instance without a running fixture", async () => {
const status = await collectEnvLabDoctorStatus({ instance: "fixture-test-missing" });
expect(status.statePath).toContain(
path.join("instances", "fixture-test-missing", "env-lab", "ssh-fixture", "state.json"),
);
expect(typeof status.ssh.supported).toBe("boolean");
expect(status.ssh.running).toBe(false);
expect(status.ssh.environment).toBeNull();
});
});
+6 -4
View File
@@ -1,3 +1,4 @@
import fs from "node:fs";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it } from "vitest";
@@ -16,13 +17,14 @@ describe("home path resolution", () => {
});
it("defaults to ~/.paperclip and default instance", () => {
delete process.env.PAPERCLIP_HOME;
const home = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-home-paths-"));
process.env.PAPERCLIP_HOME = home;
delete process.env.PAPERCLIP_INSTANCE_ID;
const paths = describeLocalInstancePaths();
expect(paths.homeDir).toBe(path.resolve(os.homedir(), ".paperclip"));
expect(paths.homeDir).toBe(home);
expect(paths.instanceId).toBe("default");
expect(paths.configPath).toBe(path.resolve(os.homedir(), ".paperclip", "instances", "default", "config.json"));
expect(paths.configPath).toBe(path.resolve(home, "instances", "default", "config.json"));
});
it("supports PAPERCLIP_HOME and explicit instance ids", () => {
@@ -34,7 +36,7 @@ describe("home path resolution", () => {
});
it("rejects invalid instance ids", () => {
expect(() => resolvePaperclipInstanceId("bad/id")).toThrow(/Invalid instance id/);
expect(() => resolvePaperclipInstanceId("bad/id")).toThrow(/Invalid PAPERCLIP_INSTANCE_ID/);
});
it("expands ~ prefixes", () => {
+69
View File
@@ -0,0 +1,69 @@
import { describe, expect, it } from "vitest";
import { resolveRuntimeBind, validateConfiguredBindMode } from "@paperclipai/shared";
import { buildPresetServerConfig } from "../config/server-bind.js";
const ORIGINAL_PATH = process.env.PATH;
describe("network bind helpers", () => {
it("rejects non-loopback bind modes in local_trusted", () => {
expect(
validateConfiguredBindMode({
deploymentMode: "local_trusted",
deploymentExposure: "private",
bind: "lan",
host: "0.0.0.0",
}),
).toContain("local_trusted requires server.bind=loopback");
});
it("resolves tailnet bind using the detected tailscale address", () => {
const resolved = resolveRuntimeBind({
bind: "tailnet",
host: "127.0.0.1",
tailnetBindHost: "100.64.0.8",
});
expect(resolved.errors).toEqual([]);
expect(resolved.host).toBe("100.64.0.8");
});
it("requires a custom bind host when bind=custom", () => {
const resolved = resolveRuntimeBind({
bind: "custom",
host: "127.0.0.1",
});
expect(resolved.errors).toContain("server.customBindHost is required when server.bind=custom");
});
it("stores the detected tailscale address for tailnet presets", () => {
process.env.PAPERCLIP_TAILNET_BIND_HOST = "100.64.0.8";
const preset = buildPresetServerConfig("tailnet", {
port: 3100,
allowedHostnames: [],
serveUi: true,
});
expect(preset.server.host).toBe("100.64.0.8");
delete process.env.PAPERCLIP_TAILNET_BIND_HOST;
});
it("falls back to loopback when no tailscale address is available for tailnet presets", () => {
delete process.env.PAPERCLIP_TAILNET_BIND_HOST;
process.env.PATH = "";
try {
const preset = buildPresetServerConfig("tailnet", {
port: 3100,
allowedHostnames: [],
serveUi: true,
});
expect(preset.server.host).toBe("127.0.0.1");
} finally {
process.env.PATH = ORIGINAL_PATH;
}
});
});
+94
View File
@@ -6,6 +6,8 @@ import { onboard } from "../commands/onboard.js";
import type { PaperclipConfig } from "../config/schema.js";
const ORIGINAL_ENV = { ...process.env };
const ORIGINAL_CWD = process.cwd();
const ORIGINAL_PATH = process.env.PATH;
function createExistingConfigFixture() {
const root = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-onboard-"));
@@ -74,16 +76,29 @@ function createExistingConfigFixture() {
return { configPath, configText: fs.readFileSync(configPath, "utf8") };
}
function createFreshConfigPath() {
const root = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-onboard-fresh-"));
return path.join(root, ".paperclip", "config.json");
}
describe("onboard", () => {
beforeEach(() => {
process.env = { ...ORIGINAL_ENV };
delete process.env.PAPERCLIP_AGENT_JWT_SECRET;
delete process.env.PAPERCLIP_SECRETS_MASTER_KEY;
delete process.env.PAPERCLIP_SECRETS_MASTER_KEY_FILE;
delete process.env.PAPERCLIP_HOME;
delete process.env.PAPERCLIP_CONFIG;
delete process.env.PAPERCLIP_INSTANCE_ID;
delete process.env.PAPERCLIP_BIND;
delete process.env.PAPERCLIP_BIND_HOST;
delete process.env.PAPERCLIP_TAILNET_BIND_HOST;
delete process.env.HOST;
});
afterEach(() => {
process.env = { ...ORIGINAL_ENV };
process.chdir(ORIGINAL_CWD);
});
it("preserves an existing config when rerun without flags", async () => {
@@ -105,4 +120,83 @@ describe("onboard", () => {
expect(fs.existsSync(`${fixture.configPath}.backup`)).toBe(false);
expect(fs.existsSync(path.join(path.dirname(fixture.configPath), ".env"))).toBe(true);
});
it("keeps --yes onboarding on local trusted loopback defaults", async () => {
const configPath = createFreshConfigPath();
process.env.HOST = "0.0.0.0";
process.env.PAPERCLIP_BIND = "lan";
await onboard({ config: configPath, yes: true, invokedByRun: true });
const raw = JSON.parse(fs.readFileSync(configPath, "utf8")) as PaperclipConfig;
expect(raw.server.deploymentMode).toBe("local_trusted");
expect(raw.server.exposure).toBe("private");
expect(raw.server.bind).toBe("loopback");
expect(raw.server.host).toBe("127.0.0.1");
});
it("creates instance-root config and data paths for a fresh PAPERCLIP_HOME", async () => {
const home = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-onboard-home-"));
const cwd = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-onboard-cwd-"));
process.chdir(cwd);
process.env.PAPERCLIP_HOME = home;
await onboard({ yes: true, invokedByRun: true });
const instanceRoot = path.join(home, "instances", "default");
const configPath = path.join(instanceRoot, "config.json");
const raw = JSON.parse(fs.readFileSync(configPath, "utf8")) as PaperclipConfig;
expect(raw.database.embeddedPostgresDataDir).toBe(path.join(instanceRoot, "db"));
expect(raw.database.backup.dir).toBe(path.join(instanceRoot, "data", "backups"));
expect(raw.logging.logDir).toBe(path.join(instanceRoot, "logs"));
expect(raw.storage.localDisk.baseDir).toBe(path.join(instanceRoot, "data", "storage"));
expect(raw.secrets.localEncrypted.keyFilePath).toBe(path.join(instanceRoot, "secrets", "master.key"));
expect(fs.existsSync(path.join(instanceRoot, ".env"))).toBe(true);
expect(fs.existsSync(path.join(instanceRoot, "secrets", "master.key"))).toBe(true);
});
it("supports authenticated/private quickstart bind presets", async () => {
const configPath = createFreshConfigPath();
process.env.PAPERCLIP_TAILNET_BIND_HOST = "100.64.0.8";
await onboard({ config: configPath, yes: true, invokedByRun: true, bind: "tailnet" });
const raw = JSON.parse(fs.readFileSync(configPath, "utf8")) as PaperclipConfig;
expect(raw.server.deploymentMode).toBe("authenticated");
expect(raw.server.exposure).toBe("private");
expect(raw.server.bind).toBe("tailnet");
expect(raw.server.host).toBe("100.64.0.8");
});
it("keeps tailnet quickstart on loopback until tailscale is available", async () => {
const configPath = createFreshConfigPath();
delete process.env.PAPERCLIP_TAILNET_BIND_HOST;
process.env.PATH = "";
try {
await onboard({ config: configPath, yes: true, invokedByRun: true, bind: "tailnet" });
} finally {
process.env.PATH = ORIGINAL_PATH;
}
const raw = JSON.parse(fs.readFileSync(configPath, "utf8")) as PaperclipConfig;
expect(raw.server.deploymentMode).toBe("authenticated");
expect(raw.server.exposure).toBe("private");
expect(raw.server.bind).toBe("tailnet");
expect(raw.server.host).toBe("127.0.0.1");
});
it("ignores deployment env overrides during --yes quickstart", async () => {
const configPath = createFreshConfigPath();
process.env.PAPERCLIP_DEPLOYMENT_MODE = "authenticated";
await onboard({ config: configPath, yes: true, invokedByRun: true });
const raw = JSON.parse(fs.readFileSync(configPath, "utf8")) as PaperclipConfig;
expect(raw.server.deploymentMode).toBe("local_trusted");
expect(raw.server.exposure).toBe("private");
expect(raw.server.bind).toBe("loopback");
expect(raw.server.host).toBe("127.0.0.1");
});
});
+164
View File
@@ -0,0 +1,164 @@
import fs from "node:fs";
import os from "node:os";
import path from "node:path";
import { Command } from "commander";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
const mocks = vi.hoisted(() => ({
scaffoldPluginProject: vi.fn((options: { outputDir: string }) => options.outputDir),
}));
vi.mock("../../../packages/plugins/create-paperclip-plugin/src/index.js", async () => {
const actual =
await vi.importActual<typeof import("../../../packages/plugins/create-paperclip-plugin/src/index.js")>(
"../../../packages/plugins/create-paperclip-plugin/src/index.js",
);
return {
...actual,
scaffoldPluginProject: mocks.scaffoldPluginProject,
};
});
import {
buildPluginInstallRequest,
buildPluginInitNextCommands,
buildPluginInitScaffoldOptions,
registerPluginCommands,
} from "../commands/client/plugin.js";
const tempDirs: string[] = [];
function makeTempDir(): string {
const dir = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-cli-plugin-"));
tempDirs.push(dir);
return dir;
}
afterEach(() => {
while (tempDirs.length > 0) {
const dir = tempDirs.pop();
if (dir) fs.rmSync(dir, { recursive: true, force: true });
}
});
describe("plugin init", () => {
beforeEach(() => {
mocks.scaffoldPluginProject.mockClear();
});
it("maps package name and flags to scaffolder options", () => {
const cwd = path.resolve("/tmp/paperclip-cli-test");
const options = buildPluginInitScaffoldOptions(
"@acme/plugin-linear",
{
output: "plugins",
template: "connector",
category: "automation",
displayName: "Linear Bridge",
description: "Syncs Linear issues",
author: "Acme",
sdkPath: "../paperclip/packages/plugins/sdk",
},
cwd,
);
expect(options).toEqual({
pluginName: "@acme/plugin-linear",
outputDir: path.resolve(cwd, "plugins", "plugin-linear"),
template: "connector",
category: "automation",
displayName: "Linear Bridge",
description: "Syncs Linear issues",
author: "Acme",
sdkPath: "../paperclip/packages/plugins/sdk",
});
});
it("builds exact next commands using the scaffold path", () => {
expect(buildPluginInitNextCommands("/tmp/acme plugin")).toEqual([
"cd '/tmp/acme plugin'",
"pnpm install",
"pnpm dev",
"paperclipai plugin install '/tmp/acme plugin'",
]);
});
it("registers the CLI wrapper and invokes the existing scaffolder", async () => {
const program = new Command();
program.exitOverride();
program.configureOutput({ writeOut: () => {}, writeErr: () => {} });
registerPluginCommands(program);
await program.parseAsync(
[
"plugin",
"init",
"demo-plugin",
"--output",
"/tmp/paperclip-init-output",
"--template",
"workspace",
"--category",
"workspace",
"--display-name",
"Demo Plugin",
"--description",
"Demo description",
"--author",
"Paperclip",
"--sdk-path",
"/repo/packages/plugins/sdk",
],
{ from: "user" },
);
expect(mocks.scaffoldPluginProject).toHaveBeenCalledTimes(1);
expect(mocks.scaffoldPluginProject).toHaveBeenCalledWith({
pluginName: "demo-plugin",
outputDir: path.resolve("/tmp/paperclip-init-output", "demo-plugin"),
template: "workspace",
category: "workspace",
displayName: "Demo Plugin",
description: "Demo description",
author: "Paperclip",
sdkPath: "/repo/packages/plugins/sdk",
});
});
});
describe("plugin install", () => {
it("resolves an existing relative local path to an absolute local install request", () => {
const cwd = makeTempDir();
const pluginDir = path.join(cwd, "demo-plugin");
fs.mkdirSync(pluginDir);
expect(buildPluginInstallRequest("demo-plugin", {}, { cwd })).toEqual({
packageName: pluginDir,
version: undefined,
isLocalPath: true,
});
});
it("keeps an absolute local path absolute and marks it as local", () => {
const pluginDir = path.join(makeTempDir(), "demo-plugin");
fs.mkdirSync(pluginDir);
expect(buildPluginInstallRequest(pluginDir, {}, { cwd: "/" })).toEqual({
packageName: pluginDir,
version: undefined,
isLocalPath: true,
});
});
it("preserves npm package installs when no local path exists", () => {
expect(
buildPluginInstallRequest("@acme/plugin-linear", { version: "1.2.3" }, {
cwd: makeTempDir(),
}),
).toEqual({
packageName: "@acme/plugin-linear",
version: "1.2.3",
isLocalPath: false,
});
});
});
+257
View File
@@ -0,0 +1,257 @@
import { afterEach, beforeEach, describe, expect, it } from "vitest";
import type { Agent, CompanySecret } from "@paperclipai/shared";
import type { PaperclipConfig } from "../config/schema.js";
import { secretsCheck } from "../checks/secrets-check.js";
import {
buildInlineMigrationSecretName,
buildMigratedAgentEnv,
collectInlineSecretMigrationCandidates,
parseSecretsInclude,
toPlainEnvValue,
} from "../commands/client/secrets.js";
function agent(partial: Partial<Agent>): Agent {
return {
id: "agent-12345678",
companyId: "company-1",
name: "Coder",
urlKey: "coder",
role: "engineer",
title: null,
icon: null,
status: "idle",
reportsTo: null,
capabilities: null,
adapterType: "codex_local",
adapterConfig: {},
runtimeConfig: {},
budgetMonthlyCents: 0,
spentMonthlyCents: 0,
pauseReason: null,
pausedAt: null,
permissions: {
canCreateAgents: false,
},
lastHeartbeatAt: null,
metadata: null,
createdAt: new Date("2026-04-26T00:00:00.000Z"),
updatedAt: new Date("2026-04-26T00:00:00.000Z"),
...partial,
};
}
function secret(partial: Partial<CompanySecret>): CompanySecret {
return {
id: "secret-1",
companyId: "company-1",
key: "agent_agent-12_anthropic_api_key",
name: "agent_agent-12_anthropic_api_key",
provider: "local_encrypted",
status: "active",
managedMode: "paperclip_managed",
externalRef: null,
providerConfigId: null,
providerMetadata: null,
latestVersion: 1,
description: null,
lastResolvedAt: null,
lastRotatedAt: null,
deletedAt: null,
createdByAgentId: null,
createdByUserId: null,
createdAt: new Date("2026-04-26T00:00:00.000Z"),
updatedAt: new Date("2026-04-26T00:00:00.000Z"),
...partial,
};
}
function configWithSecretsProvider(provider: PaperclipConfig["secrets"]["provider"]): PaperclipConfig {
return {
$meta: {
version: 1,
updatedAt: "2026-05-02T00:00:00.000Z",
source: "configure",
},
database: {
mode: "embedded-postgres",
embeddedPostgresDataDir: "/tmp/paperclip/db",
embeddedPostgresPort: 55432,
backup: {
enabled: true,
intervalMinutes: 60,
retentionDays: 30,
dir: "/tmp/paperclip/backups",
},
},
logging: {
mode: "file",
logDir: "/tmp/paperclip/logs",
},
server: {
deploymentMode: "local_trusted",
exposure: "private",
host: "127.0.0.1",
port: 3100,
allowedHostnames: [],
serveUi: true,
},
auth: {
baseUrlMode: "auto",
disableSignUp: false,
},
telemetry: {
enabled: true,
},
storage: {
provider: "local_disk",
localDisk: {
baseDir: "/tmp/paperclip/storage",
},
s3: {
bucket: "paperclip",
region: "us-east-1",
prefix: "",
forcePathStyle: false,
},
},
secrets: {
provider,
strictMode: true,
localEncrypted: {
keyFilePath: "/tmp/paperclip/secrets/master.key",
},
},
};
}
describe("secrets CLI helpers", () => {
const originalEnv = { ...process.env };
beforeEach(() => {
process.env = { ...originalEnv };
delete process.env.PAPERCLIP_SECRETS_AWS_REGION;
delete process.env.AWS_REGION;
delete process.env.AWS_DEFAULT_REGION;
delete process.env.PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID;
delete process.env.PAPERCLIP_SECRETS_AWS_KMS_KEY_ID;
});
afterEach(() => {
process.env = { ...originalEnv };
});
it("parses declaration include filters", () => {
expect(parseSecretsInclude("agents,projects,tasks")).toEqual({
company: false,
agents: true,
projects: true,
issues: true,
skills: false,
});
});
it("detects inline sensitive env values that need migration", () => {
const rows = collectInlineSecretMigrationCandidates(
[
agent({
id: "agent-12345678",
adapterConfig: {
env: {
ANTHROPIC_API_KEY: "sk-ant-test",
GH_TOKEN: {
type: "plain",
value: "ghp-test",
},
PATH: {
type: "plain",
value: "/usr/bin",
},
OPENAI_API_KEY: {
type: "secret_ref",
secretId: "secret-existing",
},
},
},
}),
],
[
secret({
id: "secret-gh-token",
name: buildInlineMigrationSecretName("agent-12345678", "GH_TOKEN"),
}),
],
);
expect(rows).toEqual([
{
agentId: "agent-12345678",
agentName: "Coder",
envKey: "ANTHROPIC_API_KEY",
secretName: "agent_agent-12_anthropic_api_key",
existingSecretId: null,
},
{
agentId: "agent-12345678",
agentName: "Coder",
envKey: "GH_TOKEN",
secretName: "agent_agent-12_gh_token",
existingSecretId: "secret-gh-token",
},
]);
});
it("builds migrated env bindings without preserving secret values", () => {
const next = buildMigratedAgentEnv(
{
ANTHROPIC_API_KEY: "sk-ant-test",
NODE_ENV: {
type: "plain",
value: "development",
},
},
new Map([["ANTHROPIC_API_KEY", "secret-1"]]),
);
expect(next).toEqual({
ANTHROPIC_API_KEY: {
type: "secret_ref",
secretId: "secret-1",
version: "latest",
},
NODE_ENV: {
type: "plain",
value: "development",
},
});
expect(JSON.stringify(next)).not.toContain("sk-ant-test");
});
it("reads only explicit plain env values", () => {
expect(toPlainEnvValue("plain-value")).toBe("plain-value");
expect(toPlainEnvValue({ type: "plain", value: "wrapped" })).toBe("wrapped");
expect(toPlainEnvValue({ type: "secret_ref", secretId: "secret-1" })).toBeNull();
});
it("reports the AWS bootstrap config required by doctor", () => {
const result = secretsCheck(configWithSecretsProvider("aws_secrets_manager"));
expect(result.status).toBe("fail");
expect(result.message).toContain("PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID");
expect(result.repairHint).toContain("AWS SDK default credential chain");
expect(result.repairHint).toContain("Do not store AWS root credentials");
});
it("passes AWS doctor checks when non-secret provider config is present", () => {
process.env.PAPERCLIP_SECRETS_AWS_REGION = "us-east-1";
process.env.PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID = "prod-us-1";
process.env.PAPERCLIP_SECRETS_AWS_KMS_KEY_ID =
"arn:aws:kms:us-east-1:123456789012:key/test";
process.env.AWS_PROFILE = "paperclip-prod";
const result = secretsCheck(configWithSecretsProvider("aws_secrets_manager"));
expect(result.status).toBe("pass");
expect(result.message).toContain("prod-us-1");
expect(result.message).toContain("AWS_PROFILE/shared config");
});
});
+506
View File
@@ -0,0 +1,506 @@
import { Command } from "commander";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { registerSkillsCommands } from "../commands/client/skills.js";
import { resolveCompanySkillReference } from "../commands/client/skills.js";
const ORIGINAL_ENV = { ...process.env };
function makeProgram(): Command {
const program = new Command();
program.exitOverride();
program.configureOutput({
writeOut: () => undefined,
writeErr: () => undefined,
});
registerSkillsCommands(program);
return program;
}
async function runCommand(args: string[]): Promise<void> {
await makeProgram().parseAsync(args, { from: "user" });
}
function jsonResponse(body: unknown, status = 200): Response {
return new Response(JSON.stringify(body), {
status,
headers: { "content-type": "application/json" },
});
}
function skill(overrides: Record<string, unknown> = {}) {
return {
id: "11111111-1111-1111-1111-111111111111",
companyId: "company-1",
key: "paperclip/review-prs",
slug: "review-prs",
name: "Review PRs",
description: "Review pull requests",
markdown: "# Review PRs",
sourceType: "local_path",
sourceLocator: null,
sourceRef: null,
trustLevel: "markdown_only",
compatibility: "compatible",
fileInventory: [{ path: "SKILL.md", kind: "skill" }],
metadata: null,
createdAt: "2026-05-26T00:00:00.000Z",
updatedAt: "2026-05-26T00:00:00.000Z",
attachedAgentCount: 2,
editable: true,
editableReason: null,
sourceLabel: null,
sourceBadge: "local",
sourcePath: null,
...overrides,
};
}
function catalogSkill(overrides: Record<string, unknown> = {}) {
return {
id: "paperclipai:bundled:software-development:github-pr-workflow",
key: "paperclipai/bundled/software-development/github-pr-workflow",
kind: "bundled",
category: "software-development",
slug: "github-pr-workflow",
name: "github-pr-workflow",
description: "Prepare pull requests, review responses, and verification notes.",
path: "catalog/bundled/software-development/github-pr-workflow",
entrypoint: "SKILL.md",
trustLevel: "markdown_only",
compatibility: "compatible",
defaultInstall: false,
recommendedForRoles: ["engineer"],
requires: [],
tags: ["github", "pull-requests"],
files: [{ path: "SKILL.md", kind: "skill", sizeBytes: 128, sha256: "sha256:abc" }],
contentHash: "sha256:catalog",
...overrides,
};
}
function agent(overrides: Record<string, unknown> = {}) {
return {
id: "agent-1",
companyId: "company-1",
name: "Coder",
role: "engineer",
status: "active",
reportsTo: null,
budgetMonthlyCents: 0,
spentMonthlyCents: 0,
adapterType: "codex_local",
adapterConfig: {},
runtimeConfig: {},
permissions: {},
createdAt: "2026-05-26T00:00:00.000Z",
updatedAt: "2026-05-26T00:00:00.000Z",
...overrides,
};
}
describe("skills CLI helpers", () => {
it("resolves skill refs by id, key, or unique normalized slug", () => {
const rows = [
skill({ id: "skill-a", key: "paperclip/a", slug: "alpha", name: "Alpha" }),
skill({ id: "skill-b", key: "paperclip/b", slug: "beta-skill", name: "Beta" }),
];
expect(resolveCompanySkillReference(rows, "skill-a").key).toBe("paperclip/a");
expect(resolveCompanySkillReference(rows, "paperclip/b").id).toBe("skill-b");
expect(resolveCompanySkillReference(rows, "Beta Skill").id).toBe("skill-b");
});
it("rejects ambiguous slug refs", () => {
const rows = [
skill({ id: "skill-a", key: "paperclip/a", slug: "same", name: "A" }),
skill({ id: "skill-b", key: "paperclip/b", slug: "same", name: "B" }),
];
expect(() => resolveCompanySkillReference(rows, "same")).toThrow(/Ambiguous skill slug/);
});
});
describe("skills CLI commands", () => {
let fetchMock: ReturnType<typeof vi.fn>;
let logSpy: ReturnType<typeof vi.spyOn>;
let writeChunks: unknown[];
beforeEach(() => {
process.env = { ...ORIGINAL_ENV };
delete process.env.PAPERCLIP_API_URL;
delete process.env.PAPERCLIP_API_KEY;
delete process.env.PAPERCLIP_COMPANY_ID;
fetchMock = vi.fn();
vi.stubGlobal("fetch", fetchMock);
logSpy = vi.spyOn(console, "log").mockImplementation(() => undefined);
writeChunks = [];
vi.spyOn(process.stdout, "write").mockImplementation((chunk: string | Uint8Array) => {
writeChunks.push(chunk);
return true;
});
});
afterEach(() => {
process.env = { ...ORIGINAL_ENV };
vi.unstubAllGlobals();
vi.restoreAllMocks();
});
it("lists company skills as JSON through the shared client context", async () => {
const rows = [skill()];
fetchMock.mockResolvedValueOnce(jsonResponse(rows));
await runCommand([
"skills",
"list",
"--company-id",
"company-1",
"--api-base",
"http://paperclip.test",
"--api-key",
"token",
"--json",
]);
expect(fetchMock).toHaveBeenCalledWith(
"http://paperclip.test/api/companies/company-1/skills",
expect.objectContaining({
method: "GET",
headers: expect.objectContaining({ authorization: "Bearer token" }),
}),
);
expect(JSON.parse(String(logSpy.mock.calls[0]?.[0]))).toEqual(rows);
});
it("resolves a skill slug before reading detail", async () => {
fetchMock
.mockResolvedValueOnce(jsonResponse([skill()]))
.mockResolvedValueOnce(jsonResponse({ ...skill(), usedByAgents: [] }));
await runCommand([
"skills",
"show",
"Review PRs",
"--company-id",
"company-1",
"--api-base",
"http://paperclip.test",
"--api-key",
"token",
"--json",
]);
expect(fetchMock).toHaveBeenNthCalledWith(
2,
"http://paperclip.test/api/companies/company-1/skills/11111111-1111-1111-1111-111111111111",
expect.objectContaining({ method: "GET" }),
);
});
it("prints skill files as raw pipeable content in human mode", async () => {
fetchMock
.mockResolvedValueOnce(jsonResponse([skill()]))
.mockResolvedValueOnce(jsonResponse({
skillId: "11111111-1111-1111-1111-111111111111",
path: "SKILL.md",
kind: "skill",
content: "# Review PRs",
language: "markdown",
markdown: true,
editable: true,
}));
await runCommand([
"skills",
"file",
"review-prs",
"--company-id",
"company-1",
"--api-base",
"http://paperclip.test",
"--api-key",
"token",
]);
expect(logSpy).not.toHaveBeenCalled();
expect(writeChunks.join("")).toBe("# Review PRs\n");
});
it("browses catalog skills with filters in table output", async () => {
fetchMock.mockResolvedValueOnce(jsonResponse([catalogSkill()]));
await runCommand([
"skills",
"browse",
"--kind",
"bundled",
"--category",
"software-development",
"--query",
"github",
"--api-base",
"http://paperclip.test",
"--api-key",
"token",
]);
expect(fetchMock).toHaveBeenCalledWith(
"http://paperclip.test/api/skills/catalog?kind=bundled&category=software-development&q=github",
expect.objectContaining({ method: "GET" }),
);
const rendered = logSpy.mock.calls.map((call) => String(call[0])).join("\n");
expect(rendered).toContain("id");
expect(rendered).toContain("paperclipai:bundled:software-development:github-pr-workflow");
expect(rendered).toContain("roles");
});
it("searches catalog skills as JSON", async () => {
const rows = [catalogSkill()];
fetchMock.mockResolvedValueOnce(jsonResponse(rows));
await runCommand([
"skills",
"search",
"pull requests",
"--kind",
"bundled",
"--api-base",
"http://paperclip.test",
"--api-key",
"token",
"--json",
]);
expect(fetchMock).toHaveBeenCalledWith(
"http://paperclip.test/api/skills/catalog?kind=bundled&q=pull+requests",
expect.objectContaining({ method: "GET" }),
);
expect(JSON.parse(String(logSpy.mock.calls[0]?.[0]))).toEqual(rows);
});
it("inspects catalog skill detail by query ref so keys with slashes work", async () => {
const detail = catalogSkill();
fetchMock.mockResolvedValueOnce(jsonResponse(detail));
await runCommand([
"skills",
"inspect",
"paperclipai/bundled/software-development/github-pr-workflow",
"--api-base",
"http://paperclip.test",
"--api-key",
"token",
"--json",
]);
expect(fetchMock).toHaveBeenCalledWith(
"http://paperclip.test/api/skills/catalog/ref?ref=paperclipai%2Fbundled%2Fsoftware-development%2Fgithub-pr-workflow",
expect.objectContaining({ method: "GET" }),
);
expect(JSON.parse(String(logSpy.mock.calls[0]?.[0]))).toEqual(detail);
});
it("installs catalog skills into the company library without agent sync", async () => {
const result = {
action: "created",
skill: skill({
key: "paperclipai/bundled/software-development/github-pr-workflow",
slug: "pr-flow",
sourceType: "catalog",
}),
catalogSkill: catalogSkill(),
warnings: [],
};
fetchMock.mockResolvedValueOnce(jsonResponse(result, 201));
await runCommand([
"skills",
"install",
"github-pr-workflow",
"--as",
"pr-flow",
"--force",
"--company-id",
"company-1",
"--api-base",
"http://paperclip.test",
"--api-key",
"token",
"--json",
]);
expect(fetchMock).toHaveBeenCalledWith(
"http://paperclip.test/api/companies/company-1/skills/install-catalog",
expect.objectContaining({
method: "POST",
body: JSON.stringify({
catalogSkillId: "github-pr-workflow",
slug: "pr-flow",
force: true,
}),
}),
);
expect(JSON.parse(String(logSpy.mock.calls[0]?.[0]))).toEqual(result);
});
it("passes force to skill updates", async () => {
fetchMock
.mockResolvedValueOnce(jsonResponse([skill()]))
.mockResolvedValueOnce(jsonResponse(skill({ sourceRef: "sha256:new" })));
await runCommand([
"skills",
"update",
"review-prs",
"--force",
"--company-id",
"company-1",
"--api-base",
"http://paperclip.test",
"--api-key",
"token",
"--json",
]);
expect(fetchMock).toHaveBeenNthCalledWith(
2,
"http://paperclip.test/api/companies/company-1/skills/11111111-1111-1111-1111-111111111111/install-update",
expect.objectContaining({
method: "POST",
body: JSON.stringify({ force: true }),
}),
);
});
it("audits installed skill bytes through the server", async () => {
const audit = {
skillId: "11111111-1111-1111-1111-111111111111",
installedHash: "sha256:installed",
originHash: "sha256:origin",
verdict: "warning",
codes: ["network_reference"],
findings: [{
code: "network_reference",
severity: "warning",
message: "Skill content references network-capable commands or URLs.",
path: "SKILL.md",
}],
scannedAt: "2026-05-26T00:00:00.000Z",
scanVersion: "skills-audit-v1",
};
fetchMock
.mockResolvedValueOnce(jsonResponse([skill()]))
.mockResolvedValueOnce(jsonResponse(audit));
await runCommand([
"skills",
"audit",
"review-prs",
"--company-id",
"company-1",
"--api-base",
"http://paperclip.test",
"--api-key",
"token",
"--json",
]);
expect(fetchMock).toHaveBeenNthCalledWith(
2,
"http://paperclip.test/api/companies/company-1/skills/11111111-1111-1111-1111-111111111111/audit",
expect.objectContaining({
method: "POST",
body: JSON.stringify({}),
}),
);
expect(JSON.parse(String(logSpy.mock.calls[0]?.[0]))).toEqual(audit);
});
it("requires confirmation for reset and sends force when confirmed", async () => {
fetchMock
.mockResolvedValueOnce(jsonResponse([skill({ sourceType: "catalog" })]))
.mockResolvedValueOnce(jsonResponse(skill({ sourceType: "catalog" })));
await runCommand([
"skills",
"reset",
"review-prs",
"--yes",
"--force",
"--company-id",
"company-1",
"--api-base",
"http://paperclip.test",
"--api-key",
"token",
"--json",
]);
expect(fetchMock).toHaveBeenNthCalledWith(
2,
"http://paperclip.test/api/companies/company-1/skills/11111111-1111-1111-1111-111111111111/reset",
expect.objectContaining({
method: "POST",
body: JSON.stringify({ force: true }),
}),
);
});
it("syncs desired company skill refs to an agent and returns the runtime snapshot", async () => {
const snapshot = {
adapterType: "codex_local",
supported: true,
mode: "persistent",
desiredSkills: ["paperclip/review-prs"],
entries: [
{
key: "paperclip/review-prs",
runtimeName: "review-prs",
desired: true,
managed: true,
required: false,
state: "installed",
origin: "company_managed",
detail: null,
},
],
warnings: [],
};
fetchMock
.mockResolvedValueOnce(jsonResponse(agent()))
.mockResolvedValueOnce(jsonResponse(snapshot));
await runCommand([
"skills",
"agent",
"sync",
"coder",
"--skill",
"review-prs",
"--skill",
"paperclip/qa",
"--company-id",
"company-1",
"--api-base",
"http://paperclip.test",
"--api-key",
"token",
"--json",
]);
expect(fetchMock).toHaveBeenNthCalledWith(
1,
"http://paperclip.test/api/agents/coder?companyId=company-1",
expect.objectContaining({ method: "GET" }),
);
expect(fetchMock).toHaveBeenNthCalledWith(
2,
"http://paperclip.test/api/agents/agent-1/skills/sync",
expect.objectContaining({
method: "POST",
body: JSON.stringify({ desiredSkills: ["review-prs", "paperclip/qa"] }),
}),
);
expect(JSON.parse(String(logSpy.mock.calls[0]?.[0]))).toEqual(snapshot);
});
});
+537 -3
View File
@@ -2,10 +2,25 @@ import fs from "node:fs";
import os from "node:os";
import path from "node:path";
import { execFileSync } from "node:child_process";
import { randomUUID } from "node:crypto";
import { eq } from "drizzle-orm";
import { afterEach, describe, expect, it, vi } from "vitest";
import {
agents,
authUsers,
companies,
createDb,
issueComments,
issues,
projects,
routines,
routineTriggers,
} from "@paperclipai/db";
import {
copyGitHooksToWorktreeGitDir,
copySeededSecretsKey,
pauseSeededScheduledRoutines,
quarantineSeededWorktreeExecutionState,
readSourceAttachmentBody,
rebindWorkspaceCwd,
resolveSourceConfigPath,
@@ -13,6 +28,7 @@ import {
resolveWorktreeReseedTargetPaths,
resolveGitWorktreeAddArgs,
resolveWorktreeMakeTargetPath,
worktreeRepairCommand,
worktreeInitCommand,
worktreeMakeCommand,
worktreeReseedCommand,
@@ -28,9 +44,22 @@ import {
sanitizeWorktreeInstanceId,
} from "../commands/worktree-lib.js";
import type { PaperclipConfig } from "../config/schema.js";
import {
getEmbeddedPostgresTestSupport,
startEmbeddedPostgresTestDatabase,
} from "./helpers/embedded-postgres.js";
const ORIGINAL_CWD = process.cwd();
const ORIGINAL_ENV = { ...process.env };
const embeddedPostgresSupport = await getEmbeddedPostgresTestSupport();
const itEmbeddedPostgres = embeddedPostgresSupport.supported ? it : it.skip;
const describeEmbeddedPostgres = embeddedPostgresSupport.supported ? describe : describe.skip;
if (!embeddedPostgresSupport.supported) {
console.warn(
`Skipping embedded Postgres worktree CLI tests on this host: ${embeddedPostgresSupport.reason ?? "unsupported environment"}`,
);
}
afterEach(() => {
process.chdir(ORIGINAL_CWD);
@@ -161,8 +190,9 @@ describe("worktree helpers", () => {
).toEqual(["worktree", "add", "-b", "my-worktree", "/tmp/my-worktree", "origin/main"]);
});
it("rewrites loopback auth URLs to the new port only", () => {
it("rewrites auth URLs only when they already include a port", () => {
expect(rewriteLocalUrlPort("http://127.0.0.1:3100", 3110)).toBe("http://127.0.0.1:3110/");
expect(rewriteLocalUrlPort("http://my-host.ts.net:3100", 3110)).toBe("http://my-host.ts.net:3110/");
expect(rewriteLocalUrlPort("https://paperclip.example", 3110)).toBe("https://paperclip.example");
});
@@ -257,6 +287,138 @@ describe("worktree helpers", () => {
expect(full.nullifyColumns).toEqual({});
});
itEmbeddedPostgres("quarantines copied live execution state in seeded worktree databases", async () => {
const tempDb = await startEmbeddedPostgresTestDatabase("paperclip-worktree-quarantine-");
const db = createDb(tempDb.connectionString);
const companyId = randomUUID();
const agentId = randomUUID();
const idleAgentId = randomUUID();
const inProgressIssueId = randomUUID();
const todoIssueId = randomUUID();
const reviewIssueId = randomUUID();
const userIssueId = randomUUID();
try {
await db.insert(companies).values({
id: companyId,
name: "Paperclip",
issuePrefix: "WTQ",
requireBoardApprovalForNewAgents: false,
});
await db.insert(agents).values([
{
id: agentId,
companyId,
name: "CodexCoder",
role: "engineer",
status: "running",
adapterType: "codex_local",
adapterConfig: {},
runtimeConfig: {
heartbeat: { enabled: true, intervalSec: 60 },
wakeOnDemand: true,
},
permissions: {},
},
{
id: idleAgentId,
companyId,
name: "Reviewer",
role: "reviewer",
status: "idle",
adapterType: "codex_local",
adapterConfig: {},
runtimeConfig: { heartbeat: { enabled: false, intervalSec: 300 } },
permissions: {},
},
]);
await db.insert(issues).values([
{
id: inProgressIssueId,
companyId,
title: "Copied in-flight issue",
status: "in_progress",
priority: "medium",
assigneeAgentId: agentId,
issueNumber: 1,
identifier: "WTQ-1",
executionAgentNameKey: "codexcoder",
executionLockedAt: new Date("2026-04-18T00:00:00.000Z"),
},
{
id: todoIssueId,
companyId,
title: "Copied assigned todo issue",
status: "todo",
priority: "medium",
assigneeAgentId: agentId,
issueNumber: 2,
identifier: "WTQ-2",
},
{
id: reviewIssueId,
companyId,
title: "Copied assigned review issue",
status: "in_review",
priority: "medium",
assigneeAgentId: idleAgentId,
issueNumber: 3,
identifier: "WTQ-3",
},
{
id: userIssueId,
companyId,
title: "Copied user issue",
status: "todo",
priority: "medium",
assigneeUserId: "user-1",
issueNumber: 4,
identifier: "WTQ-4",
},
]);
await expect(quarantineSeededWorktreeExecutionState(tempDb.connectionString)).resolves.toEqual({
disabledTimerHeartbeats: 1,
resetRunningAgents: 1,
quarantinedInProgressIssues: 1,
unassignedTodoIssues: 1,
unassignedReviewIssues: 1,
});
const [quarantinedAgent] = await db.select().from(agents).where(eq(agents.id, agentId));
expect(quarantinedAgent?.status).toBe("idle");
expect(quarantinedAgent?.runtimeConfig).toMatchObject({
heartbeat: { enabled: false, intervalSec: 60 },
wakeOnDemand: true,
});
const [inProgressIssue] = await db.select().from(issues).where(eq(issues.id, inProgressIssueId));
expect(inProgressIssue?.status).toBe("blocked");
expect(inProgressIssue?.assigneeAgentId).toBeNull();
expect(inProgressIssue?.executionAgentNameKey).toBeNull();
expect(inProgressIssue?.executionLockedAt).toBeNull();
const [todoIssue] = await db.select().from(issues).where(eq(issues.id, todoIssueId));
expect(todoIssue?.status).toBe("todo");
expect(todoIssue?.assigneeAgentId).toBeNull();
const [reviewIssue] = await db.select().from(issues).where(eq(issues.id, reviewIssueId));
expect(reviewIssue?.status).toBe("in_review");
expect(reviewIssue?.assigneeAgentId).toBeNull();
const [userIssue] = await db.select().from(issues).where(eq(issues.id, userIssueId));
expect(userIssue?.status).toBe("todo");
expect(userIssue?.assigneeUserId).toBe("user-1");
const comments = await db.select().from(issueComments).where(eq(issueComments.issueId, inProgressIssueId));
expect(comments).toHaveLength(1);
expect(comments[0]?.body).toContain("Quarantined during worktree seed");
} finally {
await db.$client?.end?.({ timeout: 5 }).catch(() => undefined);
await tempDb.cleanup();
}
}, 20_000);
it("copies the source local_encrypted secrets key into the seeded worktree instance", () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-secrets-"));
const originalInlineMasterKey = process.env.PAPERCLIP_SECRETS_MASTER_KEY;
@@ -350,6 +512,136 @@ describe("worktree helpers", () => {
}
});
it("preserves repo-managed worktree checkouts when --force re-runs from the source repo", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-force-preserve-"));
const repoRoot = path.join(tempRoot, "repo");
const originalCwd = process.cwd();
try {
fs.mkdirSync(repoRoot, { recursive: true });
const repoConfigDir = path.join(repoRoot, ".paperclip");
fs.mkdirSync(repoConfigDir, { recursive: true });
fs.writeFileSync(path.join(repoConfigDir, "config.json"), "stale", "utf8");
fs.writeFileSync(path.join(repoConfigDir, ".env"), "STALE=1", "utf8");
// Simulate the repo-managed worktrees subfolder that holds every
// worktree checkout (the directory PAPA-358 reported as nuked).
const worktreesDir = path.join(repoConfigDir, "worktrees");
const checkoutDir = path.join(worktreesDir, "PAP-100-feature");
fs.mkdirSync(checkoutDir, { recursive: true });
const sentinelPath = path.join(checkoutDir, "sentinel.txt");
fs.writeFileSync(sentinelPath, "do-not-delete", "utf8");
process.chdir(repoRoot);
await worktreeInitCommand({
seed: false,
force: true,
fromConfig: path.join(tempRoot, "missing", "config.json"),
home: path.join(tempRoot, ".paperclip-worktrees"),
});
expect(fs.existsSync(sentinelPath)).toBe(true);
expect(fs.readFileSync(sentinelPath, "utf8")).toBe("do-not-delete");
expect(fs.existsSync(path.join(repoConfigDir, "config.json"))).toBe(true);
expect(fs.readFileSync(path.join(repoConfigDir, "config.json"), "utf8")).not.toBe("stale");
} finally {
process.chdir(originalCwd);
fs.rmSync(tempRoot, { recursive: true, force: true });
}
});
itEmbeddedPostgres(
"seeds authenticated users into minimally cloned worktree instances",
async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-auth-seed-"));
const worktreeRoot = path.join(tempRoot, "PAP-999-auth-seed");
const sourceHome = path.join(tempRoot, "source-home");
const sourceConfigDir = path.join(sourceHome, "instances", "source");
const sourceConfigPath = path.join(sourceConfigDir, "config.json");
const sourceEnvPath = path.join(sourceConfigDir, ".env");
const sourceKeyPath = path.join(sourceConfigDir, "secrets", "master.key");
const worktreeHome = path.join(tempRoot, ".paperclip-worktrees");
const originalCwd = process.cwd();
const sourceDb = await startEmbeddedPostgresTestDatabase("paperclip-worktree-auth-source-");
try {
const sourceDbClient = createDb(sourceDb.connectionString);
await sourceDbClient.insert(authUsers).values({
id: "user-existing",
email: "existing@paperclip.ing",
name: "Existing User",
emailVerified: true,
createdAt: new Date(),
updatedAt: new Date(),
});
fs.mkdirSync(path.dirname(sourceKeyPath), { recursive: true });
fs.mkdirSync(worktreeRoot, { recursive: true });
const sourceConfig = buildSourceConfig();
sourceConfig.database = {
mode: "postgres",
embeddedPostgresDataDir: path.join(sourceConfigDir, "db"),
embeddedPostgresPort: 54329,
backup: {
enabled: true,
intervalMinutes: 60,
retentionDays: 30,
dir: path.join(sourceConfigDir, "backups"),
},
connectionString: sourceDb.connectionString,
};
sourceConfig.logging.logDir = path.join(sourceConfigDir, "logs");
sourceConfig.storage.localDisk.baseDir = path.join(sourceConfigDir, "storage");
sourceConfig.secrets.localEncrypted.keyFilePath = sourceKeyPath;
fs.writeFileSync(sourceConfigPath, JSON.stringify(sourceConfig, null, 2) + "\n", "utf8");
fs.writeFileSync(sourceEnvPath, "", "utf8");
fs.writeFileSync(sourceKeyPath, "source-master-key", "utf8");
process.chdir(worktreeRoot);
await worktreeInitCommand({
name: "PAP-999-auth-seed",
home: worktreeHome,
fromConfig: sourceConfigPath,
force: true,
});
const targetConfig = JSON.parse(
fs.readFileSync(path.join(worktreeRoot, ".paperclip", "config.json"), "utf8"),
) as PaperclipConfig;
const { default: EmbeddedPostgres } = await import("embedded-postgres");
const targetPg = new EmbeddedPostgres({
databaseDir: targetConfig.database.embeddedPostgresDataDir,
user: "paperclip",
password: "paperclip",
port: targetConfig.database.embeddedPostgresPort,
persistent: true,
initdbFlags: ["--encoding=UTF8", "--locale=C", "--lc-messages=C"],
onLog: () => {},
onError: () => {},
});
await targetPg.start();
try {
const targetDb = createDb(
`postgres://paperclip:paperclip@127.0.0.1:${targetConfig.database.embeddedPostgresPort}/paperclip`,
);
const seededUsers = await targetDb.select().from(authUsers);
expect(seededUsers.some((row) => row.email === "existing@paperclip.ing")).toBe(true);
} finally {
await targetPg.stop();
}
} finally {
process.chdir(originalCwd);
await sourceDb.cleanup();
fs.rmSync(tempRoot, { recursive: true, force: true });
}
},
30000,
);
it("avoids ports already claimed by sibling worktree instance configs", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-claimed-ports-"));
const repoRoot = path.join(tempRoot, "repo");
@@ -629,7 +921,7 @@ describe("worktree helpers", () => {
}
fs.rmSync(tempRoot, { recursive: true, force: true });
}
}, 20_000);
}, 30_000);
it("restores the current worktree config and instance data if reseed fails", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-reseed-rollback-"));
@@ -786,7 +1078,7 @@ describe("worktree helpers", () => {
execFileSync("git", ["worktree", "remove", "--force", worktreePath], { cwd: repoRoot, stdio: "ignore" });
fs.rmSync(tempRoot, { recursive: true, force: true });
}
});
}, 15_000);
it("creates and initializes a worktree from the top-level worktree:make command", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-make-"));
@@ -822,4 +1114,246 @@ describe("worktree helpers", () => {
fs.rmSync(tempRoot, { recursive: true, force: true });
}
}, 20_000);
it("no-ops on the primary checkout unless --branch is provided", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-repair-primary-"));
const repoRoot = path.join(tempRoot, "repo");
const originalCwd = process.cwd();
try {
fs.mkdirSync(repoRoot, { recursive: true });
execFileSync("git", ["init"], { cwd: repoRoot, stdio: "ignore" });
execFileSync("git", ["config", "user.email", "test@example.com"], { cwd: repoRoot, stdio: "ignore" });
execFileSync("git", ["config", "user.name", "Test User"], { cwd: repoRoot, stdio: "ignore" });
fs.writeFileSync(path.join(repoRoot, "README.md"), "# temp\n", "utf8");
execFileSync("git", ["add", "README.md"], { cwd: repoRoot, stdio: "ignore" });
execFileSync("git", ["commit", "-m", "Initial commit"], { cwd: repoRoot, stdio: "ignore" });
process.chdir(repoRoot);
await worktreeRepairCommand({});
expect(fs.existsSync(path.join(repoRoot, ".paperclip", "config.json"))).toBe(false);
expect(fs.existsSync(path.join(repoRoot, ".paperclip", "worktrees"))).toBe(false);
} finally {
process.chdir(originalCwd);
fs.rmSync(tempRoot, { recursive: true, force: true });
}
});
it("repairs the current linked worktree when Paperclip metadata is missing", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-repair-current-"));
const repoRoot = path.join(tempRoot, "repo");
const worktreePath = path.join(repoRoot, ".paperclip", "worktrees", "repair-me");
const sourceConfigPath = path.join(tempRoot, "source-config.json");
const worktreeHome = path.join(tempRoot, ".paperclip-worktrees");
const worktreePaths = resolveWorktreeLocalPaths({
cwd: worktreePath,
homeDir: worktreeHome,
instanceId: sanitizeWorktreeInstanceId(path.basename(worktreePath)),
});
const originalCwd = process.cwd();
try {
fs.mkdirSync(repoRoot, { recursive: true });
execFileSync("git", ["init"], { cwd: repoRoot, stdio: "ignore" });
execFileSync("git", ["config", "user.email", "test@example.com"], { cwd: repoRoot, stdio: "ignore" });
execFileSync("git", ["config", "user.name", "Test User"], { cwd: repoRoot, stdio: "ignore" });
fs.writeFileSync(path.join(repoRoot, "README.md"), "# temp\n", "utf8");
execFileSync("git", ["add", "README.md"], { cwd: repoRoot, stdio: "ignore" });
execFileSync("git", ["commit", "-m", "Initial commit"], { cwd: repoRoot, stdio: "ignore" });
fs.mkdirSync(path.dirname(worktreePath), { recursive: true });
execFileSync("git", ["worktree", "add", "-b", "repair-me", worktreePath, "HEAD"], {
cwd: repoRoot,
stdio: "ignore",
});
fs.writeFileSync(sourceConfigPath, JSON.stringify(buildSourceConfig(), null, 2), "utf8");
fs.mkdirSync(worktreePaths.instanceRoot, { recursive: true });
fs.writeFileSync(path.join(worktreePaths.instanceRoot, "marker.txt"), "stale", "utf8");
process.chdir(worktreePath);
await worktreeRepairCommand({
fromConfig: sourceConfigPath,
home: worktreeHome,
noSeed: true,
});
expect(fs.existsSync(path.join(worktreePath, ".paperclip", "config.json"))).toBe(true);
expect(fs.existsSync(path.join(worktreePath, ".paperclip", ".env"))).toBe(true);
expect(fs.existsSync(path.join(worktreePaths.instanceRoot, "marker.txt"))).toBe(false);
} finally {
process.chdir(originalCwd);
fs.rmSync(tempRoot, { recursive: true, force: true });
}
}, 20_000);
it("creates and repairs a missing branch worktree when --branch is provided", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-repair-branch-"));
const repoRoot = path.join(tempRoot, "repo");
const sourceConfigPath = path.join(tempRoot, "source-config.json");
const worktreeHome = path.join(tempRoot, ".paperclip-worktrees");
const originalCwd = process.cwd();
const expectedWorktreePath = path.join(repoRoot, ".paperclip", "worktrees", "feature-repair-me");
try {
fs.mkdirSync(repoRoot, { recursive: true });
execFileSync("git", ["init"], { cwd: repoRoot, stdio: "ignore" });
execFileSync("git", ["config", "user.email", "test@example.com"], { cwd: repoRoot, stdio: "ignore" });
execFileSync("git", ["config", "user.name", "Test User"], { cwd: repoRoot, stdio: "ignore" });
fs.writeFileSync(path.join(repoRoot, "README.md"), "# temp\n", "utf8");
execFileSync("git", ["add", "README.md"], { cwd: repoRoot, stdio: "ignore" });
execFileSync("git", ["commit", "-m", "Initial commit"], { cwd: repoRoot, stdio: "ignore" });
fs.writeFileSync(sourceConfigPath, JSON.stringify(buildSourceConfig(), null, 2), "utf8");
process.chdir(repoRoot);
await worktreeRepairCommand({
branch: "feature/repair-me",
fromConfig: sourceConfigPath,
home: worktreeHome,
noSeed: true,
});
expect(fs.existsSync(path.join(expectedWorktreePath, ".git"))).toBe(true);
expect(fs.existsSync(path.join(expectedWorktreePath, ".paperclip", "config.json"))).toBe(true);
expect(fs.existsSync(path.join(expectedWorktreePath, ".paperclip", ".env"))).toBe(true);
} finally {
process.chdir(originalCwd);
fs.rmSync(tempRoot, { recursive: true, force: true });
}
}, 20_000);
});
describeEmbeddedPostgres("pauseSeededScheduledRoutines", () => {
it("pauses only routines with enabled schedule triggers", async () => {
const tempDb = await startEmbeddedPostgresTestDatabase("paperclip-worktree-routines-");
const db = createDb(tempDb.connectionString);
const companyId = randomUUID();
const projectId = randomUUID();
const agentId = randomUUID();
const activeScheduledRoutineId = randomUUID();
const activeApiRoutineId = randomUUID();
const pausedScheduledRoutineId = randomUUID();
const archivedScheduledRoutineId = randomUUID();
const disabledScheduleRoutineId = randomUUID();
try {
await db.insert(companies).values({
id: companyId,
name: "Paperclip",
issuePrefix: `T${companyId.replace(/-/g, "").slice(0, 6).toUpperCase()}`,
requireBoardApprovalForNewAgents: false,
});
await db.insert(agents).values({
id: agentId,
companyId,
name: "Coder",
adapterType: "process",
adapterConfig: {},
runtimeConfig: {},
permissions: {},
});
await db.insert(projects).values({
id: projectId,
companyId,
name: "Project",
status: "in_progress",
});
await db.insert(routines).values([
{
id: activeScheduledRoutineId,
companyId,
projectId,
assigneeAgentId: agentId,
title: "Active scheduled",
status: "active",
},
{
id: activeApiRoutineId,
companyId,
projectId,
assigneeAgentId: agentId,
title: "Active API",
status: "active",
},
{
id: pausedScheduledRoutineId,
companyId,
projectId,
assigneeAgentId: agentId,
title: "Paused scheduled",
status: "paused",
},
{
id: archivedScheduledRoutineId,
companyId,
projectId,
assigneeAgentId: agentId,
title: "Archived scheduled",
status: "archived",
},
{
id: disabledScheduleRoutineId,
companyId,
projectId,
assigneeAgentId: agentId,
title: "Disabled schedule",
status: "active",
},
]);
await db.insert(routineTriggers).values([
{
companyId,
routineId: activeScheduledRoutineId,
kind: "schedule",
enabled: true,
cronExpression: "0 9 * * *",
timezone: "UTC",
},
{
companyId,
routineId: activeApiRoutineId,
kind: "api",
enabled: true,
},
{
companyId,
routineId: pausedScheduledRoutineId,
kind: "schedule",
enabled: true,
cronExpression: "0 10 * * *",
timezone: "UTC",
},
{
companyId,
routineId: archivedScheduledRoutineId,
kind: "schedule",
enabled: true,
cronExpression: "0 11 * * *",
timezone: "UTC",
},
{
companyId,
routineId: disabledScheduleRoutineId,
kind: "schedule",
enabled: false,
cronExpression: "0 12 * * *",
timezone: "UTC",
},
]);
const pausedCount = await pauseSeededScheduledRoutines(tempDb.connectionString);
expect(pausedCount).toBe(1);
const rows = await db.select({ id: routines.id, status: routines.status }).from(routines);
const statusById = new Map(rows.map((row) => [row.id, row.status]));
expect(statusById.get(activeScheduledRoutineId)).toBe("paused");
expect(statusById.get(activeApiRoutineId)).toBe("active");
expect(statusById.get(pausedScheduledRoutineId)).toBe("paused");
expect(statusById.get(archivedScheduledRoutineId)).toBe("archived");
expect(statusById.get(disabledScheduleRoutineId)).toBe("active");
} finally {
await db.$client?.end?.({ timeout: 5 }).catch(() => undefined);
await tempDb.cleanup();
}
}, 20_000);
});
+21
View File
@@ -1,8 +1,11 @@
import type { CLIAdapterModule } from "@paperclipai/adapter-utils";
import { printAcpxStreamEvent } from "@paperclipai/adapter-acpx-local/cli";
import { printClaudeStreamEvent } from "@paperclipai/adapter-claude-local/cli";
import { printCodexStreamEvent } from "@paperclipai/adapter-codex-local/cli";
import { printCursorStreamEvent } from "@paperclipai/adapter-cursor-local/cli";
import { printCursorCloudEvent } from "@paperclipai/adapter-cursor-cloud/cli";
import { printGeminiStreamEvent } from "@paperclipai/adapter-gemini-local/cli";
import { printGrokStreamEvent } from "@paperclipai/adapter-grok-local/cli";
import { printOpenCodeStreamEvent } from "@paperclipai/adapter-opencode-local/cli";
import { printPiStreamEvent } from "@paperclipai/adapter-pi-local/cli";
import { printOpenClawGatewayStreamEvent } from "@paperclipai/adapter-openclaw-gateway/cli";
@@ -14,6 +17,11 @@ const claudeLocalCLIAdapter: CLIAdapterModule = {
formatStdoutEvent: printClaudeStreamEvent,
};
const acpxLocalCLIAdapter: CLIAdapterModule = {
type: "acpx_local",
formatStdoutEvent: printAcpxStreamEvent,
};
const codexLocalCLIAdapter: CLIAdapterModule = {
type: "codex_local",
formatStdoutEvent: printCodexStreamEvent,
@@ -34,11 +42,21 @@ const cursorLocalCLIAdapter: CLIAdapterModule = {
formatStdoutEvent: printCursorStreamEvent,
};
const cursorCloudCLIAdapter: CLIAdapterModule = {
type: "cursor_cloud",
formatStdoutEvent: printCursorCloudEvent,
};
const geminiLocalCLIAdapter: CLIAdapterModule = {
type: "gemini_local",
formatStdoutEvent: printGeminiStreamEvent,
};
const grokLocalCLIAdapter: CLIAdapterModule = {
type: "grok_local",
formatStdoutEvent: printGrokStreamEvent,
};
const openclawGatewayCLIAdapter: CLIAdapterModule = {
type: "openclaw_gateway",
formatStdoutEvent: printOpenClawGatewayStreamEvent,
@@ -46,12 +64,15 @@ const openclawGatewayCLIAdapter: CLIAdapterModule = {
const adaptersByType = new Map<string, CLIAdapterModule>(
[
acpxLocalCLIAdapter,
claudeLocalCLIAdapter,
codexLocalCLIAdapter,
openCodeLocalCLIAdapter,
piLocalCLIAdapter,
cursorLocalCLIAdapter,
cursorCloudCLIAdapter,
geminiLocalCLIAdapter,
grokLocalCLIAdapter,
openclawGatewayCLIAdapter,
processCLIAdapter,
httpCLIAdapter,
+6 -9
View File
@@ -1,24 +1,21 @@
import { inferBindModeFromHost } from "@paperclipai/shared";
import type { PaperclipConfig } from "../config/schema.js";
import type { CheckResult } from "./index.js";
function isLoopbackHost(host: string) {
const normalized = host.trim().toLowerCase();
return normalized === "127.0.0.1" || normalized === "localhost" || normalized === "::1";
}
export function deploymentAuthCheck(config: PaperclipConfig): CheckResult {
const mode = config.server.deploymentMode;
const exposure = config.server.exposure;
const auth = config.auth;
const bind = config.server.bind ?? inferBindModeFromHost(config.server.host);
if (mode === "local_trusted") {
if (!isLoopbackHost(config.server.host)) {
if (bind !== "loopback") {
return {
name: "Deployment/auth mode",
status: "fail",
message: `local_trusted requires loopback host binding (found ${config.server.host})`,
message: `local_trusted requires loopback binding (found ${bind})`,
canRepair: false,
repairHint: "Run `paperclipai configure --section server` and set host to 127.0.0.1",
repairHint: "Run `paperclipai configure --section server` and choose Local trusted / loopback reachability",
};
}
return {
@@ -86,6 +83,6 @@ export function deploymentAuthCheck(config: PaperclipConfig): CheckResult {
return {
name: "Deployment/auth mode",
status: "pass",
message: `Mode ${mode}/${exposure} with auth URL mode ${auth.baseUrlMode}`,
message: `Mode ${mode}/${exposure} with bind ${bind} and auth URL mode ${auth.baseUrlMode}`,
};
}
+98 -4
View File
@@ -5,6 +5,9 @@ import type { PaperclipConfig } from "../config/schema.js";
import type { CheckResult } from "./index.js";
import { resolveRuntimeLikePath } from "./path-resolver.js";
const AWS_CREDENTIAL_SOURCE_HINT =
"Provide AWS runtime credentials through the AWS SDK default credential chain: IAM role/workload identity, AWS_PROFILE/SSO/shared credentials, web identity, container/instance metadata, or short-lived shell credentials";
function decodeMasterKey(raw: string): Buffer | null {
const trimmed = raw.trim();
if (!trimmed) return null;
@@ -47,13 +50,16 @@ function withStrictModeNote(
export function secretsCheck(config: PaperclipConfig, configPath?: string): CheckResult {
const provider = config.secrets.provider;
if (provider === "aws_secrets_manager") {
return withStrictModeNote(awsSecretsManagerCheck(), config);
}
if (provider !== "local_encrypted") {
return {
name: "Secrets adapter",
status: "fail",
message: `${provider} is configured, but this build only supports local_encrypted`,
message: `${provider} is configured, but this build only supports local_encrypted and aws_secrets_manager`,
canRepair: false,
repairHint: "Run `paperclipai configure --section secrets` and set provider to local_encrypted",
repairHint: "Run `paperclipai configure --section secrets` and choose local_encrypted or aws_secrets_manager",
};
}
@@ -135,12 +141,100 @@ export function secretsCheck(config: PaperclipConfig, configPath?: string): Chec
};
}
const keyMode = fs.statSync(keyFilePath).mode & 0o777;
const permissionWarning =
(keyMode & 0o077) !== 0
? `; key file permissions are ${keyMode.toString(8)} (run chmod 600 ${keyFilePath})`
: "";
return withStrictModeNote(
{
name: "Secrets adapter",
status: "pass",
message: `Local encrypted provider configured with key file ${keyFilePath}`,
status: permissionWarning ? "warn" : "pass",
message: `Local encrypted provider configured with key file ${keyFilePath}${permissionWarning}`,
repairHint: permissionWarning
? "Restrict the local encrypted secrets key file to owner read/write permissions"
: undefined,
},
config,
);
}
function awsSecretsManagerCheck(): CheckResult {
const missingConfig = missingAwsSecretsManagerConfig();
if (missingConfig.length > 0) {
return {
name: "Secrets adapter",
status: "fail",
message: `AWS Secrets Manager provider is missing non-secret config: ${missingConfig.join(", ")}`,
canRepair: false,
repairHint:
`Set ${missingConfig.join(", ")} in the Paperclip server runtime. ${AWS_CREDENTIAL_SOURCE_HINT}. Do not store AWS root credentials or long-lived IAM user keys in Paperclip secrets.`,
};
}
const staticEnvCredentials =
process.env.AWS_ACCESS_KEY_ID?.trim() && process.env.AWS_SECRET_ACCESS_KEY?.trim();
const credentialSource = detectedAwsCredentialSources().join(", ");
const message =
`AWS Secrets Manager provider configured for deployment ${process.env.PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID}; ` +
`runtime credentials source: ${credentialSource || "AWS SDK default credential chain"}`;
if (staticEnvCredentials) {
return {
name: "Secrets adapter",
status: "warn",
message,
canRepair: false,
repairHint:
"AWS static environment credentials are visible. Use only short-lived shell credentials locally; prefer IAM role/workload identity for hosted deployments and never store AWS access keys in Paperclip company secrets.",
};
}
return {
name: "Secrets adapter",
status: "pass",
message,
};
}
function missingAwsSecretsManagerConfig(): string[] {
const missing: string[] = [];
if (
!(
process.env.PAPERCLIP_SECRETS_AWS_REGION?.trim() ||
process.env.AWS_REGION?.trim() ||
process.env.AWS_DEFAULT_REGION?.trim()
)
) {
missing.push("PAPERCLIP_SECRETS_AWS_REGION or AWS_REGION/AWS_DEFAULT_REGION");
}
if (!process.env.PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID?.trim()) {
missing.push("PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID");
}
if (!process.env.PAPERCLIP_SECRETS_AWS_KMS_KEY_ID?.trim()) {
missing.push("PAPERCLIP_SECRETS_AWS_KMS_KEY_ID");
}
return missing;
}
function detectedAwsCredentialSources(): string[] {
const sources: string[] = [];
if (process.env.AWS_PROFILE?.trim()) sources.push("AWS_PROFILE/shared config");
if (process.env.AWS_ACCESS_KEY_ID?.trim() && process.env.AWS_SECRET_ACCESS_KEY?.trim()) {
sources.push("temporary AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY environment credentials");
}
if (process.env.AWS_WEB_IDENTITY_TOKEN_FILE?.trim() && process.env.AWS_ROLE_ARN?.trim()) {
sources.push("AWS web identity token");
}
if (
process.env.AWS_CONTAINER_CREDENTIALS_RELATIVE_URI?.trim() ||
process.env.AWS_CONTAINER_CREDENTIALS_FULL_URI?.trim()
) {
sources.push("AWS container credentials endpoint");
}
if (process.env.AWS_SHARED_CREDENTIALS_FILE?.trim() || process.env.AWS_CONFIG_FILE?.trim()) {
sources.push("custom AWS shared credentials/config file");
}
return sources;
}
+7 -2
View File
@@ -3,6 +3,7 @@ import * as p from "@clack/prompts";
import pc from "picocolors";
import { and, eq, gt, isNull } from "drizzle-orm";
import { createDb, instanceUserRoles, invites } from "@paperclipai/db";
import { inferBindModeFromHost } from "@paperclipai/shared";
import { loadPaperclipEnvFile } from "../config/env.js";
import { readConfig, resolveConfigPath } from "../config/store.js";
@@ -40,9 +41,13 @@ function resolveBaseUrl(configPath?: string, explicitBaseUrl?: string) {
if (config?.auth.baseUrlMode === "explicit" && config.auth.publicBaseUrl) {
return config.auth.publicBaseUrl.replace(/\/+$/, "");
}
const host = config?.server.host ?? "localhost";
const bind = config?.server.bind ?? inferBindModeFromHost(config?.server.host);
const host =
bind === "custom"
? config?.server.customBindHost ?? config?.server.host ?? "localhost"
: config?.server.host ?? "localhost";
const port = config?.server.port ?? 3100;
const publicHost = host === "0.0.0.0" ? "localhost" : host;
const publicHost = host === "0.0.0.0" || bind === "lan" ? "localhost" : host;
return `http://${publicHost}:${port}`;
}
+177
View File
@@ -0,0 +1,177 @@
import fs from "node:fs";
import path from "node:path";
import { resolvePaperclipInstanceRoot } from "../../config/home.js";
export interface CloudConnectionTokenRecord {
id: string;
companyStackId: string;
targetOrigin: string;
sourceInstanceId: string;
sourceInstanceFingerprint: string;
scopes: string[];
expiresAt: string;
[key: string]: unknown;
}
export interface CloudConnection {
id: string;
remoteUrl: string;
targetOrigin: string;
targetHost: string;
stackId: string;
stackSlug?: string | null;
stackDisplayName?: string | null;
targetCompanyId: string;
accessToken: string;
token: CloudConnectionTokenRecord;
privateKeyPem: string;
sourcePublicKey: string;
sourceInstanceId: string;
sourceInstanceFingerprint: string;
scopes: string[];
createdAt: string;
updatedAt: string;
}
interface CloudConnectionStore {
version: 1;
connections: Record<string, CloudConnection>;
currentConnectionId?: string;
}
function defaultStore(): CloudConnectionStore {
return {
version: 1,
connections: {},
};
}
export function resolveCloudConnectionStorePath(): string {
return path.resolve(resolvePaperclipInstanceRoot(), "secrets", "cloud-upstream-connections.json");
}
export function readCloudConnectionStore(storePath = resolveCloudConnectionStorePath()): CloudConnectionStore {
if (!fs.existsSync(storePath)) return defaultStore();
const raw = JSON.parse(fs.readFileSync(storePath, "utf8")) as Partial<CloudConnectionStore> | null;
const connections: Record<string, CloudConnection> = {};
if (raw?.connections && typeof raw.connections === "object") {
for (const [id, value] of Object.entries(raw.connections)) {
const normalized = normalizeConnection(value);
if (normalized) connections[id] = normalized;
}
}
const currentConnectionId =
typeof raw?.currentConnectionId === "string" && connections[raw.currentConnectionId]
? raw.currentConnectionId
: Object.values(connections).sort((left, right) => right.updatedAt.localeCompare(left.updatedAt))[0]?.id;
return {
version: 1,
connections,
currentConnectionId,
};
}
export function writeCloudConnectionStore(
store: CloudConnectionStore,
storePath = resolveCloudConnectionStorePath(),
): void {
fs.mkdirSync(path.dirname(storePath), { recursive: true });
fs.writeFileSync(storePath, `${JSON.stringify(store, null, 2)}\n`, { mode: 0o600 });
}
export function upsertCloudConnection(
connection: CloudConnection,
storePath = resolveCloudConnectionStorePath(),
): CloudConnection {
const store = readCloudConnectionStore(storePath);
const existing = store.connections[connection.id];
const now = new Date().toISOString();
const next = {
...connection,
createdAt: existing?.createdAt ?? connection.createdAt ?? now,
updatedAt: now,
};
store.connections[next.id] = next;
store.currentConnectionId = next.id;
writeCloudConnectionStore(store, storePath);
return next;
}
export function getCloudConnection(
remoteUrlOrOrigin?: string,
storePath = resolveCloudConnectionStorePath(),
): CloudConnection | null {
const store = readCloudConnectionStore(storePath);
if (remoteUrlOrOrigin?.trim()) {
const needle = normalizeRemoteLookup(remoteUrlOrOrigin);
return Object.values(store.connections).find((connection) =>
normalizeRemoteLookup(connection.remoteUrl) === needle ||
normalizeRemoteLookup(connection.targetOrigin) === needle
) ?? null;
}
return store.currentConnectionId ? store.connections[store.currentConnectionId] ?? null : null;
}
function normalizeRemoteLookup(value: string): string {
try {
const url = new URL(value);
return url.origin.replace(/\/+$/u, "");
} catch {
return value.trim().replace(/\/+$/u, "");
}
}
function normalizeConnection(value: unknown): CloudConnection | null {
if (typeof value !== "object" || value === null || Array.isArray(value)) return null;
const record = value as Record<string, unknown>;
const id = stringValue(record.id);
const remoteUrl = stringValue(record.remoteUrl);
const targetOrigin = stringValue(record.targetOrigin);
const targetHost = stringValue(record.targetHost);
const stackId = stringValue(record.stackId);
const targetCompanyId = stringValue(record.targetCompanyId);
const accessToken = stringValue(record.accessToken);
const token = typeof record.token === "object" && record.token !== null && !Array.isArray(record.token)
? record.token as CloudConnectionTokenRecord
: null;
const privateKeyPem = stringValue(record.privateKeyPem);
const sourcePublicKey = stringValue(record.sourcePublicKey);
const sourceInstanceId = stringValue(record.sourceInstanceId);
const sourceInstanceFingerprint = stringValue(record.sourceInstanceFingerprint);
const createdAt = stringValue(record.createdAt);
const updatedAt = stringValue(record.updatedAt);
if (
!id || !remoteUrl || !targetOrigin || !targetHost || !stackId || !targetCompanyId ||
!accessToken || !token || !privateKeyPem || !sourcePublicKey || !sourceInstanceId ||
!sourceInstanceFingerprint || !createdAt || !updatedAt
) {
return null;
}
return {
id,
remoteUrl,
targetOrigin,
targetHost,
stackId,
stackSlug: stringValue(record.stackSlug),
stackDisplayName: stringValue(record.stackDisplayName),
targetCompanyId,
accessToken,
token,
privateKeyPem,
sourcePublicKey,
sourceInstanceId,
sourceInstanceFingerprint,
scopes: stringArray(record.scopes),
createdAt,
updatedAt,
};
}
function stringValue(value: unknown): string | null {
return typeof value === "string" && value.trim().length > 0 ? value.trim() : null;
}
function stringArray(value: unknown): string[] {
return Array.isArray(value) ? value.filter((entry): entry is string => typeof entry === "string") : [];
}
+297
View File
@@ -0,0 +1,297 @@
import { createHash } from "node:crypto";
export const upstreamTransferSchema = {
family: "paperclip-upstream-transfer",
version: "1.0.0",
major: 1,
minor: 0,
} as const;
export type NormalizedSha256 = `sha256:${string}`;
export interface SourceEntityKey {
sourceInstanceId: string;
sourceCompanyId: string;
sourceEntityType: string;
sourceEntityId: string;
sourceNaturalKey?: string;
}
export interface UpstreamTransferWarning {
code: string;
severity: "info" | "warning" | "blocker";
message: string;
entity?: SourceEntityKey;
}
export interface UpstreamTransferEntityRecord {
key: SourceEntityKey;
contentHash: NormalizedSha256;
dependencies: SourceEntityKey[];
warnings: UpstreamTransferWarning[];
}
export interface UpstreamTransferManifestSource {
sourceInstanceId: string;
sourceCompanyId: string;
sourceInstanceKeyFingerprint: string;
exporterVersion: string;
sourceSchemaVersion: string;
}
export interface UpstreamTransferManifestTarget {
targetStackId: string;
targetCompanyId: string;
targetOrigin: string;
supportedSchemaMajor: number;
}
export interface UpstreamTransferChunk {
chunkIndex: number;
totalChunks: number;
byteLength: number;
sha256: NormalizedSha256;
manifestHash: NormalizedSha256;
}
export interface UpstreamTransferManifest {
schema: typeof upstreamTransferSchema;
source: UpstreamTransferManifestSource;
target: UpstreamTransferManifestTarget;
runId: string;
idempotencyKey: string;
generatedAt: string;
entityCount: number;
entities: UpstreamTransferEntityRecord[];
chunks: UpstreamTransferChunk[];
warnings: UpstreamTransferWarning[];
featureFlags: string[];
manifestHash: NormalizedSha256;
}
export interface LocalUpstreamExportEntityInput {
key: SourceEntityKey;
body: Record<string, unknown>;
dependencies?: SourceEntityKey[];
warnings?: UpstreamTransferWarning[];
conflictKeys?: string[];
}
export interface LocalUpstreamExportEntity {
record: UpstreamTransferEntityRecord;
body: Record<string, unknown>;
conflictKeys?: string[];
}
export interface LocalUpstreamExportChunk {
chunkIndex: number;
totalChunks: number;
byteLength: number;
sha256: NormalizedSha256;
payload: {
entityKeys: SourceEntityKey[];
};
}
export interface LocalUpstreamExportBundle {
manifest: UpstreamTransferManifest;
entities: LocalUpstreamExportEntity[];
chunks: LocalUpstreamExportChunk[];
}
export interface BuildLocalUpstreamExportBundleInput {
source: UpstreamTransferManifestSource;
target: UpstreamTransferManifestTarget;
runId: string;
idempotencyKey: string;
entities: LocalUpstreamExportEntityInput[];
warnings?: UpstreamTransferWarning[];
featureFlags?: string[];
maxEntitiesPerChunk?: number;
}
export interface LocalUpstreamPushCoordinatorOptions {
targetOrigin: string;
paperclipCompanyId: string;
fetch?: typeof fetch;
headers?: (input: { method: string; path: string }) => HeadersInit | Promise<HeadersInit>;
}
export class UpstreamImportRequestError extends Error {
readonly status: number;
readonly body: unknown;
constructor(status: number, message: string, body: unknown) {
super(message);
this.status = status;
this.body = body;
}
}
export class LocalUpstreamPushCoordinator {
readonly #targetOrigin: string;
readonly #paperclipCompanyId: string;
readonly #fetch: typeof fetch;
readonly #headers: NonNullable<LocalUpstreamPushCoordinatorOptions["headers"]>;
constructor(options: LocalUpstreamPushCoordinatorOptions) {
this.#targetOrigin = options.targetOrigin.replace(/\/+$/u, "");
this.#paperclipCompanyId = options.paperclipCompanyId;
this.#fetch = options.fetch ?? fetch;
this.#headers = options.headers ?? (() => ({}));
}
async preview(bundle: LocalUpstreamExportBundle): Promise<unknown> {
return this.post(`/api/companies/${encodeURIComponent(this.#paperclipCompanyId)}/upstream-imports/preview`, {
manifest: bundle.manifest,
entities: bundle.entities,
});
}
async apply(bundle: LocalUpstreamExportBundle): Promise<unknown> {
const run = await this.post(`/api/companies/${encodeURIComponent(this.#paperclipCompanyId)}/upstream-imports/runs`, {
mode: "apply",
manifest: bundle.manifest,
entities: bundle.entities,
}) as { run?: { id?: unknown } };
const runId = typeof run.run?.id === "string" ? run.run.id : undefined;
if (!runId) {
throw new Error("Remote upstream importer did not return a run id");
}
for (const chunk of bundle.chunks) {
await this.post(`/api/upstream-import-runs/${encodeURIComponent(runId)}/chunks`, chunk);
}
return this.post(`/api/upstream-import-runs/${encodeURIComponent(runId)}/apply`, {});
}
async events(runId: string): Promise<unknown> {
return this.get(`/api/upstream-import-runs/${encodeURIComponent(runId)}/events`);
}
private async get(path: string): Promise<unknown> {
const response = await this.#fetch(`${this.#targetOrigin}${path}`, {
method: "GET",
headers: await this.#headers({ method: "GET", path }),
});
return parseCoordinatorResponse(response);
}
private async post(path: string, body: unknown): Promise<unknown> {
const response = await this.#fetch(`${this.#targetOrigin}${path}`, {
method: "POST",
headers: {
"Content-Type": "application/json",
...(await this.#headers({ method: "POST", path })),
},
body: JSON.stringify(body),
});
return parseCoordinatorResponse(response);
}
}
export function buildLocalUpstreamExportBundle(
input: BuildLocalUpstreamExportBundleInput,
): LocalUpstreamExportBundle {
const entities = input.entities.map<LocalUpstreamExportEntity>((entity) => ({
record: {
key: entity.key,
contentHash: normalizedContentHash(entity.body),
dependencies: entity.dependencies ?? [],
warnings: entity.warnings ?? [],
},
body: entity.body,
conflictKeys: entity.conflictKeys,
}));
const chunks = buildLocalChunks(entities, input.maxEntitiesPerChunk ?? 100);
const manifestWithoutHash = {
schema: upstreamTransferSchema,
source: input.source,
target: input.target,
runId: input.runId,
idempotencyKey: input.idempotencyKey,
generatedAt: new Date(0).toISOString(),
entityCount: entities.length,
entities: entities.map((entity) => entity.record),
chunks: chunks.map(({ payload: _payload, ...chunk }) => chunk),
warnings: input.warnings ?? [],
featureFlags: (input.featureFlags ?? ["cloud_sync"]).slice().sort(),
};
const manifestHash = normalizedContentHash(manifestWithoutHash);
return {
manifest: {
...manifestWithoutHash,
chunks: manifestWithoutHash.chunks.map((chunk) => ({ ...chunk, manifestHash })),
manifestHash,
},
entities,
chunks,
};
}
export function normalizedContentHash(value: unknown): NormalizedSha256 {
return `sha256:${createHash("sha256").update(canonicalJson(value)).digest("hex")}`;
}
export function canonicalJson(value: unknown): string {
return JSON.stringify(sortJson(value));
}
function buildLocalChunks(
entities: LocalUpstreamExportEntity[],
maxEntitiesPerChunk: number,
): LocalUpstreamExportChunk[] {
if (!Number.isInteger(maxEntitiesPerChunk) || maxEntitiesPerChunk < 1) {
throw new Error("maxEntitiesPerChunk must be a positive integer");
}
if (entities.length === 0) return [];
const groups: LocalUpstreamExportEntity[][] = [];
for (let index = 0; index < entities.length; index += maxEntitiesPerChunk) {
groups.push(entities.slice(index, index + maxEntitiesPerChunk));
}
return groups.map((group, index) => {
const payload = {
entityKeys: group.map((entity) => entity.record.key),
};
return {
chunkIndex: index,
totalChunks: groups.length,
byteLength: Buffer.byteLength(canonicalJson(payload)),
sha256: normalizedContentHash(payload),
payload,
};
});
}
function sortJson(value: unknown): unknown {
if (Array.isArray(value)) return value.map(sortJson);
if (typeof value !== "object" || value === null) return value;
return Object.fromEntries(
Object.entries(value as Record<string, unknown>)
.sort(([left], [right]) => left.localeCompare(right))
.map(([key, entry]) => [key, sortJson(entry)]),
);
}
async function parseCoordinatorResponse(response: Response): Promise<unknown> {
const text = await response.text();
const parsed = text.trim() ? safeParseJson(text) : {};
if (!response.ok) {
const message = typeof parsed === "object" && parsed !== null && "error" in parsed
? String((parsed as { error: unknown }).error)
: `Upstream importer request failed with ${response.status}`;
throw new UpstreamImportRequestError(response.status, message, parsed);
}
return parsed;
}
function safeParseJson(text: string): unknown {
try {
return JSON.parse(text);
} catch {
return text;
}
}
+721
View File
@@ -0,0 +1,721 @@
import { createHash, generateKeyPairSync, randomBytes, randomUUID, sign } from "node:crypto";
import { createServer, type Server } from "node:http";
import { URL } from "node:url";
import { Command } from "commander";
import pc from "picocolors";
import type {
CompanyPortabilityExportResult,
CompanyPortabilityFileEntry,
InstanceExperimentalSettings,
} from "@paperclipai/shared";
import { openUrl } from "../../client/board-auth.js";
import { resolvePaperclipInstanceId } from "../../config/home.js";
import {
addCommonClientOptions,
handleCommandError,
printOutput,
resolveCommandContext,
type BaseClientOptions,
} from "./common.js";
import {
buildLocalUpstreamExportBundle,
LocalUpstreamPushCoordinator,
normalizedContentHash,
upstreamTransferSchema,
UpstreamImportRequestError,
type LocalUpstreamExportBundle,
type LocalUpstreamExportEntityInput,
type SourceEntityKey,
type UpstreamTransferManifestSource,
type UpstreamTransferManifestTarget,
type UpstreamTransferWarning,
} from "./cloud-transfer.js";
import {
getCloudConnection,
upsertCloudConnection,
type CloudConnection,
type CloudConnectionTokenRecord,
} from "./cloud-store.js";
const CLOUD_SYNC_CONFLICT_EXIT_CODE = 2;
const CLOUD_SYNC_SCHEMA_MISMATCH_EXIT_CODE = 3;
const CLOUD_SYNC_SCOPES = ["upstream_import:preview", "upstream_import:write", "upstream_import:read"];
const DEVICE_CODE_FALLBACK_EXPIRES_MS = 15 * 60_000;
interface CloudConnectOptions extends BaseClientOptions {
noBrowser?: boolean;
}
interface CloudPushOptions extends BaseClientOptions {
company?: string;
remoteUrl?: string;
dryRun?: boolean;
maxEntitiesPerChunk?: number;
}
interface UpstreamDiscovery {
schema: string;
stack: {
id: string;
slug?: string;
displayName?: string;
companyId: string;
origin: string;
};
auth: {
pkce?: {
authorizeUrl: string;
tokenUrl: string;
codeChallengeMethod: string;
};
deviceCode?: {
deviceCodeUrl: string;
verificationUrl: string;
tokenUrl: string;
};
scopes?: string[];
};
transfer: {
supportedSchemaMajor: number;
featureFlags?: string[];
};
}
interface TokenResponse {
accessToken: string;
token: CloudConnectionTokenRecord;
scopes?: string[];
expiresAt?: string;
}
class CloudAuthRequestError extends Error {
readonly status: number;
readonly body: unknown;
constructor(status: number, message: string, body: unknown) {
super(message);
this.status = status;
this.body = body;
}
}
export function registerCloudCommands(program: Command): void {
const cloud = program.command("cloud").description("Paperclip Cloud upstream sync commands");
addCommonClientOptions(
cloud
.command("connect")
.description("Authorize this local instance to push into a Paperclip Cloud stack")
.argument("<remote-url>", "Paperclip Cloud stack URL")
.option("--no-browser", "Use the device-code flow instead of opening a browser", false)
.action(async (remoteUrl: string, opts: CloudConnectOptions) => {
try {
await connectCloud(remoteUrl, opts);
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
cloud
.command("push")
.description("Preview or apply a local company push into the connected Paperclip Cloud stack")
.requiredOption("--company <local-company-id>", "Local company ID to export")
.option("--remote-url <remote-url>", "Use a specific stored cloud connection")
.option("--dry-run", "Preview without applying", false)
.option("--max-entities-per-chunk <count>", "Chunk size for upstream uploads", (value) => Number(value), 100)
.action(async (opts: CloudPushOptions) => {
try {
await pushCloud(opts);
} catch (err) {
if (isSchemaMismatchError(err)) {
console.error(pc.red(err instanceof Error ? err.message : String(err)));
process.exitCode = CLOUD_SYNC_SCHEMA_MISMATCH_EXIT_CODE;
return;
}
handleCommandError(err);
}
}),
);
}
export async function connectCloud(remoteUrl: string, opts: CloudConnectOptions = {}): Promise<CloudConnection> {
const ctx = resolveCommandContext(opts);
const discovery = await discoverUpstream(remoteUrl);
assertDiscoveryCompatible(discovery);
const source = createSourceIdentity();
const token = await authorizeConnection(discovery, source, {
noBrowser: Boolean(opts.noBrowser),
});
const targetOrigin = discovery.stack.origin.replace(/\/+$/u, "");
const targetHost = new URL(targetOrigin).host;
const now = new Date().toISOString();
const connection = upsertCloudConnection({
id: connectionId(targetOrigin),
remoteUrl,
targetOrigin,
targetHost,
stackId: discovery.stack.id,
stackSlug: discovery.stack.slug ?? null,
stackDisplayName: discovery.stack.displayName ?? null,
targetCompanyId: discovery.stack.companyId,
accessToken: token.accessToken,
token: token.token,
privateKeyPem: source.privateKeyPem,
sourcePublicKey: source.sourcePublicKey,
sourceInstanceId: source.sourceInstanceId,
sourceInstanceFingerprint: source.sourceInstanceFingerprint,
scopes: token.scopes ?? token.token.scopes ?? CLOUD_SYNC_SCOPES,
createdAt: now,
updatedAt: now,
});
if (ctx.json) {
printOutput(redactConnection(connection), { json: true });
} else {
console.log(pc.bold("Connected to Paperclip Cloud"));
console.log(`stack=${connection.stackDisplayName ?? connection.stackSlug ?? connection.stackId}`);
console.log(`origin=${connection.targetOrigin}`);
console.log(`company=${connection.targetCompanyId}`);
}
return connection;
}
export async function pushCloud(opts: CloudPushOptions): Promise<unknown> {
const ctx = resolveCommandContext(opts, { requireCompany: false });
const localCompanyId = requiredString(opts.company, "--company");
await assertCloudSyncEnabled(ctx.api.get<InstanceExperimentalSettings>("/api/instance/settings/experimental"));
const connection = getCloudConnection(opts.remoteUrl);
if (!connection) {
throw new Error("No cloud connection found. Run `paperclipai cloud connect <remote-url>` first.");
}
const discovery = await discoverUpstream(connection.targetOrigin);
assertDiscoveryCompatible(discovery);
const bundle = await buildBundleFromLocalCompany({
localCompanyId,
connection,
discovery,
localApi: ctx.api,
maxEntitiesPerChunk: opts.maxEntitiesPerChunk,
mode: opts.dryRun ? "preview" : "apply",
});
const coordinator = new LocalUpstreamPushCoordinator({
targetOrigin: connection.targetOrigin,
paperclipCompanyId: connection.targetCompanyId,
headers: ({ method, path }) => cloudProofHeaders(connection, method, path),
});
const result = opts.dryRun ? await coordinator.preview(bundle) : await coordinator.apply(bundle);
const runId = getRunId(result);
const events = !opts.dryRun && runId ? await coordinator.events(runId).catch(() => null) : null;
const summary = summarizeResult(result);
const conflictCount = summary.conflict + summary.staleMapping;
if (ctx.json) {
printOutput({ result, events }, { json: true });
} else {
console.log(pc.bold(opts.dryRun ? "Cloud Push Preview" : "Cloud Push Applied"));
console.log(`run=${runId ?? "-"}`);
console.log(`manifest=${bundle.manifest.manifestHash}`);
console.log(
`create=${summary.create} update=${summary.update} adopt=${summary.adopt} ` +
`skip=${summary.skip} conflict=${summary.conflict} staleMapping=${summary.staleMapping}`,
);
printWarnings(result);
printConflicts(result);
printEvents(events);
}
if (conflictCount > 0) {
process.exitCode = CLOUD_SYNC_CONFLICT_EXIT_CODE;
}
return result;
}
export async function discoverUpstream(remoteUrl: string): Promise<UpstreamDiscovery> {
const base = new URL(remoteUrl);
const discoveryUrl = new URL("/.well-known/paperclip-upstream", base);
return requestCloudJson<UpstreamDiscovery>(discoveryUrl.toString(), { method: "GET" });
}
export function assertDiscoveryCompatible(discovery: UpstreamDiscovery): void {
if (discovery.schema !== "paperclip-upstream-discovery-v1") {
throw new Error("Remote URL is not a Paperclip Cloud upstream target.");
}
if (discovery.transfer.supportedSchemaMajor !== upstreamTransferSchema.major) {
throw new Error(
`Cloud upstream schema mismatch: local major ${upstreamTransferSchema.major}, remote supports ${discovery.transfer.supportedSchemaMajor}.`,
);
}
if (!discovery.transfer.featureFlags?.includes("cloud_sync")) {
throw new Error("Remote Paperclip Cloud stack does not advertise the cloud_sync transfer flag.");
}
}
export function resolveDeviceCodeExpiresAt(expiresAt: string | undefined, nowMs = Date.now()): number {
const parsed = typeof expiresAt === "string" ? Date.parse(expiresAt) : NaN;
return Number.isFinite(parsed) ? parsed : nowMs + DEVICE_CODE_FALLBACK_EXPIRES_MS;
}
export async function buildBundleFromLocalCompany(input: {
localCompanyId: string;
connection: CloudConnection;
discovery: UpstreamDiscovery;
localApi: {
post<T>(path: string, body?: unknown): Promise<T | null>;
};
maxEntitiesPerChunk?: number;
mode: "preview" | "apply";
}): Promise<LocalUpstreamExportBundle> {
const exported = await input.localApi.post<CompanyPortabilityExportResult>(
`/api/companies/${input.localCompanyId}/export`,
{
include: {
company: true,
agents: true,
projects: true,
issues: true,
skills: true,
},
expandReferencedSkills: true,
},
);
if (!exported) throw new Error("Local company export returned no data.");
const sourceHash = normalizedContentHash({
manifest: exported.manifest,
files: exported.files,
});
const source: UpstreamTransferManifestSource = {
sourceInstanceId: input.connection.sourceInstanceId,
sourceCompanyId: input.localCompanyId,
sourceInstanceKeyFingerprint: input.connection.sourceInstanceFingerprint,
exporterVersion: "paperclipai-cli-cloud-v1",
sourceSchemaVersion: "paperclip-local-portability-v1",
};
const target: UpstreamTransferManifestTarget = {
targetStackId: input.discovery.stack.id,
targetCompanyId: input.discovery.stack.companyId,
targetOrigin: input.discovery.stack.origin,
supportedSchemaMajor: input.discovery.transfer.supportedSchemaMajor,
};
const entities = buildEntitiesFromPortableExport(input.localCompanyId, input.connection.sourceInstanceId, exported);
const idempotencyKey = [
input.mode,
input.connection.sourceInstanceId,
input.localCompanyId,
input.discovery.stack.id,
sourceHash,
].join(":");
return buildLocalUpstreamExportBundle({
source,
target,
runId: `local-${input.mode}-${shortHash(idempotencyKey)}`,
idempotencyKey,
entities,
warnings: exported.warnings.map((message): UpstreamTransferWarning => ({
code: "local_company_export_warning",
severity: "warning",
message,
})),
featureFlags: ["cloud_sync"],
maxEntitiesPerChunk: input.maxEntitiesPerChunk,
});
}
async function authorizeConnection(
discovery: UpstreamDiscovery,
source: ReturnType<typeof createSourceIdentity>,
opts: { noBrowser: boolean },
): Promise<TokenResponse> {
if (!opts.noBrowser && canOpenBrowser() && discovery.auth.pkce) {
try {
return await authorizeWithBrowser(discovery, source);
} catch (error) {
console.error(pc.yellow(`Browser authorization failed; falling back to device-code flow. ${errorMessage(error)}`));
}
}
if (!discovery.auth.deviceCode) {
throw new Error("Remote Paperclip Cloud stack does not support device-code authorization.");
}
return authorizeWithDeviceCode(discovery, source, { openBrowser: !opts.noBrowser && canOpenBrowser() });
}
async function authorizeWithBrowser(
discovery: UpstreamDiscovery,
source: ReturnType<typeof createSourceIdentity>,
): Promise<TokenResponse> {
const pkce = discovery.auth.pkce;
if (!pkce) throw new Error("Remote did not advertise PKCE authorization.");
const callback = await startPkceCallbackServer();
const verifier = randomBytes(32).toString("base64url");
const challenge = createHash("sha256").update(verifier).digest("base64url");
const state = randomUUID();
const authorizeUrl = new URL(pkce.authorizeUrl);
authorizeUrl.searchParams.set("redirectUri", callback.redirectUri);
authorizeUrl.searchParams.set("state", state);
authorizeUrl.searchParams.set("codeChallenge", challenge);
authorizeUrl.searchParams.set("codeChallengeMethod", "S256");
authorizeUrl.searchParams.set("sourceInstanceId", source.sourceInstanceId);
authorizeUrl.searchParams.set("sourceInstanceFingerprint", source.sourceInstanceFingerprint);
authorizeUrl.searchParams.set("sourcePublicKey", source.sourcePublicKey);
authorizeUrl.searchParams.set("scopes", CLOUD_SYNC_SCOPES.join(" "));
try {
console.error(`Open this URL to approve cloud sync:\n${authorizeUrl.toString()}`);
if (!openUrl(authorizeUrl.toString())) {
throw new Error("Could not open a browser.");
}
const code = await callback.waitForCode(state);
return requestCloudJson<TokenResponse>(pkce.tokenUrl, {
method: "POST",
body: JSON.stringify({
grantType: "authorization_code",
code,
redirectUri: callback.redirectUri,
codeVerifier: verifier,
}),
});
} finally {
await callback.close();
}
}
async function authorizeWithDeviceCode(
discovery: UpstreamDiscovery,
source: ReturnType<typeof createSourceIdentity>,
opts: { openBrowser: boolean },
): Promise<TokenResponse> {
const device = discovery.auth.deviceCode;
if (!device) throw new Error("Remote did not advertise device-code authorization.");
const response = await requestCloudJson<{
deviceCode: string;
userCode: string;
verificationUri: string;
expiresAt?: string;
intervalSeconds?: number;
}>(device.deviceCodeUrl, {
method: "POST",
body: JSON.stringify({
stackId: discovery.stack.id,
sourceInstanceId: source.sourceInstanceId,
sourceInstanceFingerprint: source.sourceInstanceFingerprint,
sourcePublicKey: source.sourcePublicKey,
scopes: CLOUD_SYNC_SCOPES,
}),
});
console.error(pc.bold("Cloud device authorization required"));
console.error(`Open: ${response.verificationUri}`);
console.error(`Code: ${response.userCode}`);
if (opts.openBrowser) openUrl(response.verificationUri);
const expiresAt = resolveDeviceCodeExpiresAt(response.expiresAt);
const intervalMs = Math.max(500, (response.intervalSeconds ?? 5) * 1000);
while (Date.now() < expiresAt) {
await sleep(intervalMs);
try {
return await requestCloudJson<TokenResponse>(device.tokenUrl, {
method: "POST",
body: JSON.stringify({
grantType: "device_code",
deviceCode: response.deviceCode,
}),
});
} catch (error) {
if (error instanceof CloudAuthRequestError && error.body && typeof error.body === "object") {
const code = (error.body as { error?: unknown }).error;
if (code === "authorization_pending") continue;
}
throw error;
}
}
throw new Error("Device-code authorization expired before it was approved.");
}
function buildEntitiesFromPortableExport(
localCompanyId: string,
sourceInstanceId: string,
exported: CompanyPortabilityExportResult,
): LocalUpstreamExportEntityInput[] {
const companyKey: SourceEntityKey = {
sourceInstanceId,
sourceCompanyId: localCompanyId,
sourceEntityType: "company",
sourceEntityId: localCompanyId,
sourceNaturalKey: exported.manifest.company?.name ?? localCompanyId,
};
const entities: LocalUpstreamExportEntityInput[] = [
{
key: companyKey,
body: {
kind: "paperclip_company_portability_manifest",
manifest: exported.manifest,
rootPath: exported.rootPath,
paperclipExtensionPath: exported.paperclipExtensionPath,
fileCount: Object.keys(exported.files).length,
},
conflictKeys: [`company:${companyKey.sourceNaturalKey ?? localCompanyId}`],
},
];
for (const [filePath, entry] of Object.entries(exported.files).sort(([left], [right]) => left.localeCompare(right))) {
entities.push({
key: {
sourceInstanceId,
sourceCompanyId: localCompanyId,
sourceEntityType: "company_setting",
sourceEntityId: shortHash(filePath),
sourceNaturalKey: filePath,
},
body: {
kind: "paperclip_portable_file",
path: filePath,
entry: normalizePortableFileEntry(entry),
},
dependencies: [companyKey],
conflictKeys: [`portable_file:${filePath}`],
});
}
return entities;
}
function normalizePortableFileEntry(entry: CompanyPortabilityFileEntry): Record<string, unknown> {
if (typeof entry === "string") {
return { encoding: "utf8", data: entry };
}
return { ...entry };
}
async function assertCloudSyncEnabled(settingsPromise: Promise<InstanceExperimentalSettings | null>): Promise<void> {
const settings = await settingsPromise;
if (settings?.enableCloudSync !== true) {
throw new Error(
"Cloud sync is disabled. Enable the cloud sync experimental setting before running `paperclipai cloud push`.",
);
}
}
function cloudProofHeaders(connection: CloudConnection, method: string, pathAndSearch: string): Record<string, string> {
const timestamp = new Date().toISOString();
const nonce = randomUUID();
const payload = [
method,
connection.targetHost.toLowerCase(),
pathAndSearch,
connection.token.id,
connection.sourceInstanceId,
timestamp,
nonce,
].join("\n");
return {
Authorization: `Bearer ${connection.accessToken}`,
"X-Paperclip-Upstream-Source-Instance-Id": connection.sourceInstanceId,
"X-Paperclip-Upstream-Proof-Timestamp": timestamp,
"X-Paperclip-Upstream-Proof-Nonce": nonce,
"X-Paperclip-Upstream-Proof-Signature": sign(
null,
Buffer.from(payload, "utf8"),
connection.privateKeyPem,
).toString("base64url"),
};
}
async function requestCloudJson<T>(url: string, init: RequestInit): Promise<T> {
const headers = new Headers(init.headers);
headers.set("accept", "application/json");
if (init.body !== undefined && !headers.has("content-type")) {
headers.set("content-type", "application/json");
}
const response = await fetch(url, { ...init, headers });
const text = await response.text();
const parsed = text.trim() ? JSON.parse(text) as unknown : {};
if (!response.ok) {
const message = typeof parsed === "object" && parsed !== null && "error" in parsed
? String((parsed as { error: unknown }).error)
: `Cloud request failed with ${response.status}`;
throw new CloudAuthRequestError(response.status, message, parsed);
}
return parsed as T;
}
function createSourceIdentity() {
const { publicKey, privateKey } = generateKeyPairSync("ed25519");
const sourcePublicKey = publicKey.export({ type: "spki", format: "pem" }).toString();
const sourceInstanceFingerprint = `sha256:${createHash("sha256")
.update(publicKey.export({ type: "spki", format: "der" }))
.digest("hex")}`;
return {
sourceInstanceId: `paperclip-local-${resolvePaperclipInstanceId()}`,
sourceInstanceFingerprint,
sourcePublicKey,
privateKeyPem: privateKey.export({ type: "pkcs8", format: "pem" }).toString(),
};
}
async function startPkceCallbackServer(): Promise<{
redirectUri: string;
waitForCode: (state: string) => Promise<string>;
close: () => Promise<void>;
}> {
let resolveCode: ((code: string) => void) | null = null;
let rejectCode: ((error: Error) => void) | null = null;
let expectedState = "";
const codePromise = new Promise<string>((resolve, reject) => {
resolveCode = resolve;
rejectCode = reject;
});
const server = createServer((req, res) => {
const url = new URL(req.url ?? "/", "http://127.0.0.1");
const code = url.searchParams.get("code");
const state = url.searchParams.get("state");
if (!code || state !== expectedState) {
res.writeHead(400, { "Content-Type": "text/plain" });
res.end("Paperclip Cloud authorization failed. You can close this tab.");
rejectCode?.(new Error("Authorization callback was missing a valid code or state."));
return;
}
res.writeHead(200, { "Content-Type": "text/plain" });
res.end("Paperclip Cloud authorization complete. You can close this tab.");
resolveCode?.(code);
});
await listenOnLoopback(server);
const address = server.address();
if (typeof address !== "object" || !address?.port) {
throw new Error("Failed to start local authorization callback server.");
}
return {
redirectUri: `http://127.0.0.1:${address.port}/cloud/callback`,
waitForCode: (state: string) => {
expectedState = state;
return codePromise;
},
close: () => closeServer(server),
};
}
function listenOnLoopback(server: Server): Promise<void> {
return new Promise((resolve, reject) => {
server.once("error", reject);
server.listen(0, "127.0.0.1", () => {
server.off("error", reject);
resolve();
});
});
}
function closeServer(server: Server): Promise<void> {
return new Promise((resolve, reject) => {
server.close((error) => error ? reject(error) : resolve());
});
}
function canOpenBrowser(): boolean {
if (process.platform === "darwin" || process.platform === "win32") return true;
return Boolean(process.env.DISPLAY || process.env.WAYLAND_DISPLAY);
}
function summarizeResult(result: unknown): {
create: number;
update: number;
adopt: number;
skip: number;
conflict: number;
staleMapping: number;
} {
const summary = asRecord(asRecord(result)?.summary);
return {
create: numberValue(summary?.create),
update: numberValue(summary?.update),
adopt: numberValue(summary?.adopt),
skip: numberValue(summary?.skip),
conflict: numberValue(summary?.conflict),
staleMapping: numberValue(summary?.staleMapping),
};
}
function printWarnings(result: unknown): void {
const warnings = Array.isArray(asRecord(result)?.warnings) ? asRecord(result)?.warnings as unknown[] : [];
for (const warning of warnings) {
const record = asRecord(warning);
console.log(pc.yellow(`warning=${record?.code ?? "warning"} ${record?.message ?? ""}`.trim()));
}
}
function printConflicts(result: unknown): void {
const conflicts = Array.isArray(asRecord(result)?.conflicts) ? asRecord(result)?.conflicts as unknown[] : [];
for (const conflict of conflicts.slice(0, 10)) {
const record = asRecord(conflict);
console.log(pc.red(`conflict=${record?.conflictKind ?? "target_conflict"} target=${record?.targetEntityId ?? "-"}`));
}
if (conflicts.length > 10) console.log(pc.red(`conflicts_truncated=${conflicts.length - 10}`));
}
function printEvents(events: unknown): void {
const rows = Array.isArray(asRecord(events)?.events) ? asRecord(events)?.events as unknown[] : [];
for (const row of rows.slice(-10)) {
const event = asRecord(row);
console.log(pc.dim(`event=${event?.action ?? "-"} target=${event?.targetEntityId ?? "-"}`));
}
}
function getRunId(result: unknown): string | null {
const run = asRecord(asRecord(result)?.run);
return typeof run?.id === "string" ? run.id : null;
}
function redactConnection(connection: CloudConnection): Record<string, unknown> {
return {
id: connection.id,
remoteUrl: connection.remoteUrl,
targetOrigin: connection.targetOrigin,
stackId: connection.stackId,
targetCompanyId: connection.targetCompanyId,
scopes: connection.scopes,
expiresAt: connection.token.expiresAt,
};
}
function connectionId(targetOrigin: string): string {
return `cloud-${shortHash(targetOrigin)}`;
}
function shortHash(value: string): string {
return createHash("sha256").update(value).digest("hex").slice(0, 16);
}
function requiredString(value: unknown, label: string): string {
if (typeof value === "string" && value.trim()) return value.trim();
throw new Error(`${label} is required.`);
}
function numberValue(value: unknown): number {
return typeof value === "number" && Number.isFinite(value) ? value : 0;
}
function asRecord(value: unknown): Record<string, unknown> | null {
return typeof value === "object" && value !== null && !Array.isArray(value)
? value as Record<string, unknown>
: null;
}
function isSchemaMismatchError(error: unknown): boolean {
if (error instanceof UpstreamImportRequestError) {
return JSON.stringify(error.body).toLowerCase().includes("schema");
}
return error instanceof Error && error.message.toLowerCase().includes("schema mismatch");
}
function errorMessage(error: unknown): string {
return error instanceof Error ? error.message : String(error);
}
function sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
export const cloudCommandExitCodes = {
conflict: CLOUD_SYNC_CONFLICT_EXIT_CODE,
schemaMismatch: CLOUD_SYNC_SCHEMA_MISMATCH_EXIT_CODE,
} as const;
+3
View File
@@ -61,6 +61,7 @@ interface IssueUpdateOptions extends BaseClientOptions {
interface IssueCommentOptions extends BaseClientOptions {
body: string;
reopen?: boolean;
resume?: boolean;
}
interface IssueCheckoutOptions extends BaseClientOptions {
@@ -241,12 +242,14 @@ export function registerIssueCommands(program: Command): void {
.argument("<issueId>", "Issue ID")
.requiredOption("--body <text>", "Comment body")
.option("--reopen", "Reopen if issue is done/cancelled")
.option("--resume", "Request explicit follow-up and wake the assignee when resumable")
.action(async (issueId: string, opts: IssueCommentOptions) => {
try {
const ctx = resolveCommandContext(opts);
const payload = addIssueCommentSchema.parse({
body: opts.body,
reopen: opts.reopen,
resume: opts.resume,
});
const comment = await ctx.api.post<IssueComment>(`/api/issues/${issueId}/comments`, payload);
printOutput(comment, { json: ctx.json });
+185 -25
View File
@@ -1,5 +1,11 @@
import path from "node:path";
import { Command } from "commander";
import { existsSync } from "node:fs";
import { Command, Option } from "commander";
import {
scaffoldPluginProject,
shellQuote,
type ScaffoldPluginOptions,
} from "../../../../packages/plugins/create-paperclip-plugin/src/index.js";
import pc from "picocolors";
import {
addCommonClientOptions,
@@ -39,28 +45,101 @@ interface PluginInstallOptions extends BaseClientOptions {
version?: string;
}
interface PluginInstallRequest {
packageName: string;
version?: string;
isLocalPath: boolean;
}
interface PluginUninstallOptions extends BaseClientOptions {
force?: boolean;
}
interface PluginInitOptions extends BaseClientOptions {
output?: string;
template?: ScaffoldPluginOptions["template"];
category?: ScaffoldPluginOptions["category"];
displayName?: string;
description?: string;
author?: string;
sdkPath?: string;
}
interface PluginInitResult {
outputDir: string;
nextCommands: string[];
}
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
function expandHomePath(packageArg: string): string {
if (!packageArg.startsWith("~")) return packageArg;
const home = process.env.HOME ?? process.env.USERPROFILE ?? "";
return path.resolve(home, packageArg.slice(1).replace(/^[\\/]/, ""));
}
function hasLocalPathSyntax(packageArg: string): boolean {
return (
path.isAbsolute(packageArg) ||
packageArg.startsWith("./") ||
packageArg.startsWith("../") ||
packageArg.startsWith("~") ||
packageArg.startsWith(".\\") ||
packageArg.startsWith("..\\")
);
}
function isExistingRelativePath(
packageArg: string,
cwd: string,
pathExists: (targetPath: string) => boolean,
): boolean {
if (packageArg.trim() === "") return false;
if (hasLocalPathSyntax(packageArg)) return false;
return pathExists(path.resolve(cwd, packageArg));
}
/**
* Resolve a local path argument to an absolute path so the server can find the
* plugin on disk regardless of where the user ran the CLI.
*/
function resolvePackageArg(packageArg: string, isLocal: boolean): string {
function resolvePackageArg(packageArg: string, isLocal: boolean, cwd = process.cwd()): string {
if (!isLocal) return packageArg;
// Already absolute
if (path.isAbsolute(packageArg)) return packageArg;
// Expand leading ~ to home directory
if (packageArg.startsWith("~")) {
const home = process.env.HOME ?? process.env.USERPROFILE ?? "";
return path.resolve(home, packageArg.slice(1).replace(/^[\\/]/, ""));
if (packageArg.startsWith("~")) return expandHomePath(packageArg);
return path.resolve(cwd, packageArg);
}
export function buildPluginInstallRequest(
packageArg: string,
opts: Pick<PluginInstallOptions, "local" | "version"> = {},
deps: { cwd?: string; existsSync?: (targetPath: string) => boolean } = {},
): PluginInstallRequest {
const cwd = deps.cwd ?? process.cwd();
const pathExists = deps.existsSync ?? existsSync;
const isLocal =
opts.local ||
hasLocalPathSyntax(packageArg) ||
(opts.version ? false : isExistingRelativePath(packageArg, cwd, pathExists));
if (isLocal && opts.version) {
throw new Error("--version is only supported for npm package installs, not local plugin paths.");
}
return path.resolve(process.cwd(), packageArg);
return {
packageName: resolvePackageArg(packageArg, Boolean(isLocal), cwd),
version: opts.version,
isLocalPath: Boolean(isLocal),
};
}
export function renderLocalPluginInstallHint(packagePath: string): string {
return [
pc.dim("Local plugin installs run trusted local code from your machine."),
pc.dim(`Keep ${pc.cyan("pnpm dev")} running in ${packagePath}; Paperclip watches rebuilt dist output and reloads the plugin worker.`),
].join("\n");
}
function formatPlugin(p: PluginRecord): string {
@@ -87,6 +166,58 @@ function formatPlugin(p: PluginRecord): string {
return parts.join(" ");
}
function packageToDirName(pluginName: string): string {
return pluginName.replace(/^@[^/]+\//, "");
}
export function buildPluginInitScaffoldOptions(
packageName: string,
opts: PluginInitOptions,
cwd = process.cwd(),
): ScaffoldPluginOptions {
const outputRoot = path.resolve(cwd, opts.output ?? ".");
const outputDir = path.resolve(outputRoot, packageToDirName(packageName));
return {
pluginName: packageName,
outputDir,
template: opts.template,
category: opts.category,
displayName: opts.displayName,
description: opts.description,
author: opts.author,
sdkPath: opts.sdkPath,
};
}
export function buildPluginInitNextCommands(outputDir: string): string[] {
const quotedOutputDir = shellQuote(outputDir);
return [
`cd ${quotedOutputDir}`,
"pnpm install",
"pnpm dev",
`paperclipai plugin install ${quotedOutputDir}`,
];
}
export function renderPluginInitSuccess(result: PluginInitResult): string {
return [
pc.green(`✓ Created plugin scaffold at ${result.outputDir}`),
"",
"Next commands:",
...result.nextCommands.map((command) => ` ${pc.cyan(command)}`),
].join("\n");
}
export function runPluginInitCommand(packageName: string, opts: PluginInitOptions): PluginInitResult {
const scaffoldOptions = buildPluginInitScaffoldOptions(packageName, opts);
const outputDir = scaffoldPluginProject(scaffoldOptions);
return {
outputDir,
nextCommands: buildPluginInitNextCommands(outputDir),
};
}
// ---------------------------------------------------------------------------
// Command registration
// ---------------------------------------------------------------------------
@@ -94,6 +225,43 @@ function formatPlugin(p: PluginRecord): string {
export function registerPluginCommands(program: Command): void {
const plugin = program.command("plugin").description("Plugin lifecycle management");
// -------------------------------------------------------------------------
// plugin init <package-name>
// -------------------------------------------------------------------------
addCommonClientOptions(
plugin
.command("init <packageName>")
.description("Scaffold a local Paperclip plugin project")
.option("--output <dir>", "Directory to create the plugin folder in")
.addOption(
new Option("--template <template>", "Starter template")
.choices(["default", "connector", "workspace", "environment"])
.default("default"),
)
.addOption(
new Option("--category <category>", "Manifest category")
.choices(["connector", "workspace", "automation", "ui", "environment"]),
)
.option("--display-name <name>", "Manifest display name")
.option("--description <description>", "Manifest description")
.option("--author <author>", "Manifest author")
.option("--sdk-path <path>", "Local @paperclipai/plugin-sdk package path")
.action((packageName: string, opts: PluginInitOptions) => {
try {
const result = runPluginInitCommand(packageName, opts);
if (opts.json) {
printOutput(result, { json: true });
return;
}
console.log(renderPluginInitSuccess(result));
} catch (err) {
handleCommandError(err);
}
}),
);
// -------------------------------------------------------------------------
// plugin list
// -------------------------------------------------------------------------
@@ -147,31 +315,19 @@ export function registerPluginCommands(program: Command): void {
try {
const ctx = resolveCommandContext(opts);
// Auto-detect local paths: starts with . or / or ~ or is an absolute path
const isLocal =
opts.local ||
packageArg.startsWith("./") ||
packageArg.startsWith("../") ||
packageArg.startsWith("/") ||
packageArg.startsWith("~");
const resolvedPackage = resolvePackageArg(packageArg, isLocal);
const installRequest = buildPluginInstallRequest(packageArg, opts);
if (!ctx.json) {
console.log(
pc.dim(
isLocal
? `Installing plugin from local path: ${resolvedPackage}`
: `Installing plugin: ${resolvedPackage}${opts.version ? `@${opts.version}` : ""}`,
installRequest.isLocalPath
? `Installing plugin from local path: ${installRequest.packageName}`
: `Installing plugin: ${installRequest.packageName}${opts.version ? `@${opts.version}` : ""}`,
),
);
}
const installedPlugin = await ctx.api.post<PluginRecord>("/api/plugins/install", {
packageName: resolvedPackage,
version: opts.version,
isLocalPath: isLocal,
});
const installedPlugin = await ctx.api.post<PluginRecord>("/api/plugins/install", installRequest);
if (ctx.json) {
printOutput(installedPlugin, { json: true });
@@ -192,6 +348,10 @@ export function registerPluginCommands(program: Command): void {
if (installedPlugin.lastError) {
console.log(pc.red(` Warning: ${installedPlugin.lastError}`));
}
if (installRequest.isLocalPath) {
console.log(renderLocalPluginInstallHint(installRequest.packageName));
}
} catch (err) {
handleCommandError(err);
}
+501
View File
@@ -0,0 +1,501 @@
import { Command } from "commander";
import pc from "picocolors";
import type {
Agent,
AgentEnvConfig,
CompanyPortabilityEnvInput,
CompanyPortabilityExportPreviewResult,
CompanyPortabilityInclude,
CompanySecret,
EnvBinding,
SecretProvider,
SecretProviderDescriptor,
} from "@paperclipai/shared";
import {
addCommonClientOptions,
formatInlineRecord,
handleCommandError,
printOutput,
resolveCommandContext,
type BaseClientOptions,
} from "./common.js";
interface SecretListOptions extends BaseClientOptions {
companyId?: string;
}
interface SecretDeclarationsOptions extends BaseClientOptions {
companyId?: string;
include?: string;
kind?: "all" | "secret" | "plain";
}
interface SecretCreateOptions extends BaseClientOptions {
companyId?: string;
name?: string;
key?: string;
provider?: SecretProvider;
value?: string;
valueEnv?: string;
description?: string;
}
interface SecretLinkOptions extends BaseClientOptions {
companyId?: string;
name?: string;
key?: string;
provider?: SecretProvider;
externalRef?: string;
providerVersionRef?: string;
description?: string;
}
interface SecretDoctorOptions extends BaseClientOptions {
companyId?: string;
}
interface SecretMigrateInlineEnvOptions extends BaseClientOptions {
companyId?: string;
apply?: boolean;
}
interface SecretProviderHealth {
provider: SecretProvider;
status: "ok" | "warn" | "error";
message: string;
warnings?: string[];
backupGuidance?: string[];
details?: Record<string, unknown>;
}
interface SecretProviderHealthResponse {
providers: SecretProviderHealth[];
}
export interface InlineSecretMigrationCandidate {
agentId: string;
agentName: string;
envKey: string;
secretName: string;
existingSecretId: string | null;
}
const SENSITIVE_ENV_KEY_RE =
/(^token$|[-_]?token$|api[-_]?key|access[-_]?token|auth(?:_?token)?|authorization|bearer|secret|passwd|password|credential|jwt|private[-_]?key|cookie|connectionstring)/i;
const DEFAULT_DECLARATION_INCLUDE: CompanyPortabilityInclude = {
company: true,
agents: true,
projects: true,
issues: false,
skills: false,
};
export function parseSecretsInclude(input: string | undefined): CompanyPortabilityInclude {
if (!input?.trim()) return { ...DEFAULT_DECLARATION_INCLUDE };
const values = input.split(",").map((part) => part.trim().toLowerCase()).filter(Boolean);
const include = {
company: values.includes("company"),
agents: values.includes("agents"),
projects: values.includes("projects"),
issues: values.includes("issues") || values.includes("tasks"),
skills: values.includes("skills"),
};
if (!Object.values(include).some(Boolean)) {
throw new Error("Invalid --include value. Use one or more of: company,agents,projects,issues,tasks,skills");
}
return include;
}
export function isSensitiveEnvKey(key: string): boolean {
return SENSITIVE_ENV_KEY_RE.test(key);
}
export function toPlainEnvValue(binding: unknown): string | null {
if (typeof binding === "string") return binding;
if (typeof binding !== "object" || binding === null || Array.isArray(binding)) return null;
const record = binding as Record<string, unknown>;
if (record.type === "plain" && typeof record.value === "string") return record.value;
return null;
}
export function buildInlineMigrationSecretName(agentId: string, key: string): string {
return `agent_${agentId.slice(0, 8)}_${key.toLowerCase()}`;
}
export function collectInlineSecretMigrationCandidates(
agents: Agent[],
existingSecrets: CompanySecret[],
): InlineSecretMigrationCandidate[] {
const secretByName = new Map(existingSecrets.map((secret) => [secret.name, secret]));
const candidates: InlineSecretMigrationCandidate[] = [];
for (const agent of agents) {
const env = asRecord(agent.adapterConfig.env);
if (!env) continue;
for (const [envKey, binding] of Object.entries(env)) {
if (!isSensitiveEnvKey(envKey)) continue;
const plain = toPlainEnvValue(binding);
if (plain === null || plain.trim().length === 0) continue;
const secretName = buildInlineMigrationSecretName(agent.id, envKey);
candidates.push({
agentId: agent.id,
agentName: agent.name,
envKey,
secretName,
existingSecretId: secretByName.get(secretName)?.id ?? null,
});
}
}
return candidates;
}
export function buildMigratedAgentEnv(
env: Record<string, unknown>,
secretIdByEnvKey: Map<string, string>,
): AgentEnvConfig {
const next: AgentEnvConfig = { ...(env as Record<string, EnvBinding>) };
for (const [envKey, secretId] of secretIdByEnvKey) {
next[envKey] = {
type: "secret_ref",
secretId,
version: "latest",
};
}
return next;
}
function asRecord(value: unknown): Record<string, unknown> | null {
if (typeof value !== "object" || value === null || Array.isArray(value)) return null;
return value as Record<string, unknown>;
}
function readValueFromOptions(opts: SecretCreateOptions): string {
if (opts.value !== undefined && opts.valueEnv !== undefined) {
throw new Error("Use only one of --value or --value-env.");
}
if (opts.valueEnv !== undefined) {
const value = process.env[opts.valueEnv];
if (!value) throw new Error(`Environment variable ${opts.valueEnv} is empty or unset.`);
return value;
}
if (opts.value !== undefined) return opts.value;
throw new Error("Secret value is required. Pass --value or --value-env.");
}
function renderDeclaration(input: CompanyPortabilityEnvInput): Record<string, unknown> {
const scope = input.agentSlug
? `agent:${input.agentSlug}`
: input.projectSlug
? `project:${input.projectSlug}`
: "company";
return {
key: input.key,
scope,
kind: input.kind,
requirement: input.requirement,
portability: input.portability,
hasDefault: input.defaultValue !== null && input.defaultValue.length > 0,
description: input.description,
};
}
function renderSecret(secret: CompanySecret): Record<string, unknown> {
return {
id: secret.id,
name: secret.name,
key: secret.key,
provider: secret.provider,
status: secret.status,
managedMode: secret.managedMode,
latestVersion: secret.latestVersion,
externalRef: secret.externalRef ? "yes" : "no",
};
}
function printProviderHealth(rows: SecretProviderHealth[], json: boolean): void {
if (json) {
printOutput(rows, { json: true });
return;
}
if (rows.length === 0) {
printOutput([], { json: false });
return;
}
for (const row of rows) {
console.log(
formatInlineRecord({
id: row.provider,
status: row.status,
message: row.message,
}),
);
for (const warning of row.warnings ?? []) {
console.log(pc.yellow(`warning=${warning}`));
}
const missingConfig = asStringArray(row.details?.missingConfig);
if (missingConfig.length > 0) {
console.log(pc.dim(`missingConfig=${missingConfig.join(",")}`));
}
const credentialSource = typeof row.details?.credentialSource === "string"
? row.details.credentialSource
: null;
if (credentialSource) {
console.log(pc.dim(`credentialSource=${credentialSource}`));
}
const detectedCredentialSources = asStringArray(row.details?.detectedCredentialSources);
if (detectedCredentialSources.length > 0) {
console.log(pc.dim(`detectedCredentialSources=${detectedCredentialSources.join(",")}`));
}
for (const guidance of row.backupGuidance ?? []) {
console.log(pc.dim(`backup=${guidance}`));
}
}
}
function asStringArray(value: unknown): string[] {
return Array.isArray(value)
? value.filter((entry): entry is string => typeof entry === "string" && entry.length > 0)
: [];
}
async function migrateInlineEnv(opts: SecretMigrateInlineEnvOptions): Promise<void> {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const companyId = ctx.companyId!;
const agents = (await ctx.api.get<Agent[]>(`/api/companies/${companyId}/agents`)) ?? [];
const secrets = (await ctx.api.get<CompanySecret[]>(`/api/companies/${companyId}/secrets`)) ?? [];
const candidates = collectInlineSecretMigrationCandidates(agents, secrets);
if (!opts.apply) {
printOutput(
{
apply: false,
agentsToUpdate: new Set(candidates.map((candidate) => candidate.agentId)).size,
secretsToCreate: candidates.filter((candidate) => !candidate.existingSecretId).length,
secretsToRotate: candidates.filter((candidate) => candidate.existingSecretId).length,
candidates,
},
{ json: ctx.json },
);
if (!ctx.json) {
console.log(pc.dim("Re-run with --apply to create/rotate secrets and update agent env bindings."));
}
return;
}
const createdOrRotated = new Map<string, string>();
let createdSecrets = 0;
let rotatedSecrets = 0;
for (const candidate of candidates) {
const agent = agents.find((row) => row.id === candidate.agentId);
const env = asRecord(agent?.adapterConfig.env);
const value = env ? toPlainEnvValue(env[candidate.envKey]) : null;
if (!value) continue;
if (candidate.existingSecretId) {
await ctx.api.post(`/api/secrets/${candidate.existingSecretId}/rotate`, { value });
createdOrRotated.set(`${candidate.agentId}:${candidate.envKey}`, candidate.existingSecretId);
rotatedSecrets += 1;
continue;
}
const created = await ctx.api.post<CompanySecret>(`/api/companies/${companyId}/secrets`, {
name: candidate.secretName,
provider: "local_encrypted",
value,
description: `Migrated from agent ${candidate.agentId} env ${candidate.envKey}`,
});
if (!created) throw new Error(`Secret create returned no data for ${candidate.secretName}`);
createdOrRotated.set(`${candidate.agentId}:${candidate.envKey}`, created.id);
createdSecrets += 1;
}
let updatedAgents = 0;
for (const agent of agents) {
const env = asRecord(agent.adapterConfig.env);
if (!env) continue;
const secretIdByEnvKey = new Map<string, string>();
for (const [key] of Object.entries(env)) {
const secretId = createdOrRotated.get(`${agent.id}:${key}`);
if (secretId) secretIdByEnvKey.set(key, secretId);
}
if (secretIdByEnvKey.size === 0) continue;
const adapterConfig = {
...agent.adapterConfig,
env: buildMigratedAgentEnv(env, secretIdByEnvKey),
};
await ctx.api.patch(`/api/agents/${agent.id}`, {
adapterConfig,
replaceAdapterConfig: true,
});
updatedAgents += 1;
}
printOutput(
{
apply: true,
updatedAgents,
createdSecrets,
rotatedSecrets,
},
{ json: ctx.json },
);
}
export function registerSecretCommands(program: Command): void {
const secrets = program.command("secrets").description("Secret declaration and provider operations");
addCommonClientOptions(
secrets
.command("list")
.description("List secret metadata for a company")
.requiredOption("-C, --company-id <id>", "Company ID")
.action(async (opts: SecretListOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const rows = (await ctx.api.get<CompanySecret[]>(`/api/companies/${ctx.companyId}/secrets`)) ?? [];
printOutput(ctx.json ? rows : rows.map(renderSecret), { json: ctx.json });
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("declarations")
.description("List portable env declarations emitted by company export")
.requiredOption("-C, --company-id <id>", "Company ID")
.option("--include <values>", "Comma-separated include set: company,agents,projects,issues,tasks,skills", "company,agents,projects")
.option("--kind <kind>", "Filter declarations: all | secret | plain", "all")
.action(async (opts: SecretDeclarationsOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const kind = opts.kind ?? "all";
if (!["all", "secret", "plain"].includes(kind)) {
throw new Error("Invalid --kind value. Use: all, secret, plain");
}
const preview = await ctx.api.post<CompanyPortabilityExportPreviewResult>(
`/api/companies/${ctx.companyId}/exports/preview`,
{ include: parseSecretsInclude(opts.include) },
);
const declarations = (preview?.manifest.envInputs ?? [])
.filter((entry) => kind === "all" || entry.kind === kind);
printOutput(ctx.json ? declarations : declarations.map(renderDeclaration), { json: ctx.json });
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("create")
.description("Create a Paperclip-managed secret")
.requiredOption("-C, --company-id <id>", "Company ID")
.requiredOption("--name <name>", "Secret display name")
.option("--key <key>", "Portable secret key")
.option("--provider <provider>", "Secret provider id")
.option("--value <value>", "Secret value")
.option("--value-env <name>", "Read secret value from an environment variable")
.option("--description <text>", "Description")
.action(async (opts: SecretCreateOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const created = await ctx.api.post<CompanySecret>(`/api/companies/${ctx.companyId}/secrets`, {
name: opts.name,
key: opts.key,
provider: opts.provider,
value: readValueFromOptions(opts),
description: opts.description,
});
printOutput(ctx.json ? created : renderSecret(created!), { json: ctx.json });
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("link")
.description("Link an external provider-owned secret without storing its value in Paperclip")
.requiredOption("-C, --company-id <id>", "Company ID")
.requiredOption("--name <name>", "Secret display name")
.requiredOption("--provider <provider>", "Secret provider id")
.requiredOption("--external-ref <ref>", "Provider secret ARN/name/path/reference")
.option("--key <key>", "Portable secret key")
.option("--provider-version-ref <ref>", "Provider version id or label")
.option("--description <text>", "Description")
.action(async (opts: SecretLinkOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const created = await ctx.api.post<CompanySecret>(`/api/companies/${ctx.companyId}/secrets`, {
name: opts.name,
key: opts.key,
provider: opts.provider,
managedMode: "external_reference",
externalRef: opts.externalRef,
providerVersionRef: opts.providerVersionRef,
description: opts.description,
});
printOutput(ctx.json ? created : renderSecret(created!), { json: ctx.json });
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("doctor")
.description("Run secret provider health checks through the Paperclip API")
.requiredOption("-C, --company-id <id>", "Company ID")
.action(async (opts: SecretDoctorOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const health = await ctx.api.get<SecretProviderHealthResponse>(
`/api/companies/${ctx.companyId}/secret-providers/health`,
);
printProviderHealth(health?.providers ?? [], ctx.json);
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("providers")
.description("List configured secret provider descriptors")
.requiredOption("-C, --company-id <id>", "Company ID")
.action(async (opts: SecretDoctorOptions) => {
try {
const ctx = resolveCommandContext(opts, { requireCompany: true });
const rows = (await ctx.api.get<SecretProviderDescriptor[]>(
`/api/companies/${ctx.companyId}/secret-providers`,
)) ?? [];
printOutput(rows, { json: ctx.json });
} catch (err) {
handleCommandError(err);
}
}),
);
addCommonClientOptions(
secrets
.command("migrate-inline-env")
.description("Migrate inline sensitive agent env values into secret references")
.requiredOption("-C, --company-id <id>", "Company ID")
.option("--apply", "Persist changes; default is a dry run", false)
.action(async (opts: SecretMigrateInlineEnvOptions) => {
try {
await migrateInlineEnv(opts);
} catch (err) {
handleCommandError(err);
}
}),
);
}
File diff suppressed because it is too large Load Diff
+1
View File
@@ -54,6 +54,7 @@ function defaultConfig(): PaperclipConfig {
server: {
deploymentMode: "local_trusted",
exposure: "private",
bind: "loopback",
host: "127.0.0.1",
port: 3100,
allowedHostnames: [],
+1 -1
View File
@@ -73,7 +73,7 @@ export async function dbBackupCommand(opts: DbBackupOptions): Promise<void> {
const result = await runDatabaseBackup({
connectionString: connection.value,
backupDir,
retentionDays,
retention: { dailyDays: retentionDays, weeklyWeeks: 4, monthlyMonths: 1 },
filenamePrefix,
});
spinner.stop(`Backup saved: ${formatDatabaseBackupResult(result)}`);
+174
View File
@@ -0,0 +1,174 @@
import path from "node:path";
import type { Command } from "commander";
import * as p from "@clack/prompts";
import pc from "picocolors";
import {
buildSshEnvLabFixtureConfig,
getSshEnvLabSupport,
readSshEnvLabFixtureStatus,
startSshEnvLabFixture,
stopSshEnvLabFixture,
} from "@paperclipai/adapter-utils/ssh";
import { resolvePaperclipInstanceId, resolvePaperclipInstanceRoot } from "../config/home.js";
export function resolveEnvLabSshStatePath(instanceId?: string): string {
const resolvedInstanceId = resolvePaperclipInstanceId(instanceId);
return path.resolve(
resolvePaperclipInstanceRoot(resolvedInstanceId),
"env-lab",
"ssh-fixture",
"state.json",
);
}
function printJson(value: unknown) {
process.stdout.write(`${JSON.stringify(value, null, 2)}\n`);
}
function summarizeFixture(state: {
host: string;
port: number;
username: string;
workspaceDir: string;
sshdLogPath: string;
}) {
p.log.message(`Host: ${pc.cyan(state.host)}:${pc.cyan(String(state.port))}`);
p.log.message(`User: ${pc.cyan(state.username)}`);
p.log.message(`Workspace: ${pc.cyan(state.workspaceDir)}`);
p.log.message(`Log: ${pc.dim(state.sshdLogPath)}`);
}
export async function collectEnvLabDoctorStatus(opts: { instance?: string }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const [sshSupport, sshStatus] = await Promise.all([
getSshEnvLabSupport(),
readSshEnvLabFixtureStatus(statePath),
]);
const environment = sshStatus.state ? await buildSshEnvLabFixtureConfig(sshStatus.state) : null;
return {
statePath,
ssh: {
supported: sshSupport.supported,
reason: sshSupport.reason,
running: sshStatus.running,
state: sshStatus.state,
environment,
},
};
}
export async function envLabUpCommand(opts: { instance?: string; json?: boolean }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const state = await startSshEnvLabFixture({ statePath });
const environment = await buildSshEnvLabFixtureConfig(state);
if (opts.json) {
printJson({ state, environment });
return;
}
p.log.success("SSH env-lab fixture is running.");
summarizeFixture(state);
p.log.message(`State: ${pc.dim(statePath)}`);
}
export async function envLabStatusCommand(opts: { instance?: string; json?: boolean }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const status = await readSshEnvLabFixtureStatus(statePath);
const environment = status.state ? await buildSshEnvLabFixtureConfig(status.state) : null;
if (opts.json) {
printJson({ ...status, environment, statePath });
return;
}
if (!status.state || !status.running) {
p.log.info(`SSH env-lab fixture is not running (${pc.dim(statePath)}).`);
return;
}
p.log.success("SSH env-lab fixture is running.");
summarizeFixture(status.state);
p.log.message(`State: ${pc.dim(statePath)}`);
}
export async function envLabDownCommand(opts: { instance?: string; json?: boolean }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const stopped = await stopSshEnvLabFixture(statePath);
if (opts.json) {
printJson({ stopped, statePath });
return;
}
if (!stopped) {
p.log.info(`No SSH env-lab fixture was running (${pc.dim(statePath)}).`);
return;
}
p.log.success("SSH env-lab fixture stopped.");
p.log.message(`State: ${pc.dim(statePath)}`);
}
export async function envLabDoctorCommand(opts: { instance?: string; json?: boolean }) {
const status = await collectEnvLabDoctorStatus(opts);
if (opts.json) {
printJson(status);
return;
}
if (status.ssh.supported) {
p.log.success("SSH fixture prerequisites are installed.");
} else {
p.log.warn(`SSH fixture prerequisites are incomplete: ${status.ssh.reason ?? "unknown reason"}`);
}
if (status.ssh.state && status.ssh.running) {
p.log.success("SSH env-lab fixture is running.");
summarizeFixture(status.ssh.state);
p.log.message(`Private key: ${pc.dim(status.ssh.state.clientPrivateKeyPath)}`);
p.log.message(`Known hosts: ${pc.dim(status.ssh.state.knownHostsPath)}`);
} else if (status.ssh.state) {
p.log.warn("SSH env-lab fixture state exists, but the process is not running.");
p.log.message(`State: ${pc.dim(status.statePath)}`);
} else {
p.log.info("SSH env-lab fixture is not running.");
p.log.message(`State: ${pc.dim(status.statePath)}`);
}
p.log.message(`Cleanup: ${pc.dim("pnpm paperclipai env-lab down")}`);
}
export function registerEnvLabCommands(program: Command) {
const envLab = program.command("env-lab").description("Deterministic local environment fixtures");
envLab
.command("up")
.description("Start the default SSH env-lab fixture")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable fixture details")
.action(envLabUpCommand);
envLab
.command("status")
.description("Show the current SSH env-lab fixture state")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable fixture details")
.action(envLabStatusCommand);
envLab
.command("down")
.description("Stop the default SSH env-lab fixture")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable stop details")
.action(envLabDownCommand);
envLab
.command("doctor")
.description("Check SSH fixture prerequisites and current status")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable diagnostic details")
.action(envLabDoctorCommand);
}
+130 -15
View File
@@ -3,10 +3,14 @@ import path from "node:path";
import pc from "picocolors";
import {
AUTH_BASE_URL_MODES,
BIND_MODES,
DEPLOYMENT_EXPOSURES,
DEPLOYMENT_MODES,
SECRET_PROVIDERS,
STORAGE_PROVIDERS,
inferBindModeFromHost,
resolveRuntimeBind,
type BindMode,
type AuthBaseUrlMode,
type DeploymentExposure,
type DeploymentMode,
@@ -23,6 +27,7 @@ import { promptLogging } from "../prompts/logging.js";
import { defaultSecretsConfig } from "../prompts/secrets.js";
import { defaultStorageConfig, promptStorage } from "../prompts/storage.js";
import { promptServer } from "../prompts/server.js";
import { buildPresetServerConfig } from "../config/server-bind.js";
import {
describeLocalInstancePaths,
expandHomePrefix,
@@ -46,10 +51,14 @@ type OnboardOptions = {
run?: boolean;
yes?: boolean;
invokedByRun?: boolean;
bind?: BindMode;
};
type OnboardDefaults = Pick<PaperclipConfig, "database" | "logging" | "server" | "auth" | "storage" | "secrets">;
const TAILNET_BIND_WARNING =
"No Tailscale address was detected during setup. The saved config will stay on loopback until Tailscale is available or PAPERCLIP_TAILNET_BIND_HOST is set.";
const ONBOARD_ENV_KEYS = [
"PAPERCLIP_PUBLIC_URL",
"DATABASE_URL",
@@ -59,6 +68,9 @@ const ONBOARD_ENV_KEYS = [
"PAPERCLIP_DB_BACKUP_DIR",
"PAPERCLIP_DEPLOYMENT_MODE",
"PAPERCLIP_DEPLOYMENT_EXPOSURE",
"PAPERCLIP_BIND",
"PAPERCLIP_BIND_HOST",
"PAPERCLIP_TAILNET_BIND_HOST",
"HOST",
"PORT",
"SERVE_UI",
@@ -104,29 +116,62 @@ function resolvePathFromEnv(rawValue: string | undefined): string | null {
return path.resolve(expandHomePrefix(rawValue.trim()));
}
function quickstartDefaultsFromEnv(): {
function describeServerBinding(server: Pick<PaperclipConfig["server"], "bind" | "customBindHost" | "host" | "port">): string {
const bind = server.bind ?? inferBindModeFromHost(server.host);
const detail =
bind === "custom"
? server.customBindHost ?? server.host
: bind === "tailnet"
? "detected tailscale address"
: server.host;
return `${bind}${detail ? ` (${detail})` : ""}:${server.port}`;
}
function quickstartDefaultsFromEnv(opts?: { preferTrustedLocal?: boolean }): {
defaults: OnboardDefaults;
usedEnvKeys: string[];
ignoredEnvKeys: Array<{ key: string; reason: string }>;
} {
const preferTrustedLocal = opts?.preferTrustedLocal ?? false;
const instanceId = resolvePaperclipInstanceId();
const defaultStorage = defaultStorageConfig();
const defaultSecrets = defaultSecretsConfig();
const databaseUrl = process.env.DATABASE_URL?.trim() || undefined;
const publicUrl =
process.env.PAPERCLIP_PUBLIC_URL?.trim() ||
process.env.PAPERCLIP_AUTH_PUBLIC_BASE_URL?.trim() ||
process.env.BETTER_AUTH_URL?.trim() ||
process.env.BETTER_AUTH_BASE_URL?.trim() ||
undefined;
const deploymentMode =
parseEnumFromEnv<DeploymentMode>(process.env.PAPERCLIP_DEPLOYMENT_MODE, DEPLOYMENT_MODES) ?? "local_trusted";
const publicUrl = preferTrustedLocal
? undefined
: (
process.env.PAPERCLIP_PUBLIC_URL?.trim() ||
process.env.PAPERCLIP_AUTH_PUBLIC_BASE_URL?.trim() ||
process.env.BETTER_AUTH_URL?.trim() ||
process.env.BETTER_AUTH_BASE_URL?.trim() ||
undefined
);
const deploymentMode = preferTrustedLocal
? "local_trusted"
: (parseEnumFromEnv<DeploymentMode>(process.env.PAPERCLIP_DEPLOYMENT_MODE, DEPLOYMENT_MODES) ?? "local_trusted");
const deploymentExposureFromEnv = parseEnumFromEnv<DeploymentExposure>(
process.env.PAPERCLIP_DEPLOYMENT_EXPOSURE,
DEPLOYMENT_EXPOSURES,
);
const deploymentExposure =
deploymentMode === "local_trusted" ? "private" : (deploymentExposureFromEnv ?? "private");
const bindFromEnv = parseEnumFromEnv<BindMode>(process.env.PAPERCLIP_BIND, BIND_MODES);
const customBindHostFromEnv = process.env.PAPERCLIP_BIND_HOST?.trim() || undefined;
const hostFromEnv = process.env.HOST?.trim() || undefined;
const configuredBindHost = customBindHostFromEnv ?? hostFromEnv;
const bind = preferTrustedLocal
? "loopback"
: (
deploymentMode === "local_trusted"
? "loopback"
: (bindFromEnv ?? (configuredBindHost ? inferBindModeFromHost(configuredBindHost) : "lan"))
);
const resolvedBind = resolveRuntimeBind({
bind,
host: hostFromEnv ?? (bind === "loopback" ? "127.0.0.1" : "0.0.0.0"),
customBindHost: customBindHostFromEnv,
tailnetBindHost: process.env.PAPERCLIP_TAILNET_BIND_HOST?.trim(),
});
const authPublicBaseUrl = publicUrl;
const authBaseUrlModeFromEnv = parseEnumFromEnv<AuthBaseUrlMode>(
process.env.PAPERCLIP_AUTH_BASE_URL_MODE,
@@ -183,7 +228,9 @@ function quickstartDefaultsFromEnv(): {
server: {
deploymentMode,
exposure: deploymentExposure,
host: process.env.HOST ?? "127.0.0.1",
bind: resolvedBind.bind,
...(resolvedBind.customBindHost ? { customBindHost: resolvedBind.customBindHost } : {}),
host: resolvedBind.host,
port: Number(process.env.PORT) || 3100,
allowedHostnames: Array.from(new Set([...allowedHostnamesFromEnv, ...(hostnameFromPublicUrl ? [hostnameFromPublicUrl] : [])])),
serveUi: parseBooleanFromEnv(process.env.SERVE_UI) ?? true,
@@ -220,12 +267,49 @@ function quickstartDefaultsFromEnv(): {
},
};
const ignoredEnvKeys: Array<{ key: string; reason: string }> = [];
if (preferTrustedLocal) {
const forcedLocalReason = "Ignored because --yes quickstart forces trusted local loopback defaults";
for (const key of [
"PAPERCLIP_DEPLOYMENT_MODE",
"PAPERCLIP_DEPLOYMENT_EXPOSURE",
"PAPERCLIP_BIND",
"PAPERCLIP_BIND_HOST",
"HOST",
"PAPERCLIP_AUTH_BASE_URL_MODE",
"PAPERCLIP_AUTH_PUBLIC_BASE_URL",
"PAPERCLIP_PUBLIC_URL",
"BETTER_AUTH_URL",
"BETTER_AUTH_BASE_URL",
] as const) {
if (process.env[key] !== undefined) {
ignoredEnvKeys.push({ key, reason: forcedLocalReason });
}
}
}
if (deploymentMode === "local_trusted" && process.env.PAPERCLIP_DEPLOYMENT_EXPOSURE !== undefined) {
ignoredEnvKeys.push({
key: "PAPERCLIP_DEPLOYMENT_EXPOSURE",
reason: "Ignored because deployment mode local_trusted always forces private exposure",
});
}
if (deploymentMode === "local_trusted" && process.env.PAPERCLIP_BIND !== undefined) {
ignoredEnvKeys.push({
key: "PAPERCLIP_BIND",
reason: "Ignored because deployment mode local_trusted always uses loopback reachability",
});
}
if (deploymentMode === "local_trusted" && process.env.PAPERCLIP_BIND_HOST !== undefined) {
ignoredEnvKeys.push({
key: "PAPERCLIP_BIND_HOST",
reason: "Ignored because deployment mode local_trusted always uses loopback reachability",
});
}
if (deploymentMode === "local_trusted" && process.env.HOST !== undefined) {
ignoredEnvKeys.push({
key: "HOST",
reason: "Ignored because deployment mode local_trusted always uses loopback reachability",
});
}
const ignoredKeySet = new Set(ignoredEnvKeys.map((entry) => entry.key));
const usedEnvKeys = ONBOARD_ENV_KEYS.filter(
@@ -239,6 +323,10 @@ function canCreateBootstrapInviteImmediately(config: Pick<PaperclipConfig, "data
}
export async function onboard(opts: OnboardOptions): Promise<void> {
if (opts.bind && !["loopback", "lan", "tailnet"].includes(opts.bind)) {
throw new Error(`Unsupported bind preset for onboard: ${opts.bind}. Use loopback, lan, or tailnet.`);
}
printPaperclipCliBanner();
p.intro(pc.bgCyan(pc.black(" paperclipai onboard ")));
const configPath = resolveConfigPath(opts.config);
@@ -293,7 +381,7 @@ export async function onboard(opts: OnboardOptions): Promise<void> {
`Database: ${existingConfig.database.mode}`,
existingConfig.llm ? `LLM: ${existingConfig.llm.provider}` : "LLM: not configured",
`Logging: ${existingConfig.logging.mode} -> ${existingConfig.logging.logDir}`,
`Server: ${existingConfig.server.deploymentMode}/${existingConfig.server.exposure} @ ${existingConfig.server.host}:${existingConfig.server.port}`,
`Server: ${existingConfig.server.deploymentMode}/${existingConfig.server.exposure} @ ${describeServerBinding(existingConfig.server)}`,
`Allowed hosts: ${existingConfig.server.allowedHostnames.length > 0 ? existingConfig.server.allowedHostnames.join(", ") : "(loopback only)"}`,
`Auth URL mode: ${existingConfig.auth.baseUrlMode}${existingConfig.auth.publicBaseUrl ? ` (${existingConfig.auth.publicBaseUrl})` : ""}`,
`Storage: ${existingConfig.storage.provider}`,
@@ -336,7 +424,13 @@ export async function onboard(opts: OnboardOptions): Promise<void> {
let setupMode: SetupMode = "quickstart";
if (opts.yes) {
p.log.message(pc.dim("`--yes` enabled: using Quickstart defaults."));
p.log.message(
pc.dim(
opts.bind
? `\`--yes\` enabled: using Quickstart defaults with bind=${opts.bind}.`
: "`--yes` enabled: using Quickstart defaults.",
),
);
} else {
const setupModeChoice = await p.select({
message: "Choose setup path",
@@ -365,7 +459,9 @@ export async function onboard(opts: OnboardOptions): Promise<void> {
if (tc) trackInstallStarted(tc);
let llm: PaperclipConfig["llm"] | undefined;
const { defaults: derivedDefaults, usedEnvKeys, ignoredEnvKeys } = quickstartDefaultsFromEnv();
const { defaults: derivedDefaults, usedEnvKeys, ignoredEnvKeys } = quickstartDefaultsFromEnv({
preferTrustedLocal: opts.yes === true && !opts.bind,
});
let {
database,
logging,
@@ -375,6 +471,19 @@ export async function onboard(opts: OnboardOptions): Promise<void> {
secrets,
} = derivedDefaults;
if (opts.bind === "loopback" || opts.bind === "lan" || opts.bind === "tailnet") {
const preset = buildPresetServerConfig(opts.bind, {
port: server.port,
allowedHostnames: server.allowedHostnames,
serveUi: server.serveUi,
});
server = preset.server;
auth = preset.auth;
if (opts.bind === "tailnet" && server.host === "127.0.0.1") {
p.log.warn(TAILNET_BIND_WARNING);
}
}
if (setupMode === "advanced") {
p.log.step(pc.bold("Database"));
database = await promptDatabase(database);
@@ -462,7 +571,13 @@ export async function onboard(opts: OnboardOptions): Promise<void> {
);
} else {
p.log.step(pc.bold("Quickstart"));
p.log.message(pc.dim("Using quickstart defaults."));
p.log.message(
pc.dim(
opts.bind
? `Using quickstart defaults with bind=${opts.bind}.`
: `Using quickstart defaults: ${server.deploymentMode}/${server.exposure} @ ${describeServerBinding(server)}.`,
),
);
if (usedEnvKeys.length > 0) {
p.log.message(pc.dim(`Environment-aware defaults active (${usedEnvKeys.length} env var(s) detected).`));
} else {
@@ -521,7 +636,7 @@ export async function onboard(opts: OnboardOptions): Promise<void> {
`Database: ${database.mode}`,
llm ? `LLM: ${llm.provider}` : "LLM: not configured",
`Logging: ${logging.mode} -> ${logging.logDir}`,
`Server: ${server.deploymentMode}/${server.exposure} @ ${server.host}:${server.port}`,
`Server: ${server.deploymentMode}/${server.exposure} @ ${describeServerBinding(server)}`,
`Allowed hosts: ${server.allowedHostnames.length > 0 ? server.allowedHostnames.join(", ") : "(loopback only)"}`,
`Auth URL mode: ${auth.baseUrlMode}${auth.publicBaseUrl ? ` (${auth.publicBaseUrl})` : ""}`,
`Storage: ${storage.provider}`,
+2
View File
@@ -9,6 +9,7 @@ import {
createEmbeddedPostgresLogBuffer,
ensurePostgresDatabase,
formatEmbeddedPostgresError,
prepareEmbeddedPostgresNativeRuntime,
routines,
} from "@paperclipai/db";
import { eq, inArray } from "drizzle-orm";
@@ -116,6 +117,7 @@ async function ensureEmbeddedPostgres(dataDir: string, preferredPort: number): P
"Embedded PostgreSQL support requires dependency `embedded-postgres`. Reinstall dependencies and try again.",
);
}
await prepareEmbeddedPostgresNativeRuntime();
const postmasterPidFile = path.resolve(dataDir, "postmaster.pid");
const runningPid = readRunningPostmasterPid(postmasterPidFile);
+27 -1
View File
@@ -1,5 +1,6 @@
import fs from "node:fs";
import path from "node:path";
import { spawnSync } from "node:child_process";
import { fileURLToPath, pathToFileURL } from "node:url";
import * as p from "@clack/prompts";
import pc from "picocolors";
@@ -21,6 +22,7 @@ interface RunOptions {
instance?: string;
repair?: boolean;
yes?: boolean;
bind?: "loopback" | "lan" | "tailnet";
}
interface StartedServer {
@@ -57,7 +59,7 @@ export async function runCommand(opts: RunOptions): Promise<void> {
}
p.log.step("No config found. Starting onboarding...");
await onboard({ config: configPath, invokedByRun: true });
await onboard({ config: configPath, invokedByRun: true, bind: opts.bind });
}
p.log.step("Running doctor checks...");
@@ -146,11 +148,35 @@ function maybeEnableUiDevMiddleware(entrypoint: string): void {
}
}
function ensureDevWorkspaceBuildDeps(projectRoot: string): void {
const buildScript = path.resolve(projectRoot, "scripts/ensure-plugin-build-deps.mjs");
if (!fs.existsSync(buildScript)) return;
const result = spawnSync(process.execPath, [buildScript], {
cwd: projectRoot,
stdio: "inherit",
timeout: 120_000,
});
if (result.error) {
throw new Error(
`Failed to prepare workspace build artifacts before starting the Paperclip dev server.\n${formatError(result.error)}`,
);
}
if ((result.status ?? 1) !== 0) {
throw new Error(
"Failed to prepare workspace build artifacts before starting the Paperclip dev server.",
);
}
}
async function importServerEntry(): Promise<StartedServer> {
// Dev mode: try local workspace path (monorepo with tsx)
const projectRoot = path.resolve(path.dirname(fileURLToPath(import.meta.url)), "../../..");
const devEntry = path.resolve(projectRoot, "server/src/index.ts");
if (fs.existsSync(devEntry)) {
ensureDevWorkspaceBuildDeps(projectRoot);
maybeEnableUiDevMiddleware(devEntry);
const mod = await import(pathToFileURL(devEntry).href);
return await startServerFromModule(mod, devEntry);
+4 -6
View File
@@ -75,11 +75,6 @@ function nonEmpty(value: string | null | undefined): string | null {
return typeof value === "string" && value.trim().length > 0 ? value.trim() : null;
}
function isLoopbackHost(hostname: string): boolean {
const value = hostname.trim().toLowerCase();
return value === "127.0.0.1" || value === "localhost" || value === "::1";
}
export function sanitizeWorktreeInstanceId(rawValue: string): string {
const trimmed = rawValue.trim().toLowerCase();
const normalized = trimmed
@@ -168,7 +163,8 @@ export function rewriteLocalUrlPort(rawUrl: string | undefined, port: number): s
if (!rawUrl) return undefined;
try {
const parsed = new URL(rawUrl);
if (!isLoopbackHost(parsed.hostname)) return rawUrl;
// The URL API normalizes default ports like :80/:443 to "", so treat them as stable URLs.
if (!parsed.port) return rawUrl;
parsed.port = String(port);
return parsed.toString();
} catch {
@@ -214,6 +210,8 @@ export function buildWorktreeConfig(input: {
server: {
deploymentMode: source?.server.deploymentMode ?? "local_trusted",
exposure: source?.server.exposure ?? "private",
...(source?.server.bind ? { bind: source.server.bind } : {}),
...(source?.server.customBindHost ? { customBindHost: source.server.customBindHost } : {}),
host: source?.server.host ?? "127.0.0.1",
port: serverPort,
allowedHostnames: source?.server.allowedHostnames ?? [],
+524 -18
View File
@@ -39,10 +39,13 @@ import {
issues,
projectWorkspaces,
projects,
routines,
routineTriggers,
runDatabaseBackup,
runDatabaseRestore,
createEmbeddedPostgresLogBuffer,
formatEmbeddedPostgresError,
prepareEmbeddedPostgresNativeRuntime,
} from "@paperclipai/db";
import type { Command } from "commander";
import { ensureAgentJwtSecret, loadPaperclipEnvFile, mergePaperclipEnvEntries, readPaperclipEnvEntries, resolvePaperclipEnvFile } from "../config/env.js";
@@ -91,6 +94,7 @@ type WorktreeInitOptions = {
dbPort?: number;
seed?: boolean;
seedMode?: string;
preserveLiveWork?: boolean;
force?: boolean;
};
@@ -124,10 +128,23 @@ type WorktreeReseedOptions = {
fromDataDir?: string;
fromInstance?: string;
seedMode?: string;
preserveLiveWork?: boolean;
yes?: boolean;
allowLiveTarget?: boolean;
};
type WorktreeRepairOptions = {
branch?: string;
home?: string;
fromConfig?: string;
fromDataDir?: string;
fromInstance?: string;
seedMode?: string;
preserveLiveWork?: boolean;
noSeed?: boolean;
allowLiveTarget?: boolean;
};
type EmbeddedPostgresInstance = {
initialise(): Promise<void>;
start(): Promise<void>;
@@ -166,6 +183,8 @@ type CopiedGitHooksResult = {
type SeedWorktreeDatabaseResult = {
backupSummary: string;
pausedScheduledRoutines: number;
executionQuarantine: SeededWorktreeExecutionQuarantineSummary;
reboundWorkspaces: Array<{
name: string;
fromCwd: string;
@@ -173,6 +192,14 @@ type SeedWorktreeDatabaseResult = {
}>;
};
export type SeededWorktreeExecutionQuarantineSummary = {
disabledTimerHeartbeats: number;
resetRunningAgents: number;
quarantinedInProgressIssues: number;
unassignedTodoIssues: number;
unassignedReviewIssues: number;
};
function nonEmpty(value: string | null | undefined): string | null {
return typeof value === "string" && value.trim().length > 0 ? value.trim() : null;
}
@@ -185,6 +212,18 @@ function isCurrentSourceConfigPath(sourceConfigPath: string): boolean {
return path.resolve(currentConfigPath) === path.resolve(sourceConfigPath);
}
function formatSeededWorktreeExecutionQuarantineSummary(
summary: SeededWorktreeExecutionQuarantineSummary,
): string {
return [
`disabled timer heartbeats: ${summary.disabledTimerHeartbeats}`,
`reset running agents: ${summary.resetRunningAgents}`,
`quarantined in-progress issues: ${summary.quarantinedInProgressIssues}`,
`unassigned todo issues: ${summary.unassignedTodoIssues}`,
`unassigned review issues: ${summary.unassignedReviewIssues}`,
].join(", ");
}
const WORKTREE_NAME_PREFIX = "paperclip-";
function resolveWorktreeMakeName(name: string): string {
@@ -548,6 +587,46 @@ function detectGitBranchName(cwd: string): string | null {
}
}
function validateGitBranchName(cwd: string, branchName: string): string {
const value = nonEmpty(branchName);
if (!value) {
throw new Error("Branch name is required.");
}
try {
execFileSync("git", ["check-ref-format", "--branch", value], {
cwd,
stdio: ["ignore", "pipe", "pipe"],
});
} catch (error) {
throw new Error(`Invalid branch name "${branchName}": ${extractExecSyncErrorMessage(error) ?? String(error)}`);
}
return value;
}
function isPrimaryGitWorktree(cwd: string): boolean {
const workspace = detectGitWorkspaceInfo(cwd);
return Boolean(workspace && workspace.gitDir === workspace.commonDir);
}
function resolvePrimaryGitRepoRoot(cwd: string): string {
const workspace = detectGitWorkspaceInfo(cwd);
if (!workspace) {
throw new Error("Current directory is not inside a git repository.");
}
if (workspace.gitDir === workspace.commonDir) {
return workspace.root;
}
return path.resolve(workspace.commonDir, "..");
}
function resolveRepairWorktreeDirName(branchName: string): string {
const normalized = branchName.trim()
.replace(/[^A-Za-z0-9._-]+/g, "-")
.replace(/-+/g, "-")
.replace(/^[-._]+|[-._]+$/g, "");
return normalized || "worktree";
}
function detectGitWorkspaceInfo(cwd: string): GitWorkspaceInfo | null {
try {
const root = execFileSync("git", ["rev-parse", "--show-toplevel"], {
@@ -771,6 +850,21 @@ export function resolveWorktreeReseedSource(input: WorktreeReseedOptions): Resol
);
}
function resolveWorktreeRepairSource(input: WorktreeRepairOptions): ResolvedWorktreeReseedSource {
const fromConfig = nonEmpty(input.fromConfig);
const fromDataDir = nonEmpty(input.fromDataDir);
const fromInstance = nonEmpty(input.fromInstance) ?? "default";
const configPath = resolveSourceConfigPath({
fromConfig: fromConfig ?? undefined,
fromDataDir: fromDataDir ?? undefined,
fromInstance,
});
return {
configPath,
label: configPath,
};
}
export function resolveWorktreeReseedTargetPaths(input: {
configPath: string;
rootPath: string;
@@ -792,6 +886,105 @@ export function resolveWorktreeReseedTargetPaths(input: {
});
}
function resolveExistingGitWorktree(selector: string, cwd: string): MergeSourceChoice | null {
const trimmed = selector.trim();
if (trimmed.length === 0) return null;
const directPath = path.resolve(trimmed);
if (existsSync(directPath)) {
return {
worktree: directPath,
branch: null,
branchLabel: path.basename(directPath),
hasPaperclipConfig: existsSync(path.resolve(directPath, ".paperclip", "config.json")),
isCurrent: directPath === path.resolve(cwd),
};
}
return toMergeSourceChoices(cwd).find((choice) =>
choice.worktree === directPath
|| path.basename(choice.worktree) === trimmed
|| choice.branchLabel === trimmed
|| choice.branch === trimmed,
) ?? null;
}
async function ensureRepairTargetWorktree(input: {
selector?: string;
seedMode: WorktreeSeedMode;
opts: WorktreeRepairOptions;
}): Promise<ResolvedWorktreeRepairTarget | null> {
const cwd = process.cwd();
const currentRoot = path.resolve(cwd);
const currentConfigPath = path.resolve(currentRoot, ".paperclip", "config.json");
if (!input.selector) {
if (isPrimaryGitWorktree(cwd)) {
return null;
}
return {
rootPath: currentRoot,
configPath: currentConfigPath,
label: path.basename(currentRoot),
branchName: detectGitBranchName(cwd),
created: false,
};
}
const existing = resolveExistingGitWorktree(input.selector, cwd);
if (existing) {
return {
rootPath: existing.worktree,
configPath: path.resolve(existing.worktree, ".paperclip", "config.json"),
label: existing.branchLabel,
branchName: existing.branchLabel === "(detached)" ? null : existing.branchLabel,
created: false,
};
}
const repoRoot = resolvePrimaryGitRepoRoot(cwd);
const branchName = validateGitBranchName(repoRoot, input.selector);
const targetPath = path.resolve(
repoRoot,
".paperclip",
"worktrees",
resolveRepairWorktreeDirName(branchName),
);
if (existsSync(targetPath)) {
throw new Error(`Target path already exists but is not a registered git worktree: ${targetPath}`);
}
mkdirSync(path.dirname(targetPath), { recursive: true });
const spinner = p.spinner();
spinner.start(`Creating git worktree for ${branchName}...`);
try {
execFileSync("git", resolveGitWorktreeAddArgs({
branchName,
targetPath,
branchExists: localBranchExists(repoRoot, branchName),
}), {
cwd: repoRoot,
stdio: ["ignore", "pipe", "pipe"],
});
spinner.stop(`Created git worktree at ${targetPath}.`);
} catch (error) {
spinner.stop(pc.red("Failed to create git worktree."));
throw new Error(extractExecSyncErrorMessage(error) ?? String(error));
}
installDependenciesBestEffort(targetPath);
return {
rootPath: targetPath,
configPath: path.resolve(targetPath, ".paperclip", "config.json"),
label: branchName,
branchName,
created: true,
};
}
function resolveSourceConnectionString(config: PaperclipConfig, envEntries: Record<string, string>, portOverride?: number): string {
if (config.database.mode === "postgres") {
const connectionString = nonEmpty(envEntries.DATABASE_URL) ?? nonEmpty(config.database.connectionString);
@@ -867,6 +1060,7 @@ async function ensureEmbeddedPostgres(dataDir: string, preferredPort: number): P
"Embedded PostgreSQL support requires dependency `embedded-postgres`. Reinstall dependencies and try again.",
);
}
await prepareEmbeddedPostgresNativeRuntime();
const postmasterPidFile = path.resolve(dataDir, "postmaster.pid");
const runningPid = readRunningPostmasterPid(postmasterPidFile);
@@ -922,6 +1116,163 @@ async function ensureEmbeddedPostgres(dataDir: string, preferredPort: number): P
};
}
export async function pauseSeededScheduledRoutines(connectionString: string): Promise<number> {
const db = createDb(connectionString);
try {
const scheduledRoutineIds = await db
.selectDistinct({ routineId: routineTriggers.routineId })
.from(routineTriggers)
.where(and(eq(routineTriggers.kind, "schedule"), eq(routineTriggers.enabled, true)));
const idsToPause = scheduledRoutineIds
.map((row) => row.routineId)
.filter((value): value is string => Boolean(value));
if (idsToPause.length === 0) {
return 0;
}
const paused = await db
.update(routines)
.set({
status: "paused",
updatedAt: new Date(),
})
.where(and(inArray(routines.id, idsToPause), sql`${routines.status} <> 'paused'`, sql`${routines.status} <> 'archived'`))
.returning({ id: routines.id });
return paused.length;
} finally {
await db.$client?.end?.({ timeout: 5 }).catch(() => undefined);
}
}
const EMPTY_SEEDED_WORKTREE_EXECUTION_QUARANTINE_SUMMARY: SeededWorktreeExecutionQuarantineSummary = {
disabledTimerHeartbeats: 0,
resetRunningAgents: 0,
quarantinedInProgressIssues: 0,
unassignedTodoIssues: 0,
unassignedReviewIssues: 0,
};
function isRecord(value: unknown): value is Record<string, unknown> {
return Boolean(value) && typeof value === "object" && !Array.isArray(value);
}
function isEnabledValue(value: unknown): boolean {
return value === true || value === "true" || value === 1 || value === "1";
}
function normalizeWorktreeRuntimeConfig(runtimeConfig: unknown): {
runtimeConfig: Record<string, unknown>;
disabledTimerHeartbeat: boolean;
changed: boolean;
} {
const nextRuntimeConfig = isRecord(runtimeConfig) ? { ...runtimeConfig } : {};
const heartbeat = isRecord(nextRuntimeConfig.heartbeat) ? { ...nextRuntimeConfig.heartbeat } : null;
if (!heartbeat) {
return { runtimeConfig: nextRuntimeConfig, disabledTimerHeartbeat: false, changed: false };
}
const disabledTimerHeartbeat = isEnabledValue(heartbeat.enabled);
if (heartbeat.enabled !== false) {
heartbeat.enabled = false;
nextRuntimeConfig.heartbeat = heartbeat;
return { runtimeConfig: nextRuntimeConfig, disabledTimerHeartbeat, changed: true };
}
return { runtimeConfig: nextRuntimeConfig, disabledTimerHeartbeat: false, changed: false };
}
export async function quarantineSeededWorktreeExecutionState(
connectionString: string,
): Promise<SeededWorktreeExecutionQuarantineSummary> {
const db = createDb(connectionString);
const summary = { ...EMPTY_SEEDED_WORKTREE_EXECUTION_QUARANTINE_SUMMARY };
try {
await db.transaction(async (tx) => {
const seededAgents = await tx
.select({
id: agents.id,
status: agents.status,
runtimeConfig: agents.runtimeConfig,
})
.from(agents);
for (const agent of seededAgents) {
const normalized = normalizeWorktreeRuntimeConfig(agent.runtimeConfig);
const nextStatus = agent.status === "running" ? "idle" : agent.status;
if (normalized.disabledTimerHeartbeat) {
summary.disabledTimerHeartbeats += 1;
}
if (agent.status === "running") {
summary.resetRunningAgents += 1;
}
if (normalized.changed || nextStatus !== agent.status) {
await tx
.update(agents)
.set({
runtimeConfig: normalized.runtimeConfig,
status: nextStatus,
updatedAt: new Date(),
})
.where(eq(agents.id, agent.id));
}
}
const affectedIssues = await tx
.select({
id: issues.id,
companyId: issues.companyId,
status: issues.status,
})
.from(issues)
.where(
and(
sql`${issues.assigneeAgentId} is not null`,
sql`${issues.assigneeUserId} is null`,
inArray(issues.status, ["todo", "in_progress", "in_review"]),
),
);
for (const issue of affectedIssues) {
const nextStatus = issue.status === "in_progress" ? "blocked" : issue.status;
await tx
.update(issues)
.set({
status: nextStatus,
assigneeAgentId: null,
checkoutRunId: null,
executionRunId: null,
executionAgentNameKey: null,
executionLockedAt: null,
executionWorkspaceId: null,
updatedAt: new Date(),
})
.where(eq(issues.id, issue.id));
if (issue.status === "in_progress") {
summary.quarantinedInProgressIssues += 1;
await tx.insert(issueComments).values({
companyId: issue.companyId,
issueId: issue.id,
body:
"Quarantined during worktree seed so copied in-flight work does not auto-run in this isolated instance. " +
"Reassign or unblock here only if you intentionally want the worktree instance to own this task.",
});
} else if (issue.status === "todo") {
summary.unassignedTodoIssues += 1;
} else if (issue.status === "in_review") {
summary.unassignedReviewIssues += 1;
}
}
});
return summary;
} finally {
await db.$client?.end?.({ timeout: 5 }).catch(() => undefined);
}
}
async function seedWorktreeDatabase(input: {
sourceConfigPath: string;
sourceConfig: PaperclipConfig;
@@ -929,6 +1280,7 @@ async function seedWorktreeDatabase(input: {
targetPaths: WorktreeLocalPaths;
instanceId: string;
seedMode: WorktreeSeedMode;
preserveLiveWork?: boolean;
}): Promise<SeedWorktreeDatabaseResult> {
const seedPlan = resolveWorktreeSeedPlan(input.seedMode);
const sourceEnvFile = resolvePaperclipEnvFile(input.sourceConfigPath);
@@ -959,8 +1311,9 @@ async function seedWorktreeDatabase(input: {
const backup = await runDatabaseBackup({
connectionString: sourceConnectionString,
backupDir: path.resolve(input.targetPaths.backupDir, "seed"),
retentionDays: 7,
retention: { dailyDays: 7, weeklyWeeks: 4, monthlyMonths: 1 },
filenamePrefix: `${input.instanceId}-seed`,
backupEngine: "javascript",
includeMigrationJournal: true,
excludeTables: seedPlan.excludedTables,
nullifyColumns: seedPlan.nullifyColumns,
@@ -979,6 +1332,10 @@ async function seedWorktreeDatabase(input: {
backupFile: backup.backupFile,
});
await applyPendingMigrations(targetConnectionString);
const executionQuarantine = input.preserveLiveWork
? { ...EMPTY_SEEDED_WORKTREE_EXECUTION_QUARANTINE_SUMMARY }
: await quarantineSeededWorktreeExecutionState(targetConnectionString);
const pausedScheduledRoutines = await pauseSeededScheduledRoutines(targetConnectionString);
const reboundWorkspaces = await rebindSeededProjectWorkspaces({
targetConnectionString,
currentCwd: input.targetPaths.cwd,
@@ -986,6 +1343,8 @@ async function seedWorktreeDatabase(input: {
return {
backupSummary: formatDatabaseBackupResult(backup),
pausedScheduledRoutines,
executionQuarantine,
reboundWorkspaces,
};
} finally {
@@ -1028,7 +1387,12 @@ async function runWorktreeInit(opts: WorktreeInitOptions): Promise<void> {
}
if (opts.force) {
rmSync(paths.repoConfigDir, { recursive: true, force: true });
// Only remove the specific files we're about to rewrite, not the whole
// repoConfigDir — that directory can contain sibling state such as
// <repo>/.paperclip/worktrees/ holding every repo-managed worktree
// checkout, and a recursive rmSync here would nuke them all.
rmSync(paths.configPath, { force: true });
rmSync(paths.envPath, { force: true });
rmSync(paths.instanceRoot, { recursive: true, force: true });
}
@@ -1064,6 +1428,8 @@ async function runWorktreeInit(opts: WorktreeInitOptions): Promise<void> {
const copiedGitHooks = copyGitHooksToWorktreeGitDir(cwd);
let seedSummary: string | null = null;
let seedExecutionQuarantineSummary: SeededWorktreeExecutionQuarantineSummary | null = null;
let pausedScheduledRoutineCount: number | null = null;
let reboundWorkspaceSummary: SeedWorktreeDatabaseResult["reboundWorkspaces"] = [];
if (opts.seed !== false) {
if (!sourceConfig) {
@@ -1081,8 +1447,11 @@ async function runWorktreeInit(opts: WorktreeInitOptions): Promise<void> {
targetPaths: paths,
instanceId,
seedMode,
preserveLiveWork: opts.preserveLiveWork,
});
seedSummary = seeded.backupSummary;
seedExecutionQuarantineSummary = seeded.executionQuarantine;
pausedScheduledRoutineCount = seeded.pausedScheduledRoutines;
reboundWorkspaceSummary = seeded.reboundWorkspaces;
spinner.stop(`Seeded isolated worktree database (${seedMode}).`);
} catch (error) {
@@ -1105,6 +1474,16 @@ async function runWorktreeInit(opts: WorktreeInitOptions): Promise<void> {
if (seedSummary) {
p.log.message(pc.dim(`Seed mode: ${seedMode}`));
p.log.message(pc.dim(`Seed snapshot: ${seedSummary}`));
if (opts.preserveLiveWork) {
p.log.warning("Preserved copied live work; this worktree instance may auto-run source-instance assignments.");
} else if (seedExecutionQuarantineSummary) {
p.log.message(
pc.dim(`Seed execution quarantine: ${formatSeededWorktreeExecutionQuarantineSummary(seedExecutionQuarantineSummary)}`),
);
}
if (pausedScheduledRoutineCount != null) {
p.log.message(pc.dim(`Paused scheduled routines: ${pausedScheduledRoutineCount}`));
}
for (const rebound of reboundWorkspaceSummary) {
p.log.message(
pc.dim(`Rebound workspace ${rebound.name}: ${rebound.fromCwd} -> ${rebound.toCwd}`),
@@ -1172,18 +1551,7 @@ export async function worktreeMakeCommand(nameArg: string, opts: WorktreeMakeOpt
throw new Error(extractExecSyncErrorMessage(error) ?? String(error));
}
const installSpinner = p.spinner();
installSpinner.start("Installing dependencies...");
try {
execFileSync("pnpm", ["install"], {
cwd: targetPath,
stdio: ["ignore", "pipe", "pipe"],
});
installSpinner.stop("Installed dependencies.");
} catch (error) {
installSpinner.stop(pc.yellow("Failed to install dependencies (continuing anyway)."));
p.log.warning(extractExecSyncErrorMessage(error) ?? String(error));
}
installDependenciesBestEffort(targetPath);
const originalCwd = process.cwd();
try {
@@ -1200,6 +1568,21 @@ export async function worktreeMakeCommand(nameArg: string, opts: WorktreeMakeOpt
}
}
function installDependenciesBestEffort(targetPath: string): void {
const installSpinner = p.spinner();
installSpinner.start("Installing dependencies...");
try {
execFileSync("pnpm", ["install"], {
cwd: targetPath,
stdio: ["ignore", "pipe", "pipe"],
});
installSpinner.stop("Installed dependencies.");
} catch (error) {
installSpinner.stop(pc.yellow("Failed to install dependencies (continuing anyway)."));
p.log.warning(extractExecSyncErrorMessage(error) ?? String(error));
}
}
type WorktreeCleanupOptions = {
instance?: string;
home?: string;
@@ -1233,6 +1616,14 @@ type ResolvedWorktreeReseedSource = {
label: string;
};
type ResolvedWorktreeRepairTarget = {
rootPath: string;
configPath: string;
label: string;
branchName: string | null;
created: boolean;
};
function parseGitWorktreeList(cwd: string): GitWorktreeListEntry[] {
const raw = execFileSync("git", ["worktree", "list", "--porcelain"], {
cwd,
@@ -2674,10 +3065,7 @@ export async function worktreeMergeHistoryCommand(sourceArg: string | undefined,
}
}
export async function worktreeReseedCommand(opts: WorktreeReseedOptions): Promise<void> {
printPaperclipCliBanner();
p.intro(pc.bgCyan(pc.black(" paperclipai worktree reseed ")));
async function runWorktreeReseed(opts: WorktreeReseedOptions): Promise<void> {
const seedMode = opts.seedMode ?? "full";
if (!isWorktreeSeedMode(seedMode)) {
throw new Error(`Unsupported seed mode "${seedMode}". Expected one of: minimal, full.`);
@@ -2740,11 +3128,20 @@ export async function worktreeReseedCommand(opts: WorktreeReseedOptions): Promis
targetPaths,
instanceId: targetPaths.instanceId,
seedMode,
preserveLiveWork: opts.preserveLiveWork,
});
spinner.stop(`Reseeded ${targetEndpoint.label} (${seedMode}).`);
p.log.message(pc.dim(`Source: ${source.configPath}`));
p.log.message(pc.dim(`Target: ${targetEndpoint.configPath}`));
p.log.message(pc.dim(`Seed snapshot: ${seeded.backupSummary}`));
if (opts.preserveLiveWork) {
p.log.warning("Preserved copied live work; this worktree instance may auto-run source-instance assignments.");
} else {
p.log.message(
pc.dim(`Seed execution quarantine: ${formatSeededWorktreeExecutionQuarantineSummary(seeded.executionQuarantine)}`),
);
}
p.log.message(pc.dim(`Paused scheduled routines: ${seeded.pausedScheduledRoutines}`));
for (const rebound of seeded.reboundWorkspaces) {
p.log.message(
pc.dim(`Rebound workspace ${rebound.name}: ${rebound.fromCwd} -> ${rebound.toCwd}`),
@@ -2757,6 +3154,98 @@ export async function worktreeReseedCommand(opts: WorktreeReseedOptions): Promis
}
}
export async function worktreeReseedCommand(opts: WorktreeReseedOptions): Promise<void> {
printPaperclipCliBanner();
p.intro(pc.bgCyan(pc.black(" paperclipai worktree reseed ")));
await runWorktreeReseed(opts);
}
export async function worktreeRepairCommand(opts: WorktreeRepairOptions): Promise<void> {
printPaperclipCliBanner();
p.intro(pc.bgCyan(pc.black(" paperclipai worktree repair ")));
const seedMode = opts.seedMode ?? "minimal";
if (!isWorktreeSeedMode(seedMode)) {
throw new Error(`Unsupported seed mode "${seedMode}". Expected one of: minimal, full.`);
}
const target = await ensureRepairTargetWorktree({
selector: nonEmpty(opts.branch) ?? undefined,
seedMode,
opts,
});
if (!target) {
p.log.warn("Current checkout is the primary repo worktree. Pass --branch to create or repair a linked worktree.");
p.outro(pc.yellow("No worktree repaired."));
return;
}
const source = resolveWorktreeRepairSource(opts);
if (!existsSync(source.configPath)) {
throw new Error(`Source config not found at ${source.configPath}.`);
}
if (path.resolve(source.configPath) === path.resolve(target.configPath)) {
throw new Error("Source and target Paperclip configs are the same. Use --from-config/--from-instance to point repair at a different source.");
}
const targetConfig = existsSync(target.configPath) ? readConfig(target.configPath) : null;
const targetEnvEntries = readPaperclipEnvEntries(resolvePaperclipEnvFile(target.configPath));
const targetHasWorktreeEnv = Boolean(
nonEmpty(targetEnvEntries.PAPERCLIP_HOME) && nonEmpty(targetEnvEntries.PAPERCLIP_INSTANCE_ID),
);
if (targetConfig && targetHasWorktreeEnv && opts.noSeed) {
p.log.message(pc.dim(`Target ${target.label} already has worktree-local config/env. Skipping reseed because --no-seed was passed.`));
p.outro(pc.green(`Worktree metadata already looks healthy for ${target.label}.`));
return;
}
if (targetConfig && targetHasWorktreeEnv) {
await runWorktreeReseed({
fromConfig: source.configPath,
to: target.rootPath,
seedMode,
preserveLiveWork: opts.preserveLiveWork,
yes: true,
allowLiveTarget: opts.allowLiveTarget,
});
return;
}
const repairInstanceId = sanitizeWorktreeInstanceId(path.basename(target.rootPath));
const repairPaths = resolveWorktreeLocalPaths({
cwd: target.rootPath,
homeDir: resolveWorktreeHome(opts.home),
instanceId: repairInstanceId,
});
const runningTargetPid = readRunningPostmasterPid(path.resolve(repairPaths.embeddedPostgresDataDir, "postmaster.pid"));
if (runningTargetPid && !opts.allowLiveTarget) {
throw new Error(
`Target worktree database appears to be running (pid ${runningTargetPid}). Stop Paperclip in ${target.rootPath} before repairing, or re-run with --allow-live-target if you want to override this guard.`,
);
}
if (runningTargetPid && opts.allowLiveTarget) {
p.log.warning(`Proceeding even though the target embedded PostgreSQL appears to be running (pid ${runningTargetPid}).`);
}
const originalCwd = process.cwd();
try {
process.chdir(target.rootPath);
await runWorktreeInit({
home: opts.home,
fromConfig: source.configPath,
fromDataDir: opts.fromDataDir,
fromInstance: opts.fromInstance,
seed: opts.noSeed ? false : true,
seedMode,
preserveLiveWork: opts.preserveLiveWork,
force: true,
});
} finally {
process.chdir(originalCwd);
}
}
export function registerWorktreeCommands(program: Command): void {
const worktree = program.command("worktree").description("Worktree-local Paperclip instance helpers");
@@ -2773,6 +3262,7 @@ export function registerWorktreeCommands(program: Command): void {
.option("--server-port <port>", "Preferred server port", (value) => Number(value))
.option("--db-port <port>", "Preferred embedded Postgres port", (value) => Number(value))
.option("--seed-mode <mode>", "Seed profile: minimal or full (default: minimal)", "minimal")
.option("--preserve-live-work", "Do not quarantine copied agent timers or assigned open issues in the seeded worktree", false)
.option("--no-seed", "Skip database seeding from the source instance")
.option("--force", "Replace existing repo-local config and isolated instance data", false)
.action(worktreeMakeCommand);
@@ -2789,6 +3279,7 @@ export function registerWorktreeCommands(program: Command): void {
.option("--server-port <port>", "Preferred server port", (value) => Number(value))
.option("--db-port <port>", "Preferred embedded Postgres port", (value) => Number(value))
.option("--seed-mode <mode>", "Seed profile: minimal or full (default: minimal)", "minimal")
.option("--preserve-live-work", "Do not quarantine copied agent timers or assigned open issues in the seeded worktree", false)
.option("--no-seed", "Skip database seeding from the source instance")
.option("--force", "Replace existing repo-local config and isolated instance data", false)
.action(worktreeInitCommand);
@@ -2828,10 +3319,25 @@ export function registerWorktreeCommands(program: Command): void {
.option("--from-data-dir <path>", "Source PAPERCLIP_HOME used when deriving the source config")
.option("--from-instance <id>", "Source instance id when deriving the source config")
.option("--seed-mode <mode>", "Seed profile: minimal or full (default: full)", "full")
.option("--preserve-live-work", "Do not quarantine copied agent timers or assigned open issues in the seeded worktree", false)
.option("--yes", "Skip the destructive confirmation prompt", false)
.option("--allow-live-target", "Override the guard that requires the target worktree DB to be stopped first", false)
.action(worktreeReseedCommand);
worktree
.command("repair")
.description("Create or repair a linked worktree-local Paperclip instance without touching the primary checkout")
.option("--branch <name>", "Existing branch/worktree selector to repair, or a branch name to create under .paperclip/worktrees")
.option("--home <path>", `Home root for worktree instances (env: PAPERCLIP_WORKTREES_DIR, default: ${DEFAULT_WORKTREE_HOME})`)
.option("--from-config <path>", "Source config.json to seed from")
.option("--from-data-dir <path>", "Source PAPERCLIP_HOME used when deriving the source config")
.option("--from-instance <id>", "Source instance id when deriving the source config (default: default)")
.option("--seed-mode <mode>", "Seed profile: minimal or full (default: minimal)", "minimal")
.option("--preserve-live-work", "Do not quarantine copied agent timers or assigned open issues in the seeded worktree", false)
.option("--no-seed", "Repair metadata only and skip reseeding when bootstrapping a missing worktree config", false)
.option("--allow-live-target", "Override the guard that requires the target worktree DB to be stopped first", false)
.action(worktreeRepairCommand);
program
.command("worktree:cleanup")
.description("Safely remove a worktree, its branch, and its isolated instance data")
+26 -33
View File
@@ -1,32 +1,31 @@
import os from "node:os";
import path from "node:path";
import {
expandHomePrefix,
resolveDefaultBackupDir as resolveSharedDefaultBackupDir,
resolveDefaultEmbeddedPostgresDir as resolveSharedDefaultEmbeddedPostgresDir,
resolveDefaultLogsDir as resolveSharedDefaultLogsDir,
resolveDefaultSecretsKeyFilePath as resolveSharedDefaultSecretsKeyFilePath,
resolveDefaultStorageDir as resolveSharedDefaultStorageDir,
resolveHomeAwarePath,
resolvePaperclipConfigPathForInstance,
resolvePaperclipHomeDir,
resolvePaperclipInstanceId,
resolvePaperclipInstanceRoot as resolveSharedPaperclipInstanceRoot,
} from "@paperclipai/shared/home-paths";
const DEFAULT_INSTANCE_ID = "default";
const INSTANCE_ID_RE = /^[a-zA-Z0-9_-]+$/;
export function resolvePaperclipHomeDir(): string {
const envHome = process.env.PAPERCLIP_HOME?.trim();
if (envHome) return path.resolve(expandHomePrefix(envHome));
return path.resolve(os.homedir(), ".paperclip");
}
export function resolvePaperclipInstanceId(override?: string): string {
const raw = override?.trim() || process.env.PAPERCLIP_INSTANCE_ID?.trim() || DEFAULT_INSTANCE_ID;
if (!INSTANCE_ID_RE.test(raw)) {
throw new Error(
`Invalid instance id '${raw}'. Allowed characters: letters, numbers, '_' and '-'.`,
);
}
return raw;
}
export {
expandHomePrefix,
resolveHomeAwarePath,
resolvePaperclipHomeDir,
resolvePaperclipInstanceId,
};
export function resolvePaperclipInstanceRoot(instanceId?: string): string {
const id = resolvePaperclipInstanceId(instanceId);
return path.resolve(resolvePaperclipHomeDir(), "instances", id);
return resolveSharedPaperclipInstanceRoot({ instanceId });
}
export function resolveDefaultConfigPath(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "config.json");
return resolvePaperclipConfigPathForInstance({ instanceId });
}
export function resolveDefaultContextPath(): string {
@@ -38,29 +37,23 @@ export function resolveDefaultCliAuthPath(): string {
}
export function resolveDefaultEmbeddedPostgresDir(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "db");
return resolveSharedDefaultEmbeddedPostgresDir({ instanceId });
}
export function resolveDefaultLogsDir(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "logs");
return resolveSharedDefaultLogsDir({ instanceId });
}
export function resolveDefaultSecretsKeyFilePath(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "secrets", "master.key");
return resolveSharedDefaultSecretsKeyFilePath({ instanceId });
}
export function resolveDefaultStorageDir(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "data", "storage");
return resolveSharedDefaultStorageDir({ instanceId });
}
export function resolveDefaultBackupDir(instanceId?: string): string {
return path.resolve(resolvePaperclipInstanceRoot(instanceId), "data", "backups");
}
export function expandHomePrefix(value: string): string {
if (value === "~") return os.homedir();
if (value.startsWith("~/")) return path.resolve(os.homedir(), value.slice(2));
return value;
return resolveSharedDefaultBackupDir({ instanceId });
}
export function describeLocalInstancePaths(instanceId?: string) {
+183
View File
@@ -0,0 +1,183 @@
import { execFileSync } from "node:child_process";
import {
ALL_INTERFACES_BIND_HOST,
LOOPBACK_BIND_HOST,
inferBindModeFromHost,
isAllInterfacesHost,
isLoopbackHost,
type BindMode,
type DeploymentExposure,
type DeploymentMode,
} from "@paperclipai/shared";
import type { AuthConfig, ServerConfig } from "./schema.js";
const TAILSCALE_DETECT_TIMEOUT_MS = 3000;
type BaseServerInput = {
port: number;
allowedHostnames: string[];
serveUi: boolean;
};
export function inferConfiguredBind(server?: Partial<ServerConfig>): BindMode {
if (server?.bind) return server.bind;
return inferBindModeFromHost(server?.customBindHost ?? server?.host);
}
export function detectTailnetBindHost(): string | undefined {
const explicit = process.env.PAPERCLIP_TAILNET_BIND_HOST?.trim();
if (explicit) return explicit;
try {
const stdout = execFileSync("tailscale", ["ip", "-4"], {
encoding: "utf8",
stdio: ["ignore", "pipe", "ignore"],
timeout: TAILSCALE_DETECT_TIMEOUT_MS,
});
return stdout
.split(/\r?\n/)
.map((line) => line.trim())
.find(Boolean);
} catch {
return undefined;
}
}
export function buildPresetServerConfig(
bind: Exclude<BindMode, "custom">,
input: BaseServerInput,
): { server: ServerConfig; auth: AuthConfig } {
const host =
bind === "loopback"
? LOOPBACK_BIND_HOST
: bind === "tailnet"
? (detectTailnetBindHost() ?? LOOPBACK_BIND_HOST)
: ALL_INTERFACES_BIND_HOST;
return {
server: {
deploymentMode: bind === "loopback" ? "local_trusted" : "authenticated",
exposure: "private",
bind,
customBindHost: undefined,
host,
port: input.port,
allowedHostnames: input.allowedHostnames,
serveUi: input.serveUi,
},
auth: {
baseUrlMode: "auto",
disableSignUp: false,
},
};
}
export function buildCustomServerConfig(input: BaseServerInput & {
deploymentMode: DeploymentMode;
exposure: DeploymentExposure;
host: string;
publicBaseUrl?: string;
}): { server: ServerConfig; auth: AuthConfig } {
const normalizedHost = input.host.trim();
const bind = isLoopbackHost(normalizedHost)
? "loopback"
: isAllInterfacesHost(normalizedHost)
? "lan"
: "custom";
return {
server: {
deploymentMode: input.deploymentMode,
exposure: input.deploymentMode === "local_trusted" ? "private" : input.exposure,
bind,
customBindHost: bind === "custom" ? normalizedHost : undefined,
host: normalizedHost,
port: input.port,
allowedHostnames: input.allowedHostnames,
serveUi: input.serveUi,
},
auth:
input.deploymentMode === "authenticated" && input.exposure === "public"
? {
baseUrlMode: "explicit",
disableSignUp: false,
publicBaseUrl: input.publicBaseUrl,
}
: {
baseUrlMode: "auto",
disableSignUp: false,
},
};
}
export function resolveQuickstartServerConfig(input: {
bind?: BindMode | null;
deploymentMode?: DeploymentMode | null;
exposure?: DeploymentExposure | null;
host?: string | null;
port: number;
allowedHostnames: string[];
serveUi: boolean;
publicBaseUrl?: string;
}): { server: ServerConfig; auth: AuthConfig } {
const trimmedHost = input.host?.trim();
const explicitBind = input.bind ?? null;
if (explicitBind === "loopback" || explicitBind === "lan" || explicitBind === "tailnet") {
return buildPresetServerConfig(explicitBind, {
port: input.port,
allowedHostnames: input.allowedHostnames,
serveUi: input.serveUi,
});
}
if (explicitBind === "custom") {
return buildCustomServerConfig({
deploymentMode: input.deploymentMode ?? "authenticated",
exposure: input.exposure ?? "private",
host: trimmedHost || LOOPBACK_BIND_HOST,
port: input.port,
allowedHostnames: input.allowedHostnames,
serveUi: input.serveUi,
publicBaseUrl: input.publicBaseUrl,
});
}
if (trimmedHost) {
return buildCustomServerConfig({
deploymentMode: input.deploymentMode ?? (isLoopbackHost(trimmedHost) ? "local_trusted" : "authenticated"),
exposure: input.exposure ?? "private",
host: trimmedHost,
port: input.port,
allowedHostnames: input.allowedHostnames,
serveUi: input.serveUi,
publicBaseUrl: input.publicBaseUrl,
});
}
if (input.deploymentMode === "authenticated") {
if (input.exposure === "public") {
return buildCustomServerConfig({
deploymentMode: "authenticated",
exposure: "public",
host: ALL_INTERFACES_BIND_HOST,
port: input.port,
allowedHostnames: input.allowedHostnames,
serveUi: input.serveUi,
publicBaseUrl: input.publicBaseUrl,
});
}
return buildPresetServerConfig("lan", {
port: input.port,
allowedHostnames: input.allowedHostnames,
serveUi: input.serveUi,
});
}
return buildPresetServerConfig("loopback", {
port: input.port,
allowedHostnames: input.allowedHostnames,
serveUi: input.serveUi,
});
}
+11 -1
View File
@@ -8,6 +8,7 @@ import { heartbeatRun } from "./commands/heartbeat-run.js";
import { runCommand } from "./commands/run.js";
import { bootstrapCeoInvite } from "./commands/auth-bootstrap-ceo.js";
import { dbBackupCommand } from "./commands/db-backup.js";
import { registerEnvLabCommands } from "./commands/env-lab.js";
import { registerContextCommands } from "./commands/client/context.js";
import { registerCompanyCommands } from "./commands/client/company.js";
import { registerIssueCommands } from "./commands/client/issue.js";
@@ -17,6 +18,9 @@ import { registerActivityCommands } from "./commands/client/activity.js";
import { registerDashboardCommands } from "./commands/client/dashboard.js";
import { registerRoutineCommands } from "./commands/routines.js";
import { registerFeedbackCommands } from "./commands/client/feedback.js";
import { registerSecretCommands } from "./commands/client/secrets.js";
import { registerCloudCommands } from "./commands/client/cloud.js";
import { registerSkillsCommands } from "./commands/client/skills.js";
import { applyDataDirOverride, type DataDirOptionLike } from "./config/data-dir.js";
import { loadPaperclipEnvFile } from "./config/env.js";
import { initTelemetryFromConfigFile, flushTelemetry } from "./telemetry.js";
@@ -50,7 +54,8 @@ program
.description("Interactive first-run setup wizard")
.option("-c, --config <path>", "Path to config file")
.option("-d, --data-dir <path>", DATA_DIR_OPTION_HELP)
.option("-y, --yes", "Accept defaults (quickstart + start immediately)", false)
.option("--bind <mode>", "Quickstart reachability preset (loopback, lan, tailnet)")
.option("-y, --yes", "Accept quickstart defaults (trusted local loopback unless --bind is set) and start immediately", false)
.option("--run", "Start Paperclip immediately after saving config", false)
.action(onboard);
@@ -108,6 +113,7 @@ program
.option("-c, --config <path>", "Path to config file")
.option("-d, --data-dir <path>", DATA_DIR_OPTION_HELP)
.option("-i, --instance <id>", "Local instance id (default: default)")
.option("--bind <mode>", "On first run, use onboarding reachability preset (loopback, lan, tailnet)")
.option("--repair", "Attempt automatic repairs during doctor", true)
.option("--no-repair", "Disable automatic repairs during doctor")
.action(runCommand);
@@ -144,7 +150,11 @@ registerActivityCommands(program);
registerDashboardCommands(program);
registerRoutineCommands(program);
registerFeedbackCommands(program);
registerSecretCommands(program);
registerCloudCommands(program);
registerSkillsCommands(program);
registerWorktreeCommands(program);
registerEnvLabCommands(program);
registerPluginCommands(program);
const auth = program.command("auth").description("Authentication and bootstrap utilities");
+4 -2
View File
@@ -32,7 +32,7 @@ export async function promptSecrets(current?: SecretsConfig): Promise<SecretsCon
{
value: "aws_secrets_manager" as const,
label: "AWS Secrets Manager",
hint: "requires external adapter integration",
hint: "requires runtime AWS credentials and provider env config",
},
{
value: "gcp_secret_manager" as const,
@@ -84,7 +84,9 @@ export async function promptSecrets(current?: SecretsConfig): Promise<SecretsCon
if (provider !== "local_encrypted") {
p.note(
`${provider} is not fully wired in this build yet. Keep local_encrypted unless you are actively implementing that adapter.`,
provider === "aws_secrets_manager"
? "AWS credentials must come from the Paperclip server runtime (IAM role/workload identity, AWS_PROFILE/SSO/shared credentials, or short-lived shell env), not from Paperclip company secrets."
: `${provider} is not fully wired in this build yet. Keep local_encrypted unless you are actively implementing that adapter.`,
"Heads up",
);
}
+147 -90
View File
@@ -1,6 +1,16 @@
import * as p from "@clack/prompts";
import { isLoopbackHost, type BindMode } from "@paperclipai/shared";
import type { AuthConfig, ServerConfig } from "../config/schema.js";
import { parseHostnameCsv } from "../config/hostnames.js";
import { buildCustomServerConfig, buildPresetServerConfig, inferConfiguredBind } from "../config/server-bind.js";
const TAILNET_BIND_WARNING =
"No Tailscale address was detected during setup. The saved config will stay on loopback until Tailscale is available or PAPERCLIP_TAILNET_BIND_HOST is set.";
function cancelled(): never {
p.cancel("Setup cancelled.");
process.exit(0);
}
export async function promptServer(opts?: {
currentServer?: Partial<ServerConfig>;
@@ -8,69 +18,37 @@ export async function promptServer(opts?: {
}): Promise<{ server: ServerConfig; auth: AuthConfig }> {
const currentServer = opts?.currentServer;
const currentAuth = opts?.currentAuth;
const currentBind = inferConfiguredBind(currentServer);
const deploymentModeSelection = await p.select({
message: "Deployment mode",
const bindSelection = await p.select({
message: "Reachability",
options: [
{
value: "local_trusted",
label: "Local trusted",
hint: "Easiest for local setup (no login, localhost-only)",
value: "loopback" as const,
label: "Trusted local",
hint: "Recommended for first run: localhost only, no login friction",
},
{
value: "authenticated",
label: "Authenticated",
hint: "Login required; use for private network or public hosting",
value: "lan" as const,
label: "Private network",
hint: "Broad private bind for LAN, VPN, or legacy --tailscale-auth style access",
},
{
value: "tailnet" as const,
label: "Tailnet",
hint: "Private authenticated access using the machine's detected Tailscale address",
},
{
value: "custom" as const,
label: "Custom",
hint: "Choose exact auth mode, exposure, and host manually",
},
],
initialValue: currentServer?.deploymentMode ?? "local_trusted",
initialValue: currentBind,
});
if (p.isCancel(deploymentModeSelection)) {
p.cancel("Setup cancelled.");
process.exit(0);
}
const deploymentMode = deploymentModeSelection as ServerConfig["deploymentMode"];
let exposure: ServerConfig["exposure"] = "private";
if (deploymentMode === "authenticated") {
const exposureSelection = await p.select({
message: "Exposure profile",
options: [
{
value: "private",
label: "Private network",
hint: "Private access (for example Tailscale), lower setup friction",
},
{
value: "public",
label: "Public internet",
hint: "Internet-facing deployment with stricter requirements",
},
],
initialValue: currentServer?.exposure ?? "private",
});
if (p.isCancel(exposureSelection)) {
p.cancel("Setup cancelled.");
process.exit(0);
}
exposure = exposureSelection as ServerConfig["exposure"];
}
const hostDefault = deploymentMode === "local_trusted" ? "127.0.0.1" : "0.0.0.0";
const hostStr = await p.text({
message: "Bind host",
defaultValue: currentServer?.host ?? hostDefault,
placeholder: hostDefault,
validate: (val) => {
if (!val.trim()) return "Host is required";
},
});
if (p.isCancel(hostStr)) {
p.cancel("Setup cancelled.");
process.exit(0);
}
if (p.isCancel(bindSelection)) cancelled();
const bind = bindSelection as BindMode;
const portStr = await p.text({
message: "Server port",
@@ -84,15 +62,113 @@ export async function promptServer(opts?: {
},
});
if (p.isCancel(portStr)) {
p.cancel("Setup cancelled.");
process.exit(0);
if (p.isCancel(portStr)) cancelled();
const port = Number(portStr) || 3100;
const serveUi = currentServer?.serveUi ?? true;
if (bind === "loopback") {
return buildPresetServerConfig("loopback", {
port,
allowedHostnames: [],
serveUi,
});
}
if (bind === "lan" || bind === "tailnet") {
const allowedHostnamesInput = await p.text({
message: "Allowed private hostnames (comma-separated, optional)",
defaultValue: (currentServer?.allowedHostnames ?? []).join(", "),
placeholder:
bind === "tailnet"
? "your-machine.tailnet.ts.net"
: "dotta-macbook-pro, host.docker.internal",
validate: (val) => {
try {
parseHostnameCsv(val);
return;
} catch (err) {
return err instanceof Error ? err.message : "Invalid hostname list";
}
},
});
if (p.isCancel(allowedHostnamesInput)) cancelled();
const preset = buildPresetServerConfig(bind, {
port,
allowedHostnames: parseHostnameCsv(allowedHostnamesInput),
serveUi,
});
if (bind === "tailnet" && isLoopbackHost(preset.server.host)) {
p.log.warn(TAILNET_BIND_WARNING);
}
return preset;
}
const deploymentModeSelection = await p.select({
message: "Auth mode",
options: [
{
value: "local_trusted",
label: "Local trusted",
hint: "No login required; only safe with loopback-only or similarly trusted access",
},
{
value: "authenticated",
label: "Authenticated",
hint: "Login required; supports both private-network and public deployments",
},
],
initialValue: currentServer?.deploymentMode ?? "authenticated",
});
if (p.isCancel(deploymentModeSelection)) cancelled();
const deploymentMode = deploymentModeSelection as ServerConfig["deploymentMode"];
let exposure: ServerConfig["exposure"] = "private";
if (deploymentMode === "authenticated") {
const exposureSelection = await p.select({
message: "Exposure profile",
options: [
{
value: "private",
label: "Private network",
hint: "Private access only, with automatic URL handling",
},
{
value: "public",
label: "Public internet",
hint: "Internet-facing deployment with explicit public URL requirements",
},
],
initialValue: currentServer?.exposure ?? "private",
});
if (p.isCancel(exposureSelection)) cancelled();
exposure = exposureSelection as ServerConfig["exposure"];
}
const defaultHost =
currentServer?.customBindHost ??
currentServer?.host ??
(deploymentMode === "local_trusted" ? "127.0.0.1" : "0.0.0.0");
const host = await p.text({
message: "Bind host",
defaultValue: defaultHost,
placeholder: defaultHost,
validate: (val) => {
if (!val.trim()) return "Host is required";
if (deploymentMode === "local_trusted" && !isLoopbackHost(val.trim())) {
return "Local trusted mode requires a loopback host such as 127.0.0.1";
}
},
});
if (p.isCancel(host)) cancelled();
let allowedHostnames: string[] = [];
if (deploymentMode === "authenticated" && exposure === "private") {
const allowedHostnamesInput = await p.text({
message: "Allowed hostnames (comma-separated, optional)",
message: "Allowed private hostnames (comma-separated, optional)",
defaultValue: (currentServer?.allowedHostnames ?? []).join(", "),
placeholder: "dotta-macbook-pro, your-host.tailnet.ts.net",
validate: (val) => {
@@ -105,15 +181,11 @@ export async function promptServer(opts?: {
},
});
if (p.isCancel(allowedHostnamesInput)) {
p.cancel("Setup cancelled.");
process.exit(0);
}
if (p.isCancel(allowedHostnamesInput)) cancelled();
allowedHostnames = parseHostnameCsv(allowedHostnamesInput);
}
const port = Number(portStr) || 3100;
let auth: AuthConfig = { baseUrlMode: "auto", disableSignUp: false };
let publicBaseUrl: string | undefined;
if (deploymentMode === "authenticated" && exposure === "public") {
const urlInput = await p.text({
message: "Public base URL",
@@ -133,32 +205,17 @@ export async function promptServer(opts?: {
}
},
});
if (p.isCancel(urlInput)) {
p.cancel("Setup cancelled.");
process.exit(0);
}
auth = {
baseUrlMode: "explicit",
disableSignUp: false,
publicBaseUrl: urlInput.trim().replace(/\/+$/, ""),
};
} else if (currentAuth?.baseUrlMode === "explicit" && currentAuth.publicBaseUrl) {
auth = {
baseUrlMode: "explicit",
disableSignUp: false,
publicBaseUrl: currentAuth.publicBaseUrl,
};
if (p.isCancel(urlInput)) cancelled();
publicBaseUrl = urlInput.trim().replace(/\/+$/, "");
}
return {
server: {
deploymentMode,
exposure,
host: hostStr.trim(),
port,
allowedHostnames,
serveUi: currentServer?.serveUi ?? true,
},
auth,
};
return buildCustomServerConfig({
deploymentMode,
exposure,
host: host.trim(),
port,
allowedHostnames,
serveUi,
publicBaseUrl,
});
}
+1 -1
View File
@@ -4,5 +4,5 @@
"outDir": "dist",
"rootDir": ".."
},
"include": ["src", "../packages/shared/src"]
"include": ["src", "../packages/shared/src", "../packages/plugins/create-paperclip-plugin/src"]
}
+180 -4
View File
@@ -2,7 +2,7 @@
Paperclip CLI now supports both:
- instance setup/diagnostics (`onboard`, `doctor`, `configure`, `env`, `allowed-hostname`)
- instance setup/diagnostics (`onboard`, `doctor`, `configure`, `env`, `allowed-hostname`, `env-lab`)
- control-plane client operations (issues, approvals, agents, activity, dashboard)
## Base Usage
@@ -32,10 +32,12 @@ Mode taxonomy and design intent are documented in `doc/DEPLOYMENT-MODES.md`.
Current CLI behavior:
- `paperclipai onboard` and `paperclipai configure --section server` set deployment mode in config
- server onboarding/configure ask for reachability intent and write `server.bind`
- `paperclipai run --bind <loopback|lan|tailnet>` passes a quickstart bind preset into first-run onboarding when config is missing
- runtime can override mode with `PAPERCLIP_DEPLOYMENT_MODE`
- `paperclipai run` and `paperclipai doctor` do not yet expose a direct `--mode` flag
- `paperclipai run` and `paperclipai doctor` still do not expose a direct low-level `--mode` flag
Target behavior (planned) is documented in `doc/DEPLOYMENT-MODES.md` section 5.
Canonical behavior is documented in `doc/DEPLOYMENT-MODES.md`.
Allow an authenticated/private hostname (for example custom Tailscale DNS):
@@ -43,6 +45,15 @@ Allow an authenticated/private hostname (for example custom Tailscale DNS):
pnpm paperclipai allowed-hostname dotta-macbook-pro
```
Bring up the default local SSH fixture for environment testing:
```sh
pnpm paperclipai env-lab up
pnpm paperclipai env-lab doctor
pnpm paperclipai env-lab status --json
pnpm paperclipai env-lab down
```
All client commands support:
- `--data-dir <path>`
@@ -132,6 +143,150 @@ pnpm paperclipai agent local-cli codexcoder --company-id <company-id>
pnpm paperclipai agent local-cli claudecoder --company-id <company-id>
```
## Skills Commands
`paperclipai skills` covers three distinct operations:
1. **Company install** — adds or updates a row in `company_skills` for the
whole company. This is what `skills install`, `skills import`, `skills create`,
and `skills scan-projects` do.
2. **Agent attach** — replaces an agent's *desired* company skill set
(`skills agent sync`/`clear`). This is a desired-state operation on the
agent's adapter config; it does not change the company library.
3. **Adapter runtime sync** — the adapter reconciles the desired skill set
with files on disk and reports an `AgentSkillSnapshot` (`skills agent list`).
`skills agent sync` triggers this automatically after updating desired state.
Required Paperclip runtime skills (heartbeat, etc.) remain server-enforced and
are added on top of whatever the desired set names.
### Catalog (app-shipped skills)
The Paperclip app ships a curated catalog under `@paperclipai/skills-catalog`.
Browse and inspect commands never mutate company state; `install` adds a catalog
skill to the company library.
```sh
pnpm paperclipai skills browse [--kind bundled|optional] [--category <slug>] [--query <text>]
pnpm paperclipai skills search "<text>" [--kind bundled|optional] [--category <slug>]
pnpm paperclipai skills inspect <catalog-id-or-key-or-slug>
pnpm paperclipai skills install <catalog-id-or-key-or-slug> [--as <slug>] [--force] --company-id <company-id>
```
Catalog semantics:
- **Bundled** skills live in `packages/skills-catalog/catalog/bundled/<category>/<slug>`
and are recommended defaults for most companies. They use canonical key
`paperclipai/bundled/<category>/<slug>`.
- **Optional** skills live in `packages/skills-catalog/catalog/optional/<category>/<slug>`
and are role-specific or domain-specific (browser, AWS ops, etc.). Same key
shape with `optional` in place of `bundled`.
- `skills install` materializes the catalog files into a company-managed skill
directory and records provenance (`catalogId`, `catalogKey`, `packageVersion`,
`originHash`, …) so future updates and audit decisions stay consistent.
- `--as <slug>` overrides the company skill slug. `--force` may replace a
same-key catalog-managed skill but never bypasses hard validation or hard-stop
audit findings.
Examples:
```sh
pnpm paperclipai skills browse --kind bundled --company-id <company-id>
pnpm paperclipai skills search "pull request" --kind bundled
pnpm paperclipai skills inspect github-pr-workflow
pnpm paperclipai skills install github-pr-workflow --company-id <company-id>
pnpm paperclipai skills install paperclipai:optional:browser:agent-browser --company-id <company-id>
```
External GitHub, skills.sh, local-path, and URL sources still go through
`skills import`; catalog commands are for the app-shipped catalog only.
### Company library
```sh
pnpm paperclipai skills list --company-id <company-id>
pnpm paperclipai skills show <skill-id-or-key-or-slug> --company-id <company-id>
pnpm paperclipai skills file <skill-id-or-key-or-slug> [--path SKILL.md] --company-id <company-id>
pnpm paperclipai skills import <source> --company-id <company-id>
pnpm paperclipai skills create --name "Review PRs" [--slug review-prs] [--description "..."] [--body-file SKILL.md] --company-id <company-id>
pnpm paperclipai skills scan-projects [--project-id <id>...] [--workspace-id <id>...] --company-id <company-id>
pnpm paperclipai skills check [skill-id-or-key-or-slug] --company-id <company-id>
pnpm paperclipai skills update <skill-id-or-key-or-slug> [--force] --company-id <company-id>
pnpm paperclipai skills update --all [--force] --company-id <company-id>
pnpm paperclipai skills audit [skill-id-or-key-or-slug] --company-id <company-id>
pnpm paperclipai skills reset <skill-id-or-key-or-slug> [--yes] [--force] --company-id <company-id>
pnpm paperclipai skills remove <skill-id-or-key-or-slug> --yes --company-id <company-id>
```
`skills import <source>` accepts a skills.sh URL, the equivalent
`<owner>/<repo>/<skill>` shorthand, a GitHub URL, a local path, or an
`npx skills add …` command. See `references/company-skills.md` in the agent
skill bundle for the source-type table.
`skills check`, `skills update`, `skills audit`, and `skills reset` are the
maintenance loop for catalog-installed skills:
- `check` reports whether each skill's installed bytes match its pinned origin
(`hasUpdate`, `installedHash`, `originHash`, `updateHoldReason`,
`auditVerdict`).
- `update` installs the pinned update through the existing install-update API.
`--all` checks every company skill and updates only those with
`hasUpdate=true`. `--force` discards local-modification or soft-audit holds;
hard-stop audit findings still block the update.
- `audit` re-scans installed bytes and reports findings without executing
anything.
- `reset` reinstalls a catalog-managed skill from its pinned origin, discarding
local edits. Prompts in a TTY; requires `--yes` for non-interactive use.
### Agent attach
```sh
pnpm paperclipai skills agent list <agent-id-or-shortname> --company-id <company-id>
pnpm paperclipai skills agent sync <agent-id-or-shortname> --skill <skill-id-or-key-or-slug> [--skill <skill-id-or-key-or-slug>...] --company-id <company-id>
pnpm paperclipai skills agent clear <agent-id-or-shortname> --yes --company-id <company-id>
```
`skills agent sync` replaces the agent's non-required desired skill set (it is
not additive) and returns the resulting adapter `AgentSkillSnapshot`.
`skills agent clear` sends an empty desired list. Required Paperclip skills are
still enforced by the server in both cases.
### Notes
- Skill references accept company skill `id`, canonical `key`, or unique
`slug`; catalog references accept catalog `id`, `key`, or unique `slug`.
- `skills file` prints raw file content in human mode so it can be piped.
- `skills create --body-file -` reads the skill markdown body from stdin.
- `skills remove`, `skills reset`, and `skills agent clear` prompt in a TTY and
require `--yes` in non-interactive use.
- `--json` prints the raw API result for each command.
## Secrets Commands
```sh
pnpm paperclipai secrets list --company-id <company-id>
pnpm paperclipai secrets declarations --company-id <company-id> [--include agents,projects] [--kind secret]
pnpm paperclipai secrets create --company-id <company-id> --name anthropic-api-key --value-env ANTHROPIC_API_KEY
pnpm paperclipai secrets link --company-id <company-id> --name prod-stripe-key --provider aws_secrets_manager --external-ref <provider-ref>
pnpm paperclipai secrets doctor --company-id <company-id>
pnpm paperclipai secrets migrate-inline-env --company-id <company-id> [--apply]
```
Secret listing and declarations never print secret values. `create` accepts
`--value-env` so shell history does not capture the value. `link` records
provider-owned references without copying the secret value into Paperclip.
For AWS-backed secrets, `secrets doctor` reports missing non-secret provider
env and the expected AWS SDK runtime credential source; do not store AWS
bootstrap credentials in Paperclip secrets.
Per-company provider vaults (multiple vault instances per provider, default
vault selection, coming-soon GCP/Vault) are configured from the board UI under
`Company Settings → Secrets → Provider vaults` or through
`/api/companies/{companyId}/secret-provider-configs`. There is no CLI surface
for vault management today. See the
[secrets deploy guide](../docs/deploy/secrets.md#provider-vaults) and
[API reference](../docs/api/secrets.md#provider-vaults) for the contract.
## Approval Commands
```sh
@@ -167,7 +322,28 @@ pnpm paperclipai heartbeat run --agent-id <agent-id> [--api-base http://localhos
## Local Storage Defaults
Default local instance root is `~/.paperclip/instances/default`:
Local Paperclip data lives under the selected instance root. `PAPERCLIP_HOME` chooses the home directory and `PAPERCLIP_INSTANCE_ID` chooses the instance.
```text
~/.paperclip/ # PAPERCLIP_HOME
└── instances/
└── default/ # instance root (PAPERCLIP_INSTANCE_ID)
├── config.json # runtime config
├── .env # instance env file
├── db/ # embedded PostgreSQL data
├── data/
│ ├── storage/ # local_disk uploads
│ └── backups/ # automatic DB backups
├── logs/
├── secrets/
│ └── master.key # local_encrypted master key
├── workspaces/ # default agent workspaces
├── projects/ # project execution workspaces
├── companies/ # per-company adapter homes (e.g. codex-home)
└── codex-home/ # per-instance codex home (when not company-scoped)
```
Default paths for the canonical install:
- config: `~/.paperclip/instances/default/config.json`
- embedded db: `~/.paperclip/instances/default/db`
+62 -12
View File
@@ -27,6 +27,18 @@ pnpm db:migrate
When `DATABASE_URL` is unset, this command targets the current embedded PostgreSQL instance for your active Paperclip config/instance.
Issue reference mentions follow the normal migration path: the schema migration creates the tracking table, but it does not backfill historical issue titles, descriptions, comments, or documents automatically.
To backfill existing content manually after migrating, run:
```sh
pnpm issue-references:backfill
# optional: limit to one company
pnpm issue-references:backfill -- --company <company-id>
```
Future issue, comment, and document writes sync references automatically without running the backfill command.
This mode is ideal for local development and one-command installs.
Docker note: the Docker quickstart image also uses embedded PostgreSQL by default. Persist `/paperclip` to keep DB state across container restarts (see `doc/DOCKER.md`).
@@ -47,11 +59,11 @@ cp .env.example .env
# DATABASE_URL=postgres://paperclip:paperclip@localhost:5432/paperclip
```
Run migrations (once the migration generation issue is fixed) or use `drizzle-kit push`:
Run migrations:
```sh
DATABASE_URL=postgres://paperclip:paperclip@localhost:5432/paperclip \
npx drizzle-kit push
pnpm db:migrate
```
Start the server:
@@ -88,27 +100,27 @@ postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:
### Configure
Set `DATABASE_URL` in your `.env`:
For the application runtime, use a direct PostgreSQL connection unless the database client has explicit prepared-statement configuration for your pooling mode:
```sh
DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:5432/postgres
```
If you later run the app with a pooled runtime URL, set `DATABASE_MIGRATION_URL` to the direct connection URL. Paperclip uses it for startup schema checks/migrations and plugin namespace migrations, while the app continues to use `DATABASE_URL` for runtime queries:
```sh
DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:6543/postgres
DATABASE_MIGRATION_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:5432/postgres
```
If using connection pooling (port 6543), the `postgres` client must disable prepared statements. Update `packages/db/src/client.ts`:
```ts
export function createDb(url: string) {
const sql = postgres(url, { prepare: false });
return drizzlePg(sql, { schema });
}
```
If your hosted database requires transaction-pooling-only connections, use a direct or session-pooled connection for Paperclip until runtime pooling support is documented in this guide. Do not edit database client source files as part of deployment setup.
### Push the schema
```sh
# Use the direct connection (port 5432) for schema changes
DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@...5432/postgres \
npx drizzle-kit push
pnpm db:migrate
```
### Free tier limits
@@ -131,18 +143,51 @@ The database mode is controlled by `DATABASE_URL`:
Your Drizzle schema (`packages/db/src/schema/`) stays the same regardless of mode.
## Resource membership tables
Paperclip stores current-user sidebar membership state in:
- `project_memberships`
- `agent_memberships`
These rows are company-scoped and user-scoped. A missing row means the user is joined, so existing users keep seeing projects and agents in the sidebar until they explicitly leave them. Rows only control sidebar visibility; they do not affect project/agent detail access, all-pages, selectors, assignment flows, or existing company permissions.
Both tables use a unique key on `(company_id, user_id, resource_id)` and keep `state` as `joined` or `left`. Join/leave mutations are idempotent board-user `/me` operations and write activity entries when the effective state changes.
## Plugin database namespaces
The plugin runtime tracks plugin-owned database namespaces and migrations in `plugin_database_namespaces` and `plugin_migrations`. Hosted deployments that separate runtime and migration connections should set `DATABASE_MIGRATION_URL`; plugin namespace migration work uses the migration connection when present.
## Backups
Paperclip supports automatic and manual logical database backups. These dumps include
non-system database schemas such as `public`, the Drizzle migration journal, and
plugin-owned database schemas. See `doc/DEVELOPING.md` for the current
`paperclipai db:backup` / `pnpm db:backup` commands and backup retention
configuration.
Database backups do not include non-database instance files such as local-disk
uploads, workspace files, or the local encrypted secrets master key. Back those paths
up separately when you need full instance disaster recovery.
## Secret storage
Paperclip stores secret metadata and versions in:
- `company_secrets`
- `company_secret_versions`
- `company_secret_bindings`
- `secret_access_events`
Secret-aware env bindings are supported by agents, projects, and routines. Routine env lives in `routines.env`, is captured in `routine_revisions.snapshot`, and routine dispatches store `routine_runs.routine_revision_id` so runtime secret resolution uses the env snapshot that existed when the run was created. Routine secret refs bind with `target_type = 'routine'`, `target_id = routines.id`, and `config_path` values under `env.*`.
For local/default installs, the active provider is `local_encrypted`:
- Secret material is encrypted at rest with a local master key.
- Default key file: `~/.paperclip/instances/default/secrets/master.key` (auto-created if missing).
- CLI config location: `~/.paperclip/instances/default/config.json` under `secrets.localEncrypted.keyFilePath`.
- Backup/restore requires both the database metadata and the local master key file; either artifact alone is insufficient.
- The server best-effort enforces `0600` key file permissions and provider health reports permission warnings.
Optional overrides:
@@ -164,5 +209,10 @@ pnpm paperclipai configure --section secrets
Inline secret migration command:
```sh
pnpm paperclipai secrets migrate-inline-env --company-id <company-id> --apply
# direct database maintenance fallback
pnpm secrets:migrate-inline-env --apply
```
Hosted AWS provider notes live in [SECRETS-AWS-PROVIDER.md](./SECRETS-AWS-PROVIDER.md).
+67 -11
View File
@@ -17,6 +17,11 @@ Paperclip supports two runtime modes:
This keeps one authenticated auth stack while still separating low-friction private-network defaults from internet-facing hardening requirements.
Paperclip now treats **bind** as a separate concern from auth:
- auth model: `local_trusted` vs `authenticated`, plus `private/public`
- reachability model: `server.bind = loopback | lan | tailnet | custom`
## 2. Canonical Model
| Runtime Mode | Exposure | Human auth | Primary use |
@@ -25,6 +30,15 @@ This keeps one authenticated auth stack while still separating low-friction priv
| `authenticated` | `private` | Login required | Private-network access (for example Tailscale/VPN/LAN) |
| `authenticated` | `public` | Login required | Internet-facing/cloud deployment |
## Reachability Model
| Bind | Meaning | Typical use |
|---|---|---|
| `loopback` | Listen on localhost only | default local usage, reverse-proxy deployments |
| `lan` | Listen on all interfaces (`0.0.0.0`) | LAN/VPN/private-network access |
| `tailnet` | Listen on a detected Tailscale IP | Tailscale-only access |
| `custom` | Listen on an explicit host/IP | advanced interface-specific setups |
## 3. Security Policy
## `local_trusted`
@@ -38,12 +52,14 @@ This keeps one authenticated auth stack while still separating low-friction priv
- login required
- low-friction URL handling (`auto` base URL mode)
- private-host trust policy required
- bind can be `loopback`, `lan`, `tailnet`, or `custom`
## `authenticated + public`
- login required
- explicit public URL required
- stricter deployment checks and failures in doctor
- recommended bind is `loopback` behind a reverse proxy; direct `lan/custom` is advanced
## 4. Onboarding UX Contract
@@ -55,14 +71,22 @@ pnpm paperclipai onboard
Server prompt behavior:
1. ask mode, default `local_trusted`
2. option copy:
- `local_trusted`: "Easiest for local setup (no login, localhost-only)"
- `authenticated`: "Login required; use for private network or public hosting"
3. if `authenticated`, ask exposure:
- `private`: "Private network access (for example Tailscale), lower setup friction"
- `public`: "Internet-facing deployment, stricter security requirements"
4. ask explicit public URL only for `authenticated + public`
1. quickstart `--yes` defaults to `server.bind=loopback` and therefore `local_trusted/private`
2. advanced server setup asks reachability first:
- `Trusted local``bind=loopback`, `local_trusted/private`
- `Private network``bind=lan`, `authenticated/private`
- `Tailnet``bind=tailnet`, `authenticated/private`
- `Custom` → manual mode/exposure/host entry
3. raw host entry is only required for the `Custom` path
4. explicit public URL is only required for `authenticated + public`
Examples:
```sh
pnpm paperclipai onboard --yes
pnpm paperclipai onboard --yes --bind lan
pnpm paperclipai run --bind tailnet
```
`configure --section server` follows the same interactive behavior.
@@ -101,20 +125,52 @@ When running `authenticated` mode, if the only instance admin is `local-board`,
This prevents lockout when a user migrates from long-running local trusted usage to authenticated mode.
## 8. Current Code Reality (As Of 2026-02-23)
## 8. First Admin Setup For Fresh Authenticated Installs
Fresh authenticated installs start in `bootstrap_pending` until the first
`instance_admin` exists.
For `authenticated/private`, Paperclip supports a browser-first setup path:
1. open the Paperclip URL from the private network or appliance UI
2. sign in or create a Paperclip account
3. choose `Claim this instance` on the setup screen
That browser claim promotes the signed-in session user to the first instance
admin and then falls through to normal onboarding. The endpoint is available
only to real browser session actors in `authenticated/private`; unauthenticated
requests, agent keys, board API keys, and local implicit board actors are
rejected.
The CLI fallback remains supported in all authenticated setup states:
```sh
pnpm paperclipai auth bootstrap-ceo
```
That command prints a one-time first-admin invite URL. Browser claim and
bootstrap invite acceptance share the same first-admin transaction, so whichever
path wins first makes later attempts return a conflict.
For `authenticated/public`, browser first-admin claim is intentionally disabled.
Public deployments must use the high-entropy bootstrap invite path unless a
future public-hosted setup design explicitly changes this policy.
## 9. Current Code Reality (As Of 2026-02-23)
- runtime values are `local_trusted | authenticated`
- `authenticated` uses Better Auth sessions and bootstrap invite flow
- `local_trusted` ensures a real local Board user principal in `authUsers` with `instance_user_roles` admin access
- company creation ensures creator membership in `company_memberships` so user assignment/access flows remain consistent
## 9. Naming and Compatibility Policy
## 10. Naming and Compatibility Policy
- canonical naming is `local_trusted` and `authenticated` with `private/public` exposure
- no long-term compatibility alias layer for discarded naming variants
## 10. Relationship to Other Docs
## 11. Relationship to Other Docs
- implementation plan: `doc/plans/deployment-auth-mode-consolidation.md`
- V1 contract: `doc/SPEC-implementation.md`
- operator workflows: `doc/DEVELOPING.md` and `doc/CLI.md`
- invite/join state map: `doc/spec/invite-flow.md`
+197 -10
View File
@@ -43,6 +43,19 @@ This starts:
`pnpm dev` and `pnpm dev:once` are now idempotent for the current repo and instance: if the matching Paperclip dev runner is already alive, Paperclip reports the existing process instead of starting a duplicate.
Issue execution may also use project execution workspace policies and workspace runtime services for per-project worktrees, preview servers, and managed dev commands. Configure those through the project workspace/runtime surfaces rather than starting long-running unmanaged processes when a task needs a reusable service.
## Storybook
The board UI Storybook keeps stories and Storybook config under `ui/storybook/` so component review files stay out of the app source routes.
```sh
pnpm storybook
pnpm build-storybook
```
These run the `@paperclipai/ui` Storybook on port `6006` and build the static output to `ui/storybook-static/`.
Inspect or stop the current repo's managed dev runner:
```sh
@@ -55,10 +68,30 @@ pnpm dev:stop
Tailscale/private-auth dev mode:
```sh
pnpm dev --tailscale-auth
pnpm dev --bind lan
```
This runs dev as `authenticated/private` and binds the server to `0.0.0.0` for private-network access.
This runs dev as `authenticated/private` with a private-network bind preset.
On a fresh authenticated/private instance, open the app, sign in or create an
account, and use the setup screen to claim the first instance admin from the
browser. The CLI fallback remains:
```sh
pnpm paperclipai auth bootstrap-ceo
```
For Tailscale-only reachability on a detected tailnet address:
```sh
pnpm dev --bind tailnet
```
Legacy aliases still map to the old broad private-network behavior:
```sh
pnpm dev --tailscale-auth
pnpm dev --authenticated-private
```
Allow additional private hostnames (for example custom Tailscale hostnames):
@@ -66,6 +99,31 @@ Allow additional private hostnames (for example custom Tailscale hostnames):
pnpm paperclipai allowed-hostname dotta-macbook-pro
```
## Test Commands
Use the cheap local default unless you are specifically working on browser flows:
```sh
pnpm test
```
`pnpm test` runs the Vitest suite only. For interactive Vitest watch mode use:
```sh
pnpm test:watch
```
Browser suites stay separate:
```sh
pnpm test:e2e
pnpm test:release-smoke
```
These browser suites are intended for targeted local verification and CI, not the default agent/human test command.
For normal issue work, start with the smallest targeted check that proves the change. Reserve repo-wide typecheck/build/test runs for PR-ready handoff or changes broad enough that narrow checks do not cover the risk.
## One-Command Local Run
For a first-time local install, you can bootstrap and run in one command:
@@ -106,6 +164,27 @@ See `doc/DOCKER.md` for API key wiring (`OPENAI_API_KEY` / `ANTHROPIC_API_KEY`)
For a separate review-oriented container that keeps `codex`/`claude` login state in Docker volumes and checks out PRs into an isolated scratch workspace, see `doc/UNTRUSTED-PR-REVIEW.md`.
## Local Instance Layout
Every local install keeps runtime state directly under the selected instance root:
```text
~/.paperclip/instances/default/ # instance root
config.json # runtime config
.env # instance env file
db/ # embedded PostgreSQL data
data/
storage/ # local_disk uploads
backups/ # automatic DB backups
logs/
secrets/master.key # local_encrypted master key
workspaces/<agent-id>/ # default agent workspaces
projects/ # project execution workspaces
companies/<company-id>/codex-home/ # per-company codex_local home
```
`PAPERCLIP_HOME` and `PAPERCLIP_INSTANCE_ID` override the home root and instance id respectively. `paperclipai onboard` echoes the resolved values in its banner (`Local home: <home> | instance: <id> | config: <path>`) so you can confirm where state will land before continuing.
## Database in Dev (Auto-Handled)
For local development, leave `DATABASE_URL` unset.
@@ -113,7 +192,7 @@ The server will automatically use embedded PostgreSQL and persist data at:
- `~/.paperclip/instances/default/db`
Override home and instance:
Override home or instance:
```sh
PAPERCLIP_HOME=/custom/path PAPERCLIP_INSTANCE_ID=dev pnpm paperclipai run
@@ -147,6 +226,8 @@ For `codex_local`, Paperclip also manages a per-company Codex home under the ins
If the `codex` CLI is not installed or not on `PATH`, `codex_local` agent runs fail at execution time with a clear adapter error. Quota polling uses a short-lived `codex app-server` subprocess: when `codex` cannot be spawned, that provider reports `ok: false` in aggregated quota results and the API server keeps running (it must not exit on a missing binary).
Local adapters require their corresponding CLI/session setup on the machine running Paperclip. External adapters are installed through the adapter/plugin flow and should not require hardcoded imports in `server/` or `ui/`.
## Worktree-local Instances
When developing from multiple git worktrees, do not point two Paperclip servers at the same embedded PostgreSQL data directory.
@@ -173,9 +254,13 @@ Seed modes:
- `full` makes a full logical clone of the source instance
- `--no-seed` creates an empty isolated instance
Seeded worktree instances quarantine copied live execution by default for both `minimal` and `full` seeds. During restore, Paperclip disables copied agent timer heartbeats, resets copied `running` agents to `idle`, blocks and unassigns copied agent-owned `in_progress` issues, and unassigns copied agent-owned `todo`/`in_review` issues. This keeps a freshly booted worktree from starting agents for work already owned by the source instance. Pass `--preserve-live-work` only when you intentionally want the isolated worktree to resume copied assignments.
After `worktree init`, both the server and the CLI auto-load the repo-local `.paperclip/.env` when run inside that worktree, so normal commands like `pnpm dev`, `paperclipai doctor`, and `paperclipai db:backup` stay scoped to the worktree instance.
Provisioned git worktrees also pause all seeded routines in the isolated worktree database by default. This prevents copied daily/cron routines from firing unexpectedly inside the new workspace instance during development.
`pnpm dev` now fails fast in a linked git worktree when `.paperclip/.env` is missing, instead of silently booting against the default instance/port. If that happens, run `paperclipai worktree init` in the worktree first.
Provisioned git worktrees also pause seeded routines that still have enabled schedule triggers in the isolated worktree database by default. This prevents copied daily/cron routines from firing unexpectedly inside the new workspace instance during development without disabling webhook/API-only routines.
That repo-local env also sets:
@@ -184,6 +269,8 @@ That repo-local env also sets:
- `PAPERCLIP_WORKTREE_COLOR=<hex-color>`
The server/UI use those values for worktree-specific branding such as the top banner and dynamically colored favicon.
Authenticated worktree servers also use the `PAPERCLIP_INSTANCE_ID` value to scope Better Auth cookie names.
Browser cookies are shared by host rather than port, so this prevents logging into one `127.0.0.1:<port>` worktree from replacing another worktree server's session cookie.
Print shell exports explicitly when needed:
@@ -221,10 +308,10 @@ paperclipai worktree init --from-data-dir ~/.paperclip
paperclipai worktree init --force
```
Repair an already-created repo-managed worktree and reseed its isolated instance from the main default install:
Repair an already-created repo-managed worktree and reseed its isolated instance from the main default install. Point `--from-config` at the instance config:
```sh
cd ~/.paperclip/worktrees/PAP-884-ai-commits-component
cd /path/to/paperclip/.paperclip/worktrees/PAP-884-ai-commits-component
pnpm paperclipai worktree init --force --seed-mode minimal \
--name PAP-884-ai-commits-component \
--from-config ~/.paperclip/instances/default/config.json
@@ -232,6 +319,33 @@ pnpm paperclipai worktree init --force --seed-mode minimal \
That rewrites the worktree-local `.paperclip/config.json` + `.paperclip/.env`, recreates the isolated instance under `~/.paperclip-worktrees/instances/<worktree-id>/`, and preserves the git worktree contents themselves.
For an already-created worktree where you want the CLI to decide whether to rebuild missing worktree metadata or just reseed the isolated DB, use `worktree repair`.
**`pnpm paperclipai worktree repair [options]`** — Repair the current linked worktree by default, or create/repair a named linked worktree under `.paperclip/worktrees/` when `--branch` is provided. The command never targets the primary checkout unless you explicitly pass `--branch`.
| Option | Description |
|---|---|
| `--branch <name>` | Existing branch/worktree selector to repair, or a branch name to create under `.paperclip/worktrees` |
| `--home <path>` | Home root for worktree instances (default: `~/.paperclip-worktrees`) |
| `--from-config <path>` | Source config.json to seed from |
| `--from-data-dir <path>` | Source `PAPERCLIP_HOME` used when deriving the source config |
| `--from-instance <id>` | Source instance id when deriving the source config (default: `default`) |
| `--seed-mode <mode>` | Seed profile: `minimal` or `full` (default: `minimal`) |
| `--no-seed` | Repair metadata only when bootstrapping a missing worktree config |
| `--allow-live-target` | Override the guard that requires the target worktree DB to be stopped first |
Examples:
```sh
# From inside a linked worktree, rebuild missing .paperclip metadata and reseed it from the default instance.
cd /path/to/paperclip/.paperclip/worktrees/PAP-1132-assistant-ui-pap-1131-make-issues-comments-be-like-a-chat
pnpm paperclipai worktree repair
# From the primary checkout, create or repair a linked worktree for a branch under .paperclip/worktrees/.
cd /path/to/paperclip
pnpm paperclipai worktree repair --branch PAP-1132-assistant-ui-pap-1131-make-issues-comments-be-like-a-chat
```
For an already-created worktree where you want to keep the existing repo-local config/env and only overwrite the isolated database, use `worktree reseed` instead. Stop the target worktree's Paperclip server first so the command can replace the DB safely.
**`pnpm paperclipai worktree reseed [options]`** — Re-seed an existing worktree-local instance from another Paperclip instance or worktree while preserving the target worktree's current config, ports, and instance identity.
@@ -306,6 +420,62 @@ eval "$(pnpm paperclipai worktree env)"
For project execution worktrees, Paperclip can also run a project-defined provision command after it creates or reuses an isolated git worktree. Configure this on the project's execution workspace policy (`workspaceStrategy.provisionCommand`). The command runs inside the derived worktree and receives `PAPERCLIP_WORKSPACE_*`, `PAPERCLIP_PROJECT_ID`, `PAPERCLIP_AGENT_ID`, and `PAPERCLIP_ISSUE_*` environment variables so each repo can bootstrap itself however it wants.
## App-Shipped Skills Catalog
The Paperclip app ships a curated catalog of company skills out of the box. The
catalog is a workspace package at `packages/skills-catalog`:
```text
packages/skills-catalog/
catalog/
bundled/<category>/<slug>/SKILL.md # recommended defaults
optional/<category>/<slug>/SKILL.md # role/domain-specific
generated/catalog.json # checked-in manifest
scripts/
build-catalog-manifest.ts # regenerate generated/catalog.json
validate-catalog.ts # validation only
src/ # builder + types consumed by server/CLI
```
Server and CLI import the generated manifest; they do not crawl repository
paths at request time. Root `skills/` remains reserved for Paperclip runtime
skills and is not part of the catalog.
Validate the catalog without writing the manifest:
```sh
pnpm --filter @paperclipai/skills-catalog validate
```
Regenerate `generated/catalog.json` after editing any catalog `SKILL.md`,
frontmatter, file inventory, category, or slug:
```sh
pnpm --filter @paperclipai/skills-catalog build:manifest
```
The package's `build` script runs `build:manifest` and then `tsc`; tests live
under `pnpm --filter @paperclipai/skills-catalog test`. Validation fails when:
- a catalog entry is not under `catalog/bundled/<category>/<slug>` or
`catalog/optional/<category>/<slug>`
- `SKILL.md` is missing or the frontmatter `name`/`description` is empty
- the frontmatter `key` disagrees with the generated canonical key
- two catalog entries share an `id`, `key`, or `slug`
- file inventory contains absolute paths, `..`, broken symlinks, or files
outside the skill directory
- the regenerated manifest differs from the checked-in
`generated/catalog.json`
Trust level is derived from inventory: `markdown_only` (markdown + references
only), `assets` (other non-script files), or `scripts_executables` (any
executable script). The build contract is documented in
`doc/plans/2026-05-26-skills-cli-catalog-contract.md`.
CI runs `pnpm --filter @paperclipai/skills-catalog validate` and the package's
vitest suite, so always regenerate the manifest in the same commit as the
catalog change.
## Quick Health Checks
In another terminal:
@@ -335,7 +505,9 @@ If you set `DATABASE_URL`, the server will use that instead of embedded PostgreS
## Automatic DB Backups
Paperclip can run automatic DB backups on a timer. Defaults:
Paperclip can run automatic logical database backups on a timer. These backups cover
non-system database schemas, including migration history and plugin-owned database
schemas. Defaults:
- enabled
- every 60 minutes
@@ -363,6 +535,10 @@ Environment overrides:
- `PAPERCLIP_DB_BACKUP_RETENTION_DAYS=<days>`
- `PAPERCLIP_DB_BACKUP_DIR=/absolute/or/~/path`
DB backups are not full instance filesystem backups. For full local disaster
recovery, also back up local storage files and the local encrypted secrets key if
those providers are enabled.
## Secrets in Dev
Agent env vars now support secret references. By default, secret values are stored with local encryption and only secret refs are persisted in agent config.
@@ -370,6 +546,7 @@ Agent env vars now support secret references. By default, secret values are stor
- Default local key path: `~/.paperclip/instances/default/secrets/master.key`
- Override key material directly: `PAPERCLIP_SECRETS_MASTER_KEY`
- Override key file path: `PAPERCLIP_SECRETS_MASTER_KEY_FILE`
- Back up the key file and database together; either one alone is not enough to restore local encrypted secrets.
Strict mode (recommended outside local trusted machines):
@@ -378,12 +555,20 @@ PAPERCLIP_SECRETS_STRICT_MODE=true
```
When strict mode is enabled, sensitive env keys (for example `*_API_KEY`, `*_TOKEN`, `*_SECRET`) must use secret references instead of inline plain values.
Authenticated deployments default strict mode on unless explicitly overridden.
CLI configuration support:
- `pnpm paperclipai onboard` writes a default `secrets` config section (`local_encrypted`, strict mode off, key file path set) and creates a local key file when needed.
- `pnpm paperclipai configure --section secrets` lets you update provider/strict mode/key path and creates the local key file when needed.
- `pnpm paperclipai doctor` validates secrets adapter configuration and can create a missing local key file with `--repair`.
- `pnpm paperclipai doctor` validates secrets adapter configuration, can create a missing local key file with `--repair`, and reports missing AWS Secrets Manager bootstrap env when that provider is selected.
- Provider health is available at `GET /api/companies/:companyId/secret-providers/health` and reports local key permission warnings plus backup guidance.
Per-company provider vaults are configured in the board UI under
`Company Settings → Secrets → Provider vaults`, backed by
`/api/companies/{companyId}/secret-provider-configs`. The CLI does not own
vault lifecycle today. See `docs/deploy/secrets.md` (`Provider Vaults` section)
for the operator model.
Migration helper for existing inline env secrets:
@@ -432,10 +617,12 @@ pnpm paperclipai dashboard get
See full command reference in `doc/CLI.md`.
## OpenClaw Invite Onboarding Endpoints
## Agent Invite Onboarding Endpoints
Agent-oriented invite onboarding now exposes machine-readable API docs:
The board UI generates agent onboarding prompts from the add-agent modal (`+` in the agent sidebar), so agent onboarding sits with the rest of agent creation rather than company member invite settings.
- `GET /api/invites/:token` returns invite summary plus onboarding and skills index links.
- `GET /api/invites/:token/onboarding` returns onboarding manifest details (registration endpoint, claim endpoint template, skill install hints).
- `GET /api/invites/:token/onboarding.txt` returns a plain-text onboarding doc intended for both human operators and agents (llm.txt-style handoff), including optional inviter message and suggested network host candidates.
@@ -453,7 +640,7 @@ pnpm smoke:openclaw-join
What it validates:
- invite creation for agent-only join
- agent join request using `adapterType=openclaw`
- agent join request using `adapterType=openclaw_gateway`
- board approval + one-time API key claim semantics
- callback delivery on wakeup to a dockerized OpenClaw-style webhook receiver
+10
View File
@@ -117,6 +117,16 @@ services:
- bootstrap invite URL defaults
- hostname allowlist defaults (hostname extracted from URL)
For fresh `authenticated/private` Docker or appliance-style installs, the first
admin can now be claimed entirely from the browser after sign-in. Open the
Paperclip URL, sign in or create an account, then choose `Claim this instance`
on the setup screen. This browser claim is disabled for `authenticated/public`;
public deployments should run the high-entropy CLI invite fallback instead:
```sh
pnpm paperclipai auth bootstrap-ceo
```
Granular overrides remain available if needed (`PAPERCLIP_AUTH_PUBLIC_BASE_URL`, `BETTER_AUTH_URL`, `BETTER_AUTH_TRUSTED_ORIGINS`, `PAPERCLIP_ALLOWED_HOSTNAMES`).
Set `PAPERCLIP_ALLOWED_HOSTNAMES` explicitly only when you need additional hostnames beyond the public URL host (for example Tailscale/LAN aliases or multiple private hostnames).
+9 -6
View File
@@ -23,7 +23,7 @@ Paperclip is the command, communication, and control plane for a company of AI a
- **Track work in real time** — see at any moment what every agent is working on
- **Control costs** — token salary budgets per agent, spend tracking, burn rate
- **Align to goals** — agents see how their work serves the bigger mission
- **Store company knowledge** — a shared brain for the organization
- **Preserve work context** — comments, documents, work products, attachments, and company state stay attached to the work
## Architecture
@@ -36,17 +36,20 @@ The central nervous system. Manages:
- Agent registry and org chart
- Task assignment and status
- Budget and token spend tracking
- Company knowledge base
- Issue comments, documents, work products, attachments, and company state
- Goal hierarchy (company → team → agent → task)
- Heartbeat monitoring — know when agents are alive, idle, or stuck
It also enforces execution-control semantics such as single-assignee issues, atomic checkout and execution locks, blockers, recovery issues, and workspace/runtime controls.
### 2. Execution Services (adapters)
Agents run externally and report into the control plane. An agent is just Python code that gets kicked off and does work. Adapters connect different execution environments:
Agents run externally and report into the control plane. Adapters connect different execution environments and define how a heartbeat is invoked, observed, and cancelled:
- **OpenClaw** — initial adapter target
- **Heartbeat loop** — simple custom Python that loops, checks in, does work
- **Others** — any runtime that can call an API
- **Local CLI/session adapters** — built-in adapters for tools such as Claude Code, Codex, Gemini, OpenCode, Pi, and Cursor
- **HTTP/process-style adapters** — command or webhook/API integrations for custom runtimes
- **OpenClaw gateway** — integration for OpenClaw-style remote agents
- **External adapter plugins** — dynamically loaded adapters installed outside the core app
The control plane doesn't run agents. It orchestrates them. Agents run wherever they run and phone home.
+1 -1
View File
@@ -3,7 +3,7 @@ Use this exact checklist.
1. Start Paperclip in auth mode.
```bash
cd <paperclip-repo-root>
pnpm dev --tailscale-auth
pnpm dev --bind lan
```
Then verify:
```bash
+15 -9
View File
@@ -32,12 +32,14 @@ Then you define who reports to the CEO: a CTO managing programmers, a CMO managi
### Agent Execution
There are two fundamental modes for running an agent's heartbeat:
Paperclip supports several ways to run an agent's heartbeat:
1. **Run a command** — Paperclip kicks off a process (shell command, Python script, etc.) and tracks it. The heartbeat is "execute this and monitor it."
2. **Fire and forget a request** — Paperclip sends a webhook/API call to an externally running agent. The heartbeat is "notify this agent to wake up." (OpenClaw hooks work this way.)
1. **Local CLI/session adapters** — Paperclip starts or resumes local coding-tool sessions such as Claude Code, Codex, Gemini, OpenCode, Pi, and Cursor, then tracks the run.
2. **Run a command** — Paperclip kicks off a process (shell command, Python script, etc.) and tracks it. The heartbeat is "execute this and monitor it."
3. **Fire and forget a request** — Paperclip sends a webhook/API call to an externally running agent. The heartbeat is "notify this agent to wake up." OpenClaw-style hooks work this way.
4. **External adapter plugins** — Paperclip loads adapter packages through the plugin/adapter flow so self-hosted installs can add runtimes without hardcoding them in core.
We provide sensible defaults — a default agent that shells out to Claude Code or Codex with your configuration, remembers session IDs, runs basic scripts. But you can plug in anything.
Agent runs can use project and execution workspaces, managed runtime services such as preview/dev servers, adapter-specific session state, and HTTP/webhook-style execution. We provide sensible defaults, but the adapter is still the boundary: if a runtime can be invoked, observed, and authorized, Paperclip can coordinate it.
### Task Management
@@ -54,7 +56,7 @@ I am researching the Facebook ads Granola uses (current task)
Tasks have parentage. Every task exists in service of a parent task, all the way up to the company goal. This is what keeps autonomous agents aligned — they can always answer "why am I doing this?"
More detailed task structure TBD.
The current issue model includes stable issue identifiers, parent/sub-issues, blockers, a single assignee, comments, issue documents, attachments and work products, and review/approval handoffs. That structure keeps work inspectable by both the board and agents while still allowing agents to decompose work into smaller tasks.
## Principles
@@ -115,7 +117,8 @@ Paperclips core identity is a **control plane for autonomous AI companies**,
- Do not make the core product a general chat app. The current product definition is explicitly task/comment-centric and “not a chatbot,” and that boundary is valuable.
- Do not build a complete Jira/GitHub replacement. The repo/docs already position Paperclip as organization orchestration, not focused on pull-request review.
- Do not build enterprise-grade RBAC first. The current V1 spec still treats multi-board governance and fine-grained human permissions as out of scope, so the first multi-user version should be coarse and company-scoped.
- Do not build enterprise-grade RBAC first. Paperclip now has authenticated mode, company memberships, instance roles, and permission grants, but fine-grained enterprise governance should remain secondary to the core company control plane.
- Do not interpret agent-level privacy flags as a project/issue privacy feature in V1; work visibility stays company-scoped.
- Do not lead with raw bash logs and transcripts. Default view should be human-readable intent/progress, with raw detail beneath.
- Do not force users to understand provider/API-key plumbing unless absolutely necessary. There are active onboarding/auth issues already; friction here is clearly real.
@@ -136,11 +139,14 @@ Paperclips core identity is a **control plane for autonomous AI companies**,
5. **Output-first**
Work is not done until the user can see the result: file, document, preview link, screenshot, plan, or PR.
6. **Local-first, cloud-ready**
6. **Execution visibility without log worship**
Active runs, recovery issues, productivity review states, blockers, and work products should be first-class surfaces. Raw transcripts are available when needed, but they are not the primary product surface.
7. **Local-first, cloud-ready**
The mental model should not change between local solo use and shared/private or public/cloud deployment.
7. **Safe autonomy**
8. **Safe autonomy**
Auto mode is allowed; hidden token burn is not.
8. **Thin core, rich edges**
9. **Thin core, rich edges**
Put optional chat, knowledge, and special surfaces into plugins/extensions rather than bloating the control plane.
+59 -32
View File
@@ -115,38 +115,6 @@ If the first real publish returns npm `E404`, check npm-side prerequisites befor
- The initial publish must include `--access public` for a public scoped package.
- npm also requires either account 2FA for publishing or a granular token that is allowed to bypass 2FA.
### Manual first publish for `@paperclipai/mcp-server`
If you need to publish only the MCP server package once by hand, use:
- `@paperclipai/mcp-server`
Recommended flow from the repo root:
```bash
# optional sanity check: this 404s until the first publish exists
npm view @paperclipai/mcp-server version
# make sure the build output is fresh
pnpm --filter @paperclipai/mcp-server build
# confirm your local npm auth before the real publish
npm whoami
# safe preview of the exact publish payload
cd packages/mcp-server
pnpm publish --dry-run --no-git-checks --access public
# real publish
pnpm publish --no-git-checks --access public
```
Notes:
- Publish from `packages/mcp-server/`, not the repo root.
- If `npm view @paperclipai/mcp-server version` already returns the same version that is in [`packages/mcp-server/package.json`](../packages/mcp-server/package.json), do not republish. Bump the version or use the normal repo-wide release flow in [`scripts/release.sh`](../scripts/release.sh).
- The same npm-side prerequisites apply as above: valid npm auth, permission to publish to the `@paperclipai` scope, `--access public`, and the required publish auth/2FA policy.
## Version formats
Paperclip uses calendar versions:
@@ -175,6 +143,13 @@ This keeps the default install path unchanged while allowing explicit installs w
npx paperclipai@canary onboard
```
The release script now verifies two things after a canary publish:
- the `canary` dist-tag resolves to the version that was just published
- every published internal `@paperclipai/*` dependency referenced by that manifest exists on npm
It also treats `latest -> canary` as a failure by default, because npm metadata can otherwise leave the default install path pointing at an unreleased canary dependency graph. Only pass `./scripts/release.sh canary --allow-canary-latest` when that `latest` behavior is explicitly intended.
### Stable
Stable publishes use the npm dist-tag `latest`.
@@ -201,6 +176,58 @@ That means:
See [doc/RELEASE-AUTOMATION-SETUP.md](RELEASE-AUTOMATION-SETUP.md) for the GitHub/npm setup steps.
## Release enrollment for new public packages
Paperclip does not auto-publish every non-private workspace package anymore.
CI publishing is controlled by [`scripts/release-package-manifest.json`](../scripts/release-package-manifest.json).
When you add a new public package:
1. add it to the manifest and decide whether CI should publish it immediately
2. if CI should publish it, bootstrap the package on npm before merge
3. if CI should not publish it yet, keep `"publishFromCi": false`
4. only enable `"publishFromCi": true` after npm trusted publishing is configured for that package
PR CI now checks changed release-enabled package manifests against npm. That catches a missing first-publish bootstrap before the change reaches `master`.
### One-time bootstrap sequence for a new package
The first publish of a brand-new package still needs one human maintainer with npm write access.
After that, trusted publishing can take over.
Example for `@paperclipai/adapter-acpx-local` from the repo root:
```bash
# safe preview
pnpm run release:bootstrap-package -- @paperclipai/adapter-acpx-local
# one-time first publish from an authenticated maintainer machine
pnpm run release:bootstrap-package -- @paperclipai/adapter-acpx-local --publish --otp 123456
```
The helper script:
- checks that the package does not already exist on npm
- builds the target package unless `--skip-build` is passed
- runs `npm pack --dry-run` in the package directory
- only runs the real `npm publish --access public` when `--publish --otp <code>` is provided
For the real `--publish` step, the maintainer machine must already be authenticated to npm.
If `npm whoami` returns `401`, first run `npm logout --registry=https://registry.npmjs.org/` to clear any stale local auth, then run `npm login` or `npm adduser` locally as an npm org member, and finally rerun the helper.
That local human auth is fine for the one-time bootstrap publish; we just do not want the same auth model inside CI.
The helper now requires `--otp <code>` up front for `--publish`, so it fails before the real publish attempt if the one-time password is missing.
After that first publish succeeds:
1. open `https://www.npmjs.com/package/@paperclipai/adapter-acpx-local`
2. go to `Settings``Trusted publishing`
3. add repository `paperclipai/paperclip`
4. set workflow filename to `release.yml`
5. optionally go to `Settings``Publishing access` and enable `Require two-factor authentication and disallow tokens`
6. keep `publishFromCi: true` in [`scripts/release-package-manifest.json`](../scripts/release-package-manifest.json)
Once those steps are done, future canary and stable publishes for that package are automated through GitHub OIDC. The manual step is only the first package creation on npm.
## Rollback model
Rollback does not unpublish anything.
+21
View File
@@ -67,6 +67,27 @@ Why:
- the single `release.yml` workflow handles both canary and stable publishing
- GitHub environments `npm-canary` and `npm-stable` still enforce different approval rules on the GitHub side
### 2.2.1. Newly added public packages need a bootstrap phase
Trusted publishing is configured on the npm package itself, not at the repo scope.
That means a brand-new public package must not be auto-enrolled into CI publishing until its npm package exists and its trusted publisher has been configured.
Repo policy:
1. add every non-private package to [`scripts/release-package-manifest.json`](../scripts/release-package-manifest.json)
2. set `"publishFromCi": true` only when CI is expected to publish that package
3. if the package is not ready for CI publishing yet, keep `"publishFromCi": false`
4. complete the package bootstrap before merging any PR that changes a release-enabled new package
Bootstrap sequence for a new package:
1. publish the package once from a trusted maintainer machine using normal npm auth
2. open that package on npm and add the `paperclipai/paperclip` trusted publisher for `.github/workflows/release.yml`
3. rerun or dry-run the release flow as needed to confirm CI publishing now works
4. only then enable `"publishFromCi": true`
PR CI enforces this by checking changed release-enabled package manifests against npm. That keeps `master` canary publishing healthy while preserving the no-long-lived-token model for normal CI releases.
### 2.3. Verify trusted publishing before removing old auth
After the workflows are live:
+2
View File
@@ -63,6 +63,8 @@ It:
- verifies the pushed commit
- computes the canary version for the current UTC date
- publishes under npm dist-tag `canary`
- verifies that `canary` resolves to the just-published version and that published internal dependencies exist on npm
- fails by default if npm leaves `latest` pointing at a canary; use `--allow-canary-latest` only when that state is intentional
- creates a git tag `canary/vYYYY.MDD.P-canary.N`
Users install canaries with:
+368
View File
@@ -0,0 +1,368 @@
# AWS Secrets Manager Provider
Operational contract for the hosted `aws_secrets_manager` secret provider used by Paperclip Cloud.
## Scope
- Hosted provider for Paperclip-managed secrets when Paperclip Cloud runs on AWS.
- Source of truth for secret values is AWS Secrets Manager, not Postgres.
- Paperclip stores only metadata needed for ownership, bindings, version selection, audit, and runtime resolution.
- AWS provider bootstrap credentials are deployment/runtime credentials, not Paperclip-managed company secrets.
- Remote import for existing AWS secrets is metadata-only. Preview/import uses
AWS inventory metadata and creates Paperclip external references; it does not
copy plaintext into Paperclip.
- Per-company AWS provider vaults (named instances of `aws_secrets_manager`
with their own region, namespace, prefix, KMS key id, and tags) are managed
in the board UI under `Company Settings → Secrets → Provider vaults`. See
[Provider Vaults](../docs/deploy/secrets.md#provider-vaults) for the operator
model and [Provider Vaults API](../docs/api/secrets.md#provider-vaults) for
the routes. The bootstrap trust model in this document still applies — vault
config carries non-sensitive routing metadata only, never AWS credentials.
## Bootstrap Trust Model
The AWS provider has a chicken-and-egg boundary: Paperclip cannot use
`company_secrets` to unlock the AWS provider that stores those secrets. The
initial AWS trust must exist before the Paperclip server starts.
Allowed bootstrap locations:
- Infrastructure IAM or workload identity attached to the Paperclip server
runtime.
- Process environment or orchestrator secret store used to start the Paperclip
server.
- Local AWS SDK sources such as `AWS_PROFILE`, AWS SSO/shared config, web
identity, container metadata, or instance metadata.
- Short-lived shell credentials for local development only.
Do not ask operators to paste AWS root credentials or long-lived IAM user access
keys into the Paperclip board UI. Do not store those bootstrap keys in
`company_secrets`.
## Paperclip Cloud Bootstrap
Paperclip Cloud must provision the AWS backing resources before any board user
can create AWS-backed company secrets:
1. Create or select the deployment KMS key.
2. Create the Paperclip server runtime role for the deployment.
3. Attach a minimum IAM policy scoped to the deployment Secrets Manager prefix
and the configured KMS key.
4. Configure the server runtime with the non-secret provider environment
variables below.
5. Run `paperclipai doctor` or the provider health endpoint from the deployed
runtime and confirm that the provider reports the expected region, prefix,
deployment id, KMS setting, and AWS SDK credential source.
Once this is in place, the board UI can create Paperclip-managed AWS secrets and
Paperclip will write them under the deployment/company namespace.
## Self-Hosted And Local Bootstrap
Self-hosted AWS deployments should use the AWS SDK default credential provider
chain. Preferred sources are role-based:
- EC2 instance profile.
- ECS task role.
- EKS IRSA or another OIDC web identity role.
- AWS SSO/shared config via `AWS_PROFILE`.
Local development can use:
```sh
aws sso login --profile paperclip-dev
AWS_PROFILE=paperclip-dev \
PAPERCLIP_SECRETS_PROVIDER=aws_secrets_manager \
PAPERCLIP_SECRETS_AWS_REGION=us-east-1 \
PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID=dev-local \
PAPERCLIP_SECRETS_AWS_KMS_KEY_ID=arn:aws:kms:us-east-1:123456789012:key/abcd-... \
pnpm dev
```
Temporary `AWS_ACCESS_KEY_ID`/`AWS_SECRET_ACCESS_KEY` environment credentials
are acceptable only as a local break-glass or short-lived test source. They
should not be written to Paperclip config, committed to `.env` files, stored in
`company_secrets`, or used as the default Paperclip Cloud bootstrap path.
## Deployment Config
Required environment variables:
```sh
PAPERCLIP_SECRETS_PROVIDER=aws_secrets_manager
PAPERCLIP_SECRETS_AWS_REGION=us-east-1
PAPERCLIP_SECRETS_AWS_DEPLOYMENT_ID=prod-us-1
PAPERCLIP_SECRETS_AWS_KMS_KEY_ID=arn:aws:kms:us-east-1:123456789012:key/abcd-...
```
Optional environment variables:
```sh
PAPERCLIP_SECRETS_AWS_PREFIX=paperclip
PAPERCLIP_SECRETS_AWS_ENVIRONMENT=production
PAPERCLIP_SECRETS_AWS_PROVIDER_OWNER=paperclip
PAPERCLIP_SECRETS_AWS_ENDPOINT=
PAPERCLIP_SECRETS_AWS_DELETE_RECOVERY_DAYS=30
```
Naming convention for Paperclip-managed secrets:
```text
paperclip/{deploymentId}/{companyId}/{secretKey}
```
Tag set for Paperclip-managed secrets:
- `paperclip:managed-by=paperclip`
- `paperclip:provider-owner=<owner tag>`
- `paperclip:deployment-id=<deployment id>`
- `paperclip:company-id=<company id>`
- `paperclip:secret-key=<secret key>`
- `paperclip:environment=<environment tag>`
## IAM And KMS Assumptions
Launch posture:
- One Paperclip app role per deployment.
- One deployment-scoped KMS key per deployment at launch.
- Future per-company KMS keys remain compatible because Paperclip stores provider refs and version metadata separately from values.
Minimum IAM boundary:
- Allow `secretsmanager:CreateSecret`, `PutSecretValue`, `GetSecretValue`, and `DeleteSecret`.
- Scope resources to the deployment prefix:
```text
arn:aws:secretsmanager:<region>:<account-id>:secret:paperclip/<deployment-id>/*
```
- Allow `kms:Encrypt`, `kms:Decrypt`, `kms:GenerateDataKey`, and `kms:DescribeKey` for the configured deployment CMK.
- Deny wildcard access outside the deployment prefix.
- Prefer workload identity / role-based auth. Do not store AWS credentials inline in Paperclip config.
Example minimum policy shape:
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PaperclipDeploymentSecrets",
"Effect": "Allow",
"Action": [
"secretsmanager:CreateSecret",
"secretsmanager:PutSecretValue",
"secretsmanager:GetSecretValue",
"secretsmanager:DeleteSecret"
],
"Resource": "arn:aws:secretsmanager:<region>:<account-id>:secret:paperclip/<deployment-id>/*"
},
{
"Sid": "PaperclipDeploymentKms",
"Effect": "Allow",
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:GenerateDataKey",
"kms:DescribeKey"
],
"Resource": "arn:aws:kms:<region>:<account-id>:key/<key-id>"
}
]
}
```
Operational expectation:
- Paperclip-managed secrets may be deleted only by Paperclip or an operator with equivalent break-glass access.
- External references may resolve through Paperclip runtime, but Paperclip should not delete the external secret resource.
## Remote Import Inventory IAM
Remote import preview needs one additional AWS permission:
```json
{
"Sid": "PaperclipRemoteSecretInventory",
"Effect": "Allow",
"Action": "secretsmanager:ListSecrets",
"Resource": "*"
}
```
This is intentionally separate from the managed create/rotate/delete policy.
AWS treats `ListSecrets` as an account/Region inventory action; do not document
secret ARNs, names, tags, or AWS request filters as an IAM boundary for it. Use
`Resource: "*"` and decide whether inventory exposure is acceptable for the AWS
account and Region behind each provider vault.
Remote import preview/import must not call:
- `secretsmanager:GetSecretValue`
- `secretsmanager:BatchGetSecretValue`
- `kms:Decrypt`
Those permissions are only needed later when a bound runtime resolves an
imported external reference. For imported refs, scope read permissions to the
operator-approved external prefixes that Paperclip is allowed to consume:
```json
{
"Sid": "PaperclipResolveImportedExternalReferences",
"Effect": "Allow",
"Action": "secretsmanager:GetSecretValue",
"Resource": [
"arn:aws:secretsmanager:<region>:<account-id>:secret:<approved-external-prefix>/*"
]
}
```
If selected external secrets use customer-managed KMS keys, also grant
`kms:Decrypt` and `kms:DescribeKey` on those keys. Keep managed write/delete
permissions scoped to `paperclip/<deployment-id>/*`; do not broaden them for
remote import.
Safe scoping guidance:
- Prefer one Paperclip runtime role per environment/account.
- Point provider vaults at the intended AWS account and Region instead of a
broad central admin role.
- Enable `ListSecrets` only in accounts where inventory exposure is acceptable.
- Keep preview/import board-only; agent API keys must not call these routes.
- Treat AWS tag/name filters as search UX only, not permission enforcement.
Paperclip also blocks importing refs under its own managed namespace as
external references. Use the Paperclip-managed flow for
`paperclip/{deploymentId}/{companyId}/{secretKey}` resources.
## Existing AWS Secrets
V1 keeps existing AWS Secrets Manager entries as **linked external references**, not adopted
Paperclip-managed resources.
Use the Paperclip-managed flow when Paperclip should create and rotate the value. The AWS
secret name is derived from deployment and company scope:
```text
paperclip/{deploymentId}/{companyId}/{secretKey}
```
Use the external-reference flow when the secret already exists at an operator-owned path such
as:
```text
/paperclip-bench/anthropic_api_key
```
In that mode Paperclip stores only the path or ARN, resolves it at runtime, and records
redacted access events. Operators rotate the actual value in AWS. Update the Paperclip
reference only when the AWS path, ARN, or pinned provider version changes.
Paperclip does not currently offer an "adopt existing AWS secret" flow that takes over future
`PutSecretValue` writes for an arbitrary existing secret. Adding that later requires explicit
confirmation UX, scope validation, expected Paperclip tags, and security/cloud-ops review.
## Data Custody
- Paperclip stores `externalRef`, `providerVersionRef`, provider id, fingerprint hash, status, and binding metadata.
- Paperclip does not store AWS secret plaintext in `company_secret_versions.material`.
- Runtime resolution fetches the value from AWS only when a bound consumer needs it.
## Rotation Runbook
Manual Paperclip-managed rotation:
1. Write the new value through the Paperclip secret rotate flow.
2. Paperclip creates a new AWS secret version with `PutSecretValue`.
3. Paperclip records the new `providerVersionRef` in `company_secret_versions`.
4. Re-run or restart affected workloads that consume `latest`, or pin consumers to a specific Paperclip version before rollout when you need staged release safety.
Guidance:
- Prefer pinned Paperclip secret versions for risky rollouts.
- Treat provider-native automatic rotation as a later enhancement; current V1 flow is explicit create-new-version plus controlled rollout.
## Backup And Restore Runbook
What must survive:
- Paperclip database metadata for secret ownership, bindings, status, and provider version refs.
- AWS Secrets Manager namespace under the configured deployment prefix.
- The configured KMS key and its decrypt permissions.
Restore checklist:
1. Restore Paperclip database metadata.
2. Confirm the same AWS Secrets Manager namespace still exists.
3. Confirm the Paperclip runtime role can call `GetSecretValue` on the restored prefix.
4. Confirm the role still has decrypt access to the CMK referenced by `PAPERCLIP_SECRETS_AWS_KMS_KEY_ID`.
5. Run the live smoke below or a targeted runtime secret resolution test.
## Provider Outage Runbook
Symptoms:
- Secret create/rotate/resolve operations fail with AWS provider errors.
- Agent runs fail before adapter invocation on required secret resolution.
- Remote import preview fails to list AWS inventory.
Immediate actions:
1. Confirm AWS regional health and Secrets Manager availability.
2. Confirm the runtime role still has `GetSecretValue` and KMS decrypt permissions.
3. Check for accidental prefix, region, deployment id, or KMS key config drift.
4. Retry a single resolution after AWS service health is green.
5. If outage persists, pause high-risk runs that require secret access rather than churning retries.
Remote import-specific actions:
- Missing list permission: add `secretsmanager:ListSecrets` with
`Resource: "*"` only when inventory import is approved for that vault's
AWS account and Region.
- Throttling: narrow the search, wait briefly, and retry with backoff. Avoid
full-account enumeration.
- Invalid or stale cursor: refresh the preview and discard the old
`NextToken`.
- Large account: load pages intentionally, keep one in-flight preview request
per vault/search, and do not run background full-account crawls.
- Runtime read failure after import: verify `GetSecretValue` and KMS decrypt
on the selected external secret. Visibility in `ListSecrets` does not prove
read permission.
## Incident Response Runbook
Potential incidents:
- Cross-company access caused by IAM scoping drift.
- KMS policy drift causing decrypt failures or over-broad access.
- Suspected secret exposure in logs, transcripts, or downstream agent output.
Response steps:
1. Stop or pause affected Paperclip runs.
2. Audit recent Paperclip secret access events for impacted secret ids and consumers.
3. Audit AWS CloudTrail for `ListSecrets`, `GetSecretValue`,
`PutSecretValue`, and `DeleteSecret` calls on the relevant vault account,
Region, deployment prefix, and approved external prefixes.
4. Rotate impacted secrets in AWS through Paperclip-managed versioning.
5. Re-scope IAM and KMS policies before resuming normal traffic.
6. If a value may have reached an agent transcript or external system, treat it as exposed and rotate immediately.
## Optional Live Smoke
This is safe to skip locally. Run it only against a dedicated AWS test namespace.
Prerequisites:
- AWS credentials or workload identity with the deployment-scoped IAM permissions above.
- `PAPERCLIP_SECRETS_PROVIDER=aws_secrets_manager`
- The required `PAPERCLIP_SECRETS_AWS_*` environment variables set.
Suggested smoke:
1. Create a test secret through the Paperclip board or API under a throwaway company.
2. Confirm the resulting AWS secret name matches `paperclip/{deploymentId}/{companyId}/{secretKey}`.
3. Rotate the secret once and confirm a new `providerVersionRef` appears in Paperclip metadata.
4. Resolve the secret through a bound runtime path, not by adding a general-purpose reveal endpoint.
5. Delete the throwaway secret and confirm AWS schedules deletion with the configured recovery window.
+194 -21
View File
@@ -1,7 +1,7 @@
# Paperclip V1 Implementation Spec
Status: Implementation contract for first release (V1)
Date: 2026-02-17
Date: 2026-04-28
Audience: Product, engineering, and agent-integration authors
Source inputs: `GOAL.md`, `PRODUCT.md`, `SPEC.md`, `DATABASE.md`, current monorepo code
@@ -34,11 +34,12 @@ These decisions close open questions from `SPEC.md` for V1.
| Company model | Company is first-order; all business entities are company-scoped |
| Board | Single human board operator per deployment |
| Org graph | Strict tree (`reports_to` nullable root); no multi-manager reporting |
| Visibility | Full visibility to board and all agents in same company |
| Visibility | Company-scoped visibility: board + all in-company agents can see all work objects by default; public/private deployment flags affect external exposure only and do **not** imply project/issue privacy |
| Communication | Tasks + comments only (no separate chat system) |
| Task ownership | Single assignee; atomic checkout required for `in_progress` transition |
| Recovery | No automatic reassignment; work recovery stays manual/explicit |
| Agent adapters | Built-in `process` and `http` adapters |
| Recovery | Liveness/watchdog recovery preserves explicit ownership: retry lost execution continuity where safe, otherwise open visible source-scoped recovery actions by default, use issue-backed recovery only for independent repair work, or require human escalation (see `doc/execution-semantics.md`) |
| Agent adapters | Built-in `process`, `http`, local CLI/session adapters, and OpenClaw gateway support; external adapters can also be loaded through the adapter plugin flow |
| Plugin framework | Local/self-hosted early plugin runtime is in scope; cloud marketplace and packaged public distribution remain out of scope |
| Auth | Mode-dependent human auth (`local_trusted` implicit board in current code; authenticated mode uses sessions), API keys for agents |
| Budget period | Monthly UTC calendar window |
| Budget enforcement | Soft alerts + hard limit auto-pause |
@@ -73,7 +74,7 @@ V1 implementation extends this baseline into a company-centric, governance-aware
## 5.2 Out of Scope (V1)
- Plugin framework and third-party extension SDK
- Cloud-grade plugin marketplace/distribution beyond the local/self-hosted plugin runtime
- Revenue/expense accounting beyond model/token costs
- Knowledge base subsystem
- Public marketplace (ClipHub)
@@ -123,6 +124,16 @@ Human auth tables (`users`, `sessions`, and provider-specific auth artifacts) ar
- `name` text not null
- `description` text null
- `status` enum: `active | paused | archived`
- `pause_reason` text null
- `paused_at` timestamptz null
- `issue_prefix` text not null
- `issue_counter` int not null
- `budget_monthly_cents` int not null default 0
- `spent_monthly_cents` int not null default 0
- `attachment_max_bytes` int not null
- `require_board_approval_for_new_agents` boolean not null default false
- feedback sharing consent fields
- branding fields such as `brand_color`
Invariant: every business record belongs to exactly one company.
@@ -133,15 +144,21 @@ Invariant: every business record belongs to exactly one company.
- `name` text not null
- `role` text not null
- `title` text null
- `status` enum: `active | paused | idle | running | error | terminated`
- `icon` text null
- `status` enum: `active | paused | idle | running | error | pending_approval | terminated`
- `reports_to` uuid fk `agents.id` null
- `capabilities` text null
- `adapter_type` enum: `process | http`
- `adapter_type` text; built-ins include `process`, `http`, `claude_local`, `codex_local`, `gemini_local`, `opencode_local`, `pi_local`, `cursor`, and `openclaw_gateway`
- `adapter_config` jsonb not null
- `runtime_config` jsonb not null default `{}`; may include Paperclip runtime policy such as `modelProfiles.cheap.adapterConfig` for an optional low-cost model lane that does not change the primary adapter config
- `default_environment_id` uuid fk `environments.id` null
- `context_mode` enum: `thin | fat` default `thin`
- `budget_monthly_cents` int not null default 0
- `spent_monthly_cents` int not null default 0
- pause fields: `pause_reason`, `paused_at`
- `permissions` jsonb not null default `{}`
- `last_heartbeat_at` timestamptz null
- `metadata` jsonb null
Invariants:
@@ -190,11 +207,14 @@ Invariant:
- project env is merged into run environment for issues in that project and overrides conflicting agent env keys before Paperclip runtime-owned keys are injected
Routine execution issues add a routine-scoped env overlay after project env and before Paperclip runtime-owned keys. Routine env uses the same secret-aware binding format, is stored on `routines.env`, is snapshotted in routine revisions, and resolves secret refs against the routine binding target so routine-owned secrets do not require direct bindings on the executing agent.
## 7.6 `issues` (core task entity)
- `id` uuid pk
- `company_id` uuid fk not null
- `project_id` uuid fk `projects.id` null
- `project_workspace_id` uuid fk `project_workspaces.id` null
- `goal_id` uuid fk `goals.id` null
- `parent_id` uuid fk `issues.id` null
- `title` text not null
@@ -202,13 +222,22 @@ Invariant:
- `status` enum: `backlog | todo | in_progress | in_review | done | blocked | cancelled`
- `priority` enum: `critical | high | medium | low`
- `assignee_agent_id` uuid fk `agents.id` null
- `assignee_user_id` text null
- checkout/execution locks: `checkout_run_id`, `execution_run_id`, `execution_agent_name_key`, `execution_locked_at`
- `created_by_agent_id` uuid fk `agents.id` null
- `created_by_user_id` uuid fk `users.id` null
- identifier fields: `issue_number`, `identifier`
- origin fields: `origin_kind`, `origin_id`, `origin_run_id`, `origin_fingerprint`
- `request_depth` int not null default 0
- `billing_code` text null
- `assignee_adapter_overrides` jsonb null
- `execution_policy` jsonb null
- `execution_state` jsonb null
- execution workspace fields: `execution_workspace_id`, `execution_workspace_preference`, `execution_workspace_settings`
- `started_at` timestamptz null
- `completed_at` timestamptz null
- `cancelled_at` timestamptz null
- `hidden_at` timestamptz null
Invariants:
@@ -261,10 +290,10 @@ Invariant: each event must attach to agent and company; rollups are aggregation,
- `id` uuid pk
- `company_id` uuid fk not null
- `type` enum: `hire_agent | approve_ceo_strategy`
- `type` enum: `hire_agent | approve_ceo_strategy | budget_override_required | request_board_approval`
- `requested_by_agent_id` uuid fk `agents.id` null
- `requested_by_user_id` uuid fk `users.id` null
- `status` enum: `pending | approved | rejected | cancelled`
- `status` enum: `pending | revision_requested | approved | rejected | cancelled`
- `payload` jsonb not null
- `decision_note` text null
- `decided_by_user_id` uuid fk `users.id` null
@@ -282,7 +311,32 @@ Invariant: each event must attach to agent and company; rollups are aggregation,
- `details` jsonb null
- `created_at` timestamptz not null default now()
## 7.12 `company_secrets` + `company_secret_versions`
## 7.12 `project_memberships` + `agent_memberships`
Per-user project/agent membership is personal visibility state for board users. It only controls whether a resource appears in the current user's sidebar; it must not grant or revoke access to all-pages, detail pages, selectors, assignment flows, search, or existing permissions.
`project_memberships`:
- `id` uuid pk
- `company_id` uuid fk `companies.id` not null
- `project_id` uuid fk `projects.id` not null
- `user_id` text not null
- `state` enum-like text: `joined | left`
- `created_at` timestamptz not null default now()
- `updated_at` timestamptz not null default now()
- unique `(company_id, user_id, project_id)`
`agent_memberships` mirrors the same shape with `agent_id` instead of `project_id` and unique `(company_id, user_id, agent_id)`.
Invariants:
- Missing membership rows mean `joined` for backward compatibility.
- Mutations are board-user-only `/me` operations; agent API keys are rejected.
- Viewer-role board users may update only their own membership rows through the narrow self-service helper.
- Target project/agent ownership is checked against the path company before mutation.
- Successful state changes write `resource_membership.joined` or `resource_membership.left` activity entries.
## 7.13 `company_secrets` + `company_secret_versions`
- Secret values are not stored inline in `agents.adapter_config.env`.
- Agent env entries should use secret refs for sensitive values.
@@ -296,7 +350,7 @@ Operational policy:
- Activity and approval payloads must not persist raw sensitive values.
- Config revisions may include redacted placeholders; such revisions are non-restorable for redacted fields.
## 7.13 Required Indexes
## 7.14 Required Indexes
- `agents(company_id, status)`
- `agents(company_id, reports_to)`
@@ -314,8 +368,12 @@ Operational policy:
- `issue_attachments(company_id, issue_id)`
- `company_secrets(company_id, name)` unique
- `company_secret_versions(secret_id, version)` unique
- `project_memberships(company_id, user_id)`
- `project_memberships(company_id, user_id, project_id)` unique
- `agent_memberships(company_id, user_id)`
- `agent_memberships(company_id, user_id, agent_id)` unique
## 7.14 `assets` + `issue_attachments`
## 7.15 `assets` + `issue_attachments`
- `assets` stores provider-backed object metadata (not inline bytes):
- `id` uuid pk
@@ -349,6 +407,10 @@ Operational policy:
- `created_by_user_id` uuid/text fk null
- `updated_by_agent_id` uuid fk null
- `updated_by_user_id` uuid/text fk null
- `locked_at` timestamptz null
- `locked_by_agent_id` uuid fk null
- `locked_by_user_id` uuid/text fk null
- Locked documents are immutable until unlocked. Board operators can lock/unlock; agent writes to a locked key create a new issue document with a derived key instead of overwriting the locked document.
- `document_revisions` stores append-only history:
- `id` uuid pk
- `company_id` uuid fk not null
@@ -363,6 +425,15 @@ Operational policy:
- `document_id` uuid fk not null
- `key` text not null (`plan`, `design`, `notes`, etc.)
## 7.16 Current Implementation Addenda
The current implementation includes additional V1-control-plane tables beyond the original February snapshot:
- Issue structure and review: `issue_relations` for blockers, `labels`/`issue_labels`, `issue_thread_interactions`, `issue_approvals`, `issue_execution_decisions`, `issue_work_products`, `issue_inbox_archives`, `issue_read_states`, and issue reference mention indexes.
- Execution and workspace control: `execution_workspaces`, `project_workspaces`, `workspace_runtime_services`, `workspace_operations`, `environments`, `environment_leases`, `agent_task_sessions`, `agent_runtime_state`, `agent_wakeup_requests`, heartbeat events, and watchdog decision tables.
- Plugins and routines: `plugins`, plugin config/state/entities/jobs/logs/webhooks, plugin database namespaces/migrations, plugin company settings, `routines`, `routine_revisions`, `routine_triggers`, and `routine_runs`.
- Access and operations: company memberships, instance roles, principal permission grants, invites, join requests, board API keys, CLI auth challenges, budget policies/incidents, feedback exports/votes, company skills, sidebar preferences, and company logos.
## 8. State Machines
## 8.1 Agent Status
@@ -395,6 +466,16 @@ Side effects:
- entering `done` sets `completed_at`
- entering `cancelled` sets `cancelled_at`
V1 non-terminal liveness rule:
- agent-owned `todo`, `in_progress`, `in_review`, and `blocked` issues must have a live execution path, an explicit waiting path, or an explicit recovery path
- `in_review` is healthy only when a typed execution participant, pending issue-thread interaction or approval, user owner, active run, queued wake, or explicit recovery action owns the next action
- a blocked chain is covered only when each unresolved leaf issue is live or explicitly waiting
- when Paperclip cannot safely infer the next action, it surfaces the problem through visible blocked/recovery work instead of silently completing or reassigning work
- explicit recovery actions are the liveness primitive; source-scoped actions are the default form, issue-backed recovery is a fallback for independent repair work or safety boundaries, and comments alone are evidence rather than a healthy liveness path
Detailed ownership, execution, blocker, active-run watchdog, crash-recovery, and non-terminal liveness semantics are documented in `doc/execution-semantics.md`.
## 8.3 Approval Status
- `pending -> approved | rejected | cancelled`
@@ -435,6 +516,59 @@ Side effects:
| Report cost | yes | yes |
| Set company budget | yes | no |
| Set subordinate budget | yes | yes (manager subtree only) |
| Set work-object visibility (issue/project) | no | no (pro gate) |
## 9.4 Permission Terminology and Default Visibility Rule
Paperclip V1 keeps a company-scoped visibility model as the default because centralized authorization and scoped work-object controls are not yet a core V1 control surface.
The approved term set is:
- **Agent profile visibility**: identity-level facts needed for delegation and governance (name, role, capabilities, reporting lines).
- **Agent config visibility**: adapter/runtime config metadata and secret-access policy.
- **Assignment/invocation permission**: who may modify or execute a task.
- **Work-object visibility**: who can read/write issues, comments, projects, and attachments.
- **Tool/secret policy**: what tools and secret-backed credentials an agent can use and what appears in logs.
- **Escalation authority**: where refusal/blocked decisions route (manager, then board).
## 9.5 Core V1 Rule: what “private” means
- A **private marker** on an agent profile (where represented) does **not** make company-visible work private.
- Company-visible work objects (issues, comments, work products, costs, activity, project/task state) remain visible to the board and in-company agents by default.
- Project/issue-level privacy, scoped assignment-only object visibility, and organization-wide custom ACLs are deferred to Pro/Enterprise controls.
## 9.6 V1 vs Pro/Enterprise Controls (recommended target split)
| Permission area | Free / V1 default | Pro / Enterprise |
|---|---|---|
| Company boundary | Hard boundary only (`company_id`) | Multi-company policy overlays (`membership`, `project`, and `task` scopes) |
| Simple roles | Board + agent roles with existing approval/budget gates | Additional role aliases + scoped approver roles |
| Profile visibility | Full profile visibility for coordination and audit | Optional profile redaction / selective sharing for external surfaces |
| Config visibility | Board full read with redacted secret fields; agent config read/write constrained by own agent identity | Scoped config visibility controls and central policy enforcement |
| Assignment/invocation | Assignment creates execution authority; board can reassign or force release | Delegation policies and scoped invokers with deny-listed tool classes |
| Work-object visibility | All issues and projects in-company are visible to board and agents | Project/issue ACLs and reviewer-only channels |
| Tool/secret policy | Secret refs, log redaction, and adapter-level command/webhook restrictions | Tool allowlists with centralized policy evaluation |
| Escalation | Escalate from agent to manager to board; board approval/budget gates remain authoritative | Escalation routing and SLA windows |
## 9.7 Recommended first-slice implementation order
1. Lock route-level checks for existing company boundaries, actor extraction, and approval/budget gates.
2. Treat profile privacy as external-facing signal only; do not use it to hide company-visible work objects.
3. Enforce assignment/invocation coupling (`assignee`/`agent` checks, checkout semantics, invocation checks).
4. Standardize read-path redaction for secrets and secret references, including logs and activity.
5. Standardize escalation paths (`blocked` and refusal) so non-board agents hand off by manager/board with immutable audit.
## 9.8 Scoped Task Assignment Grants
`tasks:assign` remains the broad assignment permission. Existing unscoped grants preserve compatibility and allow the principal to assign any visible company task within normal company-boundary checks.
`tasks:assign_scope` is the constrained assignment permission. Its `principal_permission_grants.scope` JSON must include at least one recognized constraint:
- Project scope: `projectId`, `projectIds`, or `allow: ["project:<projectId>"]`.
- Target-agent allowlist: `agentId`, `agentIds`, `assigneeAgentId`, `assigneeAgentIds`, `targetAgentId`, `targetAgentIds`, or `allow: ["agent:<agentId>"]`.
- Managed-subtree scope: `managerAgentId`, `managerAgentIds`, `managedSubtreeAgentId`, `managedSubtreeAgentIds`, `subtreeAgentId`, `subtreeAgentIds`, `subtreeRootAgentId`, `subtreeRootAgentIds`, or `allow: ["subtree:<agentId>"]`.
When multiple constraint families are present, assignment must satisfy all of them. Denials return `403` with a generic scope explanation and do not disclose details about hidden or unrelated resources.
## 10. API Contract (REST)
@@ -478,10 +612,13 @@ All endpoints are under `/api` and return JSON.
- `GET /issues/:issueId/documents`
- `GET /issues/:issueId/documents/:key`
- `PUT /issues/:issueId/documents/:key`
- `POST /issues/:issueId/documents/:key/lock`
- `POST /issues/:issueId/documents/:key/unlock`
- `GET /issues/:issueId/documents/:key/revisions`
- `DELETE /issues/:issueId/documents/:key`
- `POST /issues/:issueId/checkout`
- `POST /issues/:issueId/release`
- `POST /issues/:issueId/admin/force-release` (board-only lock recovery)
- `POST /issues/:issueId/comments`
- `GET /issues/:issueId/comments`
- `POST /companies/:companyId/issues/:issueId/attachments` (multipart upload)
@@ -506,6 +643,8 @@ Server behavior:
2. if updated row count is 0, return `409` with current owner/status
3. successful checkout sets `assignee_agent_id`, `status = in_progress`, and `started_at`
`POST /issues/:issueId/admin/force-release` is an operator recovery endpoint for stale harness locks. It requires board access to the issue company, clears checkout and execution run lock fields, and may clear the agent assignee when `clearAssignee=true` is passed. The route must write an `issue.admin_force_release` activity log entry containing the previous checkout and execution run IDs.
## 10.5 Projects
- `GET /companies/:companyId/projects`
@@ -513,14 +652,28 @@ Server behavior:
- `GET /projects/:projectId`
- `PATCH /projects/:projectId`
## 10.6 Approvals
## 10.6 Current-user Resource Memberships
- `GET /companies/:companyId/resource-memberships/me`
- `PUT /companies/:companyId/resource-memberships/me/projects/:projectId`
- `PUT /companies/:companyId/resource-memberships/me/agents/:agentId`
Request payload:
```json
{ "state": "joined" }
```
Allowed states are `joined` and `left`. Endpoints require a concrete board user and active company membership, reject agent API keys, and only mutate the caller's own sidebar visibility state. Joining/leaving is idempotent; missing rows read as `joined`.
## 10.7 Approvals
- `GET /companies/:companyId/approvals?status=pending`
- `POST /companies/:companyId/approvals`
- `POST /approvals/:approvalId/approve`
- `POST /approvals/:approvalId/reject`
## 10.7 Cost and Budgets
## 10.8 Cost and Budgets
- `POST /companies/:companyId/cost-events`
- `GET /companies/:companyId/costs/summary`
@@ -529,7 +682,7 @@ Server behavior:
- `PATCH /companies/:companyId/budgets`
- `PATCH /agents/:agentId/budgets`
## 10.8 Activity and Dashboard
## 10.9 Activity and Dashboard
- `GET /companies/:companyId/activity`
- `GET /companies/:companyId/dashboard`
@@ -541,7 +694,7 @@ Dashboard payload must include:
- month-to-date spend and budget utilization
- pending approvals count
## 10.9 Error Semantics
## 10.10 Error Semantics
- `400` validation error
- `401` unauthenticated
@@ -551,6 +704,17 @@ Dashboard payload must include:
- `422` semantic rule violation
- `500` server error
## 10.11 Current Implementation API Addenda
The current app also exposes V1-supporting surfaces for:
- issue thread interactions (`suggest_tasks`, `ask_user_questions`, `request_confirmation`)
- issue approvals, issue references/search, labels, read state, inbox/archive state, and work products
- execution workspaces, project workspaces, workspace runtime services, and workspace operations
- routines and scheduled/API/webhook triggers
- plugin installation, configuration, state, jobs, logs, webhooks, and plugin database namespace migration
- company import/export preview/apply, feedback export/vote routes, instance backup/config routes, invites, join requests, memberships, and permission grants
## 11. Heartbeat and Adapter Contract
## 11.1 Adapter Interface
@@ -611,13 +775,19 @@ Behavior:
- `thin`: send IDs and pointers only; agent fetches context via API
- `fat`: include current assignments, goal summary, budget snapshot, and recent comments
## 11.5 Scheduler Rules
## 11.5 Recovery Model Profiles
The optional `modelProfiles.cheap` lane is not a retry worker lane. Paperclip may request the cheap profile only for status-only recovery coordination, and those wakes must include guard context that prevents deliverable work and document/plan updates (`allowDeliverableWork: false`, `allowDocumentUpdates: false`, `resumeRequiresNormalModel: true`).
Failed source-work retries, process-loss retries, transient/scheduled retries, max-turn continuations, source-assignee continuations, and downstream source-work child/requeue/resume contexts must use the normal/original model lane. If cheap recovery repairs liveness while actual work remains, the next live continuation path must be a separate normal-model worker run with cheap hints scrubbed.
## 11.6 Scheduler Rules
Per-agent schedule fields in `adapter_config`:
- `enabled` boolean
- `intervalSec` integer (minimum 30)
- `maxConcurrentRuns` fixed at `1` for V1
- `maxConcurrentRuns` integer; new agents default to `20`; scheduler clamps configured values to `1..50`
Scheduler must skip invocation when:
@@ -726,13 +896,14 @@ Required UX behaviors:
- Node 20+
- `DATABASE_URL` optional
- if unset, auto-use PGlite and push schema
- if unset, auto-use embedded PostgreSQL under `~/.paperclip/instances/default/db`
## 15.2 Migrations
- Drizzle migrations are source of truth
- local/dev startup applies pending migrations automatically where supported
- `pnpm db:migrate` applies pending migrations manually
- no destructive migration in-place for V1 upgrade path
- provide migration script from existing minimal tables to company-scoped schema
## 15.3 Logging and Audit
@@ -787,6 +958,8 @@ A release candidate is blocked unless these pass:
## 18. Delivery Plan
Current implementation note: the milestones below describe the original V1 sequencing. Several systems originally framed as future work have since shipped or advanced materially, including issue documents/interactions, blockers, routines, execution workspaces, import/export portability, authenticated deployment modes, multi-user basics, and the local/self-hosted plugin runtime.
## Milestone 1: Company Core and Auth
- add `companies` and company scoping to existing entities
@@ -839,7 +1012,7 @@ V1 is complete only when all criteria are true:
## 20. Post-V1 Backlog (Explicitly Deferred)
- plugin architecture
- cloud-grade plugin marketplace/distribution
- richer workflow-state customization per team
- milestones/labels/dependency graph depth beyond V1 minimum
- realtime transport optimization (SSE/WebSockets)
+2
View File
@@ -141,6 +141,8 @@ Hierarchical reporting structure. CEO at top, reports cascade down.
**Full visibility across the org.** Every agent can see the entire org chart, all tasks, all agents. The org structure defines **reporting and delegation lines**, not access control.
Visibility settings on an agent profile (where supported) do not alter company-level visibility for tasks, projects, issues, comments, costs, or activity. Those work-object privacy controls are not a V1 feature until centralized scoped authorization is in place.
Each agent publishes a short description of their responsibilities and capabilities — almost like skills ("when I'm relevant"). This lets other agents discover who can help with what.
### Cross-Team Work
Binary file not shown.

After

Width:  |  Height:  |  Size: 404 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 174 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 174 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 177 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 177 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 140 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 137 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 61 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

+541
View File
@@ -0,0 +1,541 @@
# Execution Semantics
Status: Current implementation guide
Date: 2026-05-23
Audience: Product and engineering
This document explains how Paperclip interprets issue assignment, issue status, execution runs, wakeups, parent/sub-issue structure, and blocker relationships.
`doc/SPEC-implementation.md` remains the V1 contract. This document is the detailed execution model behind that contract.
## 1. Core Model
Paperclip separates four concepts that are easy to blur together:
1. structure: parent/sub-issue relationships
2. dependency: blocker relationships
3. ownership: who is responsible for the issue now
4. execution: whether the control plane currently has a live path to move the issue forward
The system works best when those are kept separate.
## 2. Assignee Semantics
An issue has at most one assignee.
- `assigneeAgentId` means the issue is owned by an agent
- `assigneeUserId` means the issue is owned by a human board user
- both cannot be set at the same time
This is a hard invariant. Paperclip is single-assignee by design.
## 3. Status Semantics
Paperclip issue statuses are not just UI labels. They imply different expectations about ownership and execution.
### `backlog`
The issue is not ready for active work.
- no execution expectation
- no pickup expectation
- safe resting state for future work
### `todo`
The issue is actionable but not actively claimed.
- it may be assigned or unassigned
- no checkout/execution lock is required yet
- for agent-assigned work, Paperclip may still need a wake path to ensure the assignee actually sees it
### `in_progress`
The issue is actively owned work.
- requires an assignee
- for agent-owned issues, this is a strict execution-backed state
- for user-owned issues, this is a human ownership state and is not backed by heartbeat execution
For agent-owned issues, `in_progress` should not be allowed to become a silent dead state.
### `blocked`
The issue cannot proceed until something external changes.
This is the right state for:
- waiting on another issue
- waiting on a human decision
- waiting on an external dependency or system when Paperclip does not own a scheduled re-check
- work that automatic recovery could not safely continue
### `in_review`
Execution work is paused because the next move belongs to a reviewer or approver, not the current executor.
An external review service can also be a valid review path when the issue keeps an agent assignee and has an active one-shot monitor that will wake that assignee to check the service later.
### `done`
The work is complete and terminal.
### `cancelled`
The work will not continue and is terminal.
## 4. Agent-Owned vs User-Owned Execution
The execution model differs depending on assignee type.
### Agent-owned issues
Agent-owned issues are part of the control plane's execution loop.
- Paperclip can wake the assignee
- Paperclip can track runs linked to the issue
- Paperclip can recover some lost execution state after crashes/restarts
### User-owned issues
User-owned issues are not executed by the heartbeat scheduler.
- Paperclip can track the ownership and status
- Paperclip cannot rely on heartbeat/run semantics to keep them moving
- stranded-work reconciliation does not apply to them
This is why `in_progress` can be strict for agents without forcing the same runtime rules onto human-held work.
## 5. Checkout and Active Execution
Checkout is the bridge from issue ownership to active agent execution.
- checkout is required to move an issue into agent-owned `in_progress`
- `checkoutRunId` represents issue-ownership lock for the current agent run
- `executionRunId` represents the currently active execution path for the issue
These are related but not identical:
- `checkoutRunId` answers who currently owns execution rights for the issue
- `executionRunId` answers which run is actually live right now
Paperclip already clears stale execution locks and can adopt some stale checkout locks when the original run is gone.
## 6. Parent/Sub-Issue vs Blockers
Paperclip uses two different relationships for different jobs.
### Parent/Sub-Issue (`parentId`)
This is structural.
Use it for:
- work breakdown
- rollup context
- explaining why a child issue exists
- waking the parent assignee when all direct children become terminal
Do not treat `parentId` as execution dependency by itself.
### Blockers (`blockedByIssueIds`)
This is dependency semantics.
Use it for:
- \"this issue cannot continue until that issue changes state\"
- explicit waiting relationships
- automatic wakeups when all blockers resolve
Blocked issues should stay idle while blockers remain unresolved. Paperclip should not create a queued heartbeat run for that issue until the final blocker is done and the `issue_blockers_resolved` wake can start real work.
If a parent is truly waiting on a child, model that with blockers. Do not rely on the parent/child relationship alone.
## 7. Accepted-Plan Decomposition
An accepted plan confirmation is permission to decompose one specific accepted plan revision into child issues.
This complements the existing accepted-plan continuation rule: once a plan is accepted, the source issue may create child implementation issues, but it must not start implementation work on the source issue itself during that continuation.
Paperclip must treat accepted-plan decomposition as an exact-once control-plane primitive, not as a free-floating wake that any later run may interpret again.
### Exact-once fingerprint
The canonical decomposition fingerprint is:
- `(sourceIssueId, acceptedPlanRevisionId)`
Where:
- `sourceIssueId` is the issue whose `plan` document revision was accepted
- `acceptedPlanRevisionId` is the accepted `plan` document revision
This is the product contract because the accepted revision is the thing being authorized for decomposition. Re-accepting, re-waking, or re-reading the same accepted revision must not authorize a second child tree. A later accepted revision on the same source issue is a new fingerprint and may produce a different decomposition result.
An implementation may also store the accepted interaction id, acceptance run id, or other evidence, but those values must collapse onto the same uniqueness guarantee. They must not allow a second decomposition claim for the same `(sourceIssueId, acceptedPlanRevisionId)` pair.
### Durable claim and durable result
Before creating child issues, the first decomposition attempt must create or reuse a durable record for the fingerprint.
That durable record must be able to answer, without reconstructing the thread from comments or transcripts:
- whether decomposition for the fingerprint is `in_flight` or `completed`
- which run or owner currently holds the in-flight claim
- which child issues, if any, have already been created under that fingerprint
- which final child issue ids belong to the completed result
Paperclip does not need to mandate a specific storage shape in this document. The record may live in a dedicated table, source-issue execution state, interaction metadata, or another durable product surface. What matters is the contract:
- the claim is durable before fan-out starts
- partial progress is durable while fan-out is underway
- the completed child result set is durable after fan-out finishes
If a run creates some children and then dies, retries must continue from the same fingerprint and reuse the already-recorded partial result. They must not restart decomposition as if nothing happened.
### Parent live path while decomposition is in flight
While decomposition for an accepted fingerprint is incomplete, the source issue must expose an explicit live path for that same fingerprint.
The accepted interaction by itself is only evidence that the plan was approved. It is not a sufficient live path once decomposition begins. The source issue must make it clear what moves the fingerprint forward next, such as:
- the active decomposition run
- a queued continuation wake for the same assignee
- a monitor or explicit recovery action tied to the same decomposition claim
- a blocked state that names the real blocker for finishing that claimed decomposition
If the live run disappears, Paperclip must repair, resume, or visibly block the existing claim. It must not leave the source issue in a state where a second run can interpret the same acceptance as fresh permission to create sibling issues again.
### Concurrent and repeat attempts
Every later run that encounters the same accepted-plan fingerprint must consult the durable claim/result before creating children.
- If no claim exists, the run may atomically create the claim and become the decomposition owner.
- If a claim exists and is `in_flight`, the later run must reuse that claim. It may resume the same decomposition if it is the valid continuation owner, or it may exit after observing that another run already owns the work.
- If a claim exists and is `completed`, the later run must reuse the recorded child result and must not create new sibling issues.
- If the prior attempt ended after partial child creation, the retry must continue under the same fingerprint and preserve the already-created child ids.
Concurrent accepted-plan runs are therefore idempotent relative to the fingerprint. Creating multiple child trees for the same `(sourceIssueId, acceptedPlanRevisionId)` pair is a product bug.
## 8. Non-Terminal Issue Liveness Contract
For agent-owned, non-terminal issues, Paperclip should never leave work in a state where nobody is responsible for the next move and nothing will wake or surface it.
This is a visibility contract, not an auto-completion contract. If Paperclip cannot safely infer the next action, it should surface the ambiguity with a blocked state, a visible notice, or an explicit recovery action. It must not silently mark work done from prose comments or guess that a dependency is complete.
An issue is healthy when the product can answer "what moves this forward next?" without requiring a human to reconstruct intent from the whole thread. An issue is stalled when it is non-terminal but has no live execution path, no explicit waiting path, and no recovery path.
The valid action-path primitives are:
- an active run linked to the issue
- a queued wake or continuation that can be delivered to the responsible agent
- a typed execution-policy participant, such as `executionState.currentParticipant`
- a pending issue-thread interaction or linked approval that is waiting for a specific responder
- a one-shot issue monitor (`executionPolicy.monitor.nextCheckAt`) that will wake the assignee for a future check
- a human owner via `assigneeUserId`
- a first-class blocker chain whose unresolved leaf issues are themselves healthy
- an open explicit recovery action that names the owner and action needed to restore liveness
### Explicit recovery actions
An explicit recovery action is a typed liveness repair path for a source issue. It is the recovery primitive; the action can be rendered directly on the source issue or backed by a separate recovery issue when the repair needs its own work item.
A valid recovery action must name:
- the source issue and company
- the recovery kind and idempotency fingerprint
- the recovery owner, plus previous or return owner when ownership may temporarily shift
- the cause, bounded evidence, and next action
- the wake, monitor, timeout, retry, or escalation policy that will move the action forward
- the resolution outcome when closed, such as restored, delegated, false positive, blocked, escalated, or cancelled
A source-scoped recovery action is the default form. Use it when the next safe move is to repair the source issue's liveness directly: move the source issue back to `todo` so it can be retried, clarify disposition, re-establish a monitor, record a false positive, or delegate real follow-up work from the source issue.
Use an issue-backed recovery action only when the recovery is genuinely independent work or when source-scoped handling would be unsafe or unclear. Examples include:
- long or cross-agent repair work with its own assignee, subtasks, or blockers
- real delegated follow-up that should block the source issue as a first-class dependency
- active-run watchdog work that must observe a still-running source process without interfering with it
- recovery that needs separate review, approval, security handling, or escalation ownership
- cases where source issue ownership cannot be changed or restored safely
A comment or system notice can be evidence for a recovery action, but it is not a recovery action by itself. Comment-only recovery is not a healthy liveness path because it does not define a typed owner, wake or monitor policy, retry bound, timeout, escalation path, or resolution outcome.
#### Recovery action freshness
Source-scoped recovery actions are snapshots of the source issue's liveness state at the time the action was opened. They must be revalidated after newer durable source activity, including source issue status changes, assignee changes, blocker changes, execution policy or monitor changes, document or work-product updates that define a valid waiting path, and structured resume or disposition updates.
When newer source activity restores a valid live or waiting path, the recovery action is stale and should be folded through the explicit recovery lifecycle instead of being hidden or deleted. Folding means resolving or cancelling the recovery action with a resolution outcome and note that preserve the audit trail.
Plain comments alone do not make a recovery action stale. A comment can provide evidence, but the recovery action should remain visible when the source issue is still stalled and the comment does not create a valid action-path primitive such as a wake, monitor, interaction, approval, blocker, human owner, execution participant, terminal disposition, or delegated follow-up.
### Agent-assigned `todo`
This is dispatch state: ready to start, not yet actively claimed.
A healthy dispatch state means at least one of these is true:
- the issue already has a queued wake path
- the issue is intentionally resting in `todo` after a completed agent heartbeat, with no interrupted dispatch evidence
- the issue has been explicitly surfaced as stranded through a visible blocked/recovery path
An assigned `todo` issue is stalled when dispatch was interrupted, no wake remains queued or running, and no recovery path has been opened.
### Agent-assigned `backlog`
This is parked state, not dispatch state.
Assigning an issue normally implies executable intent. When create APIs receive an assignee and no explicit status, Paperclip defaults the issue to `todo` so the assignee has a wake path instead of silently inheriting the unassigned `backlog` default.
An explicit assigned `backlog` issue remains valid when the creator is deliberately parking the work. It must not wake the assignee just because it has an assignee. Paperclip should make that choice visible in activity and UI so operators can distinguish intentional parking from a missed handoff.
An assigned `backlog` issue becomes a liveness problem when another issue is blocked on it and there is no explicit waiting path such as a human owner, active run, queued wake, pending interaction or approval, monitor, or open recovery action. In that case the blocked parent should surface "blocked by parked work" rather than treating the dependency chain as healthy.
### Agent-assigned `in_progress`
This is active-work state.
A healthy active-work state means at least one of these is true:
- there is an active run for the issue
- there is already a queued continuation wake
- there is an active one-shot monitor that will wake the assignee for a future check
- there is an open explicit recovery action for the lost execution path
An agent-owned `in_progress` issue is stalled when it has no active run, no queued continuation, and no explicit recovery surface. A still-running but silent process is not automatically stalled; it is handled by the active-run watchdog contract.
### `in_review`
This is review/approval state: execution is paused because the next move belongs to a reviewer, approver, board user, or recovery owner.
A healthy `in_review` issue has at least one valid action path:
- a typed execution-policy participant who can approve or request changes
- a pending issue-thread interaction or linked approval waiting for a named responder
- a human owner via `assigneeUserId`
- an active run or queued wake that is expected to process the review state
- an active one-shot monitor for an external service or async review loop that the assignee owns
- an open explicit recovery action for an ambiguous review handoff
Agent-assigned `in_review` with no typed participant is only healthy when one of the other paths exists. Assignment to the same agent that produced the handoff is not, by itself, a review path.
An `in_review` issue is stalled when it has no typed participant, no pending interaction or approval, no user owner, no active monitor, no active run, no queued wake, and no explicit recovery action. Paperclip should surface that state as recovery work rather than silently completing the issue or leaving blocker chains parked indefinitely.
### Issue monitors
An issue monitor is a one-shot deferred action path for agent-owned issues in `in_progress` or `in_review`.
Use a monitor when the current assignee owns a future check against an async system or external service. Examples include Greptile review loops, GitHub checks, Vercel deployments, or provider jobs where the agent should come back later and decide what happens next.
Monitor policy lives under `executionPolicy.monitor` and includes:
- `nextCheckAt`: when Paperclip should wake the assignee
- `notes`: non-secret instructions for what the assignee should check
- `serviceName`: optional non-secret external-service context
- `externalRef`: optional external-service reference input; Paperclip treats it as secret-adjacent, redacts it before persistence/visibility, and omits it from activity and wake payloads
- `timeoutAt`, `maxAttempts`, and `recoveryPolicy`: optional recovery hints for bounded waits
Monitors are not recurring intervals. When a monitor fires, Paperclip clears the scheduled monitor and queues an `issue_monitor_due` wake for the assignee. If the external service is still pending, the assignee must explicitly re-arm the monitor with a new `nextCheckAt`. If the issue moves to `done`, `cancelled`, an invalid status, or a human/unassigned owner, the monitor is cleared.
Because `serviceName` and `notes` remain visible in issue activity and wake context, operators should keep them short and non-secret. Put enough context for the assignee to know what to inspect, but do not include signed URLs, bearer tokens, customer secrets, tenant-private identifiers, or provider links with embedded credentials.
Monitor bounds are enforced. Paperclip rejects attempts to re-arm a monitor whose `timeoutAt` or `maxAttempts` is already exhausted. When a scheduled monitor reaches an exhausted bound at trigger time, Paperclip clears it and follows `recoveryPolicy`: `wake_owner` queues a bounded recovery wake for the assignee, `create_recovery_issue` opens visible issue-backed recovery work, and `escalate_to_board` records a board-visible escalation comment/activity.
Use `blocked` instead of a monitor when no Paperclip assignee owns a responsible polling path. In that case, name the external owner/action or create first-class recovery/blocker work.
### `blocked`
This is explicit waiting state.
A healthy `blocked` issue has an explicit waiting path:
- first-class blockers exist, and each unresolved leaf has a valid action path under this contract
- the issue has an explicit recovery action that itself has a live or waiting path
- the issue is waiting on a pending interaction, linked approval, human owner, or clearly named external owner/action
A blocker chain is covered only when its unresolved leaf is live or explicitly waiting. An intermediate `blocked` issue does not make the chain healthy by itself.
A `blocked` issue is stalled when the unresolved blocker leaf has no active run, queued wake, typed participant, pending interaction or approval, user owner, external owner/action, or recovery action. In that case the parent should show the first stalled leaf instead of presenting the dependency as calmly covered.
## 9. Crash and Restart Recovery
Paperclip now treats crash/restart recovery as a stranded-assigned-work problem, not just a stranded-run problem.
There are two distinct failure modes.
### 9.1 Stranded assigned `todo`
Example:
- issue is assigned to an agent
- status is `todo`
- the original wake/run died during or after dispatch
- after restart there is no queued wake and nothing picks the issue back up
Recovery rule:
- if the latest issue-linked run failed/timed out/cancelled and no live execution path remains, Paperclip queues one automatic assignment recovery wake
- if that recovery wake also finishes and the issue is still stranded, Paperclip moves the issue to `blocked` and opens or updates an explicit recovery action when a bounded owner/action is known; the visible comment is evidence, not the recovery path by itself
This is a dispatch recovery, not a continuation recovery.
### 9.2 Stranded assigned `in_progress`
Example:
- issue is assigned to an agent
- status is `in_progress`
- the live run disappeared
- after restart there is no active run and no queued continuation
Recovery rule:
- Paperclip queues one automatic continuation wake
- if that continuation wake also finishes and the issue is still stranded, Paperclip moves the issue to `blocked` and opens or updates an explicit recovery action when a bounded owner/action is known; the visible comment is evidence, not the recovery path by itself
This is an active-work continuity recovery.
### 9.3 Recovery model-profile lane
Cheap model profiles are only for status-only operational recovery overhead. Paperclip may request `modelProfile: "cheap"` for bounded recovery-owner work that updates task liveness, clears bad status, records a disposition, or asks for human/manager intervention. Those wakes must carry guard context such as `allowDeliverableWork: false`, `allowDocumentUpdates: false`, and `resumeRequiresNormalModel: true`.
Automatic retries that can continue source work must use the original/normal model lane. This includes failed source-work retries, process-loss retries, transient/scheduled retries, max-turn continuations, source-assignee continuations, assigned-todo dispatch recovery, and any run that can update repo files, issue documents, plans, work products, or attachments. When a cheap status-only recovery determines that actual work remains, it must hand back to a normal-model worker run before source work or persistent deliverable updates resume. Cheap recovery hints must be scrubbed from copied retry, resume, child, and downstream source-work contexts.
## 10. Startup and Periodic Reconciliation
Startup recovery and periodic recovery are different from normal wakeup delivery.
On startup and on the periodic recovery loop, Paperclip now does five things in sequence:
1. reap orphaned `running` runs
2. resume persisted `queued` runs
3. reconcile stranded assigned work
4. scan silent active runs, revalidate their source issues, and either fold source-resolved watchdogs or create/update explicit watchdog recovery actions
5. reconcile productivity reviews
The stranded-work pass closes the gap where issue state survives a crash but the wake/run path does not. The silent-run scan covers the separate case where a live process exists but has stopped producing observable output. The productivity-review pass is later and separate; it reviews unusual progression patterns on assigned source issues, not stale run handles after a source issue already has a valid disposition.
## 11. Silent Active-Run Watchdog
An active run can still be unhealthy even when its process is `running`. Paperclip treats prolonged output silence as a watchdog signal, not as proof that the run is failed.
The recovery service owns this contract:
- classify active-run output silence as `ok`, `suspicious`, `critical`, `snoozed`, or `not_applicable`
- collect bounded evidence from run logs, recent run events, child issues, and blockers
- preserve redaction and truncation before evidence is written to issue descriptions
- create at most one open watchdog recovery action per run; issue-backed implementations use `stale_active_run_evaluation` issues
- honor active snooze decisions before creating more review work
- build the `outputSilence` summary shown by live-run and active-run API responses
Suspicious silence creates a medium-priority watchdog recovery action for the selected recovery owner. Critical silence raises that recovery action to high priority and, when issue-backed evaluation is needed for correctness, blocks the source issue on the explicit evaluation task without cancelling the active process.
Watchdog decisions are explicit operator/recovery-owner decisions:
- `snooze` records an operator-chosen future quiet-until time and suppresses scan-created review work during that window
- `continue` records that the current evidence is acceptable, does not cancel or mutate the active run, and sets a 30-minute default re-arm window before the watchdog evaluates the still-silent run again
- `dismissed_false_positive` records why the review was not actionable
Operators should prefer `snooze` for known time-bounded quiet periods. `continue` is only a short acknowledgement of the current evidence; if the run remains silent after the re-arm window, the periodic watchdog scan can create or update review work again.
The board can record watchdog decisions. The assigned owner of an issue-backed watchdog evaluation can also record them. Other agents cannot.
### Source-aware watchdog folding
Active-run watchdog work is source-aware. Before the watchdog creates, refreshes, escalates, or blocks on reviewer work, it must re-read the linked source issue and decide whether the watchdog signal is still about productive source work or only about stale run/process bookkeeping.
Fold watchdog work when all of these are true:
- the run is linked to a source issue in the same company
- the source issue is terminal (`done` or `cancelled`)
- durable source activity from the same run proves the source issue reached that terminal disposition after the stale-run or output-silence evidence point
- there is no independent evidence that the still-running or detached process is doing harmful work, still owns external cleanup that needs an operator decision, or needs a separate security/ownership review
Folding means resolving or cancelling the watchdog recovery action or issue-backed evaluation through the explicit recovery lifecycle. It must preserve the run id, source issue, detected silence or detached-process evidence, terminal source activity, decision reason, and best-effort process cleanup result. It must be idempotent for the `(companyId, runId, sourceIssueId)` signal and must not recursively recover the watchdog evaluation issue itself.
Do not fold watchdog work only because the run is quiet. The watchdog must still create or continue reviewer work when:
- the source issue is still `todo` or `in_progress`, because productive work may still be happening or stuck
- the source issue remains `in_progress` after a successful run with no valid disposition, because the successful-run handoff path owns that bounded correction
- the run terminated or disappeared while the source issue remains `in_progress` without a live path, because stranded assigned recovery owns that continuity repair
- the source issue is terminal but there is no durable same-run terminal activity after the stale evidence point
- there is independent evidence that the process may still be mutating external state, leaking resources, crossing company or ownership boundaries, or otherwise needs operator review
In the normal non-terminal case, critical silence can still create issue-backed evaluation work and block the source issue when blocking is necessary for correctness. In the source-resolved case, a completed source issue should not acquire a new manager review or blocker merely because an old run handle stayed active; only real unresolved work should block work.
This is distinct from productivity review. Productivity review asks whether an assigned source issue has unusual progression patterns, such as no-comment terminal-run streaks, long active duration, or high churn. Source-resolved watchdog folding asks whether a stale active-run signal outlived a source issue that already reached a valid terminal disposition. One does not substitute for the other.
Detached process cleanup is operational hygiene, not source issue liveness. Cleanup should be best-effort and auditable. If cleanup fails but the source issue is already terminal with same-run durable evidence, Paperclip should preserve the cleanup failure on the run/watchdog audit trail and route only the cleanup concern to bounded recovery when a real owner/action remains.
## 12. Auto-Recover vs Explicit Recovery vs Human Escalation
Paperclip uses three different recovery outcomes, depending on how much it can safely infer.
### Auto-Recover
Auto-recovery is allowed when ownership is clear and the control plane only lost execution continuity.
Examples:
- requeue one dispatch wake for an assigned `todo` issue whose latest run failed, timed out, or was cancelled
- requeue one continuation wake for an assigned `in_progress` issue whose live execution path disappeared
- assign an orphan blocker back to its creator when that blocker is already preventing other work
Auto-recovery preserves the existing owner. It does not choose a replacement agent.
### Explicit Recovery Action
Paperclip opens an explicit recovery action when the system can identify a problem but cannot safely complete the work itself.
Examples:
- automatic stranded-work retry was already exhausted
- a dependency graph has an invalid/uninvokable owner, unassigned blocker, or invalid review participant
- an active run is silent past the watchdog threshold
The recovery action stays source-scoped by default. The source issue should show the recovery owner, cause, evidence, next action, and wake or monitor policy in its own thread/detail surface.
Create an issue-backed recovery action only when a separate issue is the right execution object. In that fallback form, the source issue remains visible and is blocked on the recovery issue when blocking is necessary for correctness. The recovery owner must restore a live path, resolve the source issue manually, delegate real follow-up work, or record the reason the signal is a false positive.
Instance-level issue-graph liveness auto-recovery is disabled by default. When enabled, its lookback window means "dependency paths updated within the last N hours"; older findings remain advisory and are counted as outside the configured lookback instead of creating recovery actions automatically. This is an operator noise control, not the older staleness delay for determining whether a chain is old enough to surface.
### Human Escalation
Human escalation is required when the next safe action depends on board judgment, budget/approval policy, or information unavailable to the control plane.
Examples:
- all candidate recovery owners are paused, terminated, pending approval, or budget-blocked
- the issue is human-owned rather than agent-owned
- the run is intentionally quiet but needs an operator decision before cancellation or continuation
In these cases Paperclip should leave a visible issue/comment trail instead of silently retrying.
## 13. What This Does Not Mean
These semantics do not change V1 into an auto-reassignment system.
Paperclip still does not:
- automatically reassign work to a different agent
- infer dependency semantics from `parentId` alone
- treat human-held work as heartbeat-managed execution
The recovery model is intentionally conservative:
- preserve ownership
- retry once when the control plane lost execution continuity
- open an explicit recovery action when the system can identify a bounded recovery owner/action
- escalate visibly when the system cannot safely keep going
## 14. Practical Interpretation
For a board operator, the intended meaning is:
- agent-owned `in_progress` should mean \"this is live work or clearly surfaced as a problem\"
- agent-owned `todo` should not stay assigned forever after a crash with no remaining wake path
- parent/sub-issue explains structure
- blockers explain waiting
That is the execution contract Paperclip should present to operators.
+4
View File
@@ -22,6 +22,7 @@ The question is not "which memory project wins?" The question is "what is the sm
### Hosted memory APIs
- `mem0`
- `AWS Bedrock AgentCore Memory`
- `supermemory`
- `Memori`
@@ -49,6 +50,7 @@ These emphasize local persistence, inspectability, and low operational overhead.
|---|---|---|---|---|
| [nuggets](https://github.com/NeoVertex1/nuggets) | local memory engine + messaging gateway | topic-scoped HRR memory with `remember`, `recall`, `forget`, fact promotion into `MEMORY.md` | good example of lightweight local memory and automatic promotion | very specific architecture; not a general multi-tenant service |
| [mem0](https://github.com/mem0ai/mem0) | hosted + OSS SDK | `add`, `search`, `getAll`, `get`, `update`, `delete`, `deleteAll`; entity partitioning via `user_id`, `agent_id`, `run_id`, `app_id` | closest to a clean provider API with identities and metadata filters | provider owns extraction heavily; Paperclip should not assume every backend behaves like mem0 |
| [AWS Bedrock AgentCore Memory](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/memory.html) | AWS-managed memory service | explicit short-term and long-term memories, actor/session/event APIs, memory strategies, namespace templates, optional self-managed extraction pipeline | strong example of provider-managed memory with clear scoped ids, retention controls, and standalone API access outside a single agent framework | AWS-hosted and IAM-centric; Paperclip would still need its own company/run/comment provenance, cost rollups, and likely a plugin wrapper instead of baking AWS semantics into core |
| [MemOS](https://github.com/MemTensor/MemOS) | memory OS / framework | unified add-retrieve-edit-delete, memory cubes, multimodal memory, tool memory, async scheduler, feedback/correction | strong source for optional capabilities beyond plain search | much broader than the minimal contract Paperclip should standardize first |
| [supermemory](https://github.com/supermemoryai/supermemory) | hosted memory + context API | `add`, `profile`, `search.memories`, `search.documents`, document upload, settings; automatic profile building and forgetting | strong example of "context bundle" rather than raw search results | heavily productized around its own ontology and hosted flow |
| [memU](https://github.com/NevaMind-AI/memU) | proactive agent memory framework | file-system metaphor, proactive loop, intent prediction, always-on companion model | good source for when memory should trigger agent behavior, not just retrieval | proactive assistant framing is broader than Paperclip's task-centric control plane |
@@ -77,6 +79,7 @@ These differences are exactly why Paperclip needs a layered contract instead of
### 1. Who owns extraction?
- `mem0`, `supermemory`, and `Memori` expect the provider to infer memories from conversations.
- `AWS Bedrock AgentCore Memory` supports both provider-managed extraction and self-managed pipelines where the host writes curated long-term memory records.
- `memsearch` expects the host to decide what markdown to write, then indexes it.
- `MemOS`, `memU`, `EverMemOS`, and `OpenViking` sit somewhere in between and often expose richer memory construction pipelines.
@@ -104,6 +107,7 @@ Paperclip should make plain search the minimum contract and richer outputs optio
### 4. Is memory synchronous or asynchronous?
- local tools often work synchronously in-process.
- `AWS Bedrock AgentCore Memory` is synchronous at the API edge, but its long-term memory path includes background extraction/indexing behavior and retention policies managed by the provider.
- larger systems add schedulers, background indexing, compaction, or sync jobs.
Paperclip needs both direct request/response operations and background maintenance hooks.
@@ -1,6 +1,6 @@
# 2026-03-14 Adapter Skill Sync Rollout
Status: Proposed
Status: Implemented for local adapters; gateway remains unsupported
Date: 2026-03-14
Audience: Product and engineering
Related:
@@ -25,8 +25,10 @@ Paperclip currently has these adapters:
- `claude_local`
- `codex_local`
- `cursor_local`
- `cursor`
- `gemini_local`
- `grok_local`
- `acpx_local`
- `opencode_local`
- `pi_local`
- `openclaw_gateway`
@@ -39,12 +41,14 @@ The current skill API supports:
Current implementation state:
- `codex_local`: implemented, `persistent`
- `codex_local`: implemented, `ephemeral`
- `claude_local`: implemented, `ephemeral`
- `cursor_local`: not yet implemented, but technically suited to `persistent`
- `gemini_local`: not yet implemented, but technically suited to `persistent`
- `pi_local`: not yet implemented, but technically suited to `persistent`
- `opencode_local`: not yet implemented; likely `persistent`, but with special handling because it currently injects into Claudes shared skills home
- `cursor`: implemented, `persistent`
- `gemini_local`: implemented, `persistent`
- `pi_local`: implemented, `persistent`
- `opencode_local`: implemented, `persistent`, with shared Claude skills home caveats
- `acpx_local`: implemented, `ephemeral` for Claude/Codex sub-agents and `unsupported` for custom commands
- `grok_local`: implemented, `ephemeral`
- `openclaw_gateway`: not yet implemented; blocked on gateway protocol support, so `unsupported` for now
## 3. Product Principles
@@ -64,8 +68,7 @@ These adapters have a stable local skills directory that Paperclip can read and
Candidates:
- `codex_local`
- `cursor_local`
- `cursor`
- `gemini_local`
- `pi_local`
- `opencode_local` with caveats
@@ -84,7 +87,10 @@ These adapters do not have a meaningful Paperclip-owned persistent install state
Current adapter:
- `codex_local`
- `claude_local`
- `acpx_local` when configured for Claude or Codex
- `grok_local`
Expected UX:
@@ -99,6 +105,7 @@ These adapters cannot support skill sync without new external capabilities.
Current adapter:
- `acpx_local` when configured for custom commands
- `openclaw_gateway`
Expected UX:
@@ -114,7 +121,7 @@ Expected UX:
Target mode:
- `persistent`
- `ephemeral`
Current state:
@@ -122,15 +129,15 @@ Current state:
Requirements to finish:
- keep as reference implementation
- tighten tests around external custom skills and stale removal
- ensure imported company skills can be attached and synced without manual path work
- keep runtime-mounted snapshots separate from persistent install snapshots
- ensure imported company skills can be attached and mounted without manual path work
- keep `CODEX_HOME/skills` mutation scoped to heartbeat execution, not `skills/sync`
Success criteria:
- list installed managed and external skills
- sync desired skills into `CODEX_HOME/skills`
- preserve external user-managed skills
- desired skills are stored in Paperclip
- selected skills are linked into the effective `CODEX_HOME/skills` during runs
- no persistent installed/stale state is reported from `skills/sync`
### 5.2 Claude Local
@@ -162,18 +169,11 @@ Target mode:
Technical basis:
- runtime already injects Paperclip skills into `~/.cursor/skills`
- Paperclip reconciles desired skills into `~/.cursor/skills`
Implementation work:
Current state:
1. Add `listSkills` for Cursor.
2. Add `syncSkills` for Cursor.
3. Reuse the same managed-symlink pattern as Codex.
4. Distinguish:
- managed Paperclip skills
- external skills already present
- missing desired skills
- stale managed skills
- implemented
Testing:
@@ -194,14 +194,11 @@ Target mode:
Technical basis:
- runtime already injects Paperclip skills into `~/.gemini/skills`
- Paperclip reconciles desired skills into `~/.gemini/skills`
Implementation work:
Current state:
1. Add `listSkills` for Gemini.
2. Add `syncSkills` for Gemini.
3. Reuse managed-symlink conventions from Codex/Cursor.
4. Verify auth remains untouched while skills are reconciled.
- implemented
Potential caveat:
@@ -219,14 +216,11 @@ Target mode:
Technical basis:
- runtime already injects Paperclip skills into `~/.pi/agent/skills`
- Paperclip reconciles desired skills into `~/.pi/agent/skills`
Implementation work:
Current state:
1. Add `listSkills` for Pi.
2. Add `syncSkills` for Pi.
3. Reuse managed-symlink helpers.
4. Verify session-file behavior remains independent from skill sync.
- implemented
Success criteria:
@@ -250,9 +244,7 @@ This is product-risky because:
Plan:
Phase 1:
- implement `listSkills` and `syncSkills`
- implemented `listSkills` and `syncSkills`
- treat it as `persistent`
- explicitly label the home as shared in UI copy
- only remove stale managed Paperclip skills that are clearly marked as Paperclip-managed
@@ -290,6 +282,30 @@ Future target:
- likely a fourth truth model eventually, such as remote-managed persistent state
- for now, keep the current API and treat gateway as unsupported
### 5.8 ACPX Local
Target mode:
- `ephemeral` for built-in Claude/Codex ACPX sub-agents
- `unsupported` for custom ACP commands
Success criteria:
- Claude/Codex ACPX snapshots show skills as configured for the next session
- custom command snapshots keep desired skills tracked only and do not imply runtime sync
### 5.9 Grok Local
Target mode:
- `ephemeral`
Success criteria:
- desired skills are stored in Paperclip
- selected skills are copied into the execution workspace for the next run
- no persistent installed/stale state is reported from `skills/sync`
## 6. API Plan
## 6.1 Keep the current minimal adapter API
@@ -333,14 +349,13 @@ Additional UI requirement for shared-home adapters:
Ship:
- `cursor_local`
- `cursor`
- `gemini_local`
- `pi_local`
Rationale:
Status:
- these are the closest to Codex in architecture
- they already inject into stable local skill homes
- implemented
### Phase 2: OpenCode shared-home support
@@ -348,10 +363,9 @@ Ship:
- `opencode_local`
Rationale:
Status:
- technically feasible now
- needs slightly more careful product language because of the shared Claude skills home
- implemented with shared Claude skills-home warning
### Phase 3: Gateway support decision
@@ -390,10 +404,10 @@ Adapter-wide skill support is ready when all are true:
The recommended immediate order is:
1. `cursor_local`
1. `cursor`
2. `gemini_local`
3. `pi_local`
4. `opencode_local`
5. defer `openclaw_gateway`
That gets Paperclip from “skills work for Codex and Claude” to “skills work for the whole local-adapter family,” which is the meaningful V1 milestone.
The local-adapter family now has explicit truth models. The remaining V1 boundary is `openclaw_gateway`, which should stay unsupported until the gateway protocol can report real remote skill state.
@@ -7,10 +7,10 @@ Define a Paperclip memory service and surface API that can sit above multiple me
- company scoping
- auditability
- provenance back to Paperclip work objects
- budget / cost visibility
- budget and cost visibility
- plugin-first extensibility
This plan is based on the external landscape summarized in `doc/memory-landscape.md` and on the current Paperclip architecture in:
This plan is based on the external landscape summarized in `doc/memory-landscape.md`, the AWS AgentCore comparison captured in [PAP-1274](/PAP/issues/PAP-1274), and the current Paperclip architecture in:
- `doc/SPEC-implementation.md`
- `doc/plugins/PLUGIN_SPEC.md`
@@ -19,23 +19,26 @@ This plan is based on the external landscape summarized in `doc/memory-landscape
## Recommendation In One Sentence
Paperclip should not embed one opinionated memory engine into core. It should add a company-scoped memory control plane with a small normalized adapter contract, then let built-ins and plugins implement the provider-specific behavior.
Paperclip should add a company-scoped memory control plane with company default plus agent override resolution, shared hook delivery, and full operation attribution, while leaving extraction and storage semantics to built-ins and plugins.
## Product Decisions
### 1. Memory is company-scoped by default
### 1. Memory resolution is company default plus agent override
Every memory binding belongs to exactly one company.
That binding can then be:
Resolution order in V1:
- the company default
- an agent override
- a project override later if we need it
- company default binding
- optional per-agent override
There is no per-project override in V1.
Project context can still appear in scope and provenance so providers can use it for retrieval and partitioning, but projects do not participate in binding selection.
No cross-company memory sharing in the initial design.
### 2. Providers are selected by key
### 2. Providers are selected by stable binding key
Each configured memory provider gets a stable key inside a company, for example:
@@ -44,36 +47,53 @@ Each configured memory provider gets a stable key inside a company, for example:
- `local-markdown`
- `research-kb`
Agents and services resolve the active provider by key, not by hard-coded vendor logic.
Agents, tools, and background hooks resolve the active provider by key, not by hard-coded vendor logic.
### 3. Plugins are the primary provider path
Built-ins are useful for a zero-config local path, but most providers should arrive through the existing Paperclip plugin runtime.
That keeps the core small and matches the current direction that optional knowledge-like systems live at the edges.
That keeps the core small and matches the broader Paperclip direction that specialized knowledge systems live at the edges.
### 4. Paperclip owns routing, provenance, and accounting
### 4. Paperclip owns routing, provenance, and policy
Providers should not decide how Paperclip entities map to governance.
Paperclip core should own:
- binding resolution
- who is allowed to call a memory operation
- which company / agent / project scope is active
- what issue / run / comment / document the operation belongs to
- how usage gets recorded
- which company, agent, issue, project, run, and subject scope is active
- what source object the operation belongs to
- how usage and costs are attributed
- how operators inspect what happened
### 5. Automatic memory should be narrow at first
### 5. Paperclip exposes shared hooks, providers own extraction
Paperclip should emit a common set of memory hooks that built-ins, third-party adapters, and plugins can all use.
Those hooks should pass structured Paperclip source objects plus normalized metadata. The provider then decides how to extract from those objects.
Paperclip should not force one extraction pipeline or one canonical "memory text" transform before the provider sees the input.
### 6. Automatic memory should start narrow, but the hook surface should be general
Automatic capture is useful, but broad silent capture is dangerous.
Initial automatic hooks should be:
Initial built-in automatic hooks should be:
- pre-run hydrate for agent context recall
- post-run capture from agent runs
- issue comment / document capture when the binding enables it
- pre-run recall for agent context hydration
- optional issue comment capture
- optional issue document capture
Everything else should start explicit.
The hook registry itself should be general enough that other providers can subscribe to the same events without core changes.
### 7. No approval gate for binding changes in the open-source product
For the open-source version, changing memory bindings should not require approvals.
Paperclip should still log those changes in activity and preserve full auditability. Approval-gated memory governance can remain an enterprise or future policy layer.
## Proposed Concepts
@@ -83,7 +103,7 @@ A built-in or plugin-supplied implementation that stores and retrieves memory.
Examples:
- local markdown + vector index
- local markdown plus semantic index
- mem0 adapter
- supermemory adapter
- MemOS adapter
@@ -94,6 +114,15 @@ A company-scoped configuration record that points to a provider and carries prov
This is the object selected by key.
### Memory binding target
A mapping from a Paperclip target to a binding.
V1 targets:
- `company`
- `agent`
### Memory scope
The normalized Paperclip scope passed into a provider request.
@@ -105,7 +134,9 @@ At minimum:
- optional `projectId`
- optional `issueId`
- optional `runId`
- optional `subjectId` for external/user identity
- optional `subjectId` for external or user identity
- optional `sessionKey` for providers that organize memory around sessions
- optional `namespace` for providers that need an explicit partition hint
### Memory source reference
@@ -121,24 +152,36 @@ Supported source kinds should include:
- `manual_note`
- `external_document`
### Memory hook
A normalized trigger emitted by Paperclip when something memory-relevant happens.
Initial hook kinds:
- `pre_run_hydrate`
- `post_run_capture`
- `issue_comment_capture`
- `issue_document_capture`
- `manual_capture`
### Memory operation
A normalized write, query, browse, or delete action performed through Paperclip.
A normalized capture, record-write, query, browse, get, correction, or delete action performed through Paperclip.
Paperclip should log every operation, whether the provider is local or external.
Paperclip should log every memory operation whether the provider is local, plugin-backed, or external.
## Required Adapter Contract
The required core should be small enough to fit `memsearch`, `mem0`, `Memori`, `MemOS`, or `OpenViking`.
The required core should be small enough to fit `memsearch`, `mem0`, `Memori`, `MemOS`, or `OpenViking`, but strong enough to satisfy Paperclip's attribution and inspectability requirements.
```ts
export interface MemoryAdapterCapabilities {
profile?: boolean;
browse?: boolean;
correction?: boolean;
asyncIngestion?: boolean;
multimodal?: boolean;
providerManagedExtraction?: boolean;
asyncExtraction?: boolean;
providerNativeBrowse?: boolean;
}
export interface MemoryScope {
@@ -148,6 +191,8 @@ export interface MemoryScope {
issueId?: string;
runId?: string;
subjectId?: string;
sessionKey?: string;
namespace?: string;
}
export interface MemorySourceRef {
@@ -168,10 +213,34 @@ export interface MemorySourceRef {
externalRef?: string;
}
export interface MemoryHookContext {
hookKind:
| "pre_run_hydrate"
| "post_run_capture"
| "issue_comment_capture"
| "issue_document_capture"
| "manual_capture";
hookId: string;
triggeredAt: string;
actorAgentId?: string;
heartbeatRunId?: string;
}
export interface MemorySourcePayload {
text?: string;
mimeType?: string;
metadata?: Record<string, unknown>;
object?: Record<string, unknown>;
}
export interface MemoryUsage {
provider: string;
biller?: string;
model?: string;
billingType?: "metered_api" | "subscription_included" | "subscription_overage" | "unknown";
attributionMode?: "billed_directly" | "included_in_run" | "external_invoice" | "untracked";
inputTokens?: number;
cachedInputTokens?: number;
outputTokens?: number;
embeddingTokens?: number;
costCents?: number;
@@ -179,20 +248,32 @@ export interface MemoryUsage {
details?: Record<string, unknown>;
}
export interface MemoryWriteRequest {
bindingKey: string;
scope: MemoryScope;
source: MemorySourceRef;
content: string;
metadata?: Record<string, unknown>;
mode?: "append" | "upsert" | "summarize";
}
export interface MemoryRecordHandle {
providerKey: string;
providerRecordId: string;
}
export interface MemoryCaptureRequest {
bindingKey: string;
scope: MemoryScope;
source: MemorySourceRef;
payload: MemorySourcePayload;
hook?: MemoryHookContext;
mode?: "capture_residue" | "capture_record";
metadata?: Record<string, unknown>;
}
export interface MemoryRecordWriteRequest {
bindingKey: string;
scope: MemoryScope;
source?: MemorySourceRef;
records: Array<{
text: string;
summary?: string;
metadata?: Record<string, unknown>;
}>;
}
export interface MemoryQueryRequest {
bindingKey: string;
scope: MemoryScope;
@@ -202,6 +283,14 @@ export interface MemoryQueryRequest {
metadataFilter?: Record<string, unknown>;
}
export interface MemoryListRequest {
bindingKey: string;
scope: MemoryScope;
cursor?: string;
limit?: number;
metadataFilter?: Record<string, unknown>;
}
export interface MemorySnippet {
handle: MemoryRecordHandle;
text: string;
@@ -217,30 +306,149 @@ export interface MemoryContextBundle {
usage?: MemoryUsage[];
}
export interface MemoryListPage {
items: MemorySnippet[];
nextCursor?: string;
usage?: MemoryUsage[];
}
export interface MemoryExtractionJob {
providerJobId: string;
status: "queued" | "running" | "succeeded" | "failed" | "cancelled";
hookKind?: MemoryHookContext["hookKind"];
source?: MemorySourceRef;
error?: string;
submittedAt?: string;
startedAt?: string;
finishedAt?: string;
}
export interface MemoryAdapter {
key: string;
capabilities: MemoryAdapterCapabilities;
write(req: MemoryWriteRequest): Promise<{
capture(req: MemoryCaptureRequest): Promise<{
records?: MemoryRecordHandle[];
jobs?: MemoryExtractionJob[];
usage?: MemoryUsage[];
}>;
upsertRecords(req: MemoryRecordWriteRequest): Promise<{
records?: MemoryRecordHandle[];
usage?: MemoryUsage[];
}>;
query(req: MemoryQueryRequest): Promise<MemoryContextBundle>;
list(req: MemoryListRequest): Promise<MemoryListPage>;
get(handle: MemoryRecordHandle, scope: MemoryScope): Promise<MemorySnippet | null>;
forget(handles: MemoryRecordHandle[], scope: MemoryScope): Promise<{ usage?: MemoryUsage[] }>;
}
```
This contract intentionally does not force a provider to expose its internal graph, filesystem, or ontology.
This contract intentionally does not force a provider to expose its internal graph, file tree, or ontology. It does require enough structure for Paperclip to browse, attribute, and audit what happened.
## Optional Adapter Surfaces
These should be capability-gated, not required:
- `browse(scope, filters)` for file-system / graph / timeline inspection
- `correct(handle, patch)` for natural-language correction flows
- `profile(scope)` when the provider can synthesize stable preferences or summaries
- `sync(source)` for connectors or background ingestion
- `listExtractionJobs(scope, cursor)` when async extraction needs richer operator visibility
- `retryExtractionJob(jobId)` when a provider supports re-drive
- `explain(queryResult)` for providers that can expose retrieval traces
- provider-native browse or graph surfaces exposed through plugin UI
## Lessons From AWS AgentCore Memory API
AWS AgentCore Memory is a useful check on whether this plan is too abstract or missing important operational surfaces.
The broad direction still looks right:
- AWS splits memory into a control plane (`CreateMemory`, `UpdateMemory`, `ListMemories`) and a data plane (`CreateEvent`, `RetrieveMemoryRecords`, `GetMemoryRecord`, `ListMemoryRecords`)
- AWS separates raw interaction capture from curated long-term memory records
- AWS supports both provider-managed extraction and self-managed pipelines
- AWS treats browse and list operations as first-class APIs, not ad hoc debugging helpers
- AWS exposes extraction jobs instead of hiding asynchronous maintenance completely
That lines up with the Paperclip plan at a high level: provider configuration, scoped writes, scoped retrieval, provider-managed extraction as a capability, and a browse and inspect surface.
The concrete changes Paperclip should take from AWS are:
### 1. Keep config APIs separate from runtime traffic
The rollout should preserve a clean separation between:
- control-plane APIs for binding CRUD, defaults, overrides, and capability metadata
- runtime APIs and tools for capture, record writes, query, list, get, forget, and extraction status
This keeps governance changes distinct from high-volume memory traffic.
### 2. Distinguish capture from curated record writes
AWS does not flatten everything into one write primitive. It distinguishes captured events from durable memory records.
Paperclip should do the same:
- `capture(...)` for raw run, comment, document, or activity residue
- `upsertRecords(...)` for curated durable facts and notes
That is a better fit for provider-managed extraction and for manual curation flows.
### 3. Make list and browse first-class
AWS exposes list and retrieve surfaces directly. Paperclip should not make browse optional at the portable layer.
The minimum portable surface should include:
- `query`
- `list`
- `get`
Provider-native graph or file browsing can remain optional beyond that.
### 4. Add pagination and cursors for operator inspection
AWS consistently uses pagination on browse-heavy APIs.
Paperclip should add cursor-based pagination to:
- record listing
- extraction job listing
- memory operation explorer APIs
Prompt hydration can continue to use `topK`, but operator surfaces need cursors.
### 5. Add explicit session and namespace hints
AWS uses `actorId`, `sessionId`, `namespace`, and `memoryStrategyId` heavily.
Paperclip should keep its own control-plane-centric model, but the adapter contract needs obvious places to map those concepts:
- `sessionKey`
- `namespace`
The provider adapter can map them to AWS or other vendor-specific identifiers without leaking those identifiers into core.
### 6. Treat asynchronous extraction as a real operational surface
AWS exposes extraction jobs explicitly. Paperclip should too.
Operators should be able to see:
- pending extraction work
- failed extraction work
- which hook or source caused the work
- whether a retry is available
### 7. Keep Paperclip provenance primary
Paperclip should continue to center:
- `companyId`
- `agentId`
- `projectId`
- `issueId`
- `runId`
- issue comments, documents, and activity as sources
The lesson from AWS is to support clean mapping into provider-specific models, not to let provider identifiers take over the core product model.
## What Paperclip Should Persist
@@ -248,39 +456,67 @@ Paperclip should not mirror the full provider memory corpus into Postgres unless
Paperclip core should persist:
- memory bindings and overrides
- memory bindings
- company default and agent override resolution targets
- provider keys and capability metadata
- normalized memory operation logs
- provider record handles returned by operations when available
- source references back to issue comments, documents, runs, and activity
- usage and cost data
- provider record handles returned by operations when available
- hook delivery records and extraction job state
- usage and cost attribution
For external providers, the memory payload itself can remain in the provider.
For external providers, the actual memory payload can remain in the provider.
## Hook Model
### Automatic hooks
### Shared hook surface
Paperclip should expose one shared hook system for memory.
That same system must be available to:
- built-in memory providers
- plugin-based memory providers
- third-party adapter integrations that want to use memory hooks
### What a hook delivers
Each hook delivery should include:
- resolved binding key
- normalized `MemoryScope`
- `MemorySourceRef`
- structured source payload
- hook metadata such as hook kind, trigger time, and related run id
The payload should include structured objects where possible so the provider can decide how to extract and chunk.
### Initial automatic hooks
These should be low-risk and easy to reason about:
1. `pre-run hydrate`
1. `pre_run_hydrate`
Before an agent run starts, Paperclip may call `query(... intent = "agent_preamble")` using the active binding.
2. `post-run capture`
After a run finishes, Paperclip may write a summary or transcript-derived note tied to the run.
2. `post_run_capture`
After a run finishes, Paperclip may call `capture(...)` with structured run output, excerpts, and provenance.
3. `issue comment / document capture`
When enabled on the binding, Paperclip may capture selected issue comments or issue documents as memory sources.
3. `issue_comment_capture`
When enabled on the binding, Paperclip may call `capture(...)` for selected issue comments.
### Explicit hooks
4. `issue_document_capture`
When enabled on the binding, Paperclip may call `capture(...)` for selected issue documents.
These should be tool- or UI-driven first:
### Explicit tools and APIs
These should be tool-driven or UI-driven first:
- `memory.search`
- `memory.note`
- `memory.forget`
- `memory.correct`
- `memory.browse`
- memory record list and get
- extraction-job inspection
### Not automatic in the first version
@@ -309,34 +545,69 @@ The initial browse surface should support:
- active binding by company and agent
- recent memory operations
- recent write sources
- recent write and capture sources
- record list and record detail with source backlinks
- query results with source backlinks
- filters by agent, issue, run, source kind, and date
- provider usage / cost / latency summaries
- extraction job status
- filters by agent, issue, project, run, source kind, and date
- provider usage, cost, and latency summaries
When a provider supports richer browsing, the plugin can add deeper views through the existing plugin UI surfaces.
## Cost And Evaluation
Every adapter response should be able to return usage records.
Paperclip should treat memory accounting as two related but distinct concerns:
Paperclip should roll up:
### 1. `memory_operations` is the authoritative audit trail
- memory inference tokens
- embedding tokens
- external provider cost
Every memory action should create a normalized operation record that captures:
- binding
- scope
- source provenance
- operation type
- success or failure
- latency
- query count
- write count
- usage details reported by the provider
- attribution mode
- related run, issue, and agent when available
It should also record evaluation-oriented metrics where possible:
This is where operators answer "what memory work happened and why?"
### 2. `cost_events` remains the canonical spend ledger for billable metered usage
The current `cost_events` model is already the canonical cost ledger for token and model spend, and `agent_runtime_state` plus `heartbeat_runs.usageJson` already roll up and summarize run usage.
The recommendation is:
- if a memory operation runs inside a normal Paperclip agent heartbeat and the model usage is already counted on that run, do not create a duplicate `cost_event`
- instead, store the memory operation with `attributionMode = "included_in_run"` and link it to the related `heartbeatRunId`
- if a memory provider makes a direct metered model call outside the agent run accounting path, the provider must report usage and Paperclip should create a `cost_event`
- that direct `cost_event` should still link back to the memory operation, agent, company, and issue or run context when possible
### 3. `finance_events` should carry flat subscription or invoice-style costs
If a memory service incurs:
- monthly subscription cost
- storage invoices
- provider platform charges not tied to one request
those should be represented as `finance_events`, not as synthetic per-query memory operations.
That keeps usage telemetry separate from accounting entries like invoices and flat fees.
### 4. Evaluation metrics still matter
Paperclip should record evaluation-oriented metrics where possible:
- recall hit rate
- empty query rate
- manual correction count
- per-binding success / failure counts
- extraction failure count
- per-binding success and failure counts
This is important because a memory system that "works" but silently burns budget is not acceptable in Paperclip.
This is important because a memory system that "works" but silently burns budget or silently fails extraction is not acceptable in Paperclip.
## Suggested Data Model Additions
@@ -344,23 +615,36 @@ At the control-plane level, the likely new core tables are:
- `memory_bindings`
- company-scoped key
- provider id / plugin id
- provider id or plugin id
- config blob
- enabled status
- `memory_binding_targets`
- target type (`company`, `agent`, later `project`)
- target type (`company`, `agent`)
- target id
- binding id
- `memory_operations`
- company id
- binding id
- operation type (`write`, `query`, `forget`, `browse`, `correct`)
- operation type (`capture`, `record_upsert`, `query`, `list`, `get`, `forget`, `correct`)
- scope fields
- source refs
- usage / latency / cost
- success / error
- usage, latency, and attribution mode
- related heartbeat run id
- related cost event id
- success or error
- `memory_extraction_jobs`
- company id
- binding id
- operation id
- provider job id
- hook kind
- status
- source refs
- error
- submitted, started, and finished timestamps
Provider-specific long-form state should stay in plugin state or the provider itself unless a built-in local provider needs its own schema.
@@ -382,45 +666,46 @@ The design should still treat that built-in as just another provider behind the
### Phase 1: Control-plane contract
- add memory binding models and API types
- add plugin capability / registration surface for memory providers
- add operation logging and usage reporting
- add company default plus agent override resolution
- add plugin capability and registration surface for memory providers
### Phase 2: One built-in + one plugin example
### Phase 2: Hook delivery and operation audit
- add shared memory hook emission in core
- add operation logging, extraction job state, and usage attribution
- add direct-provider cost and finance-event linkage rules
### Phase 3: One built-in plus one plugin example
- ship a local markdown-first provider
- ship one hosted adapter example to validate the external-provider path
### Phase 3: UI inspection
### Phase 4: UI inspection
- add company / agent memory settings
- add company and agent memory settings
- add a memory operation explorer
- add record list and detail surfaces
- add source backlinks to issues and runs
### Phase 4: Automatic hooks
- pre-run hydrate
- post-run capture
- selected issue comment / document capture
### Phase 5: Rich capabilities
- correction flows
- provider-native browse / graph views
- project-level overrides if needed
- provider-native browse or graph views
- evaluation dashboards
- retention and quota controls
## Open Questions
## Remaining Open Questions
- Should project overrides exist in V1 of the memory service, or should we force company default + agent override first?
- Do we want Paperclip-managed extraction pipelines at all, or should built-ins be the only place where Paperclip owns extraction?
- Should memory usage extend the current `cost_events` model directly, or should memory operations keep a parallel usage log and roll up into `cost_events` secondarily?
- Do we want provider install / binding changes to require approvals for some companies?
- Which built-in local provider should ship first: pure markdown, markdown plus embeddings, or a lightweight local vector store?
- How much source payload should Paperclip snapshot inside `memory_operations` for debugging without duplicating large transcripts?
- Should correction flows mutate provider state directly, create superseding records, or both depending on provider capability?
- What default retention and size limits should the local built-in enforce?
## Bottom Line
The right abstraction is:
- Paperclip owns memory bindings, scopes, provenance, governance, and usage reporting.
- Paperclip owns bindings, resolution, hooks, provenance, policy, and attribution.
- Providers own extraction, ranking, storage, and provider-native memory semantics.
That gives Paperclip a stable "memory service" without locking the product to one memory philosophy or one vendor.
That gives Paperclip a stable memory service without locking the product to one memory philosophy or one vendor, and it integrates the AWS lessons without importing AWS's model into core.
@@ -0,0 +1,382 @@
# VS Code Task Interoperability Plan
Status: planning only, no code changes
Date: 2026-04-12
Related issue: `PAP-1377`
## Summary
Paperclip should not replace its workspace runtime service model with VS Code tasks.
It should add a narrow interoperability layer that can discover and adopt supported entries from `.vscode/tasks.json`.
The core product model should stay:
- Paperclip owns long-running workspace services and their desired state
- Paperclip shows operators exactly which named thing they are starting or stopping
- Paperclip distinguishes long-running services from one-shot jobs
VS Code tasks should be treated as:
- an import/discovery format for workspace commands
- a convenience for repos that already maintain `tasks.json`
- a partial compatibility layer, not a full execution model
## Current State
The current implementation is already service-oriented:
- project workspaces and execution workspaces can store `workspaceRuntime` config plus `desiredState` and per-service `serviceStates`
- the UI renders one control row per configured service and persists start/stop intent
- the backend supervises long-running local processes, reuses eligible services, and restores desired services on startup
Relevant files:
- `packages/shared/src/types/workspace-runtime.ts`
- `server/src/services/workspace-runtime.ts`
- `server/src/services/project-workspace-runtime-config.ts`
- `ui/src/components/WorkspaceRuntimeControls.tsx`
- `ui/src/pages/ProjectWorkspaceDetail.tsx`
- `ui/src/pages/ExecutionWorkspaceDetail.tsx`
This is directionally correct for Paperclip because it gives the control plane an explicit model for service lifecycle, health, reuse, and restart behavior.
## Problem To Solve
The current UX is still too raw:
- operators have to hand-author runtime JSON
- a workspace can have multiple attached services, but the higher-level intent is not obvious
- start/stop controls are visible in multiple places, which makes it easy to lose track of what is being controlled
- there is no interoperability with repos that already define useful local workflows in `.vscode/tasks.json`
The issue is not that services are the wrong abstraction.
The issue is that the configuration surface is too low-level and Paperclip does not yet leverage existing workspace metadata.
## Recommendation
Keep Paperclip runtime services as the source of truth for service supervision.
Add a new workspace command model above the raw JSON layer, with VS Code task discovery as one input.
The product model should become:
1. `Workspace command`
A named runnable thing attached to a workspace.
2. `Workspace service`
A workspace command that is expected to stay alive and be supervised.
3. `Workspace job`
A workspace command that runs once and exits.
4. `Runtime service instance`
The live process record that already exists today in Paperclip.
In that model, VS Code tasks are a way to populate workspace commands.
Only commands that map cleanly to Paperclip service or job semantics should become runnable in Paperclip.
## Why Not Fully Adopt VS Code Tasks
VS Code tasks are broader than Paperclip runtime services.
They include shell/process tasks, compound tasks, background/watch tasks, presentation settings, extension/task-provider types, variable substitution, and problem-matcher-driven lifecycle.
That creates a bad fit if Paperclip tries to use `tasks.json` as its only runtime model:
- many tasks are one-shot jobs, not long-running services
- some tasks depend on VS Code task providers or editor-only variable resolution
- compound task graphs are useful, but they are not the same thing as a supervised service
- problem matcher readiness is useful metadata, but it is not enough to replace Paperclip's persisted service lifecycle model
The right boundary is interoperability, not replacement.
## Interoperability Contract
Paperclip should support a conservative subset of VS Code tasks and clearly mark unsupported entries.
### Supported in phase 1
- `shell` and `process` tasks with a concrete command Paperclip can resolve
- optional task `options.cwd`
- optional task environment values that can be flattened safely
- task labels and detail text for naming and display
- `dependsOn` for import-time expansion or display-only dependency hints
- background/watch-oriented tasks that can reasonably be treated as long-running services
### Maybe supported in later phases
- grouping and default task metadata for better UX
- selected variable substitution when Paperclip can resolve it safely from workspace context
- mapping task metadata into Paperclip readiness/expose hints
- limited compound-task launch flows
### Not supported initially
- extension-provided task types Paperclip cannot execute directly
- arbitrary VS Code variable substitution semantics
- problem matcher parsing as the main source of service health
- full parity with VS Code task execution behavior
## Long-Running Service Detection
Paperclip needs an explicit classification layer instead of assuming every VS Code task is a service.
Recommended classification:
- `service`
Explicitly marked by Paperclip metadata, or confidently inferred from background/watch task semantics
- `job`
One-shot command expected to exit
- `unsupported`
Present in `tasks.json`, but not safely runnable by Paperclip
The important product decision is that service classification must be visible and editable by the operator.
Inference can help, but it should not be the only source of truth.
## Proposed Product Shape
### 1. Replace raw-first editing with command-first editing
Project and execution workspace pages should stop making raw runtime JSON the primary editing surface.
Default UI should show:
- workspace commands
- command type: service or job
- source: Paperclip or VS Code
- exact command and cwd
- current state for services
- explicit start, stop, restart, and run-now actions
Raw JSON should remain available behind an advanced section.
### 2. Add VS Code task discovery on workspaces
For a workspace with `cwd`, Paperclip should look for `.vscode/tasks.json`.
The workspace UI should show:
- whether a `tasks.json` file was found
- last parse time
- supported commands discovered
- unsupported tasks with reasons
- whether commands are inherited into execution workspaces
### 3. Make the controlled thing explicit
Start and stop UI should always name the exact entry being controlled.
Examples:
- `Start web`
- `Stop api`
- `Run db:migrate`
Avoid generic workspace-level labels when multiple commands exist.
### 4. Separate services from jobs in the UI
Do not mix one-shot jobs and long-running services into one undifferentiated list.
Recommended sections:
- `Services`
- `Jobs`
- `Unsupported imported tasks`
That resolves the ambiguity called out in the issue.
## Data Model Direction
Do not replace `workspaceRuntime` immediately.
Instead add a higher-level representation that can compile down to the existing runtime-service machinery.
Suggested workspace metadata shape:
```ts
type WorkspaceCommandSource =
| { type: "paperclip" }
| { type: "vscode_task"; taskLabel: string; taskPath: ".vscode/tasks.json" };
type WorkspaceCommandKind = "service" | "job";
type WorkspaceCommandDefinition = {
id: string;
name: string;
kind: WorkspaceCommandKind;
source: WorkspaceCommandSource;
command: string | null;
cwd: string | null;
env?: Record<string, string> | null;
autoStart?: boolean;
serviceConfig?: {
lifecycle?: "shared" | "ephemeral";
reuseScope?: "project_workspace" | "execution_workspace" | "run";
readiness?: Record<string, unknown> | null;
expose?: Record<string, unknown> | null;
} | null;
importWarnings?: string[];
disabledReason?: string | null;
};
```
`workspaceRuntime` can then become a derived or advanced representation for service-type commands until the rest of the system is migrated.
## VS Code Mapping Rules
Paperclip should map imported tasks with explicit, documented rules.
Recommended rules:
1. A task becomes a `job` by default.
2. A task becomes a `service` only when:
- Paperclip metadata marks it as a service, or
- the task clearly represents a background/watch process and the operator confirms the classification.
3. Unsupported tasks stay visible but disabled.
4. Task labels become default command names.
5. `dependsOn` is preserved as metadata, not silently flattened into hidden behavior.
Paperclip-specific metadata can live in a namespaced field on the imported task definition, for example:
```json
{
"label": "web",
"type": "shell",
"command": "pnpm dev",
"isBackground": true,
"paperclip": {
"kind": "service",
"readiness": {
"type": "http",
"urlTemplate": "http://127.0.0.1:${port}"
},
"expose": {
"type": "url",
"urlTemplate": "http://127.0.0.1:${port}"
}
}
}
```
That gives us interoperability without depending on VS Code-only semantics for service readiness and exposure.
## Execution Policy
Project workspaces should be the main place where imported commands are discovered and curated.
Execution workspaces should inherit that curated command set by default, with optional issue-level overrides.
Recommended precedence:
1. execution workspace override
2. project workspace command set
3. imported VS Code tasks from the linked workspace
4. advanced raw runtime fallback
This matches the existing direction in `doc/plans/2026-03-10-workspace-strategy-and-git-worktrees.md`.
## Implementation Plan
### Phase 1: Discovery and read-only visibility
Goal:
show imported VS Code tasks in the workspace UI without changing runtime behavior.
Work:
- parse `.vscode/tasks.json` for project workspaces with local `cwd`
- derive a list of candidate commands plus unsupported items
- show source, label, command, cwd, and classification
- show parse warnings and unsupported reasons
Success condition:
an operator can see what Paperclip would import and why.
### Phase 2: Command model and explicit classification
Goal:
introduce a first-class workspace command layer above raw runtime JSON.
Work:
- add a persisted command definition model in workspace metadata or a dedicated table
- allow operator edits to imported command classification
- separate `service` and `job` in UI
- keep existing runtime-service storage for live supervised processes
Success condition:
the workspace UI is command-first, and raw runtime JSON is advanced-only.
### Phase 3: Service execution backed by existing runtime supervisor
Goal:
run supported imported service commands through the current Paperclip supervisor.
Work:
- compile service commands into the existing runtime service start/stop path
- persist desired state per named command
- keep startup restoration behavior for service commands
- make the active command name explicit everywhere control actions appear
Success condition:
imported service commands behave like native Paperclip services once adopted.
### Phase 4: Job execution and optional dependency handling
Goal:
support one-shot imported commands without pretending they are services.
Work:
- add `Run` actions for jobs
- record output in workspace operations
- optionally support simple `dependsOn` execution for jobs with clear logging
Success condition:
one-shot tasks are runnable, but they are not mixed into the service lifecycle model.
### Phase 5: Adapter and execution workspace integration
Goal:
let agents and issue-scoped workspaces consume the curated command model consistently.
Work:
- expose inherited workspace commands to execution workspaces
- allow issue-level selection of a default service command when relevant
- make service selection explicit in issue and workspace views
Success condition:
agents, operators, and workspaces all refer to the same named commands.
## Non-Goals
- full VS Code task-runner parity
- support for every VS Code task type
- removal of Paperclip's own runtime supervision model
- editor-dependent execution semantics inside the control plane
## Risks
- overfitting Paperclip to VS Code and making the model worse for non-VS-Code repos
- misclassifying watch tasks as durable services
- hiding too much detail and making debugging harder
- allowing imported task graphs to become implicit magic
These risks are manageable if the import layer stays explicit, conservative, and operator-editable.
## Decision
Paperclip should adopt VS Code tasks as an optional workspace command source, not as the canonical runtime model.
The main UX change should be:
- move from raw runtime JSON to named workspace commands
- separate services from jobs
- make the exact controlled command explicit
- let `.vscode/tasks.json` pre-populate those commands when available
## External References
- VS Code tasks documentation: https://code.visualstudio.com/docs/debugtest/tasks
- Existing Paperclip workspace plan: `doc/plans/2026-03-10-workspace-strategy-and-git-worktrees.md`
@@ -0,0 +1,86 @@
# Plugin Secret Refs: Company Scope Reintroduction Plan
Date: 2026-04-26
Status: follow-up after fail-closed mitigation
Related issue: PAP-2394
## Current state
`PAP-2394` now fails closed:
- `POST /api/plugins/:pluginId/config` rejects any config containing plugin secret refs.
- `ctx.secrets.resolve()` is disabled for plugin workers.
This removes the release-blocking cross-company exposure path, but it also disables plugin secret-ref support until the runtime carries company scope end to end.
## Vulnerability summary
The original design mixed an instance-global config store with company-scoped secret bindings:
- [server/src/routes/plugins.ts](/Users/dotta/paperclip/.paperclip/worktrees/PAP-2339-secrets-make-a-plan/server/src/routes/plugins.ts:1898) saved one global plugin config row, then wrote bindings into `company_secret_bindings` grouped by each referenced secret's owning company.
- [packages/db/src/schema/plugin_config.ts](/Users/dotta/paperclip/.paperclip/worktrees/PAP-2339-secrets-make-a-plan/packages/db/src/schema/plugin_config.ts:15) stored one config row per plugin, with no company dimension.
- [packages/db/src/schema/company_secret_bindings.ts](/Users/dotta/paperclip/.paperclip/worktrees/PAP-2339-secrets-make-a-plan/packages/db/src/schema/company_secret_bindings.ts:5) already modeled bindings as company-scoped.
- [server/src/services/plugin-secrets-handler.ts](/Users/dotta/paperclip/.paperclip/worktrees/PAP-2339-secrets-make-a-plan/server/src/services/plugin-secrets-handler.ts:212) resolved by `pluginId` + secret UUID, with no active company context from the bridge call.
- [packages/plugins/sdk/src/worker-rpc-host.ts](/Users/dotta/paperclip/.paperclip/worktrees/PAP-2339-secrets-make-a-plan/packages/plugins/sdk/src/worker-rpc-host.ts:384) exposed `ctx.config.get()` and `ctx.secrets.resolve()` without a company parameter.
This violated Least Privilege, Complete Mediation, and Secure Defaults.
## Recommended end state
Re-enable plugin secret refs only after both of these are true:
1. Plugin config reads/writes are company-scoped.
2. Runtime secret resolution carries explicit company context and enforces it at resolution time.
## Implementation plan
### 1. Make plugin config company-scoped
- Add `company_id` to `plugin_config`, with a unique index on `(plugin_id, company_id)`.
- Update registry helpers to require `companyId` for `getConfig`, `upsertConfig`, `patchConfig`, and `deleteConfig`.
- Update plugin config routes to require `companyId` and call `assertCompanyAccess(req, companyId)`.
- Keep instance-global plugin lifecycle state separate from company-scoped plugin config.
### 2. Propagate company context through the worker runtime
- Extend the SDK so `ctx.config.get()` and `ctx.secrets.resolve()` can receive or derive `companyId`.
- Introduce worker request context storage for handlers that already run with company scope:
- `getData`
- `performAction`
- scoped API routes
- tool executions
- environment driver calls
- Fail closed when plugin code tries to read company-scoped config or secrets outside an active company context.
### 3. Rebind secrets by `(companyId, pluginId, configPath)`
- On config save, validate every referenced secret belongs to the authorized company.
- Store bindings only for that company.
- Resolve secrets only by the current company-scoped binding, never by bare plugin ID plus UUID.
- Treat stale bindings as invalid and remove them on config replacement.
### 4. Prevent cross-company config disclosure
- When returning config to the UI, only materialize the selected company's secret refs.
- Never expose another company's secret UUIDs through the global plugin config surface.
## Required regression coverage
- Company A board user cannot save plugin config that references a Company B secret.
- Company A plugin execution cannot resolve a Company B secret even if the same plugin is configured for Company B.
- Company-scoped config reads only return the selected company's secret bindings.
- Config replacement removes stale bindings for the same `(companyId, pluginId)` target.
- Runtime calls without company context fail closed.
## Migration notes
- Existing `plugin_config` rows need a migration strategy before re-enable.
- Safest default: do not auto-assume a company for historical secret refs.
- Prefer one of:
- explicit admin migration per company, or
- import existing rows as non-secret config only and require re-entry of secret refs.
## Release posture
- Keep plugin secret refs disabled until all steps above land.
- Do not restore the feature behind a soft warning; the insecure path must remain unavailable by default.
@@ -0,0 +1,90 @@
# Scaled Kanban Board Design
Date: 2026-05-05
Branch: `feat/scaled-kanban-board`
## Context
The Issues page currently supports list and board modes. List mode already has grouping, sorting, filtering, nested parent/child rows, deferred row rendering, and incremental render limits. Board mode uses classic status columns with draggable cards. It fetches per-status board data, but the current UI still presents each lane as an unbounded stack of cards, which becomes tall and heavy when a company has hundreds of issues.
The goal is to keep the Kanban mental model while making high-volume boards usable. This is a UI-first change. It should not introduce schema changes or new API contracts in the first pass.
## Problem
When Paperclip has many issues, board columns get too tall and slow. The operator loses the ability to scan the board quickly, and rendering or dragging through long columns becomes unpleasant. The first version should solve this by reducing the number of visible cards per column and by collapsing low-signal columns, not by replacing Kanban with a different inventory surface.
## Design
Board mode remains status-column based. Each column shows its total issue count, a bounded set of visible cards, and a local affordance to reveal more cards in that column. The board should keep active workflow lanes expanded by default and collapse cold or noisy lanes once issue volume is high.
Default high-volume behavior activates when the filtered board has more than 100 issues:
- Compact cards are used by default.
- `backlog`, `done`, and `cancelled` auto-collapse to narrow rails.
- `todo`, `in_progress`, `in_review`, and `blocked` remain expanded by default.
- Each expanded column renders an initial 10 cards by default.
- The user can choose a page size of 10, 25, or 50 cards per column.
- The user can reveal one additional page at a time in each column without changing other columns.
- Drag and drop continues to work for visible cards.
The toolbar should expose compact controls for:
- toggling compact cards
- hiding or showing cold lanes
- choosing cards per column
- resetting board density to defaults
These preferences should persist through the existing issue view-state/localStorage mechanism and remain scoped by company.
## Component Shape
`IssuesList` remains the owner of issue board view state. It should store board-density preferences alongside the existing issue view state, including compact card preference, cold-lane mode, and cards-per-column page size.
`KanbanBoard` receives board tuning props from `IssuesList` and delegates per-lane display to `KanbanColumn`.
`KanbanColumn` owns only local presentation mechanics for a lane:
- whether the lane is rendered as an expanded column or collapsed rail
- how many cards are currently visible in that lane
- the local "show more" action
`KanbanCard` gets a compact variant. The compact card should still show the issue identifier, title, live state, priority, and assignee when available, but with tighter spacing and fewer vertical affordances.
## Data Flow
The first implementation uses the current issue data already available to board mode. No database, shared type, or route change is required.
Column totals are computed from the in-memory filtered board issues. If a column reaches the existing remote board query cap, the existing warning remains the truth source that more filtering may be required.
Future server-side column pagination can be added later if the UI-only version is not enough for very large instances.
## Error Handling
This feature should not introduce new network errors. Existing issue loading and update errors continue to surface through the Issues page.
For drag and drop:
- Moving a visible card keeps the current optimistic behavior.
- Hidden cards remain hidden until revealed.
- A collapsed lane rail is a valid drop target. Dropping onto it moves the issue to that status and keeps the lane collapsed.
## Testing
Focused tests should cover:
- board mode passes density preferences into `KanbanBoard`
- columns render only the initial visible card count
- "show more" reveals more cards in a single column
- high-volume cold lanes render as collapsed rails by default
- compact cards preserve identifier/title/live/priority/assignee signals
- drag/drop status updates still call `onUpdateIssue`
Manual verification should include opening the Issues board with a large fixture or mocked issue set and confirming that columns remain usable with hundreds of issues.
## Out of Scope
- Server-side per-column pagination
- New issue schema fields
- Replacing Kanban with a dense table or action-only board
- Changing issue status semantics
- Broad visual redesign of the Issues page
+250
View File
@@ -0,0 +1,250 @@
# Scaled Kanban Board Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Make the Issues Kanban board usable with hundreds of issues by adding compact high-volume rendering, collapsed cold lanes, and per-column reveal controls.
**Architecture:** Keep the change UI-only. `IssuesList` owns persisted board density preferences in existing company-scoped view state, while `KanbanBoard` owns lane rendering, card density, collapsed rails, and per-column "show more" state.
**Tech Stack:** React 19, TypeScript, Vite, Vitest/jsdom, `@dnd-kit/core`, `@dnd-kit/sortable`, Tailwind utility classes.
---
## File Structure
- Modify `ui/src/components/IssuesList.tsx`: extend `IssueViewState`, derive high-volume board preferences, add toolbar controls, pass props into `KanbanBoard`.
- Modify `ui/src/components/KanbanBoard.tsx`: add compact cards, collapsed rail lanes, visible-card limits, and per-column reveal behavior.
- Create `ui/src/components/KanbanBoard.test.tsx`: focused tests for high-volume behavior and drag/drop update callback.
- Modify `ui/src/components/IssuesList.test.tsx`: update the mocked `KanbanBoard` expectations for new props.
- Keep `doc/plans/2026-05-05-scaled-kanban-board-design.md` as the design source of truth.
## Task 1: Add Kanban Board Scaling Mechanics
**Files:**
- Modify: `ui/src/components/KanbanBoard.tsx`
- Create: `ui/src/components/KanbanBoard.test.tsx`
- [ ] **Step 1: Write focused tests**
Create `ui/src/components/KanbanBoard.test.tsx` with tests that render 60 todo issues and assert:
```tsx
renderBoard({ issues: createIssues(60, "todo"), compactCards: true, initialVisibleCount: 10, revealIncrement: 10 });
expect(container.textContent).toContain("Showing 10 of 60");
expect(container.textContent).toContain("Show 10 more");
```
Also test collapsed rails:
```tsx
renderBoard({ issues: createIssues(3, "done"), collapsedStatuses: ["done"] });
expect(container.textContent).toContain("Done");
expect(container.textContent).toContain("3");
expect(container.textContent).not.toContain("Issue 1");
```
- [ ] **Step 2: Run tests to verify failure**
Run:
```bash
pnpm exec vitest run ui/src/components/KanbanBoard.test.tsx
```
Expected: fail because `KanbanBoard.test.tsx` is new and the props/behavior do not exist.
- [ ] **Step 3: Implement minimal board behavior**
In `KanbanBoard.tsx`, add exported constants:
```ts
export const KANBAN_BOARD_HIGH_VOLUME_THRESHOLD = 100;
export const KANBAN_COLUMN_PAGE_SIZE_OPTIONS = [10, 25, 50] as const;
export const KANBAN_COLUMN_DEFAULT_PAGE_SIZE = 10;
export const KANBAN_COLD_STATUSES = ["backlog", "done", "cancelled"] as const;
```
Extend props:
```ts
compactCards?: boolean;
collapsedStatuses?: string[];
initialVisibleCount?: number;
revealIncrement?: number;
```
Add per-status visible-count state keyed by status. Expanded columns render `issues.slice(0, visibleCount)` and show a button when hidden issues remain. Collapsed columns render a narrow droppable rail with status icon, label, and count, but no cards.
Reset per-status visible-count state when `initialVisibleCount` or `revealIncrement` changes so choosing a smaller cards-per-column preset does not leave a column expanded past the newly selected page size.
- [ ] **Step 4: Preserve drag/drop**
Keep `DndContext`, `SortableContext`, and `handleDragEnd` status detection. Because collapsed rails use `useDroppable({ id: status })`, dropping a visible card onto a rail continues to resolve `targetStatus` through the existing status-id branch.
- [ ] **Step 5: Run focused test**
Run:
```bash
pnpm exec vitest run ui/src/components/KanbanBoard.test.tsx
```
Expected: pass.
- [ ] **Step 6: Commit**
```bash
git add ui/src/components/KanbanBoard.tsx ui/src/components/KanbanBoard.test.tsx
git commit -m "Scale kanban board columns"
```
## Task 2: Wire Board Density State Into IssuesList
**Files:**
- Modify: `ui/src/components/IssuesList.tsx`
- Modify: `ui/src/components/IssuesList.test.tsx`
- [ ] **Step 1: Write/update tests**
In `IssuesList.test.tsx`, update the `KanbanBoard` mock to capture:
```ts
compactCards?: boolean;
collapsedStatuses?: string[];
initialVisibleCount?: number;
revealIncrement?: number;
```
Add a test that stores board mode in localStorage, renders more than 100 issues, and expects:
```ts
expect(mockKanbanBoard).toHaveBeenLastCalledWith(expect.objectContaining({
compactCards: true,
collapsedStatuses: expect.arrayContaining(["backlog", "done", "cancelled"]),
initialVisibleCount: 10,
revealIncrement: 10,
}));
```
- [ ] **Step 2: Run test to verify failure**
Run:
```bash
pnpm exec vitest run ui/src/components/IssuesList.test.tsx
```
Expected: fail because `IssuesList` does not pass the new props yet.
- [ ] **Step 3: Add persisted board density preferences**
Extend `IssueViewState`:
```ts
boardCardDensity: "auto" | "compact" | "comfortable";
boardColdLaneMode: "auto" | "collapsed" | "expanded";
boardColumnPageSize: 10 | 25 | 50;
```
Default the density modes to `"auto"` and page size to `10`. Derive:
```ts
const boardHighVolume = viewState.viewMode === "board" && filtered.length > KANBAN_BOARD_HIGH_VOLUME_THRESHOLD;
const boardCompactCards = viewState.boardCardDensity === "compact"
|| (viewState.boardCardDensity === "auto" && boardHighVolume);
const boardCollapsedStatuses = viewState.boardColdLaneMode === "collapsed"
|| (viewState.boardColdLaneMode === "auto" && boardHighVolume)
? [...KANBAN_COLD_STATUSES]
: [];
```
- [ ] **Step 4: Add toolbar controls**
When `viewState.viewMode === "board"`, add small outline/icon buttons near the existing view controls:
```tsx
<Button ... title={boardCompactCards ? "Use comfortable cards" : "Use compact cards"}>...</Button>
<Button ... title={boardCollapsedStatuses.length > 0 ? "Expand cold lanes" : "Collapse cold lanes"}>...</Button>
<Button ... title="Cards per column">...</Button>
<Button ... title="Reset board density">...</Button>
```
Use lucide icons already available or import `ChevronsDownUp`, `PanelTopClose`, and `RotateCcw`.
- [ ] **Step 5: Pass board props**
Update the `KanbanBoard` call:
```tsx
<KanbanBoard
issues={filtered}
agents={agents}
liveIssueIds={liveIssueIds}
compactCards={boardCompactCards}
collapsedStatuses={boardCollapsedStatuses}
initialVisibleCount={viewState.boardColumnPageSize}
revealIncrement={viewState.boardColumnPageSize}
onUpdateIssue={onUpdateIssue}
/>
```
- [ ] **Step 6: Run focused tests**
Run:
```bash
pnpm exec vitest run ui/src/components/IssuesList.test.tsx ui/src/components/KanbanBoard.test.tsx
```
Expected: pass.
- [ ] **Step 7: Commit**
```bash
git add ui/src/components/IssuesList.tsx ui/src/components/IssuesList.test.tsx
git commit -m "Wire issue board density controls"
```
## Task 3: Verification And PR Prep
**Files:**
- Verify existing changes only.
- [ ] **Step 1: Run targeted UI tests**
```bash
pnpm exec vitest run ui/src/components/IssuesList.test.tsx ui/src/components/KanbanBoard.test.tsx
```
Expected: pass.
- [ ] **Step 2: Run broader cheap test path**
```bash
pnpm test
```
Expected: pass.
- [ ] **Step 3: Check worktree**
```bash
git status --short
```
Expected: only intentional changes before committing, or clean after final commit.
- [ ] **Step 4: Prepare PR**
Read `.github/PULL_REQUEST_TEMPLATE.md` and use it for the PR body. Include:
- design spec path
- scaled Kanban behavior summary
- test commands and results
- Model Used section with the current Codex model details available in this session
## Self-Review
- Spec coverage: The plan covers compact high-volume board cards, collapsed cold lanes, cards-per-column presets, per-column reveal controls, persisted board preferences, current API reuse, and focused tests.
- Placeholder scan: No unresolved markers or unspecified implementation placeholders remain.
- Type consistency: The plan consistently uses `boardCardDensity`, `boardColdLaneMode`, `boardColumnPageSize`, `compactCards`, `collapsedStatuses`, `initialVisibleCount`, and `revealIncrement`.
@@ -0,0 +1,135 @@
# LLM Wiki Paperclip Asset And Work-Product Security Gate
Status: accepted Phase 5 policy
Date: 2026-05-06
Owner: Security engineering
Scope: Paperclip-derived ingestion into the LLM Wiki before any asset or work-product content indexing ships
## Decision
Phase 5 remains **fail-closed** for Paperclip assets and work products.
- Paperclip-derived **text extraction is allowed only** for issue titles/descriptions, issue comments, and issue documents.
- Paperclip **assets/attachments** and **issue work products** are **metadata-only** in Phase 5.
- **Linked summaries** and **content extraction** for assets/work products are **not approved** in Phase 5.
- No implementation may fetch `/api/assets/:id/content`, dereference a work-product `url`, scrape preview pages, or embed binary/blob content into source bundles or source snapshots.
This keeps the secure path easier than the insecure one and avoids broadening the wiki into a second content-distribution channel.
## Allowed Source Kinds
These source kinds may contribute body text to Paperclip-derived source bundles:
| Source kind | Allowed body fields | Reason |
| --- | --- | --- |
| Issue | `title`, `description`, identifier/status metadata | First-party Paperclip text under company ACL |
| Comment | `body` | First-party Paperclip text under company ACL |
| Document | `body`, `title`, `key`, revision metadata | First-party Paperclip text under company ACL |
## Assets And Work Products
### Assets / attachments
Allowed in Phase 5:
- metadata-only references built from allowlisted structured fields already stored in Paperclip
- recommended fields: `issueId`, `issueCommentId`, `attachmentId`, `assetId`, `originalFilename`, `contentType`, `byteSize`, `sha256`, `createdAt`, `createdByAgentId`, `createdByUserId`
Disallowed in Phase 5:
- fetching asset bytes from `/api/assets/:id/content`
- parsing any blob body, including `text/plain`, `text/markdown`, `application/json`, images, SVG, PDFs, archives, or office formats
- storing `contentPath` in wiki source bundles or source snapshots
- model summarization of attachment bodies
### Work products
Allowed in Phase 5:
- metadata-only references built from allowlisted structured fields already stored in Paperclip
- recommended fields: `issueId`, `workProductId`, `type`, `provider`, `title`, `status`, `reviewState`, `healthStatus`, `externalId`, `isPrimary`, `createdAt`, `updatedAt`
- optional boolean/derived metadata such as `hasUrl: true`
Disallowed in Phase 5:
- fetching or crawling the work-product `url`
- scraping preview pages, artifacts, pull requests, branches, commits, or custom provider targets through the wiki ingestion path
- storing raw `url` values in wiki source bundles or source snapshots
- model-authored linked summaries derived from off-record content
## MIME Allowlists And Size Caps
No MIME allowlist is approved for asset content extraction in Phase 5 because **no asset body extraction is approved at all**.
- Every asset MIME type is treated as opaque for Paperclip-derived indexing.
- Existing upload limits remain storage concerns, not ingestion approvals.
- Work-product destinations are also opaque regardless of MIME type or size.
Any future issue that wants blob parsing must define:
- a positive MIME allowlist
- per-type parser strategy
- per-source size caps
- sandbox/isolation requirements
- prompt-injection handling
- regression tests for refusal paths
## Redaction Rules
Metadata-only means **structured facts only**, not capability-bearing links.
- Do not persist `contentPath` for assets.
- Do not persist raw work-product `url` values.
- Do not persist query strings, fragments, signed URL tokens, or userinfo.
- Prefer stable identifiers (`assetId`, `workProductId`, `externalId`) over links.
This addresses Sensitive Information Disclosure, Unsafe Consumption of APIs, and Insecure Output Handling risks.
## Provenance Rules
Every metadata-only reference must preserve enough provenance to explain where it came from without reading the underlying content:
- `companyId`
- `issueId`
- attachment/work-product id
- producer identity when available
- timestamps
- an explicit `metadata_only` marker in any future reference/snapshot schema
## Review-Required Behavior
Human review is **not** required for plain metadata-only references that stay inside the allowlisted fields above.
Human review **is required**, with a separate security sign-off issue, before enabling any of the following:
- asset body extraction
- work-product URL fetching
- linked summaries generated from asset/work-product content
- storing raw blob links or raw remote URLs in wiki source material
- non-default-space routing for Paperclip-derived asset/work-product references
## Security Rationale
This gate exists because the current host surfaces have different trust properties:
- issue/comment/document text is first-party Paperclip content already exposed through company-scoped issue/document APIs
- asset content is a blob download surface (`/api/assets/:id/content`) and can carry prompt-injection or parser-risk payloads
- work products can point at arbitrary destinations through `url`, which reintroduces SSRF, token leakage, and prompt-injection risk if dereferenced automatically
Relevant threat classes:
- OWASP LLM Top 10: Prompt Injection, Sensitive Information Disclosure, Insecure Output Handling, Excessive Agency
- OWASP API Top 10: SSRF, Unsafe Consumption of APIs, Broken Object Property Level Authorization
- Saltzer & Schroeder: Least Privilege, Fail Securely, Complete Mediation, Secure Defaults
## Follow-Up Implementation Scope
A follow-up implementation issue is justified only for **metadata-only references**.
That implementation must:
- keep assets/work products out of source-bundle body text
- never fetch blob bytes or remote URLs
- redact capability-bearing link fields
- mark references as `metadata_only`
- ship tests proving source bundles/snapshots never contain `contentPath` or raw work-product `url` fields
@@ -0,0 +1,486 @@
# Skills CLI And Catalog Contract
Status: Phase A engineering contract
Date: 2026-05-26
Source plan: approved Paperclip skills CLI and catalog plan
This document freezes the first implementation contract for the `paperclipai skills`
command group and the app-shipped skills catalog. It is intentionally a build
contract, not a full product spec.
## Decisions
- `paperclipai skills` manages Paperclip company skills. It does not manage
local adapter homes directly.
- Installing a skill means adding or updating a company-scoped
`company_skills` record.
- Attaching a skill to an agent is a separate agent desired-state operation.
- Adapter runtime sync is a third step handled through adapter skill APIs.
- Root `skills/` remains reserved for Paperclip runtime and operational skills.
- App-shipped catalog skills live in `packages/skills-catalog`, not root
`skills/`.
- Catalog skills are inspectable before install. Inspection never mutates company
state.
- External sources continue to use the existing company skill import API in the
first release. No separate marketplace, tap, or source registry is part of this
phase.
- Agent desired skills continue to live in
`adapterConfig.paperclipSkillSync.desiredSkills` for the first release. Do not
add a normalized `agent_skills` table unless later implementation evidence
requires it.
## Terms
- Company skill: a row in `company_skills`, owned by one company.
- Catalog skill: an app-shipped skill entry in `@paperclipai/skills-catalog`.
- Skill ref: a user-supplied company skill reference. The CLI accepts company
skill `id`, canonical `key`, or unique `slug`.
- Catalog ref: a user-supplied catalog reference. The CLI accepts catalog `id`,
canonical `key`, or unique `slug`.
- Desired skills: the skill key set stored on the agent adapter config.
- Runtime snapshot: the adapter-reported `AgentSkillSnapshot` for desired,
installed, missing, stale, external, required, or unsupported skills.
## CLI Contract
All skills commands use the existing client command stack:
- Global client options: `--data-dir`, `--config`, `--context`, `--profile`,
`--api-base`, `--api-key`, and `--json`.
- Company-scoped commands also accept `-C, --company-id <id>` and otherwise use
`PAPERCLIP_COMPANY_ID` or the active context profile.
- Human output goes to stdout. Errors go to stderr.
- `--json` prints pretty JSON and no decorative labels.
- Successful commands exit `0`. Validation, API, or conflict errors exit `1`.
- API errors use the existing `API error <status>: <message>` formatting.
- Mutating commands print a short summary in human mode and the raw result in
JSON mode.
- Commands that can delete or clear state must prompt in a TTY. In non-TTY mode
they must require `--yes`.
### Company Skill Commands
These commands are Phase B and must work over existing APIs.
| Command | Behavior | JSON output |
|---|---|---|
| `skills list` | Lists company skills from `GET /api/companies/:companyId/skills`. Human rows include `id`, `key`, `slug`, `name`, `source`, `trust`, `compatibility`, and `attachedAgents`. | `CompanySkillListItem[]` |
| `skills show <skill-ref>` | Resolves `id`, `key`, or unique `slug`, then reads detail. Ambiguous slugs are conflicts. | `CompanySkillDetail` |
| `skills file <skill-ref> [--path <path>]` | Resolves the skill, reads a file with default `SKILL.md`, and prints raw file content in human mode. This command must remain pipeable. | `CompanySkillFileDetail` |
| `skills import <source>` | Calls existing import API. Source may be a local path, GitHub URL, skills.sh URL or command, `owner/repo`, `owner/repo/skill`, or URL-like source already accepted by the server. | `CompanySkillImportResult` |
| `skills create --name <name> [--slug <slug>] [--description <text>] [--body-file <path|->]` | Creates a managed local company skill. If `--body-file` is omitted, the server default body is used. `-` reads markdown from stdin. | `CompanySkill` |
| `skills scan-projects [--project-id <id>...] [--workspace-id <id>...]` | Calls project scan. Repeated flags become arrays. With neither flag, scan all accessible project workspaces. | `CompanySkillProjectScanResult` |
| `skills check [skill-ref]` | Reads update status for one skill, or for every listed company skill when no ref is provided. Unsupported statuses are shown, not hidden. | `CompanySkillCheckRow[]` |
| `skills update <skill-ref>` | Installs the update for one skill through the existing install-update API. | `CompanySkillUpdateRow` |
| `skills update --all` | Checks all skills, installs only those with `hasUpdate=true`, and reports skipped unsupported or current skills. | `CompanySkillUpdateRow[]` |
| `skills remove <skill-ref> [--yes]` | Deletes one company skill after confirmation. | `CompanySkill` |
`CompanySkillCheckRow` is a CLI-side shape:
```ts
interface CompanySkillCheckRow {
skill: Pick<CompanySkillListItem, "id" | "key" | "slug" | "name">;
status: CompanySkillUpdateStatus;
}
```
`CompanySkillUpdateRow` is a CLI-side shape:
```ts
interface CompanySkillUpdateRow {
skillRef: string;
action: "updated" | "skipped" | "failed";
skill?: CompanySkill;
status?: CompanySkillUpdateStatus;
reason?: string;
}
```
### Agent Skill Commands
These commands are Phase B and use existing agent skill APIs.
| Command | Behavior | JSON output |
|---|---|---|
| `skills agent list <agent-ref>` | Resolves the agent using existing agent reference behavior, then prints the adapter `AgentSkillSnapshot`. Human rows include `key`, `runtimeName`, `desired`, `managed`, `required`, `state`, `origin`, and `detail`. | `AgentSkillSnapshot` |
| `skills agent sync <agent-ref> --skill <skill-ref>...` | Replaces the agent's non-required desired skill set with the supplied refs and triggers adapter sync. Required Paperclip skills remain enforced by the server. | `AgentSkillSnapshot` |
| `skills agent clear <agent-ref> [--yes]` | Clears non-required desired skills by sending an empty desired list, then returns the adapter snapshot. | `AgentSkillSnapshot` |
The word `sync` is deliberate: it is a desired-state replacement, not an append.
An additive command can be added later if operators need it.
### Catalog CLI Commands
These commands are Phase E and depend on the catalog APIs from Phase D.
| Command | Behavior | JSON output |
|---|---|---|
| `skills browse [--kind bundled|optional] [--category <slug>] [--query <text>]` | Lists app-shipped catalog skills. Human rows include `id`, `key`, `kind`, `category`, `slug`, `name`, `trust`, and `recommendedForRoles`. | `CatalogSkillListItem[]` |
| `skills search <query> [--kind bundled|optional] [--category <slug>]` | Alias for catalog browse with `query`. | `CatalogSkillListItem[]` |
| `skills inspect <catalog-ref>` | Shows app-shipped catalog detail and file inventory. Does not mutate company state. | `CatalogSkillDetail` |
| `skills install <catalog-ref> [--as <slug>] [--force]` | Installs a catalog skill into a company library. `--as` overrides the company skill slug. `--force` may replace a same-key catalog skill but must not bypass hard validation or dangerous security findings. | `CompanySkillInstallCatalogResult` |
Catalog commands are for the app-shipped Paperclip catalog only. External GitHub,
skills.sh, local path, and URL installs remain under `skills import <source>` in
the first release.
## Catalog Package Contract
Add a workspace package:
```text
packages/skills-catalog/
package.json
tsconfig.json
src/
index.ts
types.ts
catalog/
bundled/
<category>/
<slug>/
SKILL.md
references/
scripts/
assets/
optional/
<category>/
<slug>/
SKILL.md
references/
scripts/
assets/
generated/
catalog.json
scripts/
build-catalog-manifest.ts
validate-catalog.ts
```
Package name: `@paperclipai/skills-catalog`.
The package exports:
- `catalogManifest`
- `catalogSkills`
- `resolveCatalogSkillRef(ref)`
- `getCatalogSkill(id)`
- TypeScript types for every manifest shape
Server and CLI code must import the generated manifest. They must not crawl
arbitrary repository paths at request time.
## Catalog Manifest
The generated artifact is `packages/skills-catalog/generated/catalog.json`.
It is checked in and regenerated by the package build or validation script.
```ts
interface CatalogManifest {
schemaVersion: 1;
packageName: "@paperclipai/skills-catalog";
packageVersion: string;
generatedAt: string;
skills: CatalogSkill[];
}
interface CatalogSkill {
id: string;
key: string;
kind: "bundled" | "optional";
category: string;
slug: string;
name: string;
description: string;
path: string;
entrypoint: "SKILL.md";
trustLevel: "markdown_only" | "assets" | "scripts_executables";
compatibility: "compatible" | "unknown" | "invalid";
defaultInstall: boolean;
recommendedForRoles: string[];
requires: string[];
tags: string[];
files: CatalogSkillFile[];
contentHash: string;
}
interface CatalogSkillFile {
path: string;
kind: "skill" | "markdown" | "reference" | "script" | "asset" | "other";
sizeBytes: number;
sha256: string;
}
```
`id` is path-safe:
```text
paperclipai:<kind>:<category>:<slug>
```
`key` is the canonical company skill key installed into `company_skills`:
```text
paperclipai/<kind>/<category>/<slug>
```
Example:
```json
{
"id": "paperclipai:bundled:software-development:github-pr-workflow",
"key": "paperclipai/bundled/software-development/github-pr-workflow",
"kind": "bundled",
"category": "software-development",
"slug": "github-pr-workflow",
"name": "github-pr-workflow",
"description": "Prepare pull requests, review responses, and verification notes.",
"path": "catalog/bundled/software-development/github-pr-workflow",
"entrypoint": "SKILL.md",
"trustLevel": "markdown_only",
"compatibility": "compatible",
"defaultInstall": false,
"recommendedForRoles": ["engineer"],
"requires": [],
"tags": ["github", "pull-requests"],
"files": [
{
"path": "SKILL.md",
"kind": "skill",
"sizeBytes": 1200,
"sha256": "..."
}
],
"contentHash": "sha256:..."
}
```
## Catalog Skill Frontmatter
Each catalog `SKILL.md` must include:
```yaml
---
name: github-pr-workflow
description: Prepare pull requests, review responses, and verification notes.
key: paperclipai/bundled/software-development/github-pr-workflow
recommendedForRoles:
- engineer
tags:
- github
- pull-requests
---
```
Optional frontmatter:
- `slug`
- `defaultInstall`
- `requires`
- `metadata`
The manifest generator owns `kind`, `category`, `path`, `files`,
`trustLevel`, `compatibility`, and `contentHash`.
## Catalog Validation Rules
Validation must fail when:
- A catalog entry is not under `catalog/bundled/<category>/<slug>` or
`catalog/optional/<category>/<slug>`.
- `SKILL.md` is missing.
- `category` or `slug` is not a lowercase URL slug.
- `name` or `description` frontmatter is missing or empty.
- The frontmatter `key`, when present, does not equal the generated key.
- Two catalog entries have the same `id`, `key`, or `slug`.
- File inventory includes absolute paths, `..` segments, broken symlinks, or
files outside the skill directory.
- A file exceeds the package-level size limit chosen by implementation.
- A skill marked `compatible` cannot be parsed as Agent Skills markdown.
- The generated manifest differs from the checked-in
`generated/catalog.json`.
Trust level is derived from inventory:
- `scripts_executables` when any file is classified as `script`.
- `assets` when any file is classified as `asset` or `other` and no script is
present.
- `markdown_only` when all files are markdown, references, or `SKILL.md`.
Validation must report all discovered catalog errors when practical, not just
the first one.
## Catalog API Contract
Phase D adds read APIs and one company install API.
```text
GET /api/skills/catalog
GET /api/skills/catalog/:catalogId
GET /api/skills/catalog/:catalogId/files?path=SKILL.md
POST /api/companies/:companyId/skills/install-catalog
```
`GET /api/skills/catalog` accepts:
- `kind=bundled|optional`
- `category=<slug>`
- `q=<text>`
`catalogId` is the path-safe manifest `id`. The server should also support
resolution by `key` or unique `slug` where the ref is carried in a query or body,
but route parameters use `id` to avoid slash handling ambiguity.
Install request:
```ts
interface CompanySkillInstallCatalogRequest {
catalogSkillId: string;
slug?: string | null;
force?: boolean;
}
```
Install result:
```ts
interface CompanySkillInstallCatalogResult {
action: "created" | "updated" | "unchanged";
skill: CompanySkill;
catalogSkill: CatalogSkill;
warnings: string[];
}
```
Install behavior:
- Creates or updates a company skill with `sourceType="catalog"`.
- Uses catalog `key` as the company skill canonical key.
- Uses catalog `slug` unless `slug` is provided.
- Materializes the catalog files into a company-managed skill directory so
existing skill file reads continue to work.
- Stores provenance in metadata:
- `catalogId`
- `catalogKey`
- `catalogKind`
- `catalogCategory`
- `catalogPath`
- `packageName`
- `packageVersion`
- `originHash`
- `originVersion`
- `userModifiedAt`
- `updateHoldReason`
- Writes activity log entries for install and update.
- Returns `409` for duplicate slug/key conflicts that cannot be resolved safely.
- Returns `422` for invalid, incompatible, or hard-blocked catalog entries.
- `force` may replace a same-key catalog-managed skill. It must not bypass
company boundaries, permission checks, hard validation, or hard security
findings.
## Error Semantics
Use existing HTTP semantics:
- `400`: invalid CLI arguments, invalid query/body shape, or malformed refs.
- `401`: missing or invalid auth.
- `403`: authenticated principal lacks access or mutation permission.
- `404`: skill, catalog entry, agent, file, company, or source not found.
- `409`: ambiguous slug, duplicate key/slug, update conflict, or unsafe overwrite.
- `422`: semantic violation such as invalid skill content or unsupported source.
- `500`: unexpected server failure.
CLI messages should name the next useful correction, for example:
- `Skill slug "review" is ambiguous. Use an id or key.`
- `Company ID is required. Pass --company-id, set PAPERCLIP_COMPANY_ID, or set a context profile.`
- `Catalog skill contains executable scripts and cannot be force-installed until security review semantics allow it.`
## Phase Acceptance Criteria
Phase A is complete when this contract is available in the repo and the issue
thread links it.
Phase B, CLI MVP:
- `paperclipai skills --help` exposes the Phase B command group.
- All Phase B commands work against existing company skills and agent skills
APIs without schema or server changes.
- Skill refs resolve by id, key, or unique slug.
- Human and JSON output are covered by focused CLI tests.
- `doc/CLI.md` documents company install vs agent desired sync vs runtime sync.
Phase C, catalog package:
- `packages/skills-catalog` is a workspace package.
- Build or validation regenerates `generated/catalog.json`.
- Validation covers frontmatter, id/key/slug uniqueness, directory shape, file
inventory, trust derivation, and stale generated output.
- Server and CLI can import the manifest without crawling arbitrary paths.
- Root `skills/` is not expanded with the app-shipped catalog.
Phase D, catalog APIs:
- Catalog list/detail/file APIs are read-only and covered by tests.
- Install-from-catalog creates auditable company-scoped skill records with
provenance metadata and materialized files.
- Company boundary and mutation permission checks match or exceed existing
company skill mutations.
- Duplicate and unsafe overwrite behavior is explicit and tested.
Phase E, catalog CLI:
- Operators can browse, search, inspect, and install app-shipped catalog skills.
- External source behavior remains routed through `skills import`.
- Output and errors follow the Phase B CLI conventions.
- Catalog install is clearly distinct from agent attach/sync in help and docs.
Phase F, update/reset/audit:
- Security review records decisions for origin hash, user modification detection,
reset, audit findings, and force behavior.
- Implementation follows the review or records explicit deferrals.
- Mutating reset/update actions are activity logged.
- Tests cover dangerous findings, force behavior, and unchanged/current states.
Phase G, adapter truth model:
- Adapter snapshots accurately report `unsupported`, `persistent`, or
`ephemeral`.
- Desired, missing, installed, stale, external, and required states are tested.
- External adapter plugins remain dynamically loaded. No hardcoded plugin imports
are added.
Phase H, UI:
- The existing Company Skills page is extended rather than replaced.
- UX guidance covers Company, Bundled, Optional, and External source views.
- Install preview shows source, trust, provenance, update state, and file
inventory.
- Agent attach/detach states are clear.
- Frontend handoff includes screenshots or equivalent browser evidence.
Phase I, initial skill content:
- Bundled and optional entries use the finalized frontmatter and category rules.
- Skill descriptions are specific enough for browse/search.
- No script-bearing skill lands without explicit security review evidence.
- Validation fixtures or tests cover representative content.
Phase J, QA and docs:
- QA validates CLI, catalog APIs, UI install, agent sync, portability, and adapter
snapshots against a dev instance.
- Blocking defects are linked as first-class issues.
- `doc/CLI.md`, `doc/DEVELOPING.md`, and skill workflow docs match shipped
behavior.
## Deferrals
- No cloud marketplace.
- No user-home tap registry.
- No hidden curator or autonomous catalog mutator.
- No normalized `agent_skills` table in the first release.
- No skill sets or bundles in the first release.
- No automatic install of every optional catalog skill.
- No replacement of company import/export as the portability path.
+142
View File
@@ -0,0 +1,142 @@
# Local Plugin Development
This is the short happy-path guide for developing a Paperclip plugin from a folder on your machine. You will scaffold a plugin, run it in watch mode, install it into a running Paperclip instance from an absolute local path, and edit code with the plugin worker reloading after each rebuild.
For the full alpha surface — manifest fields, capabilities, managed agents/projects/routines/skills, UI slots, scoped API routes — see [`PLUGIN_AUTHORING_GUIDE.md`](./PLUGIN_AUTHORING_GUIDE.md).
If your plugin has background-like recurring work, model it as managed resources:
declare managed routines plus managed agents/projects/skills, then reconcile those
resources in worker actions. This gives operators visible work items, budgets,
pause controls, and consistent audits instead of hidden daemon behavior.
## Prerequisites
- Node.js 22+ and `pnpm`.
- A local Paperclip checkout you can run from source. Local plugin installs read source from disk, so the running server must be able to see the path you give it.
## The five steps
```bash
# 1. Start Paperclip locally
pnpm paperclipai run
# 2. Scaffold a plugin outside the Paperclip repo
paperclipai plugin init @acme/hello-plugin --output ~/dev/paperclip-plugins
# 3. Install dependencies and start the watch build
cd ~/dev/paperclip-plugins/hello-plugin
pnpm install
pnpm dev
# 4. In another terminal, install the plugin from its absolute path
paperclipai plugin install ~/dev/paperclip-plugins/hello-plugin
# 5. Confirm it loaded
paperclipai plugin list
paperclipai plugin inspect acme.hello-plugin
```
That's the loop. The rest of this page explains what each step does and what to expect when you edit code.
### 1. Start Paperclip
```bash
pnpm paperclipai run
```
Paperclip listens on `http://127.0.0.1:3100` by default. The CLI talks to that server, so leave it running.
### 2. Scaffold the plugin
```bash
paperclipai plugin init @acme/hello-plugin --output ~/dev/paperclip-plugins
```
This creates `~/dev/paperclip-plugins/hello-plugin/` with `src/manifest.ts`, `src/worker.ts`, `src/ui/index.tsx`, an esbuild watch config, a Vitest config, and a snapshot of `@paperclipai/plugin-sdk` from your local Paperclip checkout. You can run the package and tests without publishing anything to npm.
Useful flags:
- `--template <default|connector|workspace|environment>` — starter shape.
- `--category <connector|workspace|automation|ui|environment>` — manifest category.
- `--display-name`, `--description`, `--author` — manifest metadata.
- `--sdk-path <absolute-path>` — point at a specific `packages/plugins/sdk` checkout if you have more than one.
When `plugin init` finishes, it prints the next four commands literally. You can copy them.
### 3. Install dependencies and run the watch build
```bash
cd ~/dev/paperclip-plugins/hello-plugin
pnpm install
pnpm dev
```
`pnpm dev` runs `esbuild --watch` against the plugin source and emits `dist/manifest.js`, `dist/worker.js`, and `dist/ui/`. Leave it running. Every time you save, esbuild rebuilds the affected output file.
If your plugin has UI and you want a browser-side dev server with hot module replacement during local UI iteration, run `pnpm dev:ui` in a second terminal. It serves `dist/ui/` on `http://127.0.0.1:4177`. This is optional; Paperclip can load the built UI directly from `dist/ui/` without it.
### 4. Install from the absolute path
```bash
paperclipai plugin install ~/dev/paperclip-plugins/hello-plugin
```
The CLI auto-detects local paths (anything that looks absolute, starts with `./`, `../`, or `~`, or resolves to an existing folder relative to the current directory) and sends `{ isLocalPath: true }` to `POST /api/plugins/install` with the resolved absolute path. If you want to be explicit, pass `--local`.
You will see a confirmation like:
```
Installing plugin from local path: /Users/you/dev/paperclip-plugins/hello-plugin
✓ Installed acme.hello-plugin v0.1.0 (ready)
Local plugin installs run trusted local code from your machine.
Keep `pnpm dev` running in /Users/you/dev/paperclip-plugins/hello-plugin;
Paperclip watches rebuilt dist output and reloads the plugin worker.
```
Relative paths are resolved against the current working directory, so `paperclipai plugin install .` from inside the plugin folder works too.
### 5. Inspect
```bash
paperclipai plugin list
paperclipai plugin inspect acme.hello-plugin
```
`list` shows plugin key, status, version, and short error. `inspect` prints the same record with the full last error if there is one. Both accept `--json` if you want to script against them.
## Reload semantics, honestly
Paperclip watches the on-disk plugin package after a local install. The watcher targets the runtime entrypoints declared in the package's `paperclipPlugin` field (`dist/manifest.js`, `dist/worker.js`, `dist/ui/`).
What that means in practice:
- **Worker code:** save a `.ts` file → esbuild rewrites `dist/worker.js` → Paperclip debounces ~500ms and restarts the plugin worker. The next worker call uses the new code. There is no in-process hot module replacement for worker code; it is a worker restart.
- **Manifest:** save `src/manifest.ts``dist/manifest.js` rewrites → the worker restarts and the host re-reads the manifest.
- **Plugin UI:** save a `.tsx` file → esbuild rewrites `dist/ui/` → Paperclip reloads the UI bundle on its next mount. To get HMR during UI iteration, run `pnpm dev:ui` and point at the dev server with `devUiUrl` in your manifest while developing.
- **Without `pnpm dev`:** the watcher only fires on `dist/*` changes. If you stop the watch build, source edits do not reach Paperclip. Restart `pnpm dev` (or run `pnpm build` once) before expecting changes.
- **`node_modules`, `.git`, `.paperclip-sdk`, and other dotfolders are ignored.** Adding a dependency requires the new code to actually be imported and rebuilt before the worker sees it.
The server never compiles plugin source for you. The package's own build scripts own that step.
## Local path plugins vs npm packages
Both go through the same install endpoint, but they mean different things:
- **Local path plugins are trusted local code.** Paperclip executes worker code from disk under the same trust boundary as the rest of the running instance. This is meant for developing or operating a plugin against a checkout you control. There is no signature check, no sandboxing of worker code, and no provenance metadata beyond the path. Do not install local-path plugins you did not write.
- **npm packages are the deployable artifact.** `paperclipai plugin install @acme/plugin-foo` (optionally `--version 1.2.3`) installs from your configured npm registry, version-pins, and produces an install record that other operators can reproduce. Ship plugins this way.
When you are done iterating locally, publish the package and reinstall the npm-package form so the install reflects what you will ship.
## Common things to do next
- **Restart cleanly:** `paperclipai plugin disable <key>` pauses the plugin without removing it. `paperclipai plugin enable <key>` brings it back. `paperclipai plugin uninstall <key>` removes the install record; add `--force` to also purge plugin state and settings.
- **Browse examples:** `paperclipai plugin examples` lists the bundled example plugins that ship with the repo, each with a ready-to-run `paperclipai plugin install <path>` line.
- **Go deeper:** [`PLUGIN_AUTHORING_GUIDE.md`](./PLUGIN_AUTHORING_GUIDE.md) covers worker capabilities, managed agents/projects/routines/skills, plugin database namespaces, scoped API routes, and the shared UI components in `@paperclipai/plugin-sdk/ui`. [`PLUGIN_SPEC.md`](./PLUGIN_SPEC.md) is the longer-form specification, including future ideas that are not yet implemented.
- **Routine-first automation:** If your plugin should produce periodic issue work, prefer managed routines and `ctx.routines.managed` reconciliation over custom process loops or unobserved cron code.
## Troubleshooting
- **`Plugin install returned no plugin record` or `error` status.** Run `paperclipai plugin inspect <key>` for the last error. The most common causes are (1) the plugin has not built yet — run `pnpm dev` or `pnpm build` first, (2) the `paperclipPlugin` entries in `package.json` point at files that do not exist on disk, or (3) the manifest failed validation. The Paperclip server log has the full validation error.
- **Edits do not seem to reload.** Confirm `pnpm dev` is still running and writing to `dist/`. If you renamed entry files, update the `paperclipPlugin.manifest` / `paperclipPlugin.worker` / `paperclipPlugin.ui` fields in `package.json` so the watcher targets them.
- **Worker restarts but UI is stale.** Hard-reload the page. If you want HMR, run `pnpm dev:ui` and set `devUiUrl` in your manifest to `http://127.0.0.1:4177` during development.
- **Path arguments fail on Windows.** Quote paths that contain spaces, and prefer absolute paths over `~`-prefixed paths in non-bash shells.
+437 -30
View File
@@ -4,34 +4,33 @@ This guide describes the current, implemented way to create a Paperclip plugin i
It is intentionally narrower than [PLUGIN_SPEC.md](./PLUGIN_SPEC.md). The spec includes future ideas; this guide only covers the alpha surface that exists now.
> **New to plugins?** Start with the short [Local Plugin Development guide](./LOCAL_PLUGIN_DEVELOPMENT.md) — it walks the CLI happy path (`plugin init` → `pnpm dev` → `plugin install <path>`) end to end. Come back here for the full manifest surface, worker capabilities, and UI components.
## Current reality
- Treat plugin workers and plugin UI as trusted code.
- Plugin UI runs as same-origin JavaScript inside the main Paperclip app.
- Worker-side host APIs are capability-gated.
- Plugin UI is not sandboxed by manifest capabilities.
- There is no host-provided shared React component kit for plugins yet.
- Plugin database migrations are restricted to a host-derived plugin namespace.
- Plugin-managed surfaces are first-class records (agents, projects, routines, and
skills) rather than private plugin-only state.
- Plugin-owned JSON API routes must be declared in the manifest and are mounted
only under `/api/plugins/:pluginId/api/*`.
- The host provides a small shared React component kit through
`@paperclipai/plugin-sdk/ui`; use it for common Paperclip controls before
building custom versions.
- `ctx.assets` is not supported in the current runtime.
## Scaffold a plugin
Use the scaffold package:
Use the CLI scaffold command:
```bash
pnpm --filter @paperclipai/create-paperclip-plugin build
node packages/plugins/create-paperclip-plugin/dist/index.js @yourscope/plugin-name --output ./packages/plugins/examples
paperclipai plugin init @yourscope/plugin-name --output /absolute/path/to/plugin-repos
```
For a plugin that lives outside the Paperclip repo:
```bash
pnpm --filter @paperclipai/create-paperclip-plugin build
node packages/plugins/create-paperclip-plugin/dist/index.js @yourscope/plugin-name \
--output /absolute/path/to/plugin-repos \
--sdk-path /absolute/path/to/paperclip/packages/plugins/sdk
```
That creates a package with:
That creates `<output>/plugin-name/` with:
- `src/manifest.ts`
- `src/worker.ts`
@@ -42,11 +41,13 @@ That creates a package with:
Inside this monorepo, the scaffold uses `workspace:*` for `@paperclipai/plugin-sdk`.
Outside this monorepo, the scaffold snapshots `@paperclipai/plugin-sdk` from the local Paperclip checkout into a `.paperclip-sdk/` tarball so you can build and test a plugin without publishing anything to npm first.
Outside this monorepo, the scaffold snapshots `@paperclipai/plugin-sdk` from the local Paperclip checkout into a `.paperclip-sdk/` tarball so you can build and test a plugin without publishing anything to npm first. Pass `--sdk-path /absolute/path/to/paperclip/packages/plugins/sdk` if you have more than one Paperclip checkout.
## Recommended local workflow
## Local development workflow
From the generated plugin folder:
See the short [Local Plugin Development guide](./LOCAL_PLUGIN_DEVELOPMENT.md) for the full happy path (`pnpm dev``paperclipai plugin install <absolute-path>``paperclipai plugin list`) and reload semantics.
Minimum verification from the generated plugin folder:
```bash
pnpm install
@@ -55,16 +56,6 @@ pnpm test
pnpm build
```
For local development, install it into Paperclip from an absolute local path through the plugin manager or API. The server supports local filesystem installs and watches local-path plugins for file changes so worker restarts happen automatically after rebuilds.
Example:
```bash
curl -X POST http://127.0.0.1:3100/api/plugins/install \
-H "Content-Type: application/json" \
-d '{"packageName":"/absolute/path/to/your-plugin","isLocalPath":true}'
```
## Supported alpha surface
Worker:
@@ -77,11 +68,15 @@ Worker:
- secrets
- activity
- state
- database namespace via `ctx.db`
- scoped JSON API routes declared with `apiRoutes`
- entities
- projects and project workspaces
- projects, project workspaces, and plugin-managed projects
- companies
- issues and comments
- agents and agent sessions
- issues, comments, namespaced `plugin:<pluginKey>` origins, blocker relations, checkout assertions, assignment wakeups, and orchestration summaries
- agents, plugin-managed agents, and agent sessions
- plugin-managed routines
- plugin-managed skills
- goals
- data/actions
- streams
@@ -89,6 +84,232 @@ Worker:
- metrics
- logger
### Plugin database declarations
First-party or otherwise trusted orchestration plugins can declare:
```ts
database: {
migrationsDir: "migrations",
coreReadTables: ["issues"],
}
```
Required capabilities are `database.namespace.migrate` and
`database.namespace.read`; add `database.namespace.write` for runtime mutations.
The host derives `ctx.db.namespace`, runs SQL files in filename order before the
worker starts, records checksums in `plugin_migrations`, and rejects changed
already-applied migrations.
Migration SQL may create or alter objects only inside `ctx.db.namespace`. It may
reference whitelisted `public` core tables for foreign keys or read-only views,
but may not mutate/alter/drop/truncate public tables, create extensions,
triggers, untrusted languages, or runtime multi-statement SQL. Runtime
`ctx.db.query()` is restricted to `SELECT`; runtime `ctx.db.execute()` is
restricted to namespace-local `INSERT`, `UPDATE`, and `DELETE`.
### Scoped plugin API routes
Plugins can expose JSON-only routes under their own namespace:
```ts
apiRoutes: [
{
routeKey: "initialize",
method: "POST",
path: "/issues/:issueId/smoke",
auth: "board-or-agent",
capability: "api.routes.register",
checkoutPolicy: "required-for-agent-in-progress",
companyResolution: { from: "issue", param: "issueId" },
},
]
```
The host resolves the plugin, checks that it is ready, enforces
`api.routes.register`, matches the declared method/path, resolves company access,
and applies checkout policy before dispatching to the worker's `onApiRequest`
handler. The worker receives sanitized headers, route params, query, parsed JSON
body, actor context, and company id. Do not use plugin routes to claim core
paths; they always remain under `/api/plugins/:pluginId/api/*`.
## Managed Paperclip resources
Plugins that provide durable Paperclip business objects should declare them in
the manifest and let the host create or relink the actual records per company.
Do this for plugin-owned agents, projects, routines, and skills.
Do not hide long-lived work behind private plugin state when it should be visible
to the board, scoped to a company, audited, budgeted, and assigned like normal
Paperclip work.
Content-oriented plugins, such as LLM Wiki-style ingestion or durable knowledge
systems, should use the same pattern: managed projects for operation issues,
managed agents plus managed skills for LLM work, and managed routines for
ingest, lint, refresh, or maintenance runs.
Use these surfaces:
- Managed agents: declare top-level `agents[]` and require
`agents.managed`. Use this when the plugin provides a named worker the board
should see in the org, budget, pause, invoke, and inspect. Managed agents are
normal Paperclip agents with plugin ownership metadata, not background plugin
workers.
- Managed projects: declare top-level `projects[]` and require
`projects.managed`. Use this when the plugin needs a stable company-scoped
project for its issues, routines, or workspace-oriented UI. Keep plugin work
in a project instead of scattering generated issues across unrelated projects.
- Managed routines: declare top-level `routines[]` and require
`routines.managed`. Use this for scheduled, webhook, or manually triggered
jobs that should create visible Paperclip issues. Prefer managed routines over
plugin `jobs[]` for recurring business work; plugin jobs are for plugin
runtime maintenance that does not need a board-visible task trail.
- Managed skills: declare top-level `skills[]` and require `skills.managed`.
Use this for reusable plugin capabilities that should be surfaced to operators and
synced into Paperclip managed agents.
Managed resources are resolved by stable plugin keys, not hardcoded database
ids. In a worker action or data handler, call `ctx.agents.managed.reconcile()`,
`ctx.projects.managed.reconcile()`, `ctx.routines.managed.reconcile()`, and
`ctx.skills.managed.reconcile()` for
the current `companyId`. `reconcile()` creates the missing resource, relinks a
recoverable binding, or returns the existing resource. `reset()` reapplies the
manifest defaults when the operator wants to restore the plugin's suggested
configuration.
Declare dependencies between managed resources with refs. A routine can point
at a managed agent through `assigneeRef` and at a managed project through
`projectRef`. Reconcile the referenced agent and project before reconciling the
routine; if a ref is still missing, the routine resolution reports
`missing_refs` instead of guessing.
```ts
import type { PaperclipPluginManifestV1 } from "@paperclipai/plugin-sdk";
const manifest: PaperclipPluginManifestV1 = {
id: "example.research-plugin",
apiVersion: 1,
version: "0.1.0",
displayName: "Research Plugin",
description: "Creates a managed research agent and scheduled research routine.",
author: "Example",
categories: ["automation"],
capabilities: [
"agents.managed",
"projects.managed",
"routines.managed",
"skills.managed",
"instance.settings.register",
],
entrypoints: {
worker: "./dist/worker.js",
ui: "./dist/ui",
},
agents: [
{
agentKey: "researcher",
displayName: "Researcher",
role: "research",
title: "Research Agent",
capabilities: "Runs recurring research briefs for this company.",
adapterPreference: ["codex_local", "claude_local", "process"],
instructions: {
content: "Follow the Paperclip heartbeat and produce concise research briefs.",
},
},
],
projects: [
{
projectKey: "research",
displayName: "Research",
description: "Recurring research work created by the Research Plugin.",
status: "in_progress",
},
],
routines: [
{
routineKey: "weekly-brief",
title: "Weekly research brief",
description: "Create a short research brief for the board.",
assigneeRef: { resourceKind: "agent", resourceKey: "researcher" },
projectRef: { resourceKind: "project", resourceKey: "research" },
priority: "medium",
triggers: [
{
kind: "schedule",
label: "Monday morning",
cronExpression: "0 9 * * 1",
timezone: "America/Chicago",
enabled: false,
},
],
},
],
skills: [
{
skillKey: "weekly-brief-skills",
displayName: "Weekly Briefer",
description: "Reusable skill for the managed research workflow.",
},
],
ui: {
slots: [
{
type: "settingsPage",
id: "settings",
displayName: "Research",
exportName: "SettingsPage",
},
],
},
};
export default manifest;
```
In the worker, expose a small setup action or settings-page action that
reconciles the resources for the selected company:
```ts
import { definePlugin } from "@paperclipai/plugin-sdk";
export default definePlugin({
setup(ctx) {
ctx.actions.register("setup-company", async (params) => {
const companyId = String(params.companyId ?? "");
if (!companyId) throw new Error("companyId is required");
const project = await ctx.projects.managed.reconcile("research", companyId);
const agent = await ctx.agents.managed.reconcile("researcher", companyId);
const routine = await ctx.routines.managed.reconcile("weekly-brief", companyId);
const skill = await ctx.skills.managed.reconcile("weekly-brief-skills", companyId);
return { project, agent, routine, skill };
});
},
});
```
Authoring rules:
- Keep keys stable once published. Renaming `agentKey`, `projectKey`,
`routineKey`, or `skillKey` creates a new managed resource from the host's
point of view.
- Use managed agents for plugin-provided labor. Use `ctx.agents.invoke()` or
`ctx.agents.sessions` only after you have a real agent id, either selected by
the operator or resolved from `ctx.agents.managed`.
- Use managed routines for recurring or externally triggered work that should
produce tasks. Schedule, webhook, and API triggers are visible routine
triggers, and each run has the normal Paperclip issue/audit trail.
- Use managed skills for reusable operator-visible capabilities that are shared
by managed agents. Reconcile skill declarations by `skillKey` and keep the
declared skill markdown and files in sync with agent behavior.
- Use managed projects to keep plugin-generated work organized and to give
project-scoped plugin UI a stable home. For filesystem access inside a
project, still resolve project workspaces through `ctx.projects`.
- Keep defaults conservative. Managed declarations are suggestions owned by the
plugin, but the resulting resources are normal Paperclip records that the
operator can inspect, pause, and adjust.
UI:
- `usePluginData`
@@ -104,6 +325,7 @@ Mount surfaces currently wired in the host include:
- `settingsPage`
- `dashboardWidget`
- `sidebar`
- `routeSidebar`
- `sidebarPanel`
- `detailTab`
- `taskDetailView`
@@ -114,6 +336,191 @@ Mount surfaces currently wired in the host include:
- `commentAnnotation`
- `commentContextMenuItem`
## Shared host components
Use shared components from `@paperclipai/plugin-sdk/ui` when the plugin needs a
Paperclip-native control. The host owns the implementation, so plugins inherit
the board's current styling, ordering, recent selections, and dark-mode behavior
without importing `ui/src` internals.
Prefer shared components for common Paperclip UX patterns to reduce drift and
deprecation risk, especially for task/assignment flows and routine or sidebar-like
plugin screens.
Currently exposed components include:
- `MarkdownBlock` and `MarkdownEditor` for rendered and editable markdown.
- `FileTree` for serializable file and directory trees.
- `IssuesList` for a native company-scoped issue table.
- `AssigneePicker` for the same agent/user selector used in the new issue pane.
Use the controlled `value` format `agent:<id>`, `user:<id>`, or `""`.
- `ProjectPicker` for the same project selector used in the new issue pane.
Use the controlled project id value, or `""` for no project.
- `ManagedRoutinesList` for plugin-owned routine settings pages.
```tsx
import { AssigneePicker, ProjectPicker } from "@paperclipai/plugin-sdk/ui";
export function PluginAssignmentControls({ companyId }: { companyId: string }) {
const [assignee, setAssignee] = useState("");
const [projectId, setProjectId] = useState("");
return (
<>
<AssigneePicker
companyId={companyId}
value={assignee}
onChange={(value) => setAssignee(value)}
/>
<ProjectPicker
companyId={companyId}
value={projectId}
onChange={setProjectId}
/>
</>
);
}
```
## File and path UI
Plugin UI often needs to render a file tree, accept a folder path, or browse a
project workspace. There are three different surfaces for that, and they map to
different trust and data-flow boundaries. Pick the surface that matches the
data the plugin actually has.
### When to use the shared `FileTree`
Use `FileTree` from `@paperclipai/plugin-sdk/ui` whenever the plugin only needs
to render a serializable file/directory list and react to selection or
expand/collapse. The host owns the implementation, so plugin UI inherits the
board's icons, indent, focus ring, and dark-mode styling without importing host
internals.
```tsx
import {
FileTree,
type FileTreeNode,
} from "@paperclipai/plugin-sdk/ui";
const nodes: FileTreeNode[] = [
{ name: "AGENTS.md", path: "AGENTS.md", kind: "file", children: [] },
{
name: "wiki",
path: "wiki",
kind: "dir",
children: [
{ name: "index.md", path: "wiki/index.md", kind: "file", children: [] },
],
},
];
export function WikiTree() {
const [expanded, setExpanded] = useState<Set<string>>(() => new Set(["wiki"]));
const [selected, setSelected] = useState<string | null>(null);
return (
<FileTree
nodes={nodes}
selectedFile={selected}
expandedPaths={expanded}
onSelectFile={(path) => setSelected(path)}
onToggleDir={(path) =>
setExpanded((current) => {
const next = new Set(current);
next.has(path) ? next.delete(path) : next.add(path);
return next;
})
}
/>
);
}
```
Good fits:
- LLM Wiki page navigation in `packages/plugins/plugin-llm-wiki` builds a
`FileTreeNode[]` from worker query results and renders it through `FileTree`.
- The example `plugin-file-browser-example` lazily fetches a directory's
children through a `loadFileList` action when `onToggleDir` fires, then
merges the children into the local tree state — letting the shared component
handle rendering and selection.
Boundary rules:
- Keep the prop surface serializable (`nodes`, `expandedPaths`, `checkedPaths`,
`fileBadges`, `fileTones`). Do not pass arbitrary render functions across the
plugin/host boundary in v1; the supported escape hatches are
`fileBadges` (status pill keyed by path) and `fileTones` (row tone keyed by
path).
- Do not import the host's `FileTree.tsx` or any `ui/src/*` module. The SDK
declaration is the only supported import path for plugin UI.
- The shared `FileTree` is for rendering and selection. Plugin-specific editors,
ingest flows, query forms, and lint runs stay inside the plugin and do not
belong as `FileTree` props.
### When to declare `localFolders`
When the plugin needs operator-configured filesystem roots — typically for
trusted local plugins like wiki tooling — declare `localFolders[]` on the
manifest and add the `local.folders` capability. The host renders a settings
surface for the operator to set the absolute path, validates the path
server-side (containment, symlinks, required files/directories), and exposes
`ctx.localFolders.readText()` and `ctx.localFolders.writeTextAtomic()` in the
worker.
```ts
export const manifest = {
capabilities: ["local.folders"],
localFolders: [
{
folderKey: "content-root",
displayName: "Content root",
access: "readWrite",
requiredDirectories: ["sources", "pages"],
requiredFiles: ["schema.md"],
},
],
};
```
Use this when:
- The data lives outside any project workspace.
- Reads and writes need company-scoped configuration.
- The operator picks the path once in plugin settings and the worker resolves
files relative to that root.
Do not use `localFolders` to grant the UI direct browser-side access to the
filesystem — there is no such capability. The browser still goes through the
worker via `getData` / `performAction`, and the worker only exposes paths it
chose to expose.
### When to keep worker-mediated project workspace browsing
When the data lives inside an existing project workspace, keep the browsing
flow worker-mediated:
- The worker uses `ctx.projects.listWorkspaces()` to resolve the workspace
path, then reads its filesystem with normal Node APIs.
- The plugin UI calls a `getData` handler for the root listing and an action
for lazy children, then renders them through `FileTree`.
- The worker is the only side that touches the disk. The browser receives a
serializable tree and never sees raw absolute paths it can replay.
The example `plugin-file-browser-example` is the reference for this pattern:
the worker registers `fileList` (data) and `loadFileList` (action) over the
same handler, and the UI uses the action for on-toggle directory loading so the
shared `FileTree` stays the rendering surface.
### Mixing surfaces
A single plugin can use more than one of these. The LLM Wiki uses
`localFolders` for its content root, then renders the resulting page list
through `FileTree`. The file browser example uses `ctx.projects.listWorkspaces`
to pick a workspace and renders its on-disk tree through `FileTree` with lazy
loading. Pick the boundary per data source, not per plugin.
## Company routes
Plugins may declare a `page` slot with `routePath` to own a company route like:
+167 -6
View File
@@ -27,7 +27,10 @@ Current limitations to keep in mind:
- Published npm packages are the intended install artifact for deployed plugins.
- The repo example plugins under `packages/plugins/examples/` are development conveniences. They work from a source checkout and should not be assumed to exist in a generic published build unless they are explicitly shipped with that build.
- Dynamic plugin install is not yet cloud-ready for horizontally scaled or ephemeral deployments. There is no shared artifact store, install coordination, or cross-node distribution layer yet.
- The current runtime does not yet ship a real host-provided plugin UI component kit, and it does not support plugin asset uploads/reads. Treat those as future-scope ideas in this spec, not current implementation promises.
- The current runtime ships a small host-provided plugin UI component kit through `@paperclipai/plugin-sdk/ui`, but does not support plugin asset uploads/reads yet. Treat plugin asset APIs as future-scope ideas, not current implementation promises.
- Scoped plugin API routes are JSON-only and must be declared in `apiRoutes`.
They mount under `/api/plugins/:pluginId/api/*`; plugins cannot shadow core
API routes.
In practice, that means the current implementation is a good fit for local development and self-hosted persistent deployments, but not yet for multi-instance cloud plugin distribution.
@@ -316,7 +319,10 @@ export interface PaperclipPluginManifestV1 {
version: string;
displayName: string;
description: string;
author: string;
categories: Array<"connector" | "workspace" | "automation" | "ui">;
minimumHostVersion?: string;
/** @deprecated Use `minimumHostVersion` instead. Retained for backwards compatibility. */
minimumPaperclipVersion?: string;
capabilities: string[];
entrypoints: {
@@ -332,15 +338,42 @@ export interface PaperclipPluginManifestV1 {
description: string;
parametersSchema: JsonSchema;
}>;
database?: PluginDatabaseDeclaration;
apiRoutes?: PluginApiRouteDeclaration[];
environmentDrivers?: PluginEnvironmentDriverDeclaration[];
agents?: PluginManagedAgentDeclaration[];
projects?: PluginManagedProjectDeclaration[];
routines?: PluginManagedRoutineDeclaration[];
skills?: PluginManagedSkillDeclaration[];
localFolders?: PluginLocalFolderDeclaration[];
/** Legacy top-level launcher declarations. Prefer `ui.launchers` for new manifests. */
launchers?: PluginLauncherDeclaration[];
ui?: {
launchers?: PluginLauncherDeclaration[];
slots: Array<{
type: "page" | "detailTab" | "dashboardWidget" | "sidebar" | "settingsPage";
type: "page"
| "detailTab"
| "taskDetailView"
| "dashboardWidget"
| "sidebar"
| "routeSidebar"
| "sidebarPanel"
| "projectSidebarItem"
| "globalToolbarButton"
| "toolbarButton"
| "contextMenuItem"
| "commentAnnotation"
| "commentContextMenuItem"
| "settingsPage"
| "companySettingsPage";
id: string;
displayName: string;
/** Which export name in the UI bundle provides this component */
exportName: string;
/** For detailTab: which entity types this tab appears on */
entityTypes?: Array<"project" | "issue" | "agent" | "goal" | "run">;
/** For page and companySettingsPage: single route segment */
routePath?: string;
}>;
};
}
@@ -351,10 +384,17 @@ Rules:
- `id` must be globally unique
- `id` should normally equal the npm package name
- `apiVersion` must match the host-supported plugin API version
- `minimumHostVersion` is preferred, with `minimumPaperclipVersion` retained for
backwards compatibility
- `capabilities` must be static and install-time visible
- config schema must be JSON Schema compatible
- `entrypoints.ui` points to the directory containing the built UI bundle
- `ui.slots` declares which extension slots the plugin fills, so the host knows what to mount without loading the bundle eagerly; each slot references an `exportName` from the UI bundle
- declare managed declarations with the matching `*.managed` capability:
- `agents``agents.managed`
- `projects``projects.managed`
- `routines``routines.managed`
- `skills``skills.managed`
## 11. Agent Tools
@@ -624,7 +664,62 @@ Required SDK clients:
Plugins that need filesystem, git, terminal, or process operations handle those directly using standard Node APIs or libraries. The host provides project workspace metadata through `ctx.projects` so plugins can resolve workspace paths, but the host does not proxy low-level OS operations.
## 14.1 Example SDK Shape
## 14.1 Issue Orchestration APIs
Trusted orchestration plugins can create and update Paperclip issues through `ctx.issues` instead of importing server internals. The public issue contract includes parent/project/goal links, board or agent assignees, blocker IDs, labels, billing code, request depth, execution workspace inheritance, and plugin origin metadata.
Plugins that perform durable work should declare managed Paperclip resources rather than using private plugin state:
- `agents` + `ctx.agents.managed.*` for named, invokable operators (`agents.managed` required)
- `projects` + `ctx.projects.managed.*` for stable, scoped issue/workspace ownership (`projects.managed` required)
- `routines` + `ctx.routines.managed.*` for schedule/webhook/manual execution with issue trails (`routines.managed` required)
- `skills` + `ctx.skills.managed.*` for reusable agent capabilities (`skills.managed` required)
The LLM Wiki plugin is the current reference for this pattern: it declares managed
agents, projects, routines, and skills in manifest, reconciles them per company,
and uses managed routines for periodic wiki maintenance and ingest operations.
Content-oriented plugins should follow the same model instead of running
unmanaged background loops: make the LLM-facing worker an operator-visible
managed agent, attach reusable prompt/tool guidance as managed skills, keep
operation issues in a managed project, and drive recurring work through managed
routines.
Origin rules:
- Built-in core issues keep built-in origins such as `manual` and `routine_execution`.
- Plugin-managed issues use `plugin:<pluginKey>` or a sub-kind such as `plugin:<pluginKey>:feature`.
- The host derives the default plugin origin from the installed plugin key and rejects attempts to set `plugin:<otherPluginKey>` origins.
- `originId` is plugin-defined and should be stable for idempotent generated work.
Relation and read helpers:
- `ctx.issues.relations.get(issueId, companyId)`
- `ctx.issues.relations.setBlockedBy(issueId, blockerIssueIds, companyId)`
- `ctx.issues.relations.addBlockers(issueId, blockerIssueIds, companyId)`
- `ctx.issues.relations.removeBlockers(issueId, blockerIssueIds, companyId)`
- `ctx.issues.getSubtree(issueId, companyId, options)`
- `ctx.issues.summaries.getOrchestration({ issueId, companyId, includeSubtree, billingCode })`
Governance helpers:
- `ctx.issues.assertCheckoutOwner({ issueId, companyId, actorAgentId, actorRunId })` lets plugin actions preserve agent-run checkout ownership.
- `ctx.issues.requestWakeup(issueId, companyId, options)` requests assignment wakeups through host heartbeat semantics, including terminal-status, blocker, assignee, and budget hard-stop checks.
- `ctx.issues.requestWakeups(issueIds, companyId, options)` applies the same host-owned wakeup semantics to a batch and may use an idempotency key prefix for stable coordinator retries.
Plugin-originated issue, relation, document, comment, and wakeup mutations must write activity entries with `actorType: "plugin"` and details fields for `sourcePluginId`, `sourcePluginKey`, `initiatingActorType`, `initiatingActorId`, and `initiatingRunId` when a user or agent run initiated the plugin work.
Scoped API routes:
- `apiRoutes[]` declares `routeKey`, `method`, plugin-local `path`, `auth`,
`capability`, optional checkout policy, and company resolution.
- The host enforces auth, company access, `api.routes.register`, route matching,
and checkout policy before worker dispatch.
- The worker implements `onApiRequest(input)` and returns a JSON response shape
`{ status?, headers?, body? }`.
- Only safe request headers are forwarded; auth/cookie headers are never passed
to the worker.
## 14.2 Example SDK Shape
```ts
/** Top-level helper for defining a plugin with type checking */
@@ -696,20 +791,46 @@ The host enforces capabilities in the SDK layer and refuses calls outside the gr
- `project.workspaces.read`
- `issues.read`
- `issue.comments.read`
- `issue.documents.read`
- `issue.relations.read`
- `issue.subtree.read`
- `agents.read`
- `goals.read`
- `activity.read`
- `costs.read`
- `issues.orchestration.read`
- `database.namespace.read`
### Data Write
- `issues.create`
- `issues.update`
- `issue.comments.create`
- `assets.write`
- `assets.read`
- `issue.interactions.create`
- `issue.documents.write`
- `issue.relations.write`
- `issues.checkout`
- `issues.wakeup`
- `activity.log.write`
- `metrics.write`
- `telemetry.track`
- `assets.read`
- `assets.write`
- `database.namespace.migrate`
- `database.namespace.write`
- `goals.create`
- `goals.update`
- `projects.managed`
- `routines.managed`
- `skills.managed`
- `agents.managed`
- `agents.pause`
- `agents.resume`
- `agents.invoke`
- `agent.sessions.create`
- `agent.sessions.list`
- `agent.sessions.send`
- `agent.sessions.close`
### Plugin State
@@ -722,8 +843,10 @@ The host enforces capabilities in the SDK layer and refuses calls outside the gr
- `events.emit`
- `jobs.schedule`
- `webhooks.receive`
- `local.folders`
- `http.outbound`
- `secrets.read-ref`
- `environment.drivers.register`
### Agent Tools
@@ -736,6 +859,7 @@ The host enforces capabilities in the SDK layer and refuses calls outside the gr
- `ui.page.register`
- `ui.detailTab.register`
- `ui.dashboardWidget.register`
- `ui.commentAnnotation.register`
- `ui.action.register`
## 15.2 Forbidden Capabilities
@@ -772,6 +896,13 @@ Minimum event set:
- `issue.created`
- `issue.updated`
- `issue.comment.created`
- `issue.document.created`
- `issue.document.updated`
- `issue.document.deleted`
- `issue.relations.updated`
- `issue.checked_out`
- `issue.released`
- `issue.assignment_wakeup_requested`
- `agent.created`
- `agent.updated`
- `agent.status_changed`
@@ -781,6 +912,8 @@ Minimum event set:
- `agent.run.cancelled`
- `approval.created`
- `approval.decided`
- `budget.incident.opened`
- `budget.incident.resolved`
- `cost_event.created`
- `activity.logged`
@@ -835,6 +968,7 @@ Job rules:
3. The host prevents overlapping execution of the same plugin/job combination unless explicitly allowed later.
4. Every job run is recorded in Postgres.
5. Failed jobs are retryable.
6. For recurring business workflows that should create visible Paperclip work, prefer managed routines and managed resources over jobs. Jobs remain useful for private plugin-runtime maintenance tasks.
## 18. Webhooks
@@ -917,13 +1051,23 @@ export function DashboardWidget({ context }: PluginWidgetProps) {
The SDK includes a `ui` subpath export that plugin frontends import. This subpath provides:
- **Bridge hooks**: `usePluginData(key, params)`, `usePluginAction(key)`, `useHostContext()`
- **Bridge hooks**: `usePluginData(key, params)`, `usePluginAction(key)`, `useHostContext()`, `useHostNavigation()`
- **Design tokens**: colors, spacing, typography, shadows matching the host theme
- **Shared components**: `MetricCard`, `StatusBadge`, `DataTable`, `LogView`, `ActionBar`, `Spinner`, etc.
- **Type definitions**: `PluginPageProps`, `PluginWidgetProps`, `PluginDetailTabProps`
Plugins are encouraged but not required to use the shared components. A plugin may render entirely custom UI as long as it communicates through the bridge.
`useHostNavigation()` is the supported way for plugin UI to navigate to
Paperclip-internal pages. It exposes `resolveHref(to)`, `navigate(to,
options?)`, and `linkProps(to, options?)`. Plugin links should prefer
`linkProps()` so anchors keep real `href` values for copy-link, modifier-click,
middle-click, and open-in-new-tab behavior while plain left-clicks route through
the host SPA router. The host resolves company-scoped paths against the active
company prefix without double-prefixing already-prefixed paths. Plugin UI should
not use raw same-origin `href`s or `window.location.assign()` for internal
Paperclip navigation because those can force a full document reload.
### 19.0.2 Bundle Isolation
Plugin UI bundles are loaded as standard ES modules, not iframed. This gives plugins full rendering performance and access to the host's design tokens.
@@ -1003,6 +1147,11 @@ The host SDK ships shared components that plugins can import to quickly build UI
| `LogView` | Scrollable log output with timestamps | Webhook deliveries, job output, process logs |
| `JsonTree` | Collapsible JSON tree for debugging | Raw API responses, plugin state inspection |
| `Spinner` | Loading indicator | Data fetch states |
| `FileTree` | Host-styled file/directory tree | Wiki pages, workspace files, import previews |
| `IssuesList` | Host issue list | Plugin pages that need a native issue view |
| `AssigneePicker` | Host assignee picker for agents and board users | Creating issues, assigning routines, filtering work |
| `ProjectPicker` | Host project picker | Creating issues, scoping dashboards, filtering work |
| `ManagedRoutinesList` | Host routine list | Plugin settings pages that manage routines |
Plugins may also use entirely custom components. The shared components exist to reduce boilerplate and keep visual consistency, not to limit what plugins can render.
@@ -1060,6 +1209,8 @@ For plugins that need richer settings UX beyond what JSON Schema can express, th
Both approaches coexist: a plugin can use the auto-generated form for simple config and add a custom settings page slot for advanced configuration or operational dashboards.
For plugins that need a company-scoped settings surface, declare a `companySettingsPage` slot with a `routePath`. The host renders a sidebar item under Company Settings and mounts the component at `/:companyPrefix/company/settings/:routePath`. The page receives `companyId` and `companyPrefix` in its host context. Core settings routes such as `access`, `invites`, `environments`, and `secrets` are reserved and cannot be shadowed by plugin declarations.
## 20. Local Tooling
Plugins that need filesystem, git, terminal, or process operations implement those directly. The host does not wrap or proxy these operations.
@@ -1238,6 +1389,8 @@ Plugin-originated mutations should write:
- `actor_type = plugin`
- `actor_id = <plugin-id>`
- details include `sourcePluginId` and `sourcePluginKey`
- details include `initiatingActorType`, `initiatingActorId`, and `initiatingRunId` when a user or agent run triggered the plugin work
## 21.5 Plugin Migrations
@@ -1307,6 +1460,14 @@ Each plugin may expose a company-context main page:
This page is where board users do most day-to-day work.
## 24.4 Company Settings Plugin Page
Each ready plugin may expose a company settings page:
- `/:companyPrefix/company/settings/:routePath`
The host adds a matching Company Settings sidebar item using the slot `displayName`. Plugin settings route segments are single-segment slugs and must not collide with core company settings pages.
## 25. Uninstall And Data Lifecycle
When a plugin is uninstalled, the host must handle plugin-owned data explicitly.

Some files were not shown because too many files have changed in this diff Show More