paperclip

Author	SHA1	Message	Date
Devin Foley	aea35fe695	exe.dev config UX: advanced-options disclosure, form-default fix, SSH key handling (PAPA-407) (#7025 ) ## Thinking Path > - Paperclip orchestrates AI agents and provisions sandboxed execution environments for them; one of those provisioners is the exe.dev plugin, which runs each agent inside a long-lived VM reached over SSH. > - The instance-config form for that plugin is rendered generically by `JsonSchemaForm` from the plugin's `instanceConfigSchema`, so any UX problem with the form is split between the shared form component and the plugin's schema/runtime code. > - Users coming in cold hit a 12-field flat config they couldn't reason about (PAPA-407), a form that silently submitted `cpu: 0` for untouched optional fields (PAPA-407 root cause), a `sshPrivateKey` textarea that truncated RSA-4096 keys at 4096 chars (PAPA-449), a save flow that accepted clearly-malformed keys and only blew up at lease time with raw SSH stderr (PAPA-450, PAPA-451), and a manifest that didn't distinguish "essential" from "advanced" knobs (PAPA-410 / PAPA-411 — duplicate sub-issues with identical scope; PAPA-418 reconciliation kept PAPA-410 canonical). > - These problems all point at the same surface (exe.dev sandbox config) and are tightly coupled in code — PAPA-449/450/451 patch fields that PAPA-410/411 introduce — so they get reviewed together. > - This pull request lands the shared-form changes (advanced-options disclosure, optional-scalar defaults) and the exe.dev-specific changes (manifest restructure, longer `maxLength`, stderr translation, save-time key validation) as five focused commits stacked on `master`. > - The benefit is a config form that defaults to the two fields a new user actually needs (API key + SSH private key) with a collapsible disclosure for the rest, no silent truncation or zero-default submissions, and SSH key problems surfaced at save time with actionable messages instead of cryptic post-provision failures. ## What Changed - JsonSchemaForm advanced-options disclosure (PAPA-410, PAPA-411 — same scope, see note above): adds `x-paperclip-advanced` / `x-paperclip-group` schema annotations and renders flagged fields behind a collapsible "Advanced options" disclosure that auto-opens when a hidden field has a validation error. Exe.dev manifest is restructured to use the new annotations, so essentials (`apiKey`, `sshPrivateKey`) show by default while the long tail of optional knobs is grouped under "SSH access" / "VM resources" / "More options" headings. - Omit optional scalar defaults (PAPA-407): `getDefaultForSchema` no longer materialises `0` / `""` for optional `number`/`integer`/`string`/`secret-ref` fields without an explicit `default`. Object recursion drops properties whose default is `undefined`. Fields that declare a `default` (e.g. `sshPort: 22`) still round-trip. Adds a regression test against `getDefaultValues`. - Raise `sshPrivateKey` `maxLength` (PAPA-449): bumps the exe.dev manifest cap from 4096 to 8192 so RSA-4096 OpenSSH private keys (which can exceed 4 KB with comments/metadata) aren't silently truncated at submit. - Translate `invalid format` SSH stderr (PAPA-450): `formatSshFailure` now recognises `Load key … invalid format` in combined stderr/stdout and returns a specific message naming the key-format problem ("isn't an OpenSSH/PEM private key — confirm the secret starts with `-----BEGIN … PRIVATE KEY-----` and isn't the `.pub` or a PuTTY `.ppk` export") instead of dumping the raw stderr. - Save-time SSH key validation (PAPA-451): `onEnvironmentValidateConfig` inline-parses `sshPrivateKey` and rejects common failure modes — pasted public keys, PuTTY `.ppk` format, missing `-----END-----` footer, non-base64 body — so the form surfaces an inline error before any VM is provisioned. Secret-ref bindings (UUIDs) are still passed through unchanged. ## Verification CI gates (`pnpm typecheck`, `pnpm test`, the targeted vitest suites below) all pass. Run locally: ```bash # Shared form pnpm --filter @paperclipai/ui exec vitest run src/components/JsonSchemaForm # 9 tests pass — includes the new "omits optional scalar fields" regression # and the three advanced-options-disclosure tests. # exe.dev plugin cd packages/plugins/sandbox-providers/exe-dev && pnpm test # 32 tests pass — includes the new sshPrivateKey-validation cases # and the new "invalid format" stderr-translation case. ``` Manual smoke (after reinstalling the plugin so the DB manifest refreshes): 1. Open the exe.dev environment config page. Default view shows API Key + SSH Private Key only, with an "Advanced options" disclosure for everything else (PAPA-410 / PAPA-411). 2. Paste a `.pub` file's contents into SSH Private Key, click Save. Inline error rejecting the wrong-format key (PAPA-451). 3. Re-paste a valid OpenSSH/PEM private key longer than 4096 bytes — saves cleanly (PAPA-449). 4. Save the form with everything optional left blank — server no longer rejects with `"cpu must be greater than 0 when provided"` (PAPA-407). 5. Force a bad key through via a stored secret-ref binding and lease a VM — failure message names the key-format problem instead of dumping raw SSH stderr (PAPA-450). ## Risks - PAPA-410 / PAPA-411 manifest restructure is the largest surface here. Schemas using `x-paperclip-` extensions are forward-compatible with stricter JSON Schema validators (extensions are ignored by default), and the form gracefully renders a flat layout when no field opts in. - PAPA-407* changes form-default behaviour: optional scalar fields that previously round-tripped as `""` / `0` will now be `undefined` and absent from the submitted payload. Downstream consumers that expected the empty-string/zero shape need to treat the field as optional. Spot-checked the existing exe.dev driver — it already uses `parseOptionalString` / `parseOptionalInteger`, which treat missing fields as `null` rather than `0`/`""`. - PAPA-451 adds a save-time check, so a previously-saved-but-malformed `sshPrivateKey` raw value will now fail to re-save. Bound secret-refs are unaffected, matching how the user reaches the bad-key state today (via the secrets picker). - PAPA-449 simply raises a cap; no semantic risk. - PAPA-450 only kicks in on the "invalid format" code path; existing onboarding-marker branch is untouched. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (`claude-opus-4-7`) - Capabilities used: code reading, code editing, test execution, git/PR mechanics, Paperclip API for issue coordination ## Checklist - [x] PR body sections present (Thinking Path, What Changed, Verification, Risks, Model Used, Checklist) - [x] Unit tests added for the new behaviours (JsonSchemaForm default-value omission + advanced disclosure; exe.dev plugin validation + stderr translation) - [x] Existing tests still pass locally (`vitest run` on both packages) - [x] No raw secrets, IP addresses, or machine-local config in commits or PR body - [x] Commits are atomic per linked issue (PAPA-410 / PAPA-411, PAPA-407, PAPA-449, PAPA-450, PAPA-451) - [x] Branch is up-to-date with `origin/master` --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-29 18:19:37 -07:00
Dotta	5153b01ada	[codex] Add Claude model refresh (#6953 ) ## Thinking Path > - Paperclip orchestrates AI-agent companies through adapter-backed local and external runtimes. > - The agent configuration UI lets operators choose adapter models and refresh model lists when adapters support live discovery. > - Codex already had a live refresh path, but Claude Local only exposed static fallback models and the UI hid the refresh action for Claude. > - A newly available Claude Opus model should not require a code release every time the model catalog changes. > - This pull request adds Anthropic model discovery for Claude Local, keeps the static fallback current with Claude Opus 4.8, and exposes the existing refresh button in the Claude Local dropdown. > - The benefit is that operators can refresh Claude models from the same model selector flow they already use for Codex. ## What Changed - Added `claude-opus-4-8` to the Claude Local fallback model list. - Added Claude model discovery through Anthropic-compatible `GET /v1/models` when `ANTHROPIC_API_KEY` is available. - Added normal cache reuse, forced refresh support, a SHA-256-based API-key fingerprint for cache keys, and warning logging for discovery errors before fallback. - Wired `claude_local.refreshModels` into the server adapter registry. - Enabled the existing `Refresh models` dropdown action for `claude_local` in `AgentConfigForm`. - Added tests for Claude fallback, live discovery, API-failure fallback, forced refresh, and the UI refresh-button gate. ## Verification - `pnpm exec vitest run server/src/__tests__/adapter-models.test.ts` - `pnpm exec vitest run ui/src/components/AgentConfigForm.test.ts` - `pnpm --filter @paperclipai/adapter-claude-local typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm --filter @paperclipai/ui typecheck` - Greptile review reached Confidence Score: 5/5 on commit `b796cf4f1` with addressed threads resolved. UI note: the visible change is a conditional action row inside the existing model dropdown; the regression test covers that `claude_local` now receives the refresh action. ## Risks - Low risk. Without `ANTHROPIC_API_KEY`, Claude Local still uses the static fallback list. - If Anthropic model discovery fails or times out, Paperclip falls back to the existing cached or static list. - Bedrock environments remain on Bedrock-native model IDs. ## Model Used OpenAI GPT-5 via Codex local coding agent, with repository file access, shell command execution, git operations, and targeted test/typecheck verification. Exact context window is not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-05-29 07:03:07 -10:00
Devin Foley	1f70fd9a22	PAPA-430: workspace finalize gates + no-remote-git enforcement (#6969 ) ## Thinking Path > - Paperclip orchestrates AI agents across isolated execution workspaces; the local cwd is the only persistence boundary between runs. > - Workspace lifecycle (worktree_prepare → execute → workspace_finalize) and the wake/accept flow are what guarantee that dependent issues see a consistent worktree. > - PAPA-380 / PAPA-431 / PAPA-432 / PAPA-440 surfaced three holes in that contract: silent env reuse across assignees, dependent wakes firing before finalize, and `issue.interaction.accept` advancing before finalize landed. > - PAPA-441 / PAPA-442 then needed to document the "no remote git" contract and prevent future adapter/runtime code from quietly reintroducing `git push` as a backdoor sync. > - This pull request lands those server fixes, the static `check-no-git-push` enforcement, the AUTHORING.md cross-link, and the Cody-review follow-ups on the PAPA-430 thread. > - The benefit is that finalize is a real barrier — board accepts, dependent wakes, and operator-set env all respect it — and adapter code can't bypass it via raw `git push`. ## What Changed - server (PAPA-380, PAPA-431): `execution-workspace-policy` refuses silent env reuse when the assignee's resolved env disagrees with the workspace it would inherit. The inheritance protection is now scoped to the actual inheritance signal — explicit issue-level `environmentId` is honored even when the agent's default env is `null`. - server (PAPA-432): `heartbeat.ts` gates dependent wakes on `listUnfinalizedExecutionWorkspaceIds`, and writes a `workspace_finalize` row on the succeeded path. Write failures now surface instead of being swallowed so dependents aren't silently stranded behind a missing row. - server (PAPA-440): `issue-thread-interactions.acceptInteraction` adds a workspace_finalize precondition for `request_confirmation` (not `suggest_tasks`). Accept returns 409 if finalize hasn't succeeded for the latest workspace operation. - ci (PAPA-442): new `scripts/check-no-git-push.mjs` static check scans `packages/adapters/`, `packages/adapter-utils/`, `server/src/`, and `cli/src/` for any `git push` invocation (string or args-array). Wired into the `policy` PR job and `test:release-registry`. Operators can opt in per-call with `// paperclip:allow-git-push: <reason>`. Release scripts are out of scope by design. - docs (PAPA-441): `AUTHORING.md` documents the no-remote-git contract and cross-links the static check so adapter authors learn the rule and the enforcement together. - review follow-up (PAPA-430, Cody): three fixes — env resolver bug, accept-gate scope (request_confirmation only), and finalize record write on the succeeded path. ## Verification - `pnpm exec vitest run server/src/__tests__/execution-workspace-policy.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts` → 33/33 pass - `node scripts/check-no-git-push.test.mjs` → check covers string form, args-array form, comment exclusions, and per-line allow-comment. - Manual: server compiles; the policy job runs the check in <1s before heavier jobs. ## Risks - Behavioral shift in accept: boards accepting `request_confirmation` while finalize is in-flight now get 409s. This is intentional — they can retry — but it changes timing on a hot path. `suggest_tasks` is unaffected. - Workspace policy: the env-reuse refusal is a new error path. Issues that previously silently reused an env from a different-assignee workspace will now fail-loud; the resolver still honors explicit issue-level `executionWorkspaceSettings.environmentId`. - CI rule: any future legitimate `git push` in scoped dirs must be marked with the allow-comment, which is the intended ergonomic. ## Model Used - Claude Opus 4.7 (`claude-opus-4-7`, extended thinking), via Claude Code in the Paperclip executor adapter. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots (N/A — server/CI/docs only) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Closes related issues: PAPA-430, PAPA-380, PAPA-431, PAPA-432, PAPA-440, PAPA-441, PAPA-442 --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-29 08:25:29 -07:00
Devin Foley	d9f91576a0	Add accepted-plan decomposition exact-once guards and UI state (#6831 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, so planning approvals and child-issue fan-out are part of the core control-plane loop. > - Accepted plans are supposed to be a safe bridge from planning into execution, especially when agents wake from review decisions and reuse isolated workspaces. > - The duplicate-subtask incident showed that an accepted plan revision could be interpreted more than once across overlapping runs, which broke the single-source-of-truth model for issue decomposition. > - Fixing that required tightening the backend contract first: accepted-plan decomposition needs an exact-once fingerprint, durable claim state, and retry-safe child creation. > - Once that backend behavior existed, the board still needed visibility into what happened, so the issue detail view needed a dedicated decomposition section instead of forcing operators to reconstruct child creation from raw activity. > - This pull request adds the exact-once decomposition primitive, hardens wake routing and regressions around the incident, and surfaces decomposition state in the UI so future incidents are both prevented and easier to inspect. ## What Changed - Added accepted-plan decomposition semantics to `doc/execution-semantics.md`, including the exact-once fingerprint, durable claim/result expectations, and retry/resume behavior. - Added persistent accepted-plan decomposition claims in the backend, including schema, shared types/validators, service logic, and issue routes for creating and listing decomposition state. - Hardened heartbeat routing so an accepted-plan continuation stays scoped to the relevant planning issue instead of opportunistically re-decomposing another accepted issue on the same assignee. - Added regression coverage for the original failure modes: concurrent same-parent retries, cross-issue accepted-plan isolation, and partial child recreation under the same fingerprint. - Added the `Plan decomposition` issue-detail section plus supporting API/query-key/activity formatting updates so operators can see revision status, owner, child counts, and the linked child issues directly in the UI. - Included the small follow-up UI fix so the decomposition section still renders when the issue work mode is no longer `planning`. ## Verification - `pnpm --filter @paperclipai/server typecheck` - `pnpm --filter @paperclipai/ui typecheck` - `pnpm --filter @paperclipai/db typecheck` - `pnpm exec vitest run server/src/__tests__/issues-service.test.ts` - `pnpm exec vitest run server/src/__tests__/issues-service.test.ts -t "lists persisted decompositions with child issue summaries"` - `pnpm exec vitest run server/src/__tests__/issues-service.test.ts -t "accepted plan decomposition" server/src/__tests__/heartbeat-accepted-plan-workspace-refresh.test.ts server/src/__tests__/heartbeat-context-summary.test.ts` - Manual UI path: create a planning issue without an isolated execution workspace, add a `plan` document, accept the `request_confirmation`, let Paperclip create child issues, then reopen the parent issue detail page and confirm the `Plan decomposition` section shows the accepted revision, status, idempotent-claim badge, and child links. - Separate follow-up bug noted during manual UI validation: accepting a plan on an issue whose run never records `workspace_finalize` is tracked in `PAPA-445` and is not part of this PR’s fix scope. ## Risks - This adds a new migration and a large Drizzle snapshot update; reviewers should confirm the schema shape and generated metadata match the intended decomposition table. - The exact-once claim changes sit on the accepted-plan fan-out path, so regressions there could block legitimate child creation or mis-handle retries if the claim state machine is wrong. - The new UI only appears when decomposition records exist; reviewers should use the manual verification path above rather than expecting existing issues on a stale local instance to show the section automatically. - `PAPA-445` remains an open follow-up for the `workspace_finalize` accept gate when a planning handoff never records finalize; that bug can interfere with reproducing the UI flow on isolated workspaces but does not change the correctness of the exact-once decomposition feature itself. > Checked `ROADMAP.md`: this PR is a bug fix / control-plane hardening change for accepted-plan decomposition, not a new uncoordinated roadmap feature. ## Model Used - OpenAI Codex via Paperclip `codex_local` (GPT-5-based coding agent; exact backend model ID/context window not exposed in the run context), with repository tool use, shell execution, and code-editing capabilities. <img width="806" height="1069" alt="Screenshot 2026-05-27 at 11 05 48 PM" src="https://github.com/user-attachments/assets/5b00b670-96cd-4470-b0a3-581743bcae28" /> ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-28 23:30:18 -07:00
Dotta	9eac727cf1	[codex] Add skills CLI and catalog management (#6782 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies through company-scoped control-plane workflows. > - Agents need reusable, inspectable skills that can be installed, reset, audited, exported, and assigned without bespoke local setup. > - The existing skill truth model needed cleanup so bundled skills, optional catalog skills, runtime skills, and adapter-provided skills have clear provenance. > - Operators also need a practical CLI and board UI for discovering and managing company skills. > - This pull request adds the skills CLI, packaged skills catalog, company skills APIs, and catalog-aware board UI. > - The benefit is a more reusable Paperclip company setup where skills are portable, auditable, and easier for operators and agents to manage. ## What Changed - Added `paperclipai skills` CLI commands and coverage for catalog listing, installing, resetting, and inspecting company skills. - Added a packaged `@paperclipai/skills-catalog` workspace with bundled and optional skill content plus validation/build tests. - Added shared company-skill types and validators used across CLI, server, and UI contracts. - Added server catalog APIs/services for company skill catalog operations, reset semantics, audit behavior, and portability provenance. - Updated adapter skill handling so runtime/catalog provenance remains explicit across local adapters. - Added board UI support for browsing and managing catalog-backed company skills. - Updated docs for the skills CLI/catalog flow and the company skills Paperclip skill reference. - Rebased the branch onto current `paperclipai/paperclip:master`; no `pnpm-lock.yaml`, `.github/workflows`, or migration files are included in the final PR diff. ## Verification - Passed: `pnpm run preflight:workspace-links && pnpm exec vitest run cli/src/__tests__/skills.test.ts packages/skills-catalog/src/catalog-builder.test.ts packages/skills-catalog/src/shipped-catalog.test.ts packages/shared/src/validators/company-skill.test.ts packages/adapter-utils/src/server-utils.test.ts packages/plugins/create-paperclip-plugin/src/entrypoints.test.ts server/src/__tests__/company-skills-catalog-service.test.ts server/src/__tests__/company-skills-routes.test.ts server/src/__tests__/company-portability.test.ts`. - Passed: `pnpm exec vitest run server/src/__tests__/workspace-runtime.test.ts -t "default branch\|origin/master\|symbolic-ref"`. - Attempted: full `server/src/__tests__/workspace-runtime.test.ts`. Four provisioning tests failed while seeding an isolated worktree database from the local Paperclip instance because the local plugin schema dump contains a duplicate-column foreign key (`plugin_content_machine_18a7bc327b.content_case_signals`). The default-branch tests touched by the rebase conflict passed in the focused run above. - Checked final diff: no `pnpm-lock.yaml`, no `.github/workflows`, and no migration-file changes relative to `master`. ## Risks - Medium: this is a broad skills/catalog change touching CLI, server APIs, shared contracts, adapter skill sync, and UI. - Catalog validation and reset semantics need careful reviewer attention because they affect reusable company setup and portability. - No database migrations are included in this PR, so there is no migration ordering/idempotency risk in the final diff. - No lockfile is included by design; dependency resolution will be handled by the repository lockfile workflow. ## Model Used - OpenAI Codex coding agent based on GPT-5, running in Paperclip via the `codex_local` adapter with shell, git, GitHub CLI, and code-editing tool access. Exact hosted model build/context-window metadata is not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run targeted tests locally and documented the local workspace-runtime seed failure above - [x] I have added or updated tests where applicable - [x] If this change affects the UI, screenshots were intentionally omitted per PAP-10124 instructions; UI behavior is covered by tests and reviewer inspection - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-28 07:33:51 -10:00
Dotta	b7545823be	[codex] Add document annotations and comments (#6733 ) ## Thinking Path > - Paperclip orchestrates AI-agent companies through issues, documents, runs, and durable company-scoped state. > - Issue documents are where agents and operators capture plans, handoffs, and work products. > - Before this change, document collaboration could only happen through whole-document edits and detached issue comments. > - Inline document annotations need stable anchors, revision-aware persistence, and UI affordances that do not break existing document editing. > - This pull request adds company-scoped document annotation threads, comments, anchor snapshots, API routes, and board UI. > - The benefit is that operators and agents can discuss specific document passages without losing context as documents evolve. ## What Changed - Added document annotation tables, schema exports, shared types, validators, anchor hashing, and text-anchor helpers. - Added server-side document annotation services and issue routes for listing, creating, commenting, resolving, and reopening annotation threads. - Included annotation summaries in relevant issue document reads and backup/recovery document workspace behavior. - Added React UI for inline document highlights, comment panels, mobile sheet behavior, deep-link focus, and resolved/open filtering. - Added annotation design artifacts, Storybook coverage, screenshots, and a screenshot helper script. - Rebased the branch onto current `paperclipai/paperclip` `master` and renumbered the annotation migration from `0085_old_swarm` to `0091_old_swarm`; the SQL uses `IF NOT EXISTS` guards so environments that previously applied the old migration number can safely apply the new one. - Adjusted the new annotation UI tests to use a local async flush helper because this workspace's React 19.2.4 export does not expose `React.act`. ## Verification - `pnpm run preflight:workspace-links && pnpm exec vitest run packages/shared/src/document-anchors.test.ts server/src/__tests__/document-annotation-routes.test.ts server/src/__tests__/document-annotations-service.test.ts ui/src/components/DocumentAnnotationLayer.test.tsx ui/src/components/IssueDocumentAnnotations.test.tsx ui/src/lib/document-annotation-hash.test.ts ui/src/lib/document-annotation-selection.test.ts` - Confirmed `git diff --check` passes. - Confirmed no `pnpm-lock.yaml` or `.github/workflows/*` files are included in the PR diff. ## Risks - Medium risk: this adds new persisted annotation tables and routes across db/shared/server/ui. - Migration risk is reduced by moving the branch migration to `0091_old_swarm` after upstream `0090_resource_memberships` and keeping the SQL idempotent for old `0085_old_swarm` adopters. - UI risk is mostly around text range anchoring and panel positioning across long documents, folded content, and mobile layouts; the PR includes focused unit coverage and design screenshots. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-using software engineering mode. Context window size is not exposed in this Paperclip runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-26 06:41:23 -07:00
Dotta	9aea3e3d35	[codex] Add resource membership controls (#6677 ) ## Thinking Path > - Paperclip orchestrates AI-agent companies through company-scoped issues, projects, agents, and board-visible workflows. > - The board sidebar and project list are the daily navigation surface for that control plane. > - Users need to keep all projects and agents accessible while hiding resources they have intentionally left from their own sidebar. > - That requires user-scoped resource membership state backed by company-scoped API and database contracts. > - The branch also needed to preserve HTTP worktree login sessions and keep the project list easier to scan after membership grouping. > - This pull request adds resource membership controls, sidebar leave actions, grouped/sortable project listings, and focused tests. > - The benefit is a cleaner personal workspace view without weakening company-scoped access to the underlying project or agent detail pages. ## What Changed - Added `project_memberships` and `agent_memberships` tables with API/shared/server contracts for current-user join/leave state. - Renumbered the membership migration to `0090_resource_memberships` after rebasing onto current `master`, and made it idempotent for anyone who had applied the old branch-local `0087` migration. - Added project and agent sidebar leave actions, plus list filtering that waits for membership state before hiding resources. - Added grouped project listing, project sorting controls, and reserved row subtitle height for cleaner scanning. - Fixed HTTP auth cookie security handling so HTTP worktree sessions can persist. - Updated focused server and UI tests for the new membership, sidebar, project list, and auth behavior. ## Verification - `pnpm exec vitest run server/src/__tests__/better-auth.test.ts server/src/__tests__/resource-memberships-routes.test.ts ui/src/pages/Projects.test.tsx ui/src/components/SidebarProjects.test.tsx ui/src/components/SidebarAgents.test.tsx ui/src/components/MembershipAction.test.tsx ui/src/components/EntityRow.test.tsx` - Confirmed the branch is rebased on current `origin/master`. - Confirmed the PR diff does not include `pnpm-lock.yaml` or `.github/workflows` changes. ## Risks - Migration safety: low to medium. The migration now uses `IF NOT EXISTS` / guarded constraints and is numbered after current master migrations, but it should still get CI coverage against fresh databases. - UI behavior: low. Left resources are hidden from sidebar only after membership state loads; direct detail access remains available. - Auth behavior: low. Cookie security is relaxed only for HTTP/private local-style origins where secure cookies would prevent login persistence. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI GPT-5 Codex coding agent, tool-enabled shell/git workflow, context window not exposed by runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Screenshot note: no browser screenshots were captured in this heartbeat; the UI changes are covered by focused component tests above. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-25 13:12:41 -05:00
Dotta	ece8a51e22	[codex] Bundle local branch fixes from PAP-10032 (#6604 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - This branch accumulated multiple already-tested control-plane, adapter runtime, invite, workspace, plugin, and UI quality fixes on the primary Paperclip checkout. > - `origin/master` advanced while those commits were still local, so the branch needed to be preserved and reconciled before review. > - Splitting the branch commit-by-commit against the new base produced overlapping conflicts with recently merged upstream PRs. > - This pull request keeps the remaining branch as one standalone PR because the final diff is 38 files after removing screenshot artifacts, under Greptile's 100-file cap, and can be merged independently after review. > - The benefit is that none of the local work is lost, the branch is now based on current `origin/master`, and reviewers can evaluate the reconciled changes in one place. ## What Changed - Merged the local accumulated branch with current `origin/master` and resolved the invite-flow overlaps from the newer upstream companies query helper. - Preserved the local fixes for invite existing-member behavior, invite link copy fallback, reusable workspace selection, worktree auth, static SPA fallback, markdown wrapping, plugin slot registration, cloud upstream UX/server polish, project sorting, and related tests. - Removed screenshot artifacts from the PR per review request. - Kept the PR under the requested file limit: 38 files changed, with no `pnpm-lock.yaml` or `.github/workflows/` changes. ## Verification - `NODE_ENV=test pnpm exec vitest run ui/src/pages/CompanyInvites.test.tsx ui/src/pages/InviteLanding.test.tsx ui/src/pages/Projects.test.tsx ui/src/plugins/slots.test.ts ui/src/components/MarkdownBody.test.tsx server/src/__tests__/invite-accept-existing-member.test.ts server/src/__tests__/static-index-html.test.ts server/src/__tests__/execution-workspaces-service.test.ts server/src/__tests__/better-auth.test.ts server/src/__tests__/worktree-config.test.ts` - `NODE_ENV=test pnpm --filter @paperclipai/ui typecheck` - `NODE_ENV=test pnpm --filter @paperclipai/server typecheck` - Confirmed `git diff --name-only origin/master...HEAD \| wc -l` is `38`. - Confirmed no PR diff entries match `pnpm-lock.yaml`, `.github/workflows/`, or `screenshots/*`. ## Risks - Medium review risk because this is a bundled rescue PR rather than several narrow feature PRs. - Invite flow and company cache behavior overlapped with newer upstream changes; the merge resolution intentionally keeps the shared `companiesListQueryOptions` helper while preserving local existing-member invite behavior. - Visual review evidence is no longer attached in-repo because screenshots were removed from this PR per review request. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent, with repository tool access, terminal execution, and git/GitHub CLI operations. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] UI screenshots were intentionally removed from this PR per review request - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: CodexCoder <codexcoder@paperclip.local>	2026-05-25 07:25:26 -05:00
Devin Foley	96f0279e08	Make ACPX-Claude adapter work seamlessly (PAPA-388) (#6590 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, so when an adapter fails, the platform must surface enough detail for the next agent (or human reviewer) to act > - The `acpx_local` adapter wraps `claude-agent-acp`, which in turn drives the Claude Code SDK — three layers, three different permission and error-handling models > - A user created a `Claude Local ACPX` agent in PAPA-387 and it failed instantly with the generic `acpx.error / "Internal error"` log, stranding the work and triggering an opaque `stranded_assigned_issue` recovery to the CTO > - Once the diagnostic blackbox was opened, the underlying cause turned out to be two SDK-level mismatches: a model-name allowlist that rejects bare IDs like `claude-opus-4-7`, and a Claude Code permission/Read-sandbox configuration that silently denies every non-allowlisted tool when the user's `~/.claude/settings.json` has `defaultMode: "dontAsk"` > - This pull request fixes both classes of failure in the adapter itself so new ACPX agents work seamlessly without per-host configuration, and widens the diagnostic surface so the next failure of any kind is actionable > - The benefit is that ACPX-Claude can join the regular agent roster — verified end to end on PAPA-401, where the agent successfully reached the Paperclip API, opened a worktree, surveyed existing notification PRs, and posted a structured plan ## What Changed - Widen ACPX failure diagnostics (`packages/adapters/acpx-local/src/server/execute.ts`): - Capture `err.name`, ACP code, `cause.message`, retryable flag, and a 5-frame stack preview into `errorMeta`. - Promote phase-specific error codes: `ensure_session → acpx_session_init_failed`, `configure_session → acpx_session_config_failed`, `turn → acpx_turn_failed`, plus mapping for `ACP_BACKEND_MISSING` / `ACP_BACKEND_UNAVAILABLE`. - Set `verbose: true` on the ACPX runtime so its session-event log flows through `ctx.onLog`. - Capture child-process stderr via a wrapper-script tee into `<stateDir>/run-stderr/<runId>.log`, inline the tail into the `acpx.error` payload as `childStderrTail`, and forward it through `ctx.onLog("stderr", …)` so it lands in the heartbeat `stderrExcerpt` column (existing redaction applies). - Set the model via `ANTHROPIC_MODEL` env for the `claude` agent instead of `set_config_option(model, …)`. The ACP server's `set_config_option` handler validates against an internal allowlist and rejects bare IDs like `claude-opus-4-7`. `ANTHROPIC_MODEL` is read during initialization and bypasses that check. - Seed `<worktree>/.claude/settings.local.json` before spawning `claude-agent-acp` (the seamless-API fix). Since `claude-agent-acp` hard-codes `settingSources: ["user", "project", "local"]` and "local" has the highest precedence: - Set `permissions.defaultMode: "default"`, but only if the user's value is missing or `"dontAsk"` (the broken case). Other modes like `acceptEdits`/`plan` are preserved. - Pre-allow Paperclip's Bash surface (`Bash(curl:)`, `Bash(env:)`, `Bash(<cwd>/scripts/paperclip-issue-update.sh:)`, `Bash(<cwd>/scripts/paperclip:)`). - Widen `permissions.additionalDirectories` to include `stateDir`, `agentHome`, and the per-company instance root (`~/.paperclip/instances/<id>/companies/<companyId>`). Scoped to this company only — does not expose other tenants. - Existing user entries are merged, not replaced. The resolved roots are folded into the session fingerprint so warm-session handles invalidate when they change. - Sync the existing server-side integration test (`server/src/__tests__/acpx-local-execute.test.ts`) to assert `acpx_session_init_failed` instead of the now-removed `acpx_protocol_error` for `ACP_SESSION_INIT_FAILED` (a follow-up to commit 1). ## Verification - `pnpm --filter "@paperclipai/adapter-acpx-local" run typecheck` — passes. - `pnpm vitest run` in `packages/adapters/acpx-local` — 35/35 pass, includes 4 new tests covering the settings.local.json write path (claude only, merge with pre-existing content, `dontAsk` override, codex no-op). - `pnpm vitest run src/__tests__/acpx-local-execute.test.ts` in `server/` — 15/15 pass after the test-sync commit. - End-to-end manual verification (PAPA-401): the `Claude Local ACPX` agent that previously hit "restricted environment" now successfully reaches the Paperclip API, opens its worktree, posts structured plan comments, and flips the issue to `in_review` without any external configuration. ## Risks - Low, scoped to the `acpx_local` adapter. The settings.local.json write is per-worktree (worktrees live under `.paperclip/worktrees/<issue>/`) and only triggers when `acpxAgent === "claude"`. Existing user content is merged with `[...existing, ...paperclip]` and deduped — nothing is overwritten outright. - The `defaultMode` override is intentionally narrow: it only flips `"dontAsk"` (which silently denies every tool and is the root cause) to `"default"`. Users who explicitly picked `acceptEdits`, `plan`, or any other mode keep their choice. - Stderr capture goes through the existing `log-redaction` pass before persisting, so `PAPERCLIP_API_KEY` and similar secrets in the wrapper env don't leak into heartbeat logs. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - Claude Opus 4.7 (`claude-opus-4-7`), running in the `claude_local` adapter via Paperclip's harness. Extended thinking enabled, tool use enabled. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A (adapter-only) - [ ] I have updated relevant documentation to reflect my changes — no user-facing docs changed; internal commentary in the code change explains the SDK constraints - [x] I have considered and documented any risks above - [ ] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-23 13:01:27 -07:00
Devin Foley	e3c875c1c7	fix(sandbox): prevent E2B workspace upload + lease idle failures (PAPA-382) (#6560 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Heartbeats run inside managed sandboxes (E2B, Cloudflare Sandbox), and each run begins by uploading the agent's workspace as a tar archive > - PAPA-381's E2B runs were failing at 5 and 11 minutes — two distinct failure modes were entangled: workspace tar extraction errors on Linux, and sandbox idle/lease timeouts during normal heartbeat gaps > - Workspace tar extraction failed because macOS bsdtar embeds `LIBARCHIVE.xattr.` PAX headers that GNU tar on Linux rejects with "This does not look like a tar archive"; the existing `COPYFILE_DISABLE=1` only suppresses AppleDouble `._` sidecars, not inline PAX xattr entries > - E2B sandboxes also expired between heartbeats because `timeoutMs` defaulted to a short window and was never refreshed per execute, and Cloudflare sandboxes idled out because `sleepAfter` defaulted to 10 minutes > - This pull request adds `--no-xattrs` to the workspace tar invocation, refreshes the E2B sandbox lifetime on each execute and bumps the default `timeoutMs` to 1h, and raises the Cloudflare `sleepAfter` default to 1h > - The benefit is that long-running heartbeat-driven runs (Claude, Codex, etc.) survive across both their initial workspace upload and the natural idle gaps between executes on both E2B and Cloudflare ## What Changed - `packages/adapter-utils/src/sandbox-managed-runtime.ts`: added `--no-xattrs` to `createTarballFromDirectory` so macOS bsdtar produces a clean POSIX tar that GNU tar on Linux can extract, with an inline comment explaining why `COPYFILE_DISABLE=1` alone is insufficient. - `packages/plugins/sandbox-providers/e2b/src/plugin.ts`: refresh the sandbox lifetime on every execute (so long runs don't expire mid-job) and raised the default `timeoutMs` to 1h. - `packages/plugins/sandbox-providers/e2b/src/manifest.ts` and `plugin.test.ts`: updated manifest defaults and added regression coverage for the new behavior. - `packages/plugins/sandbox-providers/cloudflare/src/config.ts`, `manifest.ts`, `plugin.test.ts`: raised default `sleepAfter` from 10m to 1h, mirroring the E2B 1h default, and added a regression test asserting the acquire-lease request body sends `sleepAfter: "1h"` when not overridden. ## Verification - `pnpm --filter @paperclipai/plugin-e2b test` - `pnpm --filter @paperclipai/plugin-cloudflare-sandbox test` - Locally cherry-picked the `--no-xattrs` fix onto master and confirmed end-to-end via a real PAPA-381-style heartbeat-driven E2B run that the workspace upload now extracts cleanly on Linux. The user (board operator) tested this on master and reported "Ok, that worked." - Manual reviewer steps: trigger an E2B heartbeat from a macOS host (this is where the bsdtar xattr headers come from), confirm the workspace tar extracts on the Linux sandbox side; run a long (>15 min) Cloudflare sandbox flow and confirm no lost-lease/idle errors between executes. ## Risks - Low risk overall. - `--no-xattrs` is widely supported by both macOS bsdtar and GNU tar (Linux). Worst case it silently no-ops on a future host that doesn't support it; in that case the existing failure mode reappears, it doesn't introduce a new one. - Raising default `timeoutMs` (E2B) and `sleepAfter` (Cloudflare) from short values to 1h means sandboxes stay alive longer between executes by default. This is the intended behavior — operators that want a tighter idle window can still override via plugin config. - E2B per-execute sandbox lifetime refresh adds a small API call per execute; it is bounded by the same client that already handles execute traffic, so no new dependencies or retry semantics. ## Model Used - Claude (Anthropic), `claude-opus-4-7`, extended thinking enabled, tool use enabled (file/grep/git tools and Paperclip control-plane API). Used to diagnose the dual failure mode (workspace tar PAX xattr headers + sandbox lifetime), write the fixes and tests, and drive the verification loop with the board operator. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots (N/A — no UI changes) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-22 13:34:11 -07:00
Dotta	ad6effa65c	[codex] Improve runtime and import reliability (#6549 ) ## Thinking Path > - Paperclip coordinates autonomous company work through local and hosted runtime surfaces. > - Local embedded Postgres and tenant import/export paths are foundational reliability pieces. > - A runtime failure in either path can stop agents or imports before useful work begins. > - The branch included remaining fixes for embedded native library bootstrap and async tenant import handling. > - This pull request groups those runtime/import reliability changes into one standalone PR. > - The benefit is a more robust local runtime and safer cloud tenant import behavior. ## What Changed - Prepared embedded Postgres native runtime before startup in CLI/server/test entrypoints. - Added embedded Postgres native bootstrap coverage. - Added async tenant import job handling and deferred validation coverage. - Kept the runtime/import changes based directly on current `origin/master` after related upstream PRs had already merged. ## Verification - `pnpm --filter @paperclipai/plugin-sdk build` - `NODE_ENV=test pnpm exec vitest run packages/db/src/embedded-postgres-native.test.ts server/src/__tests__/company-portability-routes.test.ts` ## Risks - Medium-low: this touches startup/import paths, but the branch is small and covered by targeted tests. - The embedded Postgres change depends on platform-specific native-library behavior, so CI and follow-up checks should still verify supported runners. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI GPT-5 Codex via `codex_local`, tool-enabled coding session; exact context window not exposed by this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-05-22 09:57:22 -05:00
Dotta	e43b392a79	[codex] Add local Cloud Upstream sync (#6548 ) ## Thinking Path > - Paperclip is the control plane for AI-agent companies. > - Operators need a path to move local company state toward Paperclip Cloud without losing local-first control. > - The Cloud Upstream flow needs API, persistence, CLI, and board UI surfaces that agree on the same manifest/run model. > - The existing branch had the feature work plus UX and error-handling follow-ups. > - This pull request packages the remaining Cloud Upstream sync work into one standalone branch. > - The benefit is an inspectable local-to-cloud sync workflow with preview, conflicts, activation, and captured UX review states. ## What Changed - Added Cloud Upstream shared types, server routes/services, and persisted run schema/migration. - Added Paperclip Cloud CLI sync helpers and local connection storage. - Added the Cloud Upstream board UI, settings entry points, query keys, and UX lab page. - Added preview/activation checklist behavior, redirect handling, manifest-only preview support, friendly errors, in-flight hints, and entity count summaries. ## Verification - `pnpm --filter @paperclipai/plugin-sdk build` - `NODE_ENV=test pnpm exec vitest run cli/src/__tests__/cloud.test.ts server/src/__tests__/instance-settings-routes.test.ts server/src/__tests__/instance-settings-service.test.ts ui/src/pages/CloudUpstream.test.tsx ui/src/components/CompanySettingsSidebar.test.tsx` - `NODE_ENV=test pnpm exec vitest run server/src/__tests__/cloud-upstreams.test.ts` Worktree setup note: the isolated worktree install skipped native sqlite build scripts, so I copied the already-built local sqlite binding from the main checkout before running `server/src/__tests__/cloud-upstreams.test.ts`. The test then passed. ## Risks - Medium: this adds a database migration and a broad feature path across CLI/server/UI. - Merge order: this is the only PR in this split with a DB migration; merge it before any future Cloud Upstream migration follow-up. - Mitigation: the PR is based directly on current `origin/master`, has targeted route/service/UI tests, and keeps the feature behind existing experimental Cloud Sync settings. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI GPT-5 Codex via `codex_local`, tool-enabled coding session; exact context window not exposed by this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, screenshot artifacts are intentionally omitted per reviewer request - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-05-22 09:56:22 -05:00
Dotta	a1835cfa5e	[codex] Harden plugin runtime invocation scope (#6547 ) ## Thinking Path > - Paperclip orchestrates AI-agent companies through a company-scoped control plane. > - Plugins extend that control plane, but plugin workers still call back into host APIs. > - Those worker-to-host calls need the same company boundary guarantees as normal API routes. > - Plugin action handlers also need authenticated actor context from the host instead of trusting caller-supplied params. > - This pull request hardens plugin bridge/action scope and keeps plugin operation issues out of normal issue surfaces. > - The benefit is safer plugin execution with clearer authorization boundaries and better test coverage. ## What Changed - Added host-owned invocation context plumbing for nested plugin worker calls. - Added actor context to plugin `performAction` calls and test harness helpers. - Enforced company invocation scope on worker-to-host calls and filtered company lists to the active invocation scope. - Extended plugin action route tests for board and agent actor context, spoofed company params, and cross-company rejection. - Extended plugin worker manager coverage for invocation-scope propagation. - Filtered typed and legacy plugin operation issue origins from default issue/inbox lists. ## Verification - `pnpm --filter @paperclipai/plugin-sdk build` - `NODE_ENV=test pnpm exec vitest run packages/plugins/sdk/tests/host-client-factory.test.ts packages/plugins/sdk/tests/testing-actions.test.ts server/src/__tests__/plugin-routes-authz.test.ts server/src/__tests__/plugin-worker-manager.test.ts server/src/__tests__/issues-service.test.ts` Note: embedded Postgres issue-service tests reported host-level Postgres init skip for 47 tests; the non-embedded targeted tests passed. ## Risks - Medium: plugin host authorization paths are sensitive, and external plugins may rely on previously loose company params. - Mitigation: the change only tightens calls when the host attached a company invocation scope and includes explicit tests for board, agent, and nested worker calls. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI GPT-5 Codex via `codex_local`, tool-enabled coding session; exact context window not exposed by this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-05-22 09:16:24 -05:00
Dotta	38c185fb8b	[codex] Add agent permissions and controls plan (#6386 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies by keeping task ownership, approvals, and operator control inside one control plane. > - Agent permissions and plugin-hosted company settings sit on the boundary between autonomy and governance. > - V1 needs scoped task assignment rules, plugin extension points, and clearer company access surfaces without weakening company boundaries. > - The branch builds the core authorization service, plugin SDK/host APIs, and UI simplifications needed to support those controls. > - Paperclip EE plugin surfaces were intentionally moved out of this core PR per review direction, so this PR now carries only the public core/plugin infrastructure work. > - The latest updates preserve the PAP-9937 branch changes that belong in this PR, remove the `design/` artifacts, and exclude the experimental `plugin-briefs` package. > - Greptile feedback was applied through the authorization/audit paths and the final cleanup commit was re-reviewed at 5/5 with no unresolved Greptile threads. > - The benefit is safer assignment control with extension hooks for richer permission products while preserving simple defaults for normal operators. ## What Changed - Added scoped task-assignment authorization decisions and routed issue/agent assignment mutations through the authorization service. - Added plugin SDK and host APIs for company settings slots, authorization policy/grant management, assignment previews, and bridge invocation scope propagation. - Simplified core company access UI and moved advanced controls behind plugin-provided settings surfaces. - Added retry-now affordances for blocked issue next-step notices. - Added protected-assignment enforcement for persisted agent/project/issue policies, including explicit-grant fallback behavior. - Added incremental principal-access compatibility backfill for active agent memberships and role-default human permission grants. - Added the Markdown code block wrap action fix from the latest branch changes. - Removed `design/` artifacts from the PR and removed `packages/plugins/plugin-briefs` from the final diff. - Addressed Greptile feedback for plugin actor sanitization, legacy membership handling, audit pagination, unknown grant-scope metadata, and startup test mocks. ## Verification - `pnpm exec vitest run server/src/__tests__/access-service.test.ts server/src/__tests__/company-portability.test.ts` -> 2 files passed, 54 tests passed. - `pnpm exec vitest run server/src/__tests__/server-startup-feedback-export.test.ts server/src/__tests__/access-service.test.ts server/src/__tests__/company-portability.test.ts` -> 3 files passed, 62 tests passed. - `pnpm exec vitest run server/src/__tests__/authorization-service.test.ts server/src/__tests__/plugin-access-authorization-host-services.test.ts server/src/__tests__/server-startup-feedback-export.test.ts` -> 3 files passed, 28 tests passed. - `pnpm --filter @paperclipai/server typecheck` -> passed. - `git diff --check` -> passed. - `node ./scripts/check-docker-deps-stage.mjs` -> passed. - `CI=true pnpm install --frozen-lockfile --ignore-scripts` -> passed with no lockfile update. - `pnpm exec vitest run ui/src/components/MarkdownBody.interaction.test.tsx` -> 1 test passed. - `git ls-files design packages/plugins/plugin-briefs \| wc -l` -> 0. - GitHub CI on `40cd83b53` -> all checks passed, merge state `CLEAN`. - Greptile on `40cd83b53` -> 5/5, 102 files reviewed, 0 comments/annotations added, 0 unresolved review threads. - Confirmed the PR diff contains no `design/`, `packages/plugins/plugin-briefs`, `pnpm-lock.yaml`, or `.github/workflows` changes. ## Risks - Medium: task assignment authorization paths are behaviorally stricter for protected/private policy data, so existing plugin-authored policies may block assignment until explicit grants or approval flows are configured. - Medium: plugin-host authorization APIs expand the surface area available to trusted plugins and need careful review for company scoping. - Low: startup now performs a principal-access compatibility backfill, but the migration and runtime backfill use conflict-tolerant inserts. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled workflow with shell, git, and GitHub CLI access. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-22 08:12:52 -05:00
Dotta	c91a062326	[codex] Runtime control-plane fixes (#6380 ) ## Thinking Path > - Paperclip orchestrates AI agents through a server-side control plane > - That control plane depends on reliable issue state transitions, plugin lifecycle behavior, import limits, and startup/shutdown handling > - Several small runtime fixes had accumulated on the working branch and were mixed with larger feature work > - Keeping them separate makes the correctness fixes reviewable and mergeable without waiting for cloud-sync UI work > - This pull request groups the server/runtime control-plane fixes into one standalone branch > - The benefit is a tighter, safer runtime baseline for retries, imports, plugin migrations, feedback flushing, and trusted cloud import handling ## What Changed - Fixed updated issue list pagination sorting and scheduled retry comment handling. - Re-applied pending plugin migrations during hot reload and fixed plugin-schema worktree seed restore. - Hardened public tenant DB startup, portable import body limits, trusted cloud import errors, and trusted cloud tenant import mutation access. - Expired stale request confirmations after user comments. - Added feedback export shutdown hardening so database-unavailable flush loops stop cleanly. - Guarded plugin worker `error` event emission when no listener is registered. ## Verification - `pnpm install --frozen-lockfile --ignore-scripts` - `pnpm --filter @paperclipai/plugin-sdk build` - `npm run install --prefix node_modules/.pnpm/sqlite3@5.1.7/node_modules/sqlite3` - `pnpm exec vitest run server/src/__tests__/issues-service.test.ts server/src/__tests__/plugin-lifecycle-restart.test.ts server/src/__tests__/server-startup-feedback-export.test.ts server/src/__tests__/issue-comment-reopen-routes.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts server/src/__tests__/issue-thread-interaction-routes.test.ts server/src/__tests__/body-limits.test.ts server/src/__tests__/feedback-flush-controller.test.ts server/src/__tests__/error-handler.test.ts server/src/__tests__/board-mutation-guard.test.ts packages/db/src/backup-lib.test.ts` initially exposed local setup issues and two 5s test timeouts. - Rerun after local prereq build: `pnpm exec vitest run --testTimeout 15000 server/src/__tests__/issue-comment-reopen-routes.test.ts server/src/__tests__/issue-thread-interaction-routes.test.ts server/src/__tests__/feedback-flush-controller.test.ts server/src/__tests__/server-startup-feedback-export.test.ts` passed. - Some embedded Postgres-backed tests skipped on this host because local Postgres init was unavailable. ## Risks - Runtime-touching branch: startup/shutdown and issue interaction behavior should be reviewed carefully. - The feedback export change disables repeated flush attempts only for database connection-refused failures; other upload failures still log normally. - The plugin worker error guard avoids process crashes from unhandled EventEmitter errors but may hide errors from code paths that expected an emitted listener. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent with local shell/git/tool use. Exact hosted model ID and context-window size are not exposed by the local Paperclip adapter runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-20 10:37:11 -05:00
Dotta	43c5bb81b6	[codex] Workspace diff polish (#6383 ) ## Thinking Path > - Paperclip gives operators a workspace diff plugin so they can inspect agent changes before review > - The diff view needs reliable base-ref defaults and controls that stay usable while scrolling large diffs > - The working branch mixed those plugin improvements with unrelated server and cloud work > - Keeping the workspace diff plugin changes isolated makes them easy to test and review > - This pull request polishes the workspace diff plugin controls, base-ref behavior, and sticky headers > - The benefit is a more predictable diff review surface for agent workspaces ## What Changed - Fixed workspace diff default base-ref resolution. - Improved split/unified and working-tree/against-ref pane controls. - Made workspace diff headers stay sticky while scrolling. - Added a review screenshot at `screenshots/PAP-9841-workspace-diff.png`. ## Verification - `pnpm install --frozen-lockfile --ignore-scripts` - `pnpm --filter @paperclipai/plugin-sdk build` - `pnpm --filter @paperclipai/plugin-workspace-diff exec vitest run tests/plugin.spec.ts` - Result: 9 tests passed. ## Risks - UI-only plugin branch with low data risk. - The default base-ref inference should be reviewed against unusual worktree/upstream combinations. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent with local shell/git/tool use. Exact hosted model ID and context-window size are not exposed by the local Paperclip adapter runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-19 15:51:13 -05:00
Dotta	d67347be77	[codex] Provider vault secrets UX (#6381 ) ## Thinking Path > - Paperclip orchestrates AI agents that need scoped, auditable access to secrets > - Hosted and external deployments need provider vault configuration without exposing secret values in Paperclip metadata > - AWS Secrets Manager vault setup previously required too much manual operator knowledge > - Provider vault discovery and removal belong together as an independent secrets-management improvement > - This pull request adds AWS provider vault discovery/prefill plus vault removal flows > - The benefit is a safer operator path for configuring external secret storage before higher-level cloud workflows depend on it ## What Changed - Added shared validators/types for AWS provider vault discovery payloads and safe provider metadata. - Implemented AWS provider vault discovery preview on the server. - Added provider vault removal service/route behavior. - Added Secrets page UI for discovery prefill, removal messaging, and related rendering coverage. - Added Storybook provider-vault fixtures and captured screenshots for the new UX states. ## Verification - `pnpm install --frozen-lockfile --ignore-scripts` - `pnpm exec vitest run packages/shared/src/validators/secret.test.ts server/src/__tests__/aws-secrets-manager-provider.test.ts server/src/__tests__/secrets-routes.test.ts server/src/__tests__/secrets-service.test.ts ui/src/pages/Secrets.render.test.tsx` - Result: 4 files passed, 1 embedded Postgres-backed file skipped on this host because local Postgres init was unavailable. - `pnpm --filter @paperclipai/ui exec vitest run src/pages/Secrets.render.test.tsx` - `pnpm --filter @paperclipai/ui typecheck` - Storybook screenshot capture against `Product/Secrets` on `http://127.0.0.1:60381/iframe.html?id=product-secrets--secrets-inventory&viewMode=story&globals=theme:dark` ## Screenshots Provider vaults tab after this change: ![Provider vaults tab](https://raw.githubusercontent.com/paperclipai/paperclip/pap-9861-provider-vault-secrets/doc/screenshots/pr-6381/provider-vaults-tab.png) AWS discovery candidate flow: ![AWS discovery candidate flow](https://raw.githubusercontent.com/paperclipai/paperclip/pap-9861-provider-vault-secrets/doc/screenshots/pr-6381/aws-discovery-candidates.png) Provider vault removal confirmation: ![Provider vault removal confirmation](https://raw.githubusercontent.com/paperclipai/paperclip/pap-9861-provider-vault-secrets/doc/screenshots/pr-6381/remove-provider-vault-confirmation.png) ## Risks - Secret provider metadata handling must remain non-sensitive; validators reject credential-bearing Vault URLs and sensitive AWS discovery keys. - AWS discovery depends on deployment credentials being configured correctly outside Paperclip-managed company secrets. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent with local shell/git/tool use. Exact hosted model ID and context-window size are not exposed by the local Paperclip adapter runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-19 15:50:23 -05:00
Devin Foley	4b1e92a588	feat(plugins): add Modal sandbox provider plugin (#6245 ) ## Thinking Path > - Paperclip orchestrates AI agents through company-scoped control-plane workflows and extensible runtime integrations. > - Sandbox providers are part of that extension surface because they let agents execute isolated work without baking each provider into the core server. > - Modal already offers managed sandboxes with filesystem, process, timeout, and networking controls that map onto Paperclip's sandbox provider contract. > - The repo did not have a Modal provider plugin, so teams wanting Modal-backed sandboxes had no first-party integration path. > - This pull request adds a standalone `packages/plugins/sandbox-providers/modal` plugin that implements the provider contract, worker entrypoint, docs, and tests. > - The benefit is that Modal can now be installed as a provider plugin without expanding the core control-plane surface area. ## What Changed - Added a new `packages/plugins/sandbox-providers/modal` package with the plugin manifest, worker entrypoint, and exported plugin surface. - Implemented Modal-backed sandbox lifecycle support for creation, command execution, file operations, networking options, termination, and metadata translation. - Added focused Vitest coverage for config validation, env handling, lifecycle flows, networking behavior, and error mapping. - Documented installation, configuration, and usage requirements in the plugin README. - Removed misleading `MODAL_TOKEN_*` fallback behavior so authentication relies on supported Modal credentials only. ## Verification - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - `cd packages/plugins/sandbox-providers/modal && pnpm test` ## Risks - Low to medium risk: this is isolated to a new plugin package, but runtime behavior still depends on live Modal account credentials and service-side sandbox semantics. - Modal's current docs target a newer Node baseline than the repo default, so the first live install should confirm credential loading and sandbox startup behavior in a real Modal workspace. - No UI or schema changes are included in this PR. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex via Paperclip `codex_local` agent (GPT-5-class Codex coding model; exact backend model ID is not exposed by the runtime), with tool use, shell execution, and code-editing capabilities enabled. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-18 08:36:34 -07:00
Dotta	5071c4c776	[codex] Add workspace diff viewer plugin (#6071 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Operators need to inspect what agents changed inside execution and project workspaces. > - The existing workspace detail views did not provide a first-party rich diff surface for staged, unstaged, head, renamed, binary, oversized, and untracked changes. > - The plugin system is the intended extension point for optional rich UI surfaces. > - This pull request adds a workspace diff plugin plus host services and shared contracts so Changes tabs can render workspace diffs through plugin slots. > - The diff-renderer dependency should stay owned by the plugin package rather than the core UI app. > - The dependency surface must stay aligned with repository PR policy, including intentionally omitting `pnpm-lock.yaml` from the PR. > - The benefit is a more reviewable workspace surface without hard-coding the renderer into every page. ## What Changed - Added `@paperclipai/plugin-workspace-diff`, including diff normalization, plugin manifest/worker/UI entrypoints, and focused plugin tests. - Kept `@pierre/diffs` scoped to `@paperclipai/plugin-workspace-diff`; removed the core UI lab diff-renderer surface and direct UI package dependency. - Added shared workspace diff types and validators, plus plugin SDK surface for workspace diff host services. - Added server workspace diff service support and route coverage for execution/project workspace diff flows. - Wired Execution Workspace and Project Workspace Changes tabs to load the diff plugin, including loading/error fallback behavior. - Added UI tests and fixtures for the Changes tabs and plugin bridge behavior. - Added the new plugin package manifest to the Docker deps stage so PR policy can validate dependency coverage. - Addressed review hardening around empty untracked patches, workspace path exposure, project workspace read capability checks, and default base refs. ## Verification - `pnpm --filter @paperclipai/plugin-workspace-diff test` - `pnpm exec vitest run packages/shared/src/validators/workspace-diff.test.ts server/src/__tests__/workspace-diff-service.test.ts ui/src/pages/ProjectWorkspaceDetail.test.tsx ui/src/pages/ExecutionWorkspaceDetail.test.tsx` - `pnpm exec vitest run ui/src/plugins/bridge.test.ts server/src/__tests__/workspace-runtime-routes-authz.test.ts` - `pnpm --filter @paperclipai/shared typecheck` - `pnpm --filter @paperclipai/plugin-workspace-diff typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm --filter @paperclipai/ui typecheck` - `node ./scripts/check-docker-deps-stage.mjs` - Browser screenshot captured from the local worktree dev server: https://files.catbox.moe/ofdpsp.png - Confirmed branch is rebased onto `public-gh/master`, `.github/workflows/pr.yml` is not included in the PR diff, `ui/package.json` is not included in the PR diff, and `pnpm-lock.yaml` is not included in the PR diff. ## Risks - Medium UI integration risk: the Changes tab depends on the plugin slot and host diff service path. - Medium dependency risk: this adds `@pierre/diffs` in the plugin package, but `pnpm-lock.yaml` is intentionally omitted per packaging instructions because repository automation manages lockfile updates. - Current CI blocker: downstream frozen installs fail until the repository policy path for new plugin package dependencies is chosen. - Diff rendering edge cases are covered for common working-tree and head diff states, but very large repositories may still expose performance limits. - No migrations are included. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 class coding model, tool-enabled local execution environment. Exact context window was not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-18 08:50:06 -05:00
Dotta	d734bd43d1	[codex] Roll up May 17 branch changes (#6210 ) ## Thinking Path > - Paperclip is the control plane for autonomous AI companies, so agent work needs visible ownership, recovery, and operator controls. > - This local branch had accumulated several related control-plane reliability and operator-experience fixes across recovery actions, watchdog folding, model-profile defaults, mentions, markdown editing, plugin launchers, and small UI polish. > - The branch needed to be converted into a PR against the current `origin/master` without losing dirty work or including lockfile/workflow churn. > - The safest standalone shape is a single rollup PR because the recovery/server/UI files overlap heavily across the local commits and splitting would create avoidable conflicts. > - This pull request replays the local branch onto latest `origin/master`, preserves the uncommitted work as logical commits, and adds a Zod 4 validator compatibility fix found during verification. > - The benefit is that the May 17 local branch can be reviewed and merged as one coherent, conflict-free branch under the 100-file Greptile limit. ## What Changed - Rebased the local May 17 branch work onto current `origin/master` in a dedicated worktree. - Preserved and committed previously dirty changes for recovery retry handling, plugin/sidebar launcher polish, and `.herenow` ignores. - Added recovery-action behavior for returning source issues to `todo` when retrying source-scoped recovery. - Included the existing local recovery/liveness/watchdog fold, Codex cheap-profile, markdown/mention, duplicate-agent, and UI polish commits from the branch. - Normalized shared validator `z.record(...)` schemas to explicit string-key records for Zod 4 compatibility. - Confirmed the PR has no `pnpm-lock.yaml` or `.github/workflows/*` changes and stays below the 100-file Greptile limit. ## Verification - `pnpm install --frozen-lockfile --ignore-scripts` - `npm run install` in `node_modules/.pnpm/sqlite3@5.1.7/node_modules/sqlite3` to build the local native sqlite3 binding after installing with scripts disabled - `pnpm exec vitest run packages/shared/src/validators/issue.test.ts packages/shared/src/project-mentions.test.ts packages/adapter-utils/src/server-utils.test.ts server/src/__tests__/heartbeat-model-profile.test.ts server/src/__tests__/issue-recovery-actions.test.ts server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts server/src/__tests__/heartbeat-active-run-output-watchdog.test.ts server/src/__tests__/plugin-local-folders.test.ts ui/src/components/IssueRecoveryActionCard.test.tsx ui/src/components/Sidebar.test.tsx ui/src/components/SidebarAccountMenu.test.tsx ui/src/components/IssueProperties.test.tsx ui/src/components/MarkdownEditor.test.tsx ui/src/components/MarkdownBody.test.tsx ui/src/lib/duplicate-agent-payload.test.ts ui/src/pages/Routines.test.tsx` - First pass: 13 files passed with 201 passing tests; 3 server files failed before sqlite3 native binding was built. - After rebuilding sqlite3: `server/src/__tests__/heartbeat-model-profile.test.ts`, `server/src/__tests__/issue-recovery-actions.test.ts`, and `server/src/__tests__/heartbeat-active-run-output-watchdog.test.ts` passed/loaded; embedded Postgres tests were skipped by the local host guard. - `pnpm --filter @paperclipai/shared typecheck` - `pnpm --filter @paperclipai/adapter-utils typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm --filter @paperclipai/ui typecheck` ## Risks - Medium risk: this is a broad rollup PR across recovery semantics, server tests, shared validators, and UI surfaces. - Some embedded Postgres tests skipped locally due the host guard, so CI should provide the stronger database-backed signal. - UI changes were covered by component tests, but no browser screenshot was captured in this PR creation pass. - This branch may overlap with existing recovery/liveness PR work; merge this PR independently or restack/close overlapping branches rather than merging duplicate implementations together. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent, tool-enabled local repository and GitHub workflow, medium reasoning effort. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-17 17:15:06 -05:00
Dotta	705c1b8d81	[codex] Add routine env secrets support (#6212 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Scheduled routines are the control-plane path for recurring agent work. > - Routines already had dispatch/history, but their runtime environment did not carry routine-owned secret bindings through execution. > - Operators need routine-specific secrets that can override project/agent env without exposing secret values in history, logs, or access events. > - This pull request adds the routine env runtime contract, wires it into execution, and makes the routine UI/history surfaces show safe secret metadata. > - The benefit is that routine executions can use scoped secret refs predictably while preserving company boundaries and auditability. ## What Changed - Added routine env persistence/runtime support, including `routines.env`, `routine_runs.routine_revision_id`, revision snapshots, and idempotent migration `0086_routine_env_runtime_contract`. - Resolved routine env during heartbeat adapter config assembly with precedence `agent < project < routine` and secret access events recorded against the routine consumer. - Added secret binding synchronization for routine create/update/restore flows and guarded cross-company, missing, disabled, and deleted secret cases. - Added a Secrets tab to routine detail, env/secret history diff rendering, and Storybook coverage for the new UI states. - Added server/UI regression tests, including an embedded-Postgres QA path for routine secret execution and restore behavior. - Updated implementation/database docs for routine env and secret-binding behavior. ## Verification - `pnpm install --frozen-lockfile` after rebasing onto `public-gh/master` to refresh workspace links for the newly-added upstream Grok adapter package. - `pnpm exec vitest run server/src/__tests__/heartbeat-project-env.test.ts server/src/__tests__/routines-service.test.ts server/src/__tests__/secrets-service.test.ts server/src/__tests__/qa-routine-secrets-e2e.test.ts ui/src/components/RoutineHistoryTab.test.tsx` passed: 5 files, 92 tests. - `pnpm -r typecheck` passed across the workspace. - `pnpm build` passed. Vite emitted the existing large-chunk/dynamic-import warnings. - UI screenshots were captured locally during QA in `artifacts/pap-9521/` and `artifacts/pap-9522/`; generated screenshots are not committed to avoid adding binary artifacts to the repo. ## Risks - Migration risk is limited by `IF NOT EXISTS` guards for the new columns, FK, and index, and the migration is ordered as `0086` immediately after upstream `0085`. - Runtime behavior changes env precedence for routine executions by adding routine env as the highest-precedence layer; tests cover agent/project/routine precedence. - Secret handling is security-sensitive; tests cover value-free manifests/events/errors, disabled/missing/deleted secrets, and cross-company rejection. - UI history now renders routine env/secret diffs; tests and Storybook stories cover the main rendering paths. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent based on GPT-5, with shell/tool use and medium reasoning effort. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-17 16:30:34 -05:00
Devin Foley	573e9ec909	fix(grok-local): restore turn boundaries in streaming reasoning text (#6142 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The `grok-local` adapter streams reasoning text to the issue "Working..." panel as the grok CLI runs > - The `grok` CLI's `--output-format streaming-json` mode silently drops the `\n` separator between reasoning turns around tool calls > - Consecutive `thought` chunks (e.g. `` "`" `` followed by `"The"`) arrive with no intervening whitespace event, so the UI's `delta: true` concatenator merged them into run-on text like `"…planningGreat, now I have the issue descriptionThe only co"` > - This PR adds a small turn-boundary helper that detects sentence boundaries in the upstream `thought` stream and inserts a single `\n` only when the previous chunk ended with sentence punctuation (or a balanced closing backtick) AND the next chunk begins a new uppercase sentence > - The benefit is readable streaming reasoning in the UI without changing how completed messages are stored ## What Changed - Added `packages/adapters/grok-local/src/shared/turn-boundary.ts` with per-stream state (last chunk + backtick parity) and a `restoreTurnBoundary()` helper that inserts `\n` only between balanced, sentence-terminated `thought` chunks - Wired the helper into `parseGrokJsonl` (server) and added a new `createGrokStdoutParser` factory used by `grokLocalUIAdapter` for the live "Working..." panel - Added focused tests in `shared/turn-boundary.test.ts`, plus regression assertions in `server/parse.test.ts` and `ui/parse-stdout.test.ts` ## Verification - `pnpm --filter @paperclip/grok-local test` — 23/23 adapter tests pass - `pnpm --filter @paperclip/grok-local typecheck` and UI typecheck — clean - Replayed an actual broken `grok 0.1.210` stream from the report; previously-merged boundaries (`` `ls`The ``, `returned:Confirmed`) now render with a separating newline; chunks inside un-closed backtick spans are left alone ## Risks - Low risk. Boundary insertion only fires when prev ends with `.`/`!`/`?`/balanced `` ` `` and next begins with an uppercase ≥2-char word, with no whitespace on either side. Worst case: a rare missed split or a misplaced newline inside reasoning — both purely cosmetic and confined to the live streaming panel. ## Model Used - Claude Opus 4.7 (claude-opus-4-7), Anthropic, extended thinking + tool use via Claude Code ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-16 11:48:51 -07:00
Devin Foley	ab8b471685	Add built-in grok_local adapter (#6087 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, so adapter quality directly affects what runtimes the control plane can supervise. > - Local CLI adapters are one of the core execution surfaces because they turn real coding tools into Paperclip-managed employees with heartbeats, transcripts, and reviewability. > - Grok Build was installed on the Paperclip host, but Paperclip had no built-in `grok_local` adapter, so the runtime could not be configured through the normal server/UI/CLI adapter path. > - That gap needed to be closed with the same built-in registry, environment diagnostics, transcript parsing, and skill/instructions behavior that the other local adapters already rely on. > - After the initial adapter landed, a real follow-up run showed that Grok streaming text was being rendered one fragment per line, which made transcripts harder to read even though the runtime itself was working. > - This pull request adds the built-in `grok_local` adapter end-to-end and then fixes the transcript parser so streamed Grok output is coalesced into readable assistant/thinking blocks. > - The benefit is that Grok Build becomes a first-class Paperclip runtime with a usable operator experience instead of a partially wired runtime with noisy transcript output. ## What Changed - Added a new built-in `@paperclipai/adapter-grok-local` package with server, UI, and CLI entrypoints. - Implemented Grok execution, session handling, environment diagnostics, config building, skill syncing, and parser coverage inside the new adapter package. - Registered `grok_local` across the built-in adapter inventories and capability/display metadata in server, UI, CLI, and shared constants. - Added adapter route coverage for the new built-in type. - Fixed Grok transcript readability by emitting streamed `text` and `thought` fragments as deltas so the shared transcript builder coalesces them into readable message blocks. - Added regression tests for the Grok parser and transcript coalescing behavior. ## Verification - `pnpm vitest run packages/adapters/grok-local/src/ui/parse-stdout.test.ts ui/src/adapters/transcript.test.ts` - `pnpm --filter @paperclipai/adapter-grok-local build` - Manual runtime verification on the Paperclip host during implementation and follow-up review: - confirmed the Grok CLI was installed and authenticated - confirmed the worktree dev server could be restarted cleanly and health-checked after the parser follow-up - No screenshots attached. This change is primarily adapter plumbing plus transcript formatting behavior; reviewers can verify via the Grok-backed run surfaces directly. ## Risks - This adds a new built-in adapter, so any missed registration surface could create inconsistencies between server, UI, and CLI behavior. - The adapter depends on Grok Build's current event/output shape; if upstream Grok streaming JSON changes, transcript parsing or session extraction may need follow-up updates. - The transcript readability fix intentionally changes how Grok fragments are grouped, so any downstream code that implicitly expected one entry per fragment would behave differently. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex via Paperclip `codex_local` agent runtime. - GPT-5-class coding model with tool use, shell execution, file editing, and repo inspection enabled. - Exact backend model ID/context window were not surfaced to the agent in this Paperclip session. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-05-16 09:51:09 -07:00
Dotta	4c47eb46c3	[codex] Add multilingual issue preservation coverage (#6069 ) ## Thinking Path > - Paperclip orchestrates AI agents for autonomous companies. > - Agents and board operators coordinate through company-scoped issues, comments, documents, and heartbeat wake payloads. > - Chinese, Japanese, and Hindi text needs to survive the full issue lifecycle without normalization or prompt serialization damage. > - The riskiest paths are board issue creation, server issue/comment/document round-tripping, and scoped wake prompt rendering. > - This pull request adds focused regression coverage across those surfaces. > - The benefit is higher confidence that multilingual operators and agents can create, search, comment on, complete, and wake on issues using non-Latin text. ## What Changed - Added adapter-utils wake payload and prompt rendering coverage for Chinese, Japanese, and Hindi issue/comment text. - Added UI New Issue dialog coverage proving multilingual title and description text is submitted unchanged. - Added server route coverage that round-trips multilingual issue text through create, search, comments, documents, completion comments, and heartbeat context. - Addressed Greptile feedback by using a typed storage mock and splitting the server route integration path into smaller ordered assertions. ## Verification - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts ui/src/components/NewIssueDialog.test.tsx server/src/__tests__/multilingual-issues-routes.test.ts` - Result: 3 test files passed, 51 tests passed. ## Risks - Low risk: this PR adds regression coverage only and does not change runtime behavior. - The new server test uses embedded Postgres support and skips on unsupported hosts using the existing helper pattern. - No migrations are included. - No `pnpm-lock.yaml` changes are included. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected - check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 based coding agent, with shell, git, Vitest, and GitHub connector/CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-05-15 12:49:57 -05:00
Dotta	eb38b226c2	Fix LLM Wiki package and migration validation (#6010 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Plugins extend the control plane with optional capabilities such as LLM Wiki. > - LLM Wiki needs its package assets and plugin-owned database migrations to work when installed from the packaged plugin. > - The bundled spaces migration used validation-hostile dynamic SQL, and the packaged plugin could omit non-dist runtime assets. > - This pull request makes the LLM Wiki package include its required assets and cuts the spaces migration over to explicit, idempotent SQL that passes the production plugin database validator. > - The benefit is a simpler plugin install path that validates and applies the bundled LLM Wiki migrations without adding plugin-specific legacy handling to Paperclip core. ## What Changed - Added the LLM Wiki package asset allowlist so agents, migrations, skills, templates, dist output, and README are included when packaged. - Renamed the bootstrap `.gitignore` template to `gitignore.template` and updated the runtime lookup so package tooling does not drop the hidden template file. - Relaxed plugin migration validation to allow namespace-scoped `INSERT`/`UPDATE` backfills and `CREATE INDEX` statements while continuing to reject destructive or cross-namespace SQL. - Replaced the LLM Wiki spaces migration's dynamic constraint-drop DO block with explicit `DROP CONSTRAINT IF EXISTS` statements. - Replaced fragile regex-source dispatch in SQL reference extraction with explicit capture-group descriptors. - Added regression coverage that applies the bundled LLM Wiki migrations through the production validator and checks the expected constraints. ## Verification - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/plugin-database.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm --filter @paperclipai/plugin-llm-wiki build` - `git diff --check` - Confirmed `pnpm-lock.yaml` is not included in the branch diff. ## Risks - Low migration risk for current users: LLM Wiki spaces are new, so this intentionally cuts over the plugin migration instead of adding legacy handling in core. - Validator behavior is broader than before, but still requires fully qualified plugin namespace targets, blocks deletes/destructive DDL, and keeps public table access read-only and allowlisted. > Checked [`ROADMAP.md`](ROADMAP.md); this is a targeted plugin packaging/migration fix and does not duplicate planned core feature work. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 based coding agent, tool-enabled local repo access, reasoning mode managed by the Paperclip/Codex runtime. Exact context window was not surfaced in this session. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-15 10:20:02 -05:00
Dotta	03ad5c5bea	[codex] Add issue document locking (#6009 ) ## Thinking Path > - Paperclip orchestrates AI-agent companies through company-scoped issues, comments, and issue documents. > - Issue documents are the durable place where plans, handoffs, and other work artifacts are revised over time. > - Some documents need to be preserved as operator-approved snapshots while agents continue working on the same issue. > - Without document locking, a later board or agent write can overwrite the document key that reviewers expected to remain stable. > - This pull request adds board-managed issue document locks and makes agent writes to locked keys create a derived document instead of mutating the locked document. > - The benefit is safer document handoffs: approved or frozen issue documents stay immutable until the board explicitly unlocks them. ## What Changed - Added `locked_at`, `locked_by_agent_id`, and `locked_by_user_id` document fields plus migration `0085_tranquil_the_executioner.sql`. - Added document lock/unlock service behavior, route endpoints, activity events, and locked-document write protections. - Made agent document writes to locked keys create a new derived key such as `plan-2` rather than overwriting the locked document. - Surfaced lock state through shared issue document types, UI API methods, document header lock controls, and activity formatting. - Added server and UI tests for lock/unlock behavior, locked document immutability, and UI action visibility. - Updated `doc/SPEC-implementation.md` with the V1 document lock contract and endpoints. ## Verification - `git rebase public-gh/master` completed cleanly after committing the branch changes. - `git diff --check` passed before commit. - `pnpm run preflight:workspace-links && pnpm exec vitest run server/src/__tests__/documents-service.test.ts server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts ui/src/components/IssueDocumentsSection.test.tsx ui/src/components/IssueContinuationHandoff.test.tsx ui/src/lib/document-revisions.test.ts` passed: 5 files, 32 tests. ## Risks - Medium risk because this changes the document persistence contract and adds a migration. - The migration uses `ADD COLUMN IF NOT EXISTS` and guarded foreign-key creation so it remains safe for users who may have already applied an earlier copy of the migration. - Locked documents intentionally reject board edits/deletes/restores until unlocked; any existing workflows that expected direct overwrite need to unlock first. - Agent writes to locked keys now create derived documents, which may create extra issue documents when agents retry locked writes. ## Model Used - OpenAI Codex coding agent based on GPT-5, with tool use and local code execution in the Paperclip worktree. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-15 08:54:55 -05:00
Devin Foley	1bd44c8a0d	Harden Cloudflare sandbox execution (#5967 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Remote-managed adapters need sandbox/environment execution to behave like real agent runs, not just local host probes. > - The Cloudflare sandbox path was the weakest leg in the SSH + Cloudflare QA matrix because bridge execution could truncate output, time out long-running installs, and under-provision the worker instance. > - That made several adapters fail for reasons unrelated to their actual business logic, which blocks confidence in Paperclip's non-local environment model. > - This pull request hardens the Cloudflare bridge/runtime path and adjusts sandbox probe budgets so adapter verification matches the measured behavior of the fixed environment. > - It also corrects the Pi sandbox install command so the QA matrix exercises a real, supported install path. > - The benefit is a materially more reliable SSH + Cloudflare adapter matrix with fewer false negatives and clearer failure boundaries. ## What Changed - Switched the Cloudflare bridge worker instance type to `standard-2` for the QA-matrix execution path. - Raised Cloudflare bridge/plugin-worker timeout budgets and added SSE keepalives so long-running install/exec calls can complete instead of dying at the transport layer. - Fixed Cloudflare bridge-channel command handling to avoid dropped final stdout chunks on short-lived execs. - Made Claude, OpenCode, and Cursor sandbox probe timeouts configurable/sandbox-aware, then tightened the defaults to the measured post-fix range. - Updated the Pi sandbox install command to use the package currently installed by the official `pi.dev` installer, pinned to a specific npm version. - Added/updated tests around Cloudflare bridge behavior and adapter sandbox probe paths. ## Verification - `pnpm --filter @paperclipai/adapter-claude-local typecheck` - `pnpm --filter @paperclipai/adapter-opencode-local typecheck` - `pnpm --filter @paperclipai/adapter-cursor-local typecheck` - `pnpm vitest run packages/adapters/cursor-local packages/adapters/claude-local packages/adapters/opencode-local packages/adapters/pi-local packages/plugins/sandbox-providers/cloudflare server/src/services/__tests__/plugin-worker-manager.test.ts` - Manual QA on the dedicated dev instance using the SSH + Cloudflare environment matrix (`ENV-29` through `ENV-40`). Clean end-to-end passes: SSH `claude_local`, `codex_local`, `cursor`, `gemini_local`; Cloudflare `claude_local`, `codex_local`, `cursor`, `gemini_local`. ## Risks - Cloudflare sandbox cost increases because the bridge worker now runs on `standard-2` instead of `lite`. - Higher timeout ceilings can delay surfacing truly hung Cloudflare bridge calls, even though they remove transport-level false negatives. - The manual heartbeat matrix still exposed follow-on execution/sync/disposition bugs in `opencode_local` and `pi_local`; those are not fixed by this PR. ## Model Used - OpenAI `gpt-5.4` via Paperclip `codex_local`, reasoning effort `high`, tool use enabled, repo search enabled. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots (not applicable) - [x] I have updated relevant documentation to reflect my changes (not applicable) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-13 22:00:10 -07:00
Dotta	4142559c37	[codex] Add blocked inbox attention view (#5603 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies through company-scoped issues, comments, approvals, and execution workspaces. > - Operators need the Inbox to show not only active work, but also blocked work that may need human or agent attention. > - The existing inbox experience did not have a dedicated blocked-work surface, so blocked tasks were harder to triage and resume deliberately. > - Backend consumers also needed a compact attention signal that distinguishes actionable blockers from covered or waiting blocker states. > - This pull request adds a Blocked Inbox tab backed by issue blocker-attention metadata, shared validators, and UI helpers. > - The benefit is a clearer triage path for stalled or blocked Paperclip work without exposing external wait internals in the operator-facing UI. ## What Changed - Added shared issue blocker-attention types, validators, and exports for the API/UI contract. - Added backend blocker-attention computation and issue route support for blocked inbox data. - Added the Blocked Inbox tab, blocked reason chips, filtering/search UI, responsive layouts, and Storybook stories. - Updated inbox helpers and page behavior so toolbar controls only appear where they apply. - Added coverage for shared validators, server blocker-attention behavior, blocked inbox UI helpers/components, and the Inbox page. - Added a screenshot helper script for the blocked inbox Storybook stories. - Addressed Greptile feedback by making urgency sorting deterministic for null stop times, avoiding full blocked-inbox list enrichment for counts, and hardening the screenshot helper. ## Verification - Rebased the branch cleanly onto `public-gh/master`. - Confirmed the diff does not include `pnpm-lock.yaml`. - Confirmed the diff does not include database migration files. - Ran `pnpm exec vitest run packages/shared/src/validators/issue.test.ts server/src/__tests__/issue-blocker-attention.test.ts ui/src/components/BlockedInboxView.test.tsx ui/src/components/BlockedReasonChip.test.tsx ui/src/lib/blockedInbox.test.ts ui/src/lib/inbox.test.ts ui/src/pages/Inbox.test.tsx`. - Ran `pnpm --filter @paperclipai/shared typecheck && pnpm --filter @paperclipai/server typecheck && pnpm --filter @paperclipai/ui typecheck`. - Checked `ROADMAP.md`; this is scoped inbox/operator triage work and does not duplicate a listed roadmap feature. - Greptile Review is green on the latest head and all four Greptile review threads are resolved. - GitHub PR checks are green on the latest head: policy, security/snyk, e2e, verify, Canary Dry Run, Greptile Review, and serialized server suites 1/4 through 4/4. ## Risks - Medium review surface because this touches the shared issue contract, server issue services, and the Inbox UI together. - Blocker-attention classification may need product tuning after operators use it on real blocked queues. - UI screenshots were not attached in this PR-opening pass; the branch includes `scripts/screenshot-blocked-inbox.mjs` and Storybook stories for visual capture. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used OpenAI Codex, GPT-5-based coding agent with shell, git, GitHub CLI, GitHub connector, and Paperclip API tool use. Reasoning mode: medium. Context window: not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-13 16:41:36 -05:00
Dotta	d1a8c873b2	fix(remote-sandbox): harden host workspace resumes (#5922 ) ## Thinking Path > - Paperclip orchestrates AI agents through a control plane while adapters execute work in local, remote, or sandboxed runtimes. > - Remote sandbox execution depends on a strict host-versus-remote workspace boundary: the host prepares/restores files, while the adapter command runs inside the sandbox cwd. > - Jannes' PR #5823 identified host-side failure modes that were not covered by replacement PR #5822. > - Persisting a remote pod cwd in session params could poison the next host heartbeat resume and make Paperclip inspect or upload system temp roots. > - Plugin sandbox providers also need a narrow way to receive model-provider API keys without exposing the full server environment to every plugin worker. > - This pull request ports the host-side fixes from #5823 in the current codebase style, with focused regression coverage. > - The benefit is safer remote sandbox resumes and plugin worker environment handling without broadening core plugin privileges. ## What Changed - Persist host workspace cwd, not remote sandbox cwd, in `claude_local` session params while retaining remote execution identity metadata. - Reject saved session cwds that point at system roots before heartbeat falls back to agent home workspace. - Skip sockets, FIFOs, devices, and other non-file entries during workspace restore snapshot capture/comparison. - Pass a small model-provider API-key allowlist only to plugins declaring `environment.drivers.register`. - Added focused regression tests for remote Claude session params, unsafe session cwd detection, plugin worker env filtering, and non-file snapshot entries. Credits: ports host-side fixes from Jannes' #5823. ## Verification - `pnpm vitest run packages/adapter-utils/src/workspace-restore-merge.test.ts server/src/services/session-workspace-cwd.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/plugin-database.test.ts` (25 passed, 7 skipped by existing embedded-Postgres host guard) - `pnpm --filter @paperclipai/adapter-utils typecheck` - `pnpm --filter @paperclipai/adapter-claude-local typecheck` - `pnpm --filter @paperclipai/server typecheck` ## Risks - Low risk: changes are scoped to remote sandbox/session metadata, workspace snapshot filtering, and plugin worker env setup. - Sandbox-provider plugins now receive only the explicit model-provider key allowlist; any provider needing another key name will need a deliberate allowlist update. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent, tool-enabled local code execution and repository editing. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-13 16:23:04 -05:00
Dotta	b947a7d76c	[codex] Improve local plugin development workflow (#5821 ) ## Thinking Path > - Paperclip is the control plane for autonomous AI-agent companies. > - Plugins are the extension point for adding capabilities without expanding the core product surface. > - Local plugin development needed a tighter CLI-first loop so plugin authors can scaffold, run, install, inspect, and reload plugins without reaching into internal package paths. > - The server plugin install path also needed local-path handling that keeps plugin identity, dashboard routes, and development watchers coherent. > - This pull request adds the CLI scaffold/install workflow, fixes the server and SDK edge cases that blocked that loop, and updates the agent-facing plugin creation skill and docs. > - The benefit is that contributors can develop plugins from local folders with a documented, repeatable happy path. ## What Changed - Added `paperclipai plugin init` coverage and CLI wiring for local plugin scaffolding. - Improved local plugin install handling, plugin key route resolution, dashboard capability behavior, and dev watcher startup/reload behavior. - Fixed plugin SDK worker entrypoint validation for symlinked package layouts. - Added targeted tests for plugin init, server plugin authz/watcher behavior, SDK worker host validation, and the authoring smoke example. - Added a short local plugin development guide and refreshed the plugin authoring guide plus `paperclip-create-plugin` skill instructions. ## Verification - `pnpm run preflight:workspace-links && pnpm --filter @paperclipai/plugin-sdk build && pnpm --filter @paperclipai/create-paperclip-plugin typecheck && pnpm --filter paperclipai typecheck && pnpm --filter @paperclipai/plugin-sdk typecheck && pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run --project paperclipai cli/src/__tests__/plugin-init.test.ts` - `pnpm exec vitest run --project @paperclipai/plugin-sdk packages/plugins/sdk/tests/worker-rpc-host.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/plugin-dev-watcher.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/plugin-routes-authz.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm --dir packages/plugins/examples/plugin-authoring-smoke-example test` - Confirmed `pnpm-lock.yaml` is not included in the PR diff. ## Risks - Medium risk: this touches plugin install routing, CLI command behavior, and the local development watcher. - Local path plugin installs execute trusted local code by design; the new docs call out that trust boundary. - No database migrations are included. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled local shell and git workflow, medium reasoning effort. Context window details were not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge UI screenshots: not applicable; this PR changes CLI/server/plugin docs and tests, not board UI rendering. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-12 17:38:24 -05:00
Dotta	0808b388ee	[codex] Add source-scoped recovery actions (#5599 ) ## Thinking Path > - Paperclip is a control plane for autonomous AI companies, where work must end with a clear disposition rather than ambiguous agent liveness. > - Recovery currently detects stalled or missing-next-step issues, but source issue recovery can become split across child recovery issues, blockers, and comments. > - That makes it harder for operators and agents to see who owns recovery and what exact action is needed on the original issue. > - Source-scoped recovery actions give the original issue a first-class active recovery state with owner, evidence, wake policy, and resolution outcome. > - This pull request adds the recovery-action data model, backend reconciliation and resolution APIs, and board UI indicators/actions. > - The benefit is clearer stalled-work recovery without losing source issue context or relying on comments as the liveness path. ## What Changed - Added the `issue_recovery_actions` schema, shared types/constants/validators, and an idempotent `0084_issue_recovery_actions` migration ordered after current `master` migrations. - Updated stranded/missing-disposition recovery to create source-scoped recovery actions, wake the recovery owner on the source issue, and avoid locking the source issue for recovery-action wakes. - Added API support for reading active recovery actions on issue detail/list surfaces and resolving them with restored, blocked, cancelled, or false-positive outcomes. - Require blocked recovery resolutions to have an unresolved first-class blocker, and removed the UI shortcut that could mark recovery blocked without a blocker selection path. - Surfaced recovery indicators/actions in the issue UI, blocker notices, active run panels, issue rows, and Storybook coverage. - Updated docs and focused tests for recovery semantics, ownership, races, stale comments, and UI behavior. ## Verification - `pnpm exec vitest run server/src/__tests__/issue-recovery-actions.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts ui/src/components/IssueRecoveryActionCard.test.tsx ui/src/components/IssueBlockedNotice.test.tsx ui/src/api/issues.test.ts` — 5 files, 72 tests passed. - `pnpm --filter @paperclipai/shared typecheck` — passed. - `pnpm --filter @paperclipai/db typecheck` — passed, including migration numbering check. - `pnpm --filter @paperclipai/server typecheck` — passed. - `pnpm --filter @paperclipai/ui typecheck` — passed. - Follow-up verification after blocker-resolution guard: `pnpm exec vitest run server/src/__tests__/issue-recovery-actions.test.ts ui/src/components/IssueRecoveryActionCard.test.tsx ui/src/api/issues.test.ts` — 3 files, 27 tests passed. - Follow-up `pnpm --filter @paperclipai/server typecheck` — passed. - Follow-up `pnpm --filter @paperclipai/ui typecheck` — passed. - UI states are available in `ui/storybook/stories/source-issue-recovery.stories.tsx`; screenshot capture helper is `scripts/screenshot-recovery-card.cjs`. ## Risks - Medium: recovery behavior changes from child recovery issue ownership toward source-scoped actions, so operators may see stalled-work state in new places. - Migration risk is mitigated by using the next migration slot after `master` and making the table/constraints/index creation idempotent for anyone who previously applied the old branch-local `0082_dizzy_master_mold` migration. - Existing child recovery issue paths are still guarded for already-created recovery issues, but new source-scoped flows should be watched in CI and Greptile review. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool use enabled for shell, Git, GitHub, and local test execution. Context window not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-12 09:37:15 -05:00
Devin Foley	c445e59256	fix(ui): fix message attribution for agent-posted comments with user author IDs (#5780 ) ## Thinking Path > - Paperclip’s issue chat is an audit surface: reviewers need to trust who actually authored a message. > - Some historical agent comments were persisted with `authorUserId` and no surviving `createdByRunId`, so the UI rendered real agent output as if it came from the board user. > - A pure timestamp-window fallback is too risky because human reviewers can comment while agents are running. > - The safe recovery path is to derive attribution only when the server can prove it from same-issue run logs that include the exact posted comment id, then let the chat renderer prefer that recovered agent attribution. > - This keeps historical threads trustworthy without mutating old database rows or guessing in ambiguous cases. ## What Changed - Added shared `IssueComment` fields for derived attribution so server and UI can carry recovered `derivedAuthorAgentId`, `derivedCreatedByRunId`, and `derivedAuthorSource` consistently. - Added server-side attribution recovery in `server/src/services/issues.ts` that reads same-issue run logs and only derives agent authorship when a run log contains the exact `comment id: ...` emitted during posting. - Updated issue chat rendering in `ui/src/lib/issue-chat-messages.ts` to prefer direct agent authorship, then activity-log `runAgentId`, then the server-derived attribution. - Removed the unsafe UI-only run-window fallback from `ui/src/pages/IssueDetail.tsx` so human comments posted during an active run are not silently relabeled as agent output. - Added regression coverage for both the run-log derivation path and the chat-rendering fallback behavior. - Bounded server-side run-log enrichment to 8 concurrent reads per request and removed the unused `issueCommentSchema` declaration during PR cleanup. ## Verification - `pnpm exec vitest run ui/src/lib/issue-chat-messages.test.ts server/src/__tests__/issues-service.test.ts` - `pnpm test:run:general` - Live validation on May 12, 2026 in `PAPA-322`: confirmed the previously misattributed historical comments on `PAPA-316` now render as Claude-authored on `http://goldie.gerbil-company.ts.net:3100`. - Reviewer check: open `PAPA-316` in the running instance and confirm historical comments such as `## Investigation: exe.dev 422 + codex re-test` render under Claude instead of the board user. ## Risks - Low risk. The change is scoped to comment attribution recovery and rendering. - Derived attribution is intentionally conservative: if there is no exact run-log proof, the comment remains user-authored instead of guessing. - Run-log recovery depends on retained same-issue logs, so older comments without that evidence remain unchanged. ## Model Used - OpenAI Codex via the Paperclip `codex_local` adapter (GPT-5-class coding agent with tool use in the local Paperclip runtime; the exact deployment/model ID is not surfaced by this workspace). ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-12 01:20:49 -07:00
Dotta	563413ecd4	Fix LLM wiki type contracts (#5758 ) ## Thinking Path > - Paperclip is the control plane for autonomous AI companies, and plugins extend that control plane without bloating core. > - The LLM Wiki plugin adds a knowledge surface through the plugin runtime and shared plugin UI components. > - After the LLM Wiki work merged to `master`, CI exposed TypeScript contract drift between plugin code, SDK component types, and update settings types. > - The ingestion settings update path intentionally accepts partial source toggles, but its type intersected with the full settings shape and required every source key. > - The LLM Wiki UI also passes managed routine default-drift metadata through the shared routine list item shape, but that metadata was missing from the public item type. > - This pull request narrows those type contracts to match the existing runtime behavior. > - The benefit is restoring typecheck on `master` with a small, non-behavioral follow-up. ## What Changed - Added a `WikiEventIngestionSettingsUpdate` type that permits partial source updates without weakening normalized stored settings. - Added managed routine default-drift metadata to the plugin SDK `ManagedRoutinesListItem` type. - Mirrored that managed routine default-drift type in the host UI component item type. ## Verification - `pnpm --filter @paperclipai/plugin-llm-wiki typecheck` - `pnpm --filter @paperclipai/plugin-sdk typecheck` - `pnpm --filter @paperclipai/ui typecheck` - `git diff --check` ## Risks - Low risk. This is a TypeScript type-contract fix only; no runtime behavior or database schema changes. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent, tool-enabled local repository editing and command execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Notes on checklist applicability: no screenshots are included because the UI change is a shared type-only contract update with no visual behavior change; no docs were required because no behavior or commands changed. Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-11 21:07:06 -05:00
Dotta	508355b8fc	[codex] Add LLM Wiki plugin package to master (#5716 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The plugin system is the extension surface for optional product capabilities without baking every workflow into core. > - The LLM Wiki plugin package was reviewed in stacked PR #5592, which targeted `pap-9173-llm-wiki-rest`. > - The stack base PR #5597 merged to `master` before #5592 was merged into that branch, so the plugin package never reached `master`. > - A direct PR from `pap-9173-llm-wiki-rest` back to `master` would be noisy because that branch has diverged from current `master`. > - This pull request reapplies the reviewed `packages/plugins/plugin-llm-wiki/` package onto current `master` and updates Docker deps-stage manifest coverage. > - The branch intentionally no longer changes `pnpm-workspace.yaml` after maintainer feedback; because the new package is now a root workspace importer, the remaining integration question is how maintainers want the root lockfile handled under the current PR policy. ## What Changed - Added the LLM Wiki plugin package under `packages/plugins/plugin-llm-wiki/` from the merged PR #5592 head. - Preserved the post-review cleanup from #5592: generated design/screenshot artifacts are not committed, and `src/ui/index.tsx` / `src/wiki.ts` are small public entrypoints. - Added the new plugin package manifest to the Docker deps stage so policy can validate package manifest coverage. - Removed the earlier `pnpm-workspace.yaml` exclusion per maintainer request, so the plugin is included by the existing `packages/plugins/*` workspace glob. ## Verification Current head: - PGlite migration harness: ran migrations 001-003, verified old non-space distillation unique constraints were removed, inserted duplicate cursor and work-item keys in a second space, then reran migration 003 successfully - `node ./scripts/check-docker-deps-stage.mjs` - `git diff --check` Known current-head install result after removing the workspace exclusion: - `pnpm install --frozen-lockfile` fails because `pnpm-lock.yaml` has no importer for `packages/plugins/plugin-llm-wiki/package.json`. Previously verified on the same plugin source before the workspace-exclusion removal: - `pnpm --filter @paperclipai/plugin-sdk build` - `cd packages/plugins/plugin-llm-wiki && pnpm install --lockfile=false && pnpm test` ## Risks - The branch now includes `packages/plugins/plugin-llm-wiki` in the root workspace but does not update `pnpm-lock.yaml`. Root frozen install will fail until maintainers choose a lockfile path that fits repo policy. - Committing `pnpm-lock.yaml` directly on this PR conflicts with the current PR policy check, while excluding the package from `pnpm-workspace.yaml` was rejected in maintainer feedback. - The package includes UI code already reviewed in #5592; generated screenshot/design artifacts were intentionally removed per maintainer request, so visual review should regenerate screenshots locally if needed. - The package depends on plugin host support from #5597, which is already merged to `master`. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI GPT-5 Codex via Codex CLI, tool use and local code execution enabled; context window not exposed. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run the targeted checks listed above - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Stack context: #5592 was merged into `pap-9173-llm-wiki-rest` after #5597 had already merged that branch to `master`, so this follow-up PR is needed to carry the plugin package itself into `master`. Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-11 20:45:41 -05:00
Devin Foley	ad0bb57350	Fix exe.dev sandbox installs for gemini/opencode local adapters (#5737 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, including running adapter CLIs inside remote sandboxes > - The QA matrix in PAPA-316 spins up local-runtime adapters (claude/gemini/opencode) against both SSH and the new exe.dev sandbox provider, and "Test" exercises the same install + probe path the real runtime uses > - On exe.dev the QA matrix failed at three different points: SSH/sandbox secret refs would not resolve, gemini-local could not find npm, and opencode-local installed a binary that was not on the probe-shell PATH > - These are all environment-shape issues the runtime should handle, not regressions in any individual adapter, so they need to be fixed in the shared install/resolve layer before the matrix can pass > - This pull request wires the environment id through to secret-ref resolution, bootstraps npm from a portable Node tarball when the sandbox image lacks Node, and symlinks the opencode binary into a directory that non-login shells see > - The benefit is that the QA matrix passes end-to-end on exe.dev, and any future sandbox provider that ships without Node or relies on rc-file PATH wiring gets the same fixes for free ## What Changed - `server/src/services/environment-execution-target.ts`: pass the environment `id` into `resolveEnvironmentDriverConfigForRuntime` for both the sandbox and SSH branches, so `privateKeySecretRef` / sandbox-provider secret refs (e.g. exe.dev `apiKey`) can resolve against the secret store at runtime instead of throwing `Runtime secret resolution requires an environment id`. - `packages/adapter-utils/src/sandbox-install-command.ts`: extend `buildSandboxNpmInstallCommand` with an `ENSURE_NPM_PREAMBLE` that, when `npm` is missing, downloads a portable Node v22 tarball into `$HOME/.local` and sets `PAPERCLIP_NPM_BOOTSTRAPPED=1` so the install step skips sudo (sudo's `secure_path` would lose the freshly-installed `npm` in `$HOME/.local/bin`). Distro-packaged Node from apt-get is intentionally avoided because it tends to be too old to parse modern JS syntax used by `@google/gemini-cli`. - `packages/adapters/gemini-local/src/index.ts`: switch the hardcoded `npm install -g @google/gemini-cli` to `buildSandboxNpmInstallCommand`, so gemini-local picks up the same sudo-aware + npm-bootstrap behavior as the other local adapters. - `packages/adapters/opencode-local/src/index.ts`: append a step to the install command that symlinks `$HOME/.opencode/bin/opencode` into `$HOME/.local/bin`. The upstream installer only adds `~/.opencode/bin` to PATH via `~/.bashrc`, which non-login `sh -c` probe invocations do not source. - `packages/adapter-utils/src/sandbox-install-command.test.ts`: cover the new preamble plus the unchanged root/sudo/user-prefix branches. ## Verification - `cd packages/adapter-utils && npm test -- sandbox-install-command` (passes; new "bootstraps npm from a portable Node tarball when missing" case is included). - Manual: ran the in-app `Test` action against the QA matrix dev instance for `QA exe.dev Claude`, `QA exe.dev Gemini`, and `QA exe.dev OpenCode` — all three now report `status=pass` including the hello probe. `QA SSH Claude` also passes; without the environment-id fix, SSH resolution threw before the wrapper / install fixes could run. - Suggested reviewer check: re-run the matrix on a fresh exe.dev environment and confirm the install step no longer hits `npm: command not found` for gemini and the opencode probe no longer hits `opencode: command not found`. ## Risks - Low/medium. The npm bootstrap pins Node `v22.11.0` from `nodejs.org/dist`; if that URL becomes unreachable the install will fail with a clear `curl` error rather than corrupting state. The bootstrap path is only taken when `npm` is genuinely missing, so existing sandbox images that ship with Node are unaffected. - The opencode symlink uses `ln -sf` into `$HOME/.local/bin`, which is created with `mkdir -p`; idempotent on re-install. - The `id` change is a strict additive: callers previously got `undefined` and only the secret-ref code paths actually read it. No behavior change for environments without secret refs. ## Model Used - Claude (Anthropic), `claude-opus-4-7`, with extended thinking and tool use enabled. Iterated through the Paperclip QA matrix harness; no other model assisted. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots (n/a — runtime/install path only) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-11 14:28:22 -07:00
Devin Foley	5a64cf52a1	Add exe.dev sandbox provider plugin (#5688 ) > _Stacked on top of #5685 → #5686 → #5687. Diff against master includes commits from earlier PRs in the stack — review focuses on the two new commits (`Add long-secret textarea variant to JsonSchemaForm SecretField` + `Add exe.dev sandbox provider plugin`)._ ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs in a sandbox environment, and operators choose the provider — today E2B, Daytona, and (in this stack) Cloudflare > - exe.dev offers per-VM sandboxes via a small CLI / HTTP API — useful for operators who want full Linux VMs (vs container/runtime-only sandboxes) > - The plugin shape mirrors the e2b plugin: lifecycle hooks (`new`, `ls`, `rm`) drive exe.dev's CLI; SSH plumbing handles direct VM access for adapters that need it > - exe.dev VMs come up bare — `node` is not preinstalled, so the Paperclip sandbox callback bridge (a Node script) needs Node 20 installed at VM init via `--setup-script`. The plugin defaults the setup script to a Nodesource install > - The auth field accepts long SSH private keys, which need a textarea variant of the existing `SecretField` in `JsonSchemaForm` — added behind a `maxLength > THRESHOLD` opt-in so other secret fields are unaffected > - The benefit is that operators get exe.dev as a fully working sandbox provider out of the box, with no manual VM provisioning required ## What Changed Shared UI support (`Add long-secret textarea variant to JsonSchemaForm SecretField`): - `ui/src/components/JsonSchemaForm.tsx` + new `JsonSchemaForm.test.tsx`: when a secret-formatted field declares `maxLength` larger than the existing single-line threshold, render a monospace textarea instead of the masked input. Short secrets (API keys, tokens) keep the existing masked-input + show/hide toggle behavior. The exe.dev plugin (`Add exe.dev sandbox provider plugin`): - `packages/plugins/sandbox-providers/exe-dev/`: plugin entry, manifest, plugin runtime, README, and 19-test Vitest suite. - Manifest fields: API token (with `secret-ref` + `/exec` permission notes — needs `new`, `ls`, `rm`), API URL override, optional SSH username, optional SSH private key (uses the new `JsonSchemaForm` textarea variant via `maxLength: 4096`), optional SSH identity-file path, optional setup script. - Default `--setup-script` is a Nodesource Node 20 install. exe.dev VMs come up bare and the Paperclip sandbox callback bridge is a Node script, so without Node preinstalled the bridge can't start. Operators can override by supplying their own setup script. - `runLifecycleCommand` redacts env values from the executed command before surfacing it in error messages, so secrets passed via `--env=KEY=VALUE` don't leak into operator-visible failures. - The plugin distinguishes exe.dev's SSH onboarding failures (`Please complete registration by running: ssh exe.dev`) from general SSH failures and surfaces a clear remediation message. - `scripts/release-package-manifest.json`: register the new plugin for CI publish alongside the existing daytona / e2b providers. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage ui/src/components/JsonSchemaForm.test.tsx` - `(cd packages/plugins/sandbox-providers/exe-dev && pnpm test)` — 19 passing For an operator-side smoke test: 1. Get an exe.dev API token with `/exec` permission for `new`, `ls`, `rm`. 2. Register the plugin in your Paperclip instance, configure an environment with the token. 3. Create a sandbox env whose provider is `exe-dev`, then run a Codex or Claude job against it. The default Node 20 setup script should bring the VM up automatically. ## Risks - Adds a new sandbox provider plugin that follows the existing daytona / e2b shape; behavior on existing providers is unchanged. - The `JsonSchemaForm` textarea variant only engages for fields that opt in via `maxLength` larger than the existing threshold. All existing secret fields (which don't declare a `maxLength`) keep their current rendering. Test coverage pins both paths. - The redaction in `runLifecycleCommand` is a defense-in-depth measure; the test suite exercises the redaction path. If the redaction misses a future env-arg shape, the worst case is restored behavior (secrets in error messages), which is what the existing daytona / e2b plugins also do today. - Default setup script downloads from `deb.nodesource.com` over HTTPS at VM init. Operators on air-gapped networks or with a different package strategy can override the setup script. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — UI change is a textarea variant of an existing secret field; will attach screenshots before requesting merge - [x] I have updated relevant documentation to reflect my changes (plugin README, manifest descriptions) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-11 07:42:18 -07:00
Devin Foley	486fb88a15	Add Cloudflare sandbox provider plugin (#5687 ) > _Stacked on top of #5685 → #5686. Diff against master includes commits from earlier PRs in the stack — review focuses on the two new commits (`Extend sandbox callback bridge for Worker-hosted plugins` + `Add Cloudflare sandbox provider plugin`)._ ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs in a sandbox environment, and operators choose which provider backs that sandbox — today E2B and Daytona are bundled with the platform > - Cloudflare Workers + Durable Objects + the Sandbox SDK offer a credible new option: globally distributed, cheap idle, and operator-deployable as a single Worker > - To plug it in, Paperclip needs (a) a provider plugin that speaks the `PaperclipPluginManifestV1` lifecycle and (b) a small operator-deployed Worker — the bridge — that adapts Paperclip's runtime RPCs to the Cloudflare Sandbox SDK > - The plugin extends the existing sandbox-callback-bridge with a `bridge.transport: "worker"` discriminator so the platform routes runtime RPCs through the Worker bridge instead of the in-process runner > - This pull request adds the plugin, the bridge Worker template, and the supporting adapter-utils + server hooks the new transport needs > - The benefit is that operators can run sandboxes on Cloudflare's edge with no new platform code beyond installing the plugin and deploying the Worker ## What Changed Shared support (`Extend sandbox callback bridge for Worker-hosted plugins`): - `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`: expose `expectedHostHeader` so plugin-side bridge clients can verify the canonical request envelope before forwarding. - `packages/adapter-utils/src/command-managed-runtime.{ts,test.ts}`: relax the always-fresh runner construction so callers can re-use a runner across exec calls (Worker-hosted bridges hold the runner inside a Durable Object). - `server/src/services/environment-runtime.ts` + `environment-runtime.test.ts`: route Worker-hosted bridges through the same env-shaping path as E2B and pin the `requestEnv` contract. - `server/src/services/plugin-environment-driver.ts`: thread an optional `issueId` through the runtime descriptor so bridges can scope leases to the originating issue (used by Cloudflare to map a sandbox to the issue/workflow for billing and audit). - `packages/plugins/sdk/src/protocol.ts`: add `issueId?` to `PluginEnvironmentDriverBaseParams` and the new `bridge.transport: "worker"` discriminator that the new plugin declares. - `server/__tests__/heartbeat-plugin-environment.test.ts`: pin the heartbeat path against the new runtime descriptor. The Cloudflare plugin itself (`Add Cloudflare sandbox provider plugin`): - `packages/plugins/sandbox-providers/cloudflare/`: plugin entry, manifest, plugin runtime (lifecycle + bridge client), config parsing, and Vitest coverage. Manifest declares `bridge.transport: "worker"` so the platform routes runtime RPCs through the bridge client. - `bridge-template/`: a Worker template the operator deploys with `wrangler`. Owns Durable Object-backed sessions (`sessions.ts`), exec/stream routes (`exec.ts`, `routes.ts`), and an HMAC auth layer (`auth.ts`) that pins the `Host` header surface. Includes the SDK-contract-correct exec implementation, lease recovery, and chunked stdout/stderr streaming. - Tests cover lease/session handoff (`bridge-template/src/exec.test.ts`, `routes.test.ts`), bridge client request shaping (`src/bridge-client.test.ts`), and end-to-end plugin behavior (`src/plugin.test.ts`) including streamed exec output. 27 tests in total. - `README.md` walks the operator through deploying the bridge Worker, registering the plugin, and configuring the runtime. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage packages/adapter-utils/src/sandbox-callback-bridge.test.ts packages/adapter-utils/src/command-managed-runtime.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/heartbeat-plugin-environment.test.ts` - `(cd packages/plugins/sandbox-providers/cloudflare && pnpm test)` — 27 passing For an operator-side smoke test: 1. Deploy the bridge: `cd packages/plugins/sandbox-providers/cloudflare/bridge-template && wrangler deploy` 2. Register the plugin in your Paperclip instance, point its bridge URL at the deployed Worker, set the HMAC shared secret. 3. Create a sandbox environment whose provider is `cloudflare`, then run a Codex or Claude job against it. ## Risks - Adds a new `bridge.transport: "worker"` code path, but the existing E2B / Daytona transports go through the same shaped helpers and have explicit test coverage that pins their behavior unchanged. - The Worker bridge stores session state in a Durable Object; operator instances must be aware of the corresponding Cloudflare costs (DO requests, storage). Documented in the README. - The `issueId` plumbing is optional throughout — existing plugins that don't supply it continue to work. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes (plugin README, bridge-template README) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-11 07:33:13 -07:00
Devin Foley	0fe39a2d5c	fix(cursor-local): resolve sandbox agent installs from cursor bin (#5686 ) > _Stacked on top of #5685 (Harden remote sandbox runtime). Diff against master includes commits from earlier PRs in the stack — review focuses on the new commit only._ ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The cursor-local adapter wraps the Cursor Agent CLI so a Paperclip workflow can drive it inside a sandbox > - When the adapter runs in a remote sandbox, the Cursor Agent CLI installs under `$HOME/.local/bin/cursor-agent` (or wherever `$XDG_BIN_HOME` points), not on the global PATH > - The existing post-install resolution assumed `cursor-agent` would resolve via the sandbox's login shell PATH after `npm install -g`, which fails on sandboxes where the install lands in a user-prefixed directory that isn't on PATH at probe time > - This pull request resolves the agent CLI from the cursor binary's own directory (`dirname "$(command -v cursor)"`) so the install probe and execute path agree on a real binary location > - The benefit is that cursor-local works correctly on any sandbox provider where `npm install` lands in a user-prefixed directory ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: resolve the cursor-agent binary from the cursor bin directory after install, instead of relying on PATH. - `packages/adapters/cursor-local/src/server/test.ts`: corresponding probe tweak. - `packages/adapters/cursor-local/src/server/test.test.ts` (new) + `remote-command.test.ts`: focused coverage that exercises the install + resolve path against a sandbox runner that places the binary in a user-prefixed directory. ## Verification - `pnpm exec vitest run --no-coverage packages/adapters/cursor-local/src/server/test.test.ts packages/adapters/cursor-local/src/server/remote-command.test.ts packages/adapters/cursor-local/src/server/execute.test.ts` All passing locally. ## Risks - Local cursor-local runs are unaffected — the resolution change only kicks in for the sandbox install path. - Low risk; isolated to one adapter. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: tool use (Read/Edit/Bash), no code execution beyond local repo commands ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-11 00:41:20 -07:00
Devin Foley	b24c6909e8	Harden remote sandbox runtime probes, timeouts, and installs (#5685 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs inside a sandbox environment so its CLI is isolated from the host > - Sandbox-backed adapter runs go through a small set of shared helpers — `ensureAdapterExecutionTargetCommandResolvable`, the sandbox callback bridge runner, and per-adapter `SANDBOX_INSTALL_COMMAND` strings > - When standing up new sandbox provider plugins, the existing helpers timed out, missed install fallbacks, or leaned on assumptions that only held for E2B > - Local adapters (`claude-local`, `codex-local`, `gemini-local`, `opencode-local`) needed slightly hardened probes so they could install themselves and validate inside any remote sandbox transport, not just E2B > - This pull request bundles those runtime fixes so future sandbox provider plugins inherit a working baseline > - The benefit is that adding a new sandbox provider plugin no longer requires touching adapter-utils or each local-adapter probe — the supporting infra is already correct ## What Changed - `packages/adapter-utils/src/execution-target.ts`: introduce `DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC = 1800` and `resolveAdapterExecutionTargetTimeoutSec(...)`. Local and SSH adapters keep the historical "0 means no adapter timeout" behavior; sandbox-backed runs without an explicit `timeoutSec` get an explicit 30-minute default so remote installs and warm-up don't time out at the per-RPC default. Plumbed `timeoutSec` through `ensureAdapterExecutionTargetCommandResolvable` so install probes inside a sandbox honor adapter-level overrides instead of the bridge's 5-minute default. - `packages/adapters/opencode-local/src/index.ts`: switch `SANDBOX_INSTALL_COMMAND` from `npm install -g opencode-ai` to `curl -fsSL https://opencode.ai/install \| bash`. The npm package reifies four large prebuilt-binary subpackages in parallel even though only one matches the host arch; on bandwidth-constrained sandboxes that blew through the 240s install budget. The official installer fetches one arch-specific binary and adds `$HOME/.opencode/bin` to PATH via `~/.bashrc`, which the sandbox-callback-bridge login-shell script already sources. - `packages/adapters/{claude,codex,gemini,opencode}-local/`: harden remote-target probes — pass `--skip-git-repo-check` for Codex when probing outside a repo, normalize permission flags for Claude, and add `.remote.test.ts` coverage that exercises the remote-sandbox path explicitly for each adapter. - `packages/adapter-utils/src/sandbox-install-command.{ts,test.ts}` (new): add `buildSandboxNpmInstallCommand` helper. `server/src/adapters/registry.ts` + new `server/src/__tests__/adapter-registry.test.ts`: wire adapter install commands so they fall back to a writable `$HOME/.local` prefix when global install isn't available. - `server/src/__tests__/plugin-worker-manager.test.ts` + new `server/src/__tests__/fixtures/plugin-worker-delayed.cjs`: pin per-call timeout overrides so plugin worker exec calls honor the caller's timeout instead of the worker's default. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage packages/adapter-utils/src/execution-target-sandbox.test.ts packages/adapter-utils/src/sandbox-install-command.test.ts` - `pnpm exec vitest run --no-coverage server/src/__tests__/plugin-worker-manager.test.ts server/src/__tests__/adapter-registry.test.ts server/src/__tests__/claude-local-adapter-environment.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/gemini-local-adapter-environment.test.ts` - `pnpm exec vitest run --no-coverage packages/adapters/codex-local/src/server/test.remote.test.ts packages/adapters/opencode-local/src/server/test.remote.test.ts packages/adapters/codex-local/src/server/codex-args.test.ts packages/adapters/codex-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts` All passing locally. ## Risks - Touches shared `adapter-utils` and several `-local` adapters. The 30-minute default applies only when both (a) the target is `remote+sandbox` and (b) no `timeoutSec` is configured — local + SSH paths are unchanged. New test coverage was added alongside each behavior change to pin the contracts. - Switching OpenCode's install command to the official installer is a behavior change for any operator running OpenCode inside a remote sandbox. Local installs are unaffected (the `SANDBOX_INSTALL_COMMAND` only runs when an adapter is being installed inside a sandbox). - Low risk overall — no migrations, no API surface change. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep), no code execution beyond local repo commands ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-11 00:31:54 -07:00
Devin Foley	534aee66ae	Add cursor_cloud adapter for Cursor SDK + Cloud Agents API v1 (#5664 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - There are many adapter types, one per agent-runtime product (Claude, Codex, OpenCode, Cursor local CLI, etc.) > - Cursor shipped a public TypeScript SDK on 2026-04-29 that exposes Cursor's full hosted-agent platform (cloud VMs, harness, MCP, skills, hooks) > - Paperclip had no first-class adapter for this — agents that wanted to use Cursor's managed cloud runtime had to fall back to the local CLI adapter, which loses the cloud session, streaming, and durable run model > - This PR adds a new `cursor_cloud` adapter built directly on `@cursor/sdk`, with Paperclip's heartbeat mapped to Cursor's durable-agent + per-run model > - The benefit is that any Paperclip agent can now drive a Cursor cloud agent across heartbeats with native session reuse, streaming, and cancellation, while Paperclip remains the source of truth for issue/task state ## What Changed - New built-in adapter package `packages/adapters/cursor-cloud` (15 files, ~1.7k LOC) backed by `@cursor/sdk` ^1.0.12 - `src/server/execute.ts` — SDK-first lifecycle: `Agent.create` / `Agent.resume` / `Agent.getRun` / `agent.send` / `run.stream` / `run.wait`, with session reuse keyed on the (runtime env type, env name, repo set) tuple - `src/server/session.ts` — codec for `cursorAgentId` + `latestRunId` + repo metadata, persisted in `runtime.sessionParams` - `src/server/test.ts` — environment probe via `Cursor.me()` and optional model validation via `Cursor.models.list()` - `src/ui/parse-stdout.ts` + `src/cli/format-event.ts` — normalize Cursor SDK message types (`status`, `thinking`, `assistant`, `user`, `tool_call`, `tool_result`, `result`) into Paperclip transcript events for the UI and CLI - Registrations: `packages/shared/src/constants.ts`, `packages/adapter-utils/src/session-compaction.ts`, `server/src/adapters/{registry,builtin-adapter-types}.ts`, `ui/src/adapters/{registry,adapter-display-registry}.ts` + `ui/src/adapters/cursor-cloud/index.ts`, `cli/src/adapters/registry.ts`, plus workspace deps in `cli`/`server`/`ui` `package.json` - `ui/src/components/AgentConfigForm.tsx` — hide local-Cursor `mode`/thinking-effort field for `cursor_cloud` (different config surface) - 11 vitest tests covering execute paths (fresh create, matching-resume, active-run reattach, non-finished result), session codec round-trip, transcript parsing, and config building ## Verification Reviewer steps: ```bash pnpm install pnpm --filter @paperclipai/adapter-cursor-cloud typecheck # → clean pnpm vitest run packages/adapters/cursor-cloud # → 11/11 passing ``` End-to-end check against a real Cursor cloud agent (requires `CURSOR_API_KEY` and Cursor GitHub-app install on the target repo): 1. Create a `cursor_cloud` agent in Paperclip with `repoUrl` set to the test repo, `repoStartingRef: main`, and `env.CURSOR_API_KEY` set 2. Trigger a heartbeat → adapter calls `Agent.create({ cloud: { env: { type: "cloud" }, repos: [...] } })`, streams events, terminates on `finished` 3. Trigger a second heartbeat → adapter calls `Agent.resume` or `agent.send` follow-up depending on prior-run state, reusing `cursorAgentId` 4. The Paperclip UI/CLI transcript reflects Cursor `status` / `thinking` / `assistant` events as they stream 5. Cancellation from Paperclip maps to `run.cancel()` or Cloud API v1 `cancelRun` for cross-heartbeat cancellation A direct-SDK smoke run against a real repo (devinfoley/my_test_project @ main) confirmed: `Cursor.me()` ok → `Agent.create` → `agent.send` → `run.stream()` (30 events) → terminal status `finished` in ~11s. ## Risks - New adapter, additive only. No existing adapter or registry is replaced; current `cursor` local-CLI adapter is untouched. Default behavior of any existing agent is unchanged. - External dependency on `@cursor/sdk`. Cursor's SDK is v1.0.x and may evolve. Mocked unit tests cover the public surface used here; if the SDK breaks compatibility we update the adapter independently. - Cost/budget. `cursor_cloud` runs on Cursor's billed cloud VMs; operators must understand they are spending money outside Paperclip's budget controls when they enable this adapter. Same shape as other API-billed adapters. - No webhook support in V1. The SDK already provides stream/wait/cancel/reattach, so V1 does not require a public callback URL. If a future use case needs out-of-band wakes, we add a Cloud API v1 webhook bridge as a separate change. This is called out in the issue plan document. - Lockfile. Per repo policy, `pnpm-lock.yaml` is intentionally not in this PR — CI's lockfile workflow will update it on merge given the manifest changes. ## Model Used - Provider: Anthropic Claude (via Claude Code / Paperclip `claude_local` adapter) - Model: `claude-opus-4-7` (Claude Opus 4.7), knowledge cutoff January 2026 - Mode: standard tool-use with extended reasoning - Context: ~200k token window - Capabilities used: code generation, multi-file edits, shell/test execution, GitHub PR workflow ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass (11/11 in `packages/adapters/cursor-cloud`) - [x] I have added or updated tests where applicable (4 new test files, 11 cases) - [ ] If this change affects the UI, I have included before/after screenshots (the only UI change is hiding the local-Cursor mode field on the `cursor_cloud` adapter — happy to attach a screenshot if the reviewer wants one) - [x] I have updated relevant documentation to reflect my changes (issue plan document supersedes the pre-SDK design; tracked in PAPA-203) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-10 17:21:04 -07:00
Dotta	0096b56a1c	[codex] Add LLM Wiki plugin host support (#5597 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The plugin system needs host contracts and runtime support before large plugins can integrate cleanly. > - The source branch mixed the LLM Wiki package with supporting host/runtime work, managed plugin skills, root-level storage spaces, and a bookmarks reference plugin. > - [PAP-9173](/PAP/issues/PAP-9173) asked for the current branch to be split by file boundary: plugin package separately from everything else. > - [PAP-9188](/PAP/issues/PAP-9188) clarified that LLM Wiki may have plugin-local spaces, but Paperclip core should not reorganize top-level local storage into spaces. > - Follow-up review clarified that the bookmarks example should not ship in this PR either. > - This pull request contains the non-`packages/plugins/plugin-llm-wiki/` host/runtime work, keeps runtime state under the selected Paperclip instance root, and no longer includes the bookmarks example. ## What Changed - Added/updated plugin host contracts, SDK types, worker RPC plumbing, managed plugin skill support, and related server tests. - Removed the bookmarks example plugin package and its bundled-example/workspace references. - Removed the root-level local spaces CLI/migration surface and restored instance-root runtime defaults for config, db, logs, storage, secrets, workspaces, projects, and adapter homes. - Replaced shared root `space-paths` helpers with `home-paths` helpers for core runtime storage. - Tightened stranded recovery unique-conflict detection so concurrent recovery scans reuse the raced recovery issue when Postgres errors are wrapped. - Kept `packages/plugins/plugin-llm-wiki/` out of this PR diff; plugin-local spaces remain in the stacked plugin-only PR. ## Verification - `pnpm exec vitest run cli/src/__tests__/data-dir.test.ts cli/src/__tests__/home-paths.test.ts cli/src/__tests__/onboard.test.ts packages/shared/src/home-paths.test.ts packages/db/src/runtime-config.test.ts server/src/__tests__/agent-instructions-service.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/codex-local-execute.test.ts` - `pnpm exec vitest run packages/db/src/runtime-config.test.ts` - `pnpm exec vitest run server/src/__tests__/plugin-routes-authz.test.ts` - `pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts -t "reuses the raced stranded recovery issue"` skipped locally because embedded Postgres did not initialize on this macOS temp host; the code path was typechecked and is covered by Linux CI. - Boundary check: no core references remain for `PAPERCLIP_SPACE_ID`, `spaces migrate-default`, `@paperclipai/shared/space-paths`, `registerSpacesCommands`, or the removed bookmarks example. - Previous PR head `4f23e034` had green GitHub checks: `verify`, all four serialized server shards, `e2e`, `Canary Dry Run`, `policy`, Snyk, and `Greptile Review`. Current head `582f466d` is re-running checks after the bookmarks deletion. ## Risks - Plugin host changes touch shared runtime paths, so regressions would most likely appear in adapter startup, plugin loading, or local dev path defaults. - Removing the bookmarks example also removes one demonstration of plugin database namespaces plus local-folder persistence; remaining plugin examples still cover bundled example discovery and plugin host flows. - The plugin package itself is intentionally deferred to the stacked plugin-only PR, where LLM Wiki plugin-local spaces live. - Existing installs that tested the transient root-level spaces CLI should stop using it; this PR intentionally removes that unsupported migration surface before merge. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI GPT-5 Codex via Codex CLI, tool use and local code execution enabled; context window not exposed. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass, except where noted above for host-specific embedded Postgres initialization - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Stacked follow-up: PR #5592 contains only `packages/plugins/plugin-llm-wiki/` and targets this branch. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-10 07:34:12 -05:00
Devin Foley	2f72cb29ea	chore: update drizzle-orm to 0.45.2 (#5589 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The server, DB package, and CLI all rely on the shared Drizzle ORM dependency for core persistence flows. > - A published install was still resolving nested `drizzle-orm@0.38.4`, which left the production package graph behind the intended security update. > - The repo’s documented dependency policy says GitHub Actions owns `pnpm-lock.yaml`, so the correct maintainer workflow is to update dependency manifests in the feature PR and let the lockfile refresh happen separately after merge. > - This pull request therefore keeps the Drizzle upgrade to the package manifests only and leaves lockfile regeneration to the existing `Refresh Lockfile` automation. ## What Changed - Updated `drizzle-orm` dependency declarations in `cli/package.json`, `packages/db/package.json`, and `server/package.json` from `0.38.4` / `^0.38.4` to `0.45.2` / `^0.45.2`. - Re-verified the packed `@paperclipai/db` and `@paperclipai/server` publish payloads to confirm their generated `package.json` files advertise `drizzle-orm ^0.45.2`. - Removed the temporary lockfile/CI follow-up commits so the branch now matches the intended manifest-only protocol. ## Verification - `pnpm list drizzle-orm -r --depth 0` - `pnpm exec vitest run packages/db/src/client.test.ts server/src/__tests__/issues-service.test.ts` - `pnpm run test:release-registry` - Packed `@paperclipai/db` and `@paperclipai/server` locally and inspected the tarball `package.json` files to confirm they advertise `drizzle-orm ^0.45.2`. ## Risks - Low to moderate risk: the runtime code paths are unchanged, but downstream lockfile refresh now depends on the existing post-merge GitHub automation working as documented. - A separate packaging/versioning issue around unpublished `@paperclipai/plugin-sdk@1.0.0` showed up during a raw local tarball install experiment; that is called out for reviewers but is not part of this Drizzle bump. ## Model Used - OpenAI Codex via the `codex_local` adapter, using a GPT-5-based coding agent with terminal tool use and code execution. The adapter does not expose a public exact model ID or context-window value in this environment. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-05-09 21:31:57 -07:00
Dotta	778e775c35	Add secrets provider vaults and remote import (#5429 ) ## Thinking Path > - Paperclip orchestrates AI-agent companies and needs secrets handling to work across local development, hosted operators, and governed agent execution. > - The affected subsystem is the company-scoped secrets control plane: database schema, server services/routes, CLI workflows, and the Secrets settings UI. > - The gap was that secrets were local-only and operators could not manage provider vaults or import existing remote references without exposing plaintext. > - This branch adds provider vault configuration plus an AWS Secrets Manager remote-import path while preserving company boundaries, binding context, and audit trails. > - I kept the PR to a single branch PR, removed unrelated lockfile/package drift, rebased the full branch onto the current `public-gh/master`, and addressed fresh Greptile findings. > - The benefit is a reviewable implementation of provider-backed secrets with focused tests covering provider selection, import conflicts, deleted secret reuse, rotation guards, and AWS signing behavior. ## What Changed - Added provider vault support for company secrets, including provider config storage, default vault handling, health checks, binding usage, access events, and remote import preview/commit. - Added an AWS Secrets Manager provider using SigV4 request signing, bounded request timeouts, namespace guardrails, cached runtime credential resolution, and external-reference linking without plaintext reads. - Added Secrets UI surfaces for vault management and remote import, plus CLI/API documentation for setup and operations. - Stabilized routine webhook secret binding paths and SSH environment-driver fixture bindings discovered during verification. - Addressed Greptile and CI findings: no lockfile/package drift, monotonic migration metadata, disabled-vault default races, soft-deleted secret hiding/recreate behavior, remove behavior with disabled vaults, soft-deleted external-reference re-import, non-active rotation guards, managed-secret soft deletion through PATCH, and per-call AWS SDK credential client churn. - Rebased this branch onto `public-gh/master` at `0e1a5828` and force-pushed with lease to keep this as the single PR for the branch. ## Verification - `git fetch public-gh master` - `git rebase public-gh/master` - `git diff --name-only public-gh/master...HEAD \| grep '^pnpm-lock\.yaml$' \|\| true` confirmed `pnpm-lock.yaml` is not in the PR diff. - Confirmed migration ordering: master ends at `0081_optimal_dormammu`; this PR adds `0082_dry_vision` and `0083_company_secret_provider_configs`. - Inspected migrations for repeat safety: new tables/indexes use `IF NOT EXISTS`; foreign keys are guarded by `DO $$ ... IF NOT EXISTS`; column additions use `ADD COLUMN IF NOT EXISTS`. - `pnpm -r typecheck` passed before the Greptile follow-up commits. - `pnpm test:run` ran the full stable Vitest path before the Greptile follow-up commits; it completed with 3 timing-related failures under parallel load: `codex-local-execute.test.ts`, `cursor-local-execute.test.ts`, and `environment-service.test.ts`. - `pnpm --filter @paperclipai/server exec vitest run src/__tests__/codex-local-execute.test.ts src/__tests__/cursor-local-execute.test.ts src/__tests__/environment-service.test.ts` passed on targeted rerun (`24/24`). - `pnpm build` passed before the Greptile follow-up commits. Vite reported existing chunk-size/dynamic-import warnings. - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/secrets-service.test.ts` passed (`26/26`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/aws-secrets-manager-provider.test.ts src/__tests__/secrets-service.test.ts` passed (`39/39`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server typecheck` passed. - Captured Storybook screenshots from `ui/storybook-static` for visual review. - Latest PR checks on `5ca3a5cf`: `policy`, serialized server suites 1/4-4/4, `Canary Dry Run`, `e2e`, `security/snyk`, and `Greptile Review` pass; aggregate `verify` is still registering the completed child checks. - Greptile review loop continued through the latest requested pass; all Greptile review threads are resolved and the latest `Greptile Review` check on `5ca3a5cf` passed with 0 comments added. ## Screenshots Before: the provider-vault and remote-import surfaces did not exist on `master`; these are after-state screenshots from the Storybook fixtures. ![Secrets inventory](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secrets-inventory.png) ![Secret binding picker](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/secret-binding-picker.png) ![Environment editor with secrets](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2339-secrets-make-a-plan/doc/pr/5429/env-editor-with-secrets.png) ## Risks - Migration risk: this adds new secret provider tables and extends existing secret rows. The migrations were checked for monotonic ordering and idempotent guards, but reviewers should still inspect upgrade behavior carefully. - Provider risk: AWS support uses direct SigV4 requests. Automated tests cover signing, request timeouts, vault-config selection, namespace guardrails, pending-version archival, sanitized provider errors, and service-level cleanup paths. A real-vault AWS smoke test remains deployment validation for an operator with AWS credentials rather than an unverified merge blocker in this local branch. - UI risk: the Secrets page and import dialog are large new surfaces; screenshots are included above for reviewer inspection. - Verification risk: the full local stable test command hit parallel-load timing failures, although the exact failed files passed when rerun directly. - Operational risk: remote import intentionally avoids plaintext reads; operators must understand that imported external references resolve at runtime and may fail if AWS permissions change. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent with local shell/tool use in the Paperclip worktree. Exact context-window size was not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [ ] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-09 18:22:17 -05:00
Devin Foley	06e6ee25cd	Add Daytona sandbox provider plugin (#5580 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents need isolated sandbox environments to execute work safely; Paperclip already supports E2B as a sandbox provider plugin > - Users want to use Daytona (https://www.daytona.io/) as an alternative sandbox backend, but no plugin existed for it > - Without a Daytona plugin, teams that prefer Daytona's pricing/regions/runtime can't run Paperclip agents on it > - This pull request adds a `@paperclip/sandbox-provider-daytona` plugin that mirrors the existing E2B plugin shape and wires up Daytona's `@daytonaio/sdk` for sandbox lifecycle, command execution, and shell detection > - The benefit is that operators can pick Daytona as a first-class sandbox provider without touching core code, broadening Paperclip's runtime options ## What Changed - New plugin package `packages/plugins/sandbox-providers/daytona` with manifest, worker entry, and provider implementation backed by `@daytonaio/sdk` - Implements sandbox create/destroy/exec/upload/download lifecycle, shell command detection, and config/env wiring consistent with the E2B plugin - Adds unit tests under `src/plugin.test.ts` and a README documenting setup and the `DAYTONA_API_KEY` requirement - Minor adjustments in `scripts/paperclip-issue-update.sh`, `packages/shared/src/issue-thread-interactions.test.ts`, and `packages/shared/src/validators/issue.ts` to support the integration ## Verification - Re-ran the full sandbox provider matrix on the QA Paperclip instance using Daytona as the runtime — all 6 adapters executed inside the Daytona sandbox with zero `environmentExecute` timeouts - 5/6 adapters pass cleanly (or with informational warns); the only failure is `codex_local`, which is an OpenAI quota/billing issue unrelated to Daytona - `pnpm --filter @paperclip/sandbox-provider-daytona test` runs the plugin unit tests ## Risks - New optional plugin; no behavior change for users who don't enable it - Requires `DAYTONA_API_KEY` for runtime use — documented in the plugin README - Daytona SDK is a new external dependency; tracked in the plugin's own package.json so it doesn't affect the core install footprint ## Model Used - Claude Opus 4.7 (`claude-opus-4-7`), extended thinking, tool use enabled ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots (N/A — backend plugin) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-09 11:50:12 -07:00
Devin Foley	0e1a582831	Revert "Add experimental newest-first issue thread" (#5460 ) This is actually bad. Glad it was under experiments.	2026-05-07 16:50:31 -07:00
Devin Foley	a904effb96	Add experimental newest-first issue thread (#5455 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, so issue threads are a core operator surface for reviewing work. > - The issue detail page is the place where humans read agent messages, user comments, and execution context together. > - That thread originally rendered oldest-first, which made recent activity harder to see during active review. > - Reversing the thread order changes navigation expectations, timestamp placement, and the "Jump to latest" affordance, so the UI behavior needed to move as a coherent set. > - Because this is a visible core-product behavior shift, it also needed a safe rollout path instead of becoming the default immediately. > - This pull request adds the newest-first issue thread behavior behind an Experimental setting, updates the thread UI to match that mode, and keeps the legacy oldest-first experience unchanged by default. > - The benefit is that reviewers can opt into a more recent-first issue workflow without forcing a global behavior change on every Paperclip instance. ## What Changed - Reversed issue thread rendering so the newest comments and messages appear first when the experiment is enabled. - Moved the plain comment timestamp into the card header in newest-first mode and kept the legacy timestamp placement for oldest-first mode. - Moved the `Jump to latest` control to the bottom of the thread in newest-first mode while leaving the existing top placement for the legacy mode. - Added the `Enable Newest-First Issue Thread` experimental instance setting and wired issue detail to read that toggle. - Added regression coverage for thread order, timestamp placement, jump-button placement, and the issue-detail experiment toggle behavior. ## Verification - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Focused checks that also passed during issue review: - `pnpm vitest run src/components/IssueChatThread.test.tsx src/pages/IssueDetail.test.tsx` in `ui/` - `pnpm vitest run src/__tests__/instance-settings-routes.test.ts` in `server/` - Manual review path: - Enable `Instance Settings > Experimental > Enable Newest-First Issue Thread` - Open an issue with comments/messages and confirm newest activity renders first, timestamps move into the header, and `Jump to latest` sits below the thread - Disable the experiment and confirm the legacy oldest-first behavior returns ## Risks - Low risk: the behavioral change is gated behind an instance-level experimental toggle and defaults off. - The main regression risk is thread navigation drift between the two modes, especially around anchor scrolling and the `Jump to latest` affordance. - There is some UI coupling between issue-detail query state and experimental settings fetches, so future changes in that area should keep both modes covered. - Screenshots are not attached in this PR body; verification is described with automated coverage and manual steps instead. > I checked [`ROADMAP.md`](ROADMAP.md). This is a scoped issue-thread UX improvement and rollout gate, not a duplicate of a roadmap-level planned core feature. ## Model Used - OpenAI Codex via the local `codex_local` Paperclip adapter, GPT-5-based coding agent with terminal tool use and local code execution in this repository worktree. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-05-07 16:45:12 -07:00
Devin Foley	4269545b19	Stabilize Cursor sandbox runtime resolution (#5446 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > Stacked PR. Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by this PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge.	2026-05-07 15:00:28 -07:00
Devin Foley	fe3904f434	Stabilize runtime probes and Codex env tests (#5445 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Adapters expose a Test action that probes the configured runtime — install, resolvability, hello — to give operators a fast yes/no on whether an environment is healthy > - The Codex test path was running its hello probe directly without going through the managed-runtime preparation that production runs use, so a healthy production setup could still report a probe failure > - The plugin worker manager wasn't surfacing terminated workers cleanly, leaving the runtime probe waiting on a dead worker until the request timed out > - This pull request routes the Codex test probe through `prepareAdapterExecutionTargetRuntime` (so it sees the same managed Codex home production sees), exposes `commandCwd` on `createCommandManagedRuntimeClient` so callers can target a per-probe directory without leaking the workspace `remoteCwd`, and propagates plugin-worker termination as a usable error instead of a hang > - The benefit is the Codex Test action mirrors production behavior end-to-end, and probes against a terminated plugin worker fail fast instead of timing out ## What Changed - `packages/adapter-utils/src/command-managed-runtime.ts`: rename the `remoteCwd` knob to `commandCwd` so callers can target a per-probe directory without inheriting the workspace cwd; matching test coverage in `command-managed-runtime.test.ts` - `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`: small fixes to keep callback bridge stop semantics deterministic - `packages/adapters/codex-local/src/server/test.ts`: thread the Codex hello probe through `prepareAdapterExecutionTargetRuntime` + `prepareManagedCodexHome` so the probe sees the same managed home production sees; new `test.remote.test.ts` covers the remote probe path - `packages/adapters/cursor-local/src/server/execute.ts`: small probe-side cleanup that aligns with the new commandCwd contract - `server/src/services/plugin-worker-manager.ts`: surface plugin-worker termination as a structured error so callers fail fast; new `plugin-worker-terminated.cjs` fixture and `plugin-worker-manager.test.ts` cases pin the behavior ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils --project @paperclipai/adapter-codex-local --project @paperclipai/adapter-cursor-local --project @paperclipai/server` — 1749/1750 passing (1 unrelated skip) - `pnpm typecheck` clean ## Risks Low–medium. The `remoteCwd → commandCwd` rename is a parameter renaming on an internal helper used only by adapter test/execute paths in this repo. The plugin-worker-terminated path was previously a hang; failing fast may surface latent timeouts as explicit termination errors in callers that already expected them. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover commandCwd, plugin-worker termination, and Codex remote test path - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > Stacked PR. Sits on top of #5444 which adds the per-run runtime API surface this PR builds on. Cumulative diff against `master` includes that PR's content; the files touched by this PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once #5444 merges.	2026-05-07 14:52:31 -07:00
Devin Foley	12cb7b40fd	Harden remote workspace sync and restore flows (#5444 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - When an agent runs against a remote target, Paperclip syncs the workspace out to the remote at run start and restores changes back to the local workspace at run end > - The previous restore flow naïvely overwrote local files with whatever the remote returned, so files that the remote run never touched but had timestamp/mode drift could be needlessly rewritten — and a single static `refs/paperclip/ssh-sync/imported` ref made concurrent SSH workspace exports race on the same git ref > - This pull request adds a `workspace-restore-merge` module that diffs a pre-run snapshot against the post-run remote state and only writes back files the remote actually changed; SSH workspace exports now use a per-import unique ref so concurrent runs can't trample each other > - Every adapter's execute path threads the snapshot through `prepareAdapterExecutionTargetRuntime` so the merge has the baseline it needs > - The benefit is workspace restores no longer churn untouched files, and concurrent SSH runs no longer collide on the import ref ## What Changed - `packages/adapter-utils/src/workspace-restore-merge.{ts,test.ts}`: new module — directory snapshot (kind/mode/sha256/symlink target) plus snapshot-aware merge that writes only the files the remote changed - `packages/adapter-utils/src/ssh.ts`: SSH workspace export uses a per-import unique ref (`refs/paperclip/ssh-sync/imported/<uuid>`); restore goes through the new merge helper; `ssh-fixture.test.ts` covers the unique-ref + merge paths - `packages/adapter-utils/src/sandbox-managed-runtime.ts` + `remote-managed-runtime.ts`: thread the snapshot/merge through the sandbox and SSH paths - `packages/adapter-utils/src/server-utils.{ts,test.ts}` + `execution-target.ts`: helpers for capturing the pre-run snapshot; `prepareAdapterExecutionTargetRuntime` gains required `runId` and optional `workspaceRemoteDir`, and returns the realized `workspaceRemoteDir` - Each adapter's `execute.ts` (acpx, claude, codex, cursor, gemini, opencode, pi) takes the snapshot at run start and passes it through to the runtime restore - Remote execute test mocks updated to match the new `prepareWorkspaceForSshExecution` return shape and the per-run `${managedRemoteWorkspace}` cwd subdirectory ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils --project @paperclipai/adapter-acpx-local --project @paperclipai/adapter-claude-local --project @paperclipai/adapter-codex-local --project @paperclipai/adapter-cursor-local --project @paperclipai/adapter-gemini-local --project @paperclipai/adapter-opencode-local --project @paperclipai/adapter-pi-local` — 196/196 passing - `pnpm typecheck` clean across the workspace ## Risks Medium. The restore path now writes a strict subset of what it previously did — files the remote did not touch are no longer rewritten. If any flow was relying on a touch-without-content-change being copied back (timestamp or permission propagation only), that behavior is now skipped. Snapshot capture adds an O(N-files-in-workspace) hash pass at run start; the cost is bounded by the existing exclude list. The `runId` parameter on `prepareAdapterExecutionTargetRuntime` is now required — every in-tree caller is updated; out-of-tree adapter authors need to pass it. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new module + every adapter execute path covered - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-05-07 14:44:45 -07:00
Dotta	e400315cbf	Guard assigned backlog liveness (#5428 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The issue graph and liveness recovery system decide whether assigned work is executable or parked > - Assigned issues created without an explicit status could silently land in backlog, making parents look blocked with no productive wake path > - The server, shared validators, recovery analysis, and UI all need to agree on that execution semantic > - This pull request makes assigned issue creation default to `todo`, flags assigned backlog blockers, and surfaces the state in the board > - The benefit is that parked assigned work becomes intentional and visible instead of creating silent liveness stalls ## What Changed - Adds contract tests for assigned issue creation defaults. - Defaults assigned issue creation to `todo` when status is omitted while preserving explicit `backlog` parking. - Exposes `resolveCreateIssueStatusDefault` through shared validators. - Teaches liveness/blocker attention paths to distinguish assigned backlog blockers. - Adds UI notices, row/header badges, and issue detail safeguards for assigned backlog blockers. - Adds Storybook fixtures and execution-semantics documentation for the assigned-backlog behavior. ## Verification - `pnpm run preflight:workspace-links && pnpm exec vitest run packages/shared/src/validators/issue.test.ts server/src/__tests__/issue-assigned-backlog-contract-routes.test.ts server/src/__tests__/issue-blocker-attention.test.ts server/src/__tests__/issue-liveness.test.ts server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts ui/src/components/IssueAssignedBacklogNotice.test.tsx ui/src/components/IssueRow.test.tsx` — 50 passed, 23 skipped. - Skipped tests were embedded Postgres suites on this host with the repo skip message: `Postgres init script exited with code null. Please check the logs for extra info. The data directory might already exist.` - Pairwise merge check against the issue-controls PR branch completed without conflicts via `git merge --no-commit --no-ff` in a temporary worktree. - Screenshots for assigned-backlog UI states: [light](docs/pr-screenshots/pr-5428/assigned-backlog-light.png), [dark](docs/pr-screenshots/pr-5428/assigned-backlog-dark.png). - Follow-up checks: `pnpm --filter /ui typecheck`; `pnpm --filter /mcp-server build`; `pnpm --filter /mcp-server test`; `pnpm exec vitest run packages/shared/src/validators/issue.test.ts`; focused UI component tests. - Remote PR checks on head `6300b3c`: policy, verify, serialized server shards 1/4-4/4, Canary Dry Run, e2e, Greptile Review, and Snyk all passed. ## Risks - Medium: changes status defaulting for assigned issue creation when the caller omits status. Explicit `backlog` remains supported, and server/shared tests cover both paths. - Medium: liveness classification changes can affect blocker attention labels; focused service and UI tests cover the new assigned-backlog state. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled Paperclip heartbeat environment. Context window and internal reasoning mode are not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-07 12:25:26 -05:00

1 2 3 4 5 ...

623 Commits