forked from farhoodlabs/paperclip
bfe6369ef58cce7bbfa2dcbd09414d4809fe7bf7
606 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
4b1e92a588 |
feat(plugins): add Modal sandbox provider plugin (#6245)
## Thinking Path > - Paperclip orchestrates AI agents through company-scoped control-plane workflows and extensible runtime integrations. > - Sandbox providers are part of that extension surface because they let agents execute isolated work without baking each provider into the core server. > - Modal already offers managed sandboxes with filesystem, process, timeout, and networking controls that map onto Paperclip's sandbox provider contract. > - The repo did not have a Modal provider plugin, so teams wanting Modal-backed sandboxes had no first-party integration path. > - This pull request adds a standalone `packages/plugins/sandbox-providers/modal` plugin that implements the provider contract, worker entrypoint, docs, and tests. > - The benefit is that Modal can now be installed as a provider plugin without expanding the core control-plane surface area. ## What Changed - Added a new `packages/plugins/sandbox-providers/modal` package with the plugin manifest, worker entrypoint, and exported plugin surface. - Implemented Modal-backed sandbox lifecycle support for creation, command execution, file operations, networking options, termination, and metadata translation. - Added focused Vitest coverage for config validation, env handling, lifecycle flows, networking behavior, and error mapping. - Documented installation, configuration, and usage requirements in the plugin README. - Removed misleading `MODAL_TOKEN_*` fallback behavior so authentication relies on supported Modal credentials only. ## Verification - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - `cd packages/plugins/sandbox-providers/modal && pnpm test` ## Risks - Low to medium risk: this is isolated to a new plugin package, but runtime behavior still depends on live Modal account credentials and service-side sandbox semantics. - Modal's current docs target a newer Node baseline than the repo default, so the first live install should confirm credential loading and sandbox startup behavior in a real Modal workspace. - No UI or schema changes are included in this PR. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex via Paperclip `codex_local` agent (GPT-5-class Codex coding model; exact backend model ID is not exposed by the runtime), with tool use, shell execution, and code-editing capabilities enabled. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
5071c4c776 |
[codex] Add workspace diff viewer plugin (#6071)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Operators need to inspect what agents changed inside execution and project workspaces. > - The existing workspace detail views did not provide a first-party rich diff surface for staged, unstaged, head, renamed, binary, oversized, and untracked changes. > - The plugin system is the intended extension point for optional rich UI surfaces. > - This pull request adds a workspace diff plugin plus host services and shared contracts so Changes tabs can render workspace diffs through plugin slots. > - The diff-renderer dependency should stay owned by the plugin package rather than the core UI app. > - The dependency surface must stay aligned with repository PR policy, including intentionally omitting `pnpm-lock.yaml` from the PR. > - The benefit is a more reviewable workspace surface without hard-coding the renderer into every page. ## What Changed - Added `@paperclipai/plugin-workspace-diff`, including diff normalization, plugin manifest/worker/UI entrypoints, and focused plugin tests. - Kept `@pierre/diffs` scoped to `@paperclipai/plugin-workspace-diff`; removed the core UI lab diff-renderer surface and direct UI package dependency. - Added shared workspace diff types and validators, plus plugin SDK surface for workspace diff host services. - Added server workspace diff service support and route coverage for execution/project workspace diff flows. - Wired Execution Workspace and Project Workspace Changes tabs to load the diff plugin, including loading/error fallback behavior. - Added UI tests and fixtures for the Changes tabs and plugin bridge behavior. - Added the new plugin package manifest to the Docker deps stage so PR policy can validate dependency coverage. - Addressed review hardening around empty untracked patches, workspace path exposure, project workspace read capability checks, and default base refs. ## Verification - `pnpm --filter @paperclipai/plugin-workspace-diff test` - `pnpm exec vitest run packages/shared/src/validators/workspace-diff.test.ts server/src/__tests__/workspace-diff-service.test.ts ui/src/pages/ProjectWorkspaceDetail.test.tsx ui/src/pages/ExecutionWorkspaceDetail.test.tsx` - `pnpm exec vitest run ui/src/plugins/bridge.test.ts server/src/__tests__/workspace-runtime-routes-authz.test.ts` - `pnpm --filter @paperclipai/shared typecheck` - `pnpm --filter @paperclipai/plugin-workspace-diff typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm --filter @paperclipai/ui typecheck` - `node ./scripts/check-docker-deps-stage.mjs` - Browser screenshot captured from the local worktree dev server: https://files.catbox.moe/ofdpsp.png - Confirmed branch is rebased onto `public-gh/master`, `.github/workflows/pr.yml` is not included in the PR diff, `ui/package.json` is not included in the PR diff, and `pnpm-lock.yaml` is not included in the PR diff. ## Risks - Medium UI integration risk: the Changes tab depends on the plugin slot and host diff service path. - Medium dependency risk: this adds `@pierre/diffs` in the plugin package, but `pnpm-lock.yaml` is intentionally omitted per packaging instructions because repository automation manages lockfile updates. - Current CI blocker: downstream frozen installs fail until the repository policy path for new plugin package dependencies is chosen. - Diff rendering edge cases are covered for common working-tree and head diff states, but very large repositories may still expose performance limits. - No migrations are included. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 class coding model, tool-enabled local execution environment. Exact context window was not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
d734bd43d1 |
[codex] Roll up May 17 branch changes (#6210)
## Thinking Path > - Paperclip is the control plane for autonomous AI companies, so agent work needs visible ownership, recovery, and operator controls. > - This local branch had accumulated several related control-plane reliability and operator-experience fixes across recovery actions, watchdog folding, model-profile defaults, mentions, markdown editing, plugin launchers, and small UI polish. > - The branch needed to be converted into a PR against the current `origin/master` without losing dirty work or including lockfile/workflow churn. > - The safest standalone shape is a single rollup PR because the recovery/server/UI files overlap heavily across the local commits and splitting would create avoidable conflicts. > - This pull request replays the local branch onto latest `origin/master`, preserves the uncommitted work as logical commits, and adds a Zod 4 validator compatibility fix found during verification. > - The benefit is that the May 17 local branch can be reviewed and merged as one coherent, conflict-free branch under the 100-file Greptile limit. ## What Changed - Rebased the local May 17 branch work onto current `origin/master` in a dedicated worktree. - Preserved and committed previously dirty changes for recovery retry handling, plugin/sidebar launcher polish, and `.herenow` ignores. - Added recovery-action behavior for returning source issues to `todo` when retrying source-scoped recovery. - Included the existing local recovery/liveness/watchdog fold, Codex cheap-profile, markdown/mention, duplicate-agent, and UI polish commits from the branch. - Normalized shared validator `z.record(...)` schemas to explicit string-key records for Zod 4 compatibility. - Confirmed the PR has no `pnpm-lock.yaml` or `.github/workflows/*` changes and stays below the 100-file Greptile limit. ## Verification - `pnpm install --frozen-lockfile --ignore-scripts` - `npm run install` in `node_modules/.pnpm/sqlite3@5.1.7/node_modules/sqlite3` to build the local native sqlite3 binding after installing with scripts disabled - `pnpm exec vitest run packages/shared/src/validators/issue.test.ts packages/shared/src/project-mentions.test.ts packages/adapter-utils/src/server-utils.test.ts server/src/__tests__/heartbeat-model-profile.test.ts server/src/__tests__/issue-recovery-actions.test.ts server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts server/src/__tests__/heartbeat-active-run-output-watchdog.test.ts server/src/__tests__/plugin-local-folders.test.ts ui/src/components/IssueRecoveryActionCard.test.tsx ui/src/components/Sidebar.test.tsx ui/src/components/SidebarAccountMenu.test.tsx ui/src/components/IssueProperties.test.tsx ui/src/components/MarkdownEditor.test.tsx ui/src/components/MarkdownBody.test.tsx ui/src/lib/duplicate-agent-payload.test.ts ui/src/pages/Routines.test.tsx` - First pass: 13 files passed with 201 passing tests; 3 server files failed before sqlite3 native binding was built. - After rebuilding sqlite3: `server/src/__tests__/heartbeat-model-profile.test.ts`, `server/src/__tests__/issue-recovery-actions.test.ts`, and `server/src/__tests__/heartbeat-active-run-output-watchdog.test.ts` passed/loaded; embedded Postgres tests were skipped by the local host guard. - `pnpm --filter @paperclipai/shared typecheck` - `pnpm --filter @paperclipai/adapter-utils typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm --filter @paperclipai/ui typecheck` ## Risks - Medium risk: this is a broad rollup PR across recovery semantics, server tests, shared validators, and UI surfaces. - Some embedded Postgres tests skipped locally due the host guard, so CI should provide the stronger database-backed signal. - UI changes were covered by component tests, but no browser screenshot was captured in this PR creation pass. - This branch may overlap with existing recovery/liveness PR work; merge this PR independently or restack/close overlapping branches rather than merging duplicate implementations together. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent, tool-enabled local repository and GitHub workflow, medium reasoning effort. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
705c1b8d81 |
[codex] Add routine env secrets support (#6212)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Scheduled routines are the control-plane path for recurring agent work. > - Routines already had dispatch/history, but their runtime environment did not carry routine-owned secret bindings through execution. > - Operators need routine-specific secrets that can override project/agent env without exposing secret values in history, logs, or access events. > - This pull request adds the routine env runtime contract, wires it into execution, and makes the routine UI/history surfaces show safe secret metadata. > - The benefit is that routine executions can use scoped secret refs predictably while preserving company boundaries and auditability. ## What Changed - Added routine env persistence/runtime support, including `routines.env`, `routine_runs.routine_revision_id`, revision snapshots, and idempotent migration `0086_routine_env_runtime_contract`. - Resolved routine env during heartbeat adapter config assembly with precedence `agent < project < routine` and secret access events recorded against the routine consumer. - Added secret binding synchronization for routine create/update/restore flows and guarded cross-company, missing, disabled, and deleted secret cases. - Added a Secrets tab to routine detail, env/secret history diff rendering, and Storybook coverage for the new UI states. - Added server/UI regression tests, including an embedded-Postgres QA path for routine secret execution and restore behavior. - Updated implementation/database docs for routine env and secret-binding behavior. ## Verification - `pnpm install --frozen-lockfile` after rebasing onto `public-gh/master` to refresh workspace links for the newly-added upstream Grok adapter package. - `pnpm exec vitest run server/src/__tests__/heartbeat-project-env.test.ts server/src/__tests__/routines-service.test.ts server/src/__tests__/secrets-service.test.ts server/src/__tests__/qa-routine-secrets-e2e.test.ts ui/src/components/RoutineHistoryTab.test.tsx` passed: 5 files, 92 tests. - `pnpm -r typecheck` passed across the workspace. - `pnpm build` passed. Vite emitted the existing large-chunk/dynamic-import warnings. - UI screenshots were captured locally during QA in `artifacts/pap-9521/` and `artifacts/pap-9522/`; generated screenshots are not committed to avoid adding binary artifacts to the repo. ## Risks - Migration risk is limited by `IF NOT EXISTS` guards for the new columns, FK, and index, and the migration is ordered as `0086` immediately after upstream `0085`. - Runtime behavior changes env precedence for routine executions by adding routine env as the highest-precedence layer; tests cover agent/project/routine precedence. - Secret handling is security-sensitive; tests cover value-free manifests/events/errors, disabled/missing/deleted secrets, and cross-company rejection. - UI history now renders routine env/secret diffs; tests and Storybook stories cover the main rendering paths. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent based on GPT-5, with shell/tool use and medium reasoning effort. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
573e9ec909 |
fix(grok-local): restore turn boundaries in streaming reasoning text (#6142)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The `grok-local` adapter streams reasoning text to the issue "Working..." panel as the grok CLI runs > - The `grok` CLI's `--output-format streaming-json` mode silently drops the `\n` separator between reasoning turns around tool calls > - Consecutive `thought` chunks (e.g. `` "`" `` followed by `"The"`) arrive with no intervening whitespace event, so the UI's `delta: true` concatenator merged them into run-on text like `"…planningGreat, now I have the issue descriptionThe only co"` > - This PR adds a small turn-boundary helper that detects sentence boundaries in the upstream `thought` stream and inserts a single `\n` only when the previous chunk ended with sentence punctuation (or a balanced closing backtick) AND the next chunk begins a new uppercase sentence > - The benefit is readable streaming reasoning in the UI without changing how completed messages are stored ## What Changed - Added `packages/adapters/grok-local/src/shared/turn-boundary.ts` with per-stream state (last chunk + backtick parity) and a `restoreTurnBoundary()` helper that inserts `\n` only between balanced, sentence-terminated `thought` chunks - Wired the helper into `parseGrokJsonl` (server) and added a new `createGrokStdoutParser` factory used by `grokLocalUIAdapter` for the live "Working..." panel - Added focused tests in `shared/turn-boundary.test.ts`, plus regression assertions in `server/parse.test.ts` and `ui/parse-stdout.test.ts` ## Verification - `pnpm --filter @paperclip/grok-local test` — 23/23 adapter tests pass - `pnpm --filter @paperclip/grok-local typecheck` and UI typecheck — clean - Replayed an actual broken `grok 0.1.210` stream from the report; previously-merged boundaries (`` `ls`The ``, `returned:Confirmed`) now render with a separating newline; chunks inside un-closed backtick spans are left alone ## Risks - Low risk. Boundary insertion only fires when prev ends with `.`/`!`/`?`/balanced `` ` `` and next begins with an uppercase ≥2-char word, with no whitespace on either side. Worst case: a rare missed split or a misplaced newline inside reasoning — both purely cosmetic and confined to the live streaming panel. ## Model Used - Claude Opus 4.7 (claude-opus-4-7), Anthropic, extended thinking + tool use via Claude Code ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
ab8b471685 |
Add built-in grok_local adapter (#6087)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, so adapter quality directly affects what runtimes the control plane can supervise. > - Local CLI adapters are one of the core execution surfaces because they turn real coding tools into Paperclip-managed employees with heartbeats, transcripts, and reviewability. > - Grok Build was installed on the Paperclip host, but Paperclip had no built-in `grok_local` adapter, so the runtime could not be configured through the normal server/UI/CLI adapter path. > - That gap needed to be closed with the same built-in registry, environment diagnostics, transcript parsing, and skill/instructions behavior that the other local adapters already rely on. > - After the initial adapter landed, a real follow-up run showed that Grok streaming text was being rendered one fragment per line, which made transcripts harder to read even though the runtime itself was working. > - This pull request adds the built-in `grok_local` adapter end-to-end and then fixes the transcript parser so streamed Grok output is coalesced into readable assistant/thinking blocks. > - The benefit is that Grok Build becomes a first-class Paperclip runtime with a usable operator experience instead of a partially wired runtime with noisy transcript output. ## What Changed - Added a new built-in `@paperclipai/adapter-grok-local` package with server, UI, and CLI entrypoints. - Implemented Grok execution, session handling, environment diagnostics, config building, skill syncing, and parser coverage inside the new adapter package. - Registered `grok_local` across the built-in adapter inventories and capability/display metadata in server, UI, CLI, and shared constants. - Added adapter route coverage for the new built-in type. - Fixed Grok transcript readability by emitting streamed `text` and `thought` fragments as deltas so the shared transcript builder coalesces them into readable message blocks. - Added regression tests for the Grok parser and transcript coalescing behavior. ## Verification - `pnpm vitest run packages/adapters/grok-local/src/ui/parse-stdout.test.ts ui/src/adapters/transcript.test.ts` - `pnpm --filter @paperclipai/adapter-grok-local build` - Manual runtime verification on the Paperclip host during implementation and follow-up review: - confirmed the Grok CLI was installed and authenticated - confirmed the worktree dev server could be restarted cleanly and health-checked after the parser follow-up - No screenshots attached. This change is primarily adapter plumbing plus transcript formatting behavior; reviewers can verify via the Grok-backed run surfaces directly. ## Risks - This adds a new built-in adapter, so any missed registration surface could create inconsistencies between server, UI, and CLI behavior. - The adapter depends on Grok Build's current event/output shape; if upstream Grok streaming JSON changes, transcript parsing or session extraction may need follow-up updates. - The transcript readability fix intentionally changes how Grok fragments are grouped, so any downstream code that implicitly expected one entry per fragment would behave differently. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex via Paperclip `codex_local` agent runtime. - GPT-5-class coding model with tool use, shell execution, file editing, and repo inspection enabled. - Exact backend model ID/context window were not surfaced to the agent in this Paperclip session. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
4c47eb46c3 |
[codex] Add multilingual issue preservation coverage (#6069)
## Thinking Path > - Paperclip orchestrates AI agents for autonomous companies. > - Agents and board operators coordinate through company-scoped issues, comments, documents, and heartbeat wake payloads. > - Chinese, Japanese, and Hindi text needs to survive the full issue lifecycle without normalization or prompt serialization damage. > - The riskiest paths are board issue creation, server issue/comment/document round-tripping, and scoped wake prompt rendering. > - This pull request adds focused regression coverage across those surfaces. > - The benefit is higher confidence that multilingual operators and agents can create, search, comment on, complete, and wake on issues using non-Latin text. ## What Changed - Added adapter-utils wake payload and prompt rendering coverage for Chinese, Japanese, and Hindi issue/comment text. - Added UI New Issue dialog coverage proving multilingual title and description text is submitted unchanged. - Added server route coverage that round-trips multilingual issue text through create, search, comments, documents, completion comments, and heartbeat context. - Addressed Greptile feedback by using a typed storage mock and splitting the server route integration path into smaller ordered assertions. ## Verification - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts ui/src/components/NewIssueDialog.test.tsx server/src/__tests__/multilingual-issues-routes.test.ts` - Result: 3 test files passed, 51 tests passed. ## Risks - Low risk: this PR adds regression coverage only and does not change runtime behavior. - The new server test uses embedded Postgres support and skips on unsupported hosts using the existing helper pattern. - No migrations are included. - No `pnpm-lock.yaml` changes are included. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected - check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 based coding agent, with shell, git, Vitest, and GitHub connector/CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
eb38b226c2 |
Fix LLM Wiki package and migration validation (#6010)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Plugins extend the control plane with optional capabilities such as LLM Wiki. > - LLM Wiki needs its package assets and plugin-owned database migrations to work when installed from the packaged plugin. > - The bundled spaces migration used validation-hostile dynamic SQL, and the packaged plugin could omit non-dist runtime assets. > - This pull request makes the LLM Wiki package include its required assets and cuts the spaces migration over to explicit, idempotent SQL that passes the production plugin database validator. > - The benefit is a simpler plugin install path that validates and applies the bundled LLM Wiki migrations without adding plugin-specific legacy handling to Paperclip core. ## What Changed - Added the LLM Wiki package asset allowlist so agents, migrations, skills, templates, dist output, and README are included when packaged. - Renamed the bootstrap `.gitignore` template to `gitignore.template` and updated the runtime lookup so package tooling does not drop the hidden template file. - Relaxed plugin migration validation to allow namespace-scoped `INSERT`/`UPDATE` backfills and `CREATE INDEX` statements while continuing to reject destructive or cross-namespace SQL. - Replaced the LLM Wiki spaces migration's dynamic constraint-drop DO block with explicit `DROP CONSTRAINT IF EXISTS` statements. - Replaced fragile regex-source dispatch in SQL reference extraction with explicit capture-group descriptors. - Added regression coverage that applies the bundled LLM Wiki migrations through the production validator and checks the expected constraints. ## Verification - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/plugin-database.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm --filter @paperclipai/plugin-llm-wiki build` - `git diff --check` - Confirmed `pnpm-lock.yaml` is not included in the branch diff. ## Risks - Low migration risk for current users: LLM Wiki spaces are new, so this intentionally cuts over the plugin migration instead of adding legacy handling in core. - Validator behavior is broader than before, but still requires fully qualified plugin namespace targets, blocks deletes/destructive DDL, and keeps public table access read-only and allowlisted. > Checked [`ROADMAP.md`](ROADMAP.md); this is a targeted plugin packaging/migration fix and does not duplicate planned core feature work. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 based coding agent, tool-enabled local repo access, reasoning mode managed by the Paperclip/Codex runtime. Exact context window was not surfaced in this session. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
03ad5c5bea |
[codex] Add issue document locking (#6009)
## Thinking Path > - Paperclip orchestrates AI-agent companies through company-scoped issues, comments, and issue documents. > - Issue documents are the durable place where plans, handoffs, and other work artifacts are revised over time. > - Some documents need to be preserved as operator-approved snapshots while agents continue working on the same issue. > - Without document locking, a later board or agent write can overwrite the document key that reviewers expected to remain stable. > - This pull request adds board-managed issue document locks and makes agent writes to locked keys create a derived document instead of mutating the locked document. > - The benefit is safer document handoffs: approved or frozen issue documents stay immutable until the board explicitly unlocks them. ## What Changed - Added `locked_at`, `locked_by_agent_id`, and `locked_by_user_id` document fields plus migration `0085_tranquil_the_executioner.sql`. - Added document lock/unlock service behavior, route endpoints, activity events, and locked-document write protections. - Made agent document writes to locked keys create a new derived key such as `plan-2` rather than overwriting the locked document. - Surfaced lock state through shared issue document types, UI API methods, document header lock controls, and activity formatting. - Added server and UI tests for lock/unlock behavior, locked document immutability, and UI action visibility. - Updated `doc/SPEC-implementation.md` with the V1 document lock contract and endpoints. ## Verification - `git rebase public-gh/master` completed cleanly after committing the branch changes. - `git diff --check` passed before commit. - `pnpm run preflight:workspace-links && pnpm exec vitest run server/src/__tests__/documents-service.test.ts server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts ui/src/components/IssueDocumentsSection.test.tsx ui/src/components/IssueContinuationHandoff.test.tsx ui/src/lib/document-revisions.test.ts` passed: 5 files, 32 tests. ## Risks - Medium risk because this changes the document persistence contract and adds a migration. - The migration uses `ADD COLUMN IF NOT EXISTS` and guarded foreign-key creation so it remains safe for users who may have already applied an earlier copy of the migration. - Locked documents intentionally reject board edits/deletes/restores until unlocked; any existing workflows that expected direct overwrite need to unlock first. - Agent writes to locked keys now create derived documents, which may create extra issue documents when agents retry locked writes. ## Model Used - OpenAI Codex coding agent based on GPT-5, with tool use and local code execution in the Paperclip worktree. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
1bd44c8a0d |
Harden Cloudflare sandbox execution (#5967)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Remote-managed adapters need sandbox/environment execution to behave like real agent runs, not just local host probes. > - The Cloudflare sandbox path was the weakest leg in the SSH + Cloudflare QA matrix because bridge execution could truncate output, time out long-running installs, and under-provision the worker instance. > - That made several adapters fail for reasons unrelated to their actual business logic, which blocks confidence in Paperclip's non-local environment model. > - This pull request hardens the Cloudflare bridge/runtime path and adjusts sandbox probe budgets so adapter verification matches the measured behavior of the fixed environment. > - It also corrects the Pi sandbox install command so the QA matrix exercises a real, supported install path. > - The benefit is a materially more reliable SSH + Cloudflare adapter matrix with fewer false negatives and clearer failure boundaries. ## What Changed - Switched the Cloudflare bridge worker instance type to `standard-2` for the QA-matrix execution path. - Raised Cloudflare bridge/plugin-worker timeout budgets and added SSE keepalives so long-running install/exec calls can complete instead of dying at the transport layer. - Fixed Cloudflare bridge-channel command handling to avoid dropped final stdout chunks on short-lived execs. - Made Claude, OpenCode, and Cursor sandbox probe timeouts configurable/sandbox-aware, then tightened the defaults to the measured post-fix range. - Updated the Pi sandbox install command to use the package currently installed by the official `pi.dev` installer, pinned to a specific npm version. - Added/updated tests around Cloudflare bridge behavior and adapter sandbox probe paths. ## Verification - `pnpm --filter @paperclipai/adapter-claude-local typecheck` - `pnpm --filter @paperclipai/adapter-opencode-local typecheck` - `pnpm --filter @paperclipai/adapter-cursor-local typecheck` - `pnpm vitest run packages/adapters/cursor-local packages/adapters/claude-local packages/adapters/opencode-local packages/adapters/pi-local packages/plugins/sandbox-providers/cloudflare server/src/services/__tests__/plugin-worker-manager.test.ts` - Manual QA on the dedicated dev instance using the SSH + Cloudflare environment matrix (`ENV-29` through `ENV-40`). Clean end-to-end passes: SSH `claude_local`, `codex_local`, `cursor`, `gemini_local`; Cloudflare `claude_local`, `codex_local`, `cursor`, `gemini_local`. ## Risks - Cloudflare sandbox cost increases because the bridge worker now runs on `standard-2` instead of `lite`. - Higher timeout ceilings can delay surfacing truly hung Cloudflare bridge calls, even though they remove transport-level false negatives. - The manual heartbeat matrix still exposed follow-on execution/sync/disposition bugs in `opencode_local` and `pi_local`; those are not fixed by this PR. ## Model Used - OpenAI `gpt-5.4` via Paperclip `codex_local`, reasoning effort `high`, tool use enabled, repo search enabled. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots (not applicable) - [x] I have updated relevant documentation to reflect my changes (not applicable) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
4142559c37 |
[codex] Add blocked inbox attention view (#5603)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies through company-scoped issues, comments, approvals, and execution workspaces. > - Operators need the Inbox to show not only active work, but also blocked work that may need human or agent attention. > - The existing inbox experience did not have a dedicated blocked-work surface, so blocked tasks were harder to triage and resume deliberately. > - Backend consumers also needed a compact attention signal that distinguishes actionable blockers from covered or waiting blocker states. > - This pull request adds a Blocked Inbox tab backed by issue blocker-attention metadata, shared validators, and UI helpers. > - The benefit is a clearer triage path for stalled or blocked Paperclip work without exposing external wait internals in the operator-facing UI. ## What Changed - Added shared issue blocker-attention types, validators, and exports for the API/UI contract. - Added backend blocker-attention computation and issue route support for blocked inbox data. - Added the Blocked Inbox tab, blocked reason chips, filtering/search UI, responsive layouts, and Storybook stories. - Updated inbox helpers and page behavior so toolbar controls only appear where they apply. - Added coverage for shared validators, server blocker-attention behavior, blocked inbox UI helpers/components, and the Inbox page. - Added a screenshot helper script for the blocked inbox Storybook stories. - Addressed Greptile feedback by making urgency sorting deterministic for null stop times, avoiding full blocked-inbox list enrichment for counts, and hardening the screenshot helper. ## Verification - Rebased the branch cleanly onto `public-gh/master`. - Confirmed the diff does not include `pnpm-lock.yaml`. - Confirmed the diff does not include database migration files. - Ran `pnpm exec vitest run packages/shared/src/validators/issue.test.ts server/src/__tests__/issue-blocker-attention.test.ts ui/src/components/BlockedInboxView.test.tsx ui/src/components/BlockedReasonChip.test.tsx ui/src/lib/blockedInbox.test.ts ui/src/lib/inbox.test.ts ui/src/pages/Inbox.test.tsx`. - Ran `pnpm --filter @paperclipai/shared typecheck && pnpm --filter @paperclipai/server typecheck && pnpm --filter @paperclipai/ui typecheck`. - Checked `ROADMAP.md`; this is scoped inbox/operator triage work and does not duplicate a listed roadmap feature. - Greptile Review is green on the latest head and all four Greptile review threads are resolved. - GitHub PR checks are green on the latest head: policy, security/snyk, e2e, verify, Canary Dry Run, Greptile Review, and serialized server suites 1/4 through 4/4. ## Risks - Medium review surface because this touches the shared issue contract, server issue services, and the Inbox UI together. - Blocker-attention classification may need product tuning after operators use it on real blocked queues. - UI screenshots were not attached in this PR-opening pass; the branch includes `scripts/screenshot-blocked-inbox.mjs` and Storybook stories for visual capture. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used OpenAI Codex, GPT-5-based coding agent with shell, git, GitHub CLI, GitHub connector, and Paperclip API tool use. Reasoning mode: medium. Context window: not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
d1a8c873b2 |
fix(remote-sandbox): harden host workspace resumes (#5922)
## Thinking Path > - Paperclip orchestrates AI agents through a control plane while adapters execute work in local, remote, or sandboxed runtimes. > - Remote sandbox execution depends on a strict host-versus-remote workspace boundary: the host prepares/restores files, while the adapter command runs inside the sandbox cwd. > - Jannes' PR #5823 identified host-side failure modes that were not covered by replacement PR #5822. > - Persisting a remote pod cwd in session params could poison the next host heartbeat resume and make Paperclip inspect or upload system temp roots. > - Plugin sandbox providers also need a narrow way to receive model-provider API keys without exposing the full server environment to every plugin worker. > - This pull request ports the host-side fixes from #5823 in the current codebase style, with focused regression coverage. > - The benefit is safer remote sandbox resumes and plugin worker environment handling without broadening core plugin privileges. ## What Changed - Persist host workspace cwd, not remote sandbox cwd, in `claude_local` session params while retaining remote execution identity metadata. - Reject saved session cwds that point at system roots before heartbeat falls back to agent home workspace. - Skip sockets, FIFOs, devices, and other non-file entries during workspace restore snapshot capture/comparison. - Pass a small model-provider API-key allowlist only to plugins declaring `environment.drivers.register`. - Added focused regression tests for remote Claude session params, unsafe session cwd detection, plugin worker env filtering, and non-file snapshot entries. Credits: ports host-side fixes from Jannes' #5823. ## Verification - `pnpm vitest run packages/adapter-utils/src/workspace-restore-merge.test.ts server/src/services/session-workspace-cwd.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/plugin-database.test.ts` (25 passed, 7 skipped by existing embedded-Postgres host guard) - `pnpm --filter @paperclipai/adapter-utils typecheck` - `pnpm --filter @paperclipai/adapter-claude-local typecheck` - `pnpm --filter @paperclipai/server typecheck` ## Risks - Low risk: changes are scoped to remote sandbox/session metadata, workspace snapshot filtering, and plugin worker env setup. - Sandbox-provider plugins now receive only the explicit model-provider key allowlist; any provider needing another key name will need a deliberate allowlist update. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent, tool-enabled local code execution and repository editing. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
b947a7d76c |
[codex] Improve local plugin development workflow (#5821)
## Thinking Path > - Paperclip is the control plane for autonomous AI-agent companies. > - Plugins are the extension point for adding capabilities without expanding the core product surface. > - Local plugin development needed a tighter CLI-first loop so plugin authors can scaffold, run, install, inspect, and reload plugins without reaching into internal package paths. > - The server plugin install path also needed local-path handling that keeps plugin identity, dashboard routes, and development watchers coherent. > - This pull request adds the CLI scaffold/install workflow, fixes the server and SDK edge cases that blocked that loop, and updates the agent-facing plugin creation skill and docs. > - The benefit is that contributors can develop plugins from local folders with a documented, repeatable happy path. ## What Changed - Added `paperclipai plugin init` coverage and CLI wiring for local plugin scaffolding. - Improved local plugin install handling, plugin key route resolution, dashboard capability behavior, and dev watcher startup/reload behavior. - Fixed plugin SDK worker entrypoint validation for symlinked package layouts. - Added targeted tests for plugin init, server plugin authz/watcher behavior, SDK worker host validation, and the authoring smoke example. - Added a short local plugin development guide and refreshed the plugin authoring guide plus `paperclip-create-plugin` skill instructions. ## Verification - `pnpm run preflight:workspace-links && pnpm --filter @paperclipai/plugin-sdk build && pnpm --filter @paperclipai/create-paperclip-plugin typecheck && pnpm --filter paperclipai typecheck && pnpm --filter @paperclipai/plugin-sdk typecheck && pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run --project paperclipai cli/src/__tests__/plugin-init.test.ts` - `pnpm exec vitest run --project @paperclipai/plugin-sdk packages/plugins/sdk/tests/worker-rpc-host.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/plugin-dev-watcher.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/plugin-routes-authz.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm --dir packages/plugins/examples/plugin-authoring-smoke-example test` - Confirmed `pnpm-lock.yaml` is not included in the PR diff. ## Risks - Medium risk: this touches plugin install routing, CLI command behavior, and the local development watcher. - Local path plugin installs execute trusted local code by design; the new docs call out that trust boundary. - No database migrations are included. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled local shell and git workflow, medium reasoning effort. Context window details were not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge UI screenshots: not applicable; this PR changes CLI/server/plugin docs and tests, not board UI rendering. --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
0808b388ee |
[codex] Add source-scoped recovery actions (#5599)
## Thinking Path > - Paperclip is a control plane for autonomous AI companies, where work must end with a clear disposition rather than ambiguous agent liveness. > - Recovery currently detects stalled or missing-next-step issues, but source issue recovery can become split across child recovery issues, blockers, and comments. > - That makes it harder for operators and agents to see who owns recovery and what exact action is needed on the original issue. > - Source-scoped recovery actions give the original issue a first-class active recovery state with owner, evidence, wake policy, and resolution outcome. > - This pull request adds the recovery-action data model, backend reconciliation and resolution APIs, and board UI indicators/actions. > - The benefit is clearer stalled-work recovery without losing source issue context or relying on comments as the liveness path. ## What Changed - Added the `issue_recovery_actions` schema, shared types/constants/validators, and an idempotent `0084_issue_recovery_actions` migration ordered after current `master` migrations. - Updated stranded/missing-disposition recovery to create source-scoped recovery actions, wake the recovery owner on the source issue, and avoid locking the source issue for recovery-action wakes. - Added API support for reading active recovery actions on issue detail/list surfaces and resolving them with restored, blocked, cancelled, or false-positive outcomes. - Require blocked recovery resolutions to have an unresolved first-class blocker, and removed the UI shortcut that could mark recovery blocked without a blocker selection path. - Surfaced recovery indicators/actions in the issue UI, blocker notices, active run panels, issue rows, and Storybook coverage. - Updated docs and focused tests for recovery semantics, ownership, races, stale comments, and UI behavior. ## Verification - `pnpm exec vitest run server/src/__tests__/issue-recovery-actions.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts ui/src/components/IssueRecoveryActionCard.test.tsx ui/src/components/IssueBlockedNotice.test.tsx ui/src/api/issues.test.ts` — 5 files, 72 tests passed. - `pnpm --filter @paperclipai/shared typecheck` — passed. - `pnpm --filter @paperclipai/db typecheck` — passed, including migration numbering check. - `pnpm --filter @paperclipai/server typecheck` — passed. - `pnpm --filter @paperclipai/ui typecheck` — passed. - Follow-up verification after blocker-resolution guard: `pnpm exec vitest run server/src/__tests__/issue-recovery-actions.test.ts ui/src/components/IssueRecoveryActionCard.test.tsx ui/src/api/issues.test.ts` — 3 files, 27 tests passed. - Follow-up `pnpm --filter @paperclipai/server typecheck` — passed. - Follow-up `pnpm --filter @paperclipai/ui typecheck` — passed. - UI states are available in `ui/storybook/stories/source-issue-recovery.stories.tsx`; screenshot capture helper is `scripts/screenshot-recovery-card.cjs`. ## Risks - Medium: recovery behavior changes from child recovery issue ownership toward source-scoped actions, so operators may see stalled-work state in new places. - Migration risk is mitigated by using the next migration slot after `master` and making the table/constraints/index creation idempotent for anyone who previously applied the old branch-local `0082_dizzy_master_mold` migration. - Existing child recovery issue paths are still guarded for already-created recovery issues, but new source-scoped flows should be watched in CI and Greptile review. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool use enabled for shell, Git, GitHub, and local test execution. Context window not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
c445e59256 |
fix(ui): fix message attribution for agent-posted comments with user author IDs (#5780)
## Thinking Path > - Paperclip’s issue chat is an audit surface: reviewers need to trust who actually authored a message. > - Some historical agent comments were persisted with `authorUserId` and no surviving `createdByRunId`, so the UI rendered real agent output as if it came from the board user. > - A pure timestamp-window fallback is too risky because human reviewers can comment while agents are running. > - The safe recovery path is to derive attribution only when the server can prove it from same-issue run logs that include the exact posted comment id, then let the chat renderer prefer that recovered agent attribution. > - This keeps historical threads trustworthy without mutating old database rows or guessing in ambiguous cases. ## What Changed - Added shared `IssueComment` fields for derived attribution so server and UI can carry recovered `derivedAuthorAgentId`, `derivedCreatedByRunId`, and `derivedAuthorSource` consistently. - Added server-side attribution recovery in `server/src/services/issues.ts` that reads same-issue run logs and only derives agent authorship when a run log contains the exact `comment id: ...` emitted during posting. - Updated issue chat rendering in `ui/src/lib/issue-chat-messages.ts` to prefer direct agent authorship, then activity-log `runAgentId`, then the server-derived attribution. - Removed the unsafe UI-only run-window fallback from `ui/src/pages/IssueDetail.tsx` so human comments posted during an active run are not silently relabeled as agent output. - Added regression coverage for both the run-log derivation path and the chat-rendering fallback behavior. - Bounded server-side run-log enrichment to 8 concurrent reads per request and removed the unused `issueCommentSchema` declaration during PR cleanup. ## Verification - `pnpm exec vitest run ui/src/lib/issue-chat-messages.test.ts server/src/__tests__/issues-service.test.ts` - `pnpm test:run:general` - Live validation on May 12, 2026 in `PAPA-322`: confirmed the previously misattributed historical comments on `PAPA-316` now render as Claude-authored on `http://goldie.gerbil-company.ts.net:3100`. - Reviewer check: open `PAPA-316` in the running instance and confirm historical comments such as `## Investigation: exe.dev 422 + codex re-test` render under Claude instead of the board user. ## Risks - Low risk. The change is scoped to comment attribution recovery and rendering. - Derived attribution is intentionally conservative: if there is no exact run-log proof, the comment remains user-authored instead of guessing. - Run-log recovery depends on retained same-issue logs, so older comments without that evidence remain unchanged. ## Model Used - OpenAI Codex via the Paperclip `codex_local` adapter (GPT-5-class coding agent with tool use in the local Paperclip runtime; the exact deployment/model ID is not surfaced by this workspace). ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
563413ecd4 |
Fix LLM wiki type contracts (#5758)
## Thinking Path > - Paperclip is the control plane for autonomous AI companies, and plugins extend that control plane without bloating core. > - The LLM Wiki plugin adds a knowledge surface through the plugin runtime and shared plugin UI components. > - After the LLM Wiki work merged to `master`, CI exposed TypeScript contract drift between plugin code, SDK component types, and update settings types. > - The ingestion settings update path intentionally accepts partial source toggles, but its type intersected with the full settings shape and required every source key. > - The LLM Wiki UI also passes managed routine default-drift metadata through the shared routine list item shape, but that metadata was missing from the public item type. > - This pull request narrows those type contracts to match the existing runtime behavior. > - The benefit is restoring typecheck on `master` with a small, non-behavioral follow-up. ## What Changed - Added a `WikiEventIngestionSettingsUpdate` type that permits partial source updates without weakening normalized stored settings. - Added managed routine default-drift metadata to the plugin SDK `ManagedRoutinesListItem` type. - Mirrored that managed routine default-drift type in the host UI component item type. ## Verification - `pnpm --filter @paperclipai/plugin-llm-wiki typecheck` - `pnpm --filter @paperclipai/plugin-sdk typecheck` - `pnpm --filter @paperclipai/ui typecheck` - `git diff --check` ## Risks - Low risk. This is a TypeScript type-contract fix only; no runtime behavior or database schema changes. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent, tool-enabled local repository editing and command execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Notes on checklist applicability: no screenshots are included because the UI change is a shared type-only contract update with no visual behavior change; no docs were required because no behavior or commands changed. Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
508355b8fc |
[codex] Add LLM Wiki plugin package to master (#5716)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The plugin system is the extension surface for optional product capabilities without baking every workflow into core. > - The LLM Wiki plugin package was reviewed in stacked PR #5592, which targeted `pap-9173-llm-wiki-rest`. > - The stack base PR #5597 merged to `master` before #5592 was merged into that branch, so the plugin package never reached `master`. > - A direct PR from `pap-9173-llm-wiki-rest` back to `master` would be noisy because that branch has diverged from current `master`. > - This pull request reapplies the reviewed `packages/plugins/plugin-llm-wiki/` package onto current `master` and updates Docker deps-stage manifest coverage. > - The branch intentionally no longer changes `pnpm-workspace.yaml` after maintainer feedback; because the new package is now a root workspace importer, the remaining integration question is how maintainers want the root lockfile handled under the current PR policy. ## What Changed - Added the LLM Wiki plugin package under `packages/plugins/plugin-llm-wiki/` from the merged PR #5592 head. - Preserved the post-review cleanup from #5592: generated design/screenshot artifacts are not committed, and `src/ui/index.tsx` / `src/wiki.ts` are small public entrypoints. - Added the new plugin package manifest to the Docker deps stage so policy can validate package manifest coverage. - Removed the earlier `pnpm-workspace.yaml` exclusion per maintainer request, so the plugin is included by the existing `packages/plugins/*` workspace glob. ## Verification Current head: - PGlite migration harness: ran migrations 001-003, verified old non-space distillation unique constraints were removed, inserted duplicate cursor and work-item keys in a second space, then reran migration 003 successfully - `node ./scripts/check-docker-deps-stage.mjs` - `git diff --check` Known current-head install result after removing the workspace exclusion: - `pnpm install --frozen-lockfile` fails because `pnpm-lock.yaml` has no importer for `packages/plugins/plugin-llm-wiki/package.json`. Previously verified on the same plugin source before the workspace-exclusion removal: - `pnpm --filter @paperclipai/plugin-sdk build` - `cd packages/plugins/plugin-llm-wiki && pnpm install --lockfile=false && pnpm test` ## Risks - The branch now includes `packages/plugins/plugin-llm-wiki` in the root workspace but does not update `pnpm-lock.yaml`. Root frozen install will fail until maintainers choose a lockfile path that fits repo policy. - Committing `pnpm-lock.yaml` directly on this PR conflicts with the current PR policy check, while excluding the package from `pnpm-workspace.yaml` was rejected in maintainer feedback. - The package includes UI code already reviewed in #5592; generated screenshot/design artifacts were intentionally removed per maintainer request, so visual review should regenerate screenshots locally if needed. - The package depends on plugin host support from #5597, which is already merged to `master`. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI GPT-5 Codex via Codex CLI, tool use and local code execution enabled; context window not exposed. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run the targeted checks listed above - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Stack context: #5592 was merged into `pap-9173-llm-wiki-rest` after #5597 had already merged that branch to `master`, so this follow-up PR is needed to carry the plugin package itself into `master`. Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
ad0bb57350 |
Fix exe.dev sandbox installs for gemini/opencode local adapters (#5737)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, including running adapter CLIs inside remote sandboxes > - The QA matrix in PAPA-316 spins up local-runtime adapters (claude/gemini/opencode) against both SSH and the new exe.dev sandbox provider, and "Test" exercises the same install + probe path the real runtime uses > - On exe.dev the QA matrix failed at three different points: SSH/sandbox secret refs would not resolve, gemini-local could not find npm, and opencode-local installed a binary that was not on the probe-shell PATH > - These are all environment-shape issues the runtime should handle, not regressions in any individual adapter, so they need to be fixed in the shared install/resolve layer before the matrix can pass > - This pull request wires the environment id through to secret-ref resolution, bootstraps npm from a portable Node tarball when the sandbox image lacks Node, and symlinks the opencode binary into a directory that non-login shells see > - The benefit is that the QA matrix passes end-to-end on exe.dev, and any future sandbox provider that ships without Node or relies on rc-file PATH wiring gets the same fixes for free ## What Changed - `server/src/services/environment-execution-target.ts`: pass the environment `id` into `resolveEnvironmentDriverConfigForRuntime` for both the sandbox and SSH branches, so `privateKeySecretRef` / sandbox-provider secret refs (e.g. exe.dev `apiKey`) can resolve against the secret store at runtime instead of throwing `Runtime secret resolution requires an environment id`. - `packages/adapter-utils/src/sandbox-install-command.ts`: extend `buildSandboxNpmInstallCommand` with an `ENSURE_NPM_PREAMBLE` that, when `npm` is missing, downloads a portable Node v22 tarball into `$HOME/.local` and sets `PAPERCLIP_NPM_BOOTSTRAPPED=1` so the install step skips sudo (sudo's `secure_path` would lose the freshly-installed `npm` in `$HOME/.local/bin`). Distro-packaged Node from apt-get is intentionally avoided because it tends to be too old to parse modern JS syntax used by `@google/gemini-cli`. - `packages/adapters/gemini-local/src/index.ts`: switch the hardcoded `npm install -g @google/gemini-cli` to `buildSandboxNpmInstallCommand`, so gemini-local picks up the same sudo-aware + npm-bootstrap behavior as the other local adapters. - `packages/adapters/opencode-local/src/index.ts`: append a step to the install command that symlinks `$HOME/.opencode/bin/opencode` into `$HOME/.local/bin`. The upstream installer only adds `~/.opencode/bin` to PATH via `~/.bashrc`, which non-login `sh -c` probe invocations do not source. - `packages/adapter-utils/src/sandbox-install-command.test.ts`: cover the new preamble plus the unchanged root/sudo/user-prefix branches. ## Verification - `cd packages/adapter-utils && npm test -- sandbox-install-command` (passes; new "bootstraps npm from a portable Node tarball when missing" case is included). - Manual: ran the in-app `Test` action against the QA matrix dev instance for `QA exe.dev Claude`, `QA exe.dev Gemini`, and `QA exe.dev OpenCode` — all three now report `status=pass` including the hello probe. `QA SSH Claude` also passes; without the environment-id fix, SSH resolution threw before the wrapper / install fixes could run. - Suggested reviewer check: re-run the matrix on a fresh exe.dev environment and confirm the install step no longer hits `npm: command not found` for gemini and the opencode probe no longer hits `opencode: command not found`. ## Risks - Low/medium. The npm bootstrap pins Node `v22.11.0` from `nodejs.org/dist`; if that URL becomes unreachable the install will fail with a clear `curl` error rather than corrupting state. The bootstrap path is only taken when `npm` is genuinely missing, so existing sandbox images that ship with Node are unaffected. - The opencode symlink uses `ln -sf` into `$HOME/.local/bin`, which is created with `mkdir -p`; idempotent on re-install. - The `id` change is a strict additive: callers previously got `undefined` and only the secret-ref code paths actually read it. No behavior change for environments without secret refs. ## Model Used - Claude (Anthropic), `claude-opus-4-7`, with extended thinking and tool use enabled. Iterated through the Paperclip QA matrix harness; no other model assisted. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots (n/a — runtime/install path only) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
5a64cf52a1 |
Add exe.dev sandbox provider plugin (#5688)
> _Stacked on top of #5685 → #5686 → #5687. Diff against master includes commits from earlier PRs in the stack — review focuses on the two new commits (`Add long-secret textarea variant to JsonSchemaForm SecretField` + `Add exe.dev sandbox provider plugin`)._ ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs in a sandbox environment, and operators choose the provider — today E2B, Daytona, and (in this stack) Cloudflare > - exe.dev offers per-VM sandboxes via a small CLI / HTTP API — useful for operators who want full Linux VMs (vs container/runtime-only sandboxes) > - The plugin shape mirrors the e2b plugin: lifecycle hooks (`new`, `ls`, `rm`) drive exe.dev's CLI; SSH plumbing handles direct VM access for adapters that need it > - exe.dev VMs come up bare — `node` is not preinstalled, so the Paperclip sandbox callback bridge (a Node script) needs Node 20 installed at VM init via `--setup-script`. The plugin defaults the setup script to a Nodesource install > - The auth field accepts long SSH private keys, which need a textarea variant of the existing `SecretField` in `JsonSchemaForm` — added behind a `maxLength > THRESHOLD` opt-in so other secret fields are unaffected > - The benefit is that operators get exe.dev as a fully working sandbox provider out of the box, with no manual VM provisioning required ## What Changed **Shared UI support (`Add long-secret textarea variant to JsonSchemaForm SecretField`):** - `ui/src/components/JsonSchemaForm.tsx` + new `JsonSchemaForm.test.tsx`: when a secret-formatted field declares `maxLength` larger than the existing single-line threshold, render a monospace textarea instead of the masked input. Short secrets (API keys, tokens) keep the existing masked-input + show/hide toggle behavior. **The exe.dev plugin (`Add exe.dev sandbox provider plugin`):** - `packages/plugins/sandbox-providers/exe-dev/`: plugin entry, manifest, plugin runtime, README, and 19-test Vitest suite. - Manifest fields: API token (with `secret-ref` + `/exec` permission notes — needs `new`, `ls`, `rm`), API URL override, optional SSH username, optional SSH private key (uses the new `JsonSchemaForm` textarea variant via `maxLength: 4096`), optional SSH identity-file path, optional setup script. - Default `--setup-script` is a Nodesource Node 20 install. exe.dev VMs come up bare and the Paperclip sandbox callback bridge is a Node script, so without Node preinstalled the bridge can't start. Operators can override by supplying their own setup script. - `runLifecycleCommand` redacts env values from the executed command before surfacing it in error messages, so secrets passed via `--env=KEY=VALUE` don't leak into operator-visible failures. - The plugin distinguishes exe.dev's SSH onboarding failures (`Please complete registration by running: ssh exe.dev`) from general SSH failures and surfaces a clear remediation message. - `scripts/release-package-manifest.json`: register the new plugin for CI publish alongside the existing daytona / e2b providers. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage ui/src/components/JsonSchemaForm.test.tsx` - `(cd packages/plugins/sandbox-providers/exe-dev && pnpm test)` — 19 passing For an operator-side smoke test: 1. Get an exe.dev API token with `/exec` permission for `new`, `ls`, `rm`. 2. Register the plugin in your Paperclip instance, configure an environment with the token. 3. Create a sandbox env whose provider is `exe-dev`, then run a Codex or Claude job against it. The default Node 20 setup script should bring the VM up automatically. ## Risks - Adds a new sandbox provider plugin that follows the existing daytona / e2b shape; behavior on existing providers is unchanged. - The `JsonSchemaForm` textarea variant only engages for fields that opt in via `maxLength` larger than the existing threshold. All existing secret fields (which don't declare a `maxLength`) keep their current rendering. Test coverage pins both paths. - The redaction in `runLifecycleCommand` is a defense-in-depth measure; the test suite exercises the redaction path. If the redaction misses a future env-arg shape, the worst case is restored behavior (secrets in error messages), which is what the existing daytona / e2b plugins also do today. - Default setup script downloads from `deb.nodesource.com` over HTTPS at VM init. Operators on air-gapped networks or with a different package strategy can override the setup script. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — UI change is a textarea variant of an existing secret field; will attach screenshots before requesting merge - [x] I have updated relevant documentation to reflect my changes (plugin README, manifest descriptions) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
486fb88a15 |
Add Cloudflare sandbox provider plugin (#5687)
> _Stacked on top of #5685 → #5686. Diff against master includes commits from earlier PRs in the stack — review focuses on the two new commits (`Extend sandbox callback bridge for Worker-hosted plugins` + `Add Cloudflare sandbox provider plugin`)._ ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs in a sandbox environment, and operators choose which provider backs that sandbox — today E2B and Daytona are bundled with the platform > - Cloudflare Workers + Durable Objects + the Sandbox SDK offer a credible new option: globally distributed, cheap idle, and operator-deployable as a single Worker > - To plug it in, Paperclip needs (a) a provider plugin that speaks the `PaperclipPluginManifestV1` lifecycle and (b) a small operator-deployed Worker — the **bridge** — that adapts Paperclip's runtime RPCs to the Cloudflare Sandbox SDK > - The plugin extends the existing sandbox-callback-bridge with a `bridge.transport: "worker"` discriminator so the platform routes runtime RPCs through the Worker bridge instead of the in-process runner > - This pull request adds the plugin, the bridge Worker template, and the supporting adapter-utils + server hooks the new transport needs > - The benefit is that operators can run sandboxes on Cloudflare's edge with no new platform code beyond installing the plugin and deploying the Worker ## What Changed **Shared support (`Extend sandbox callback bridge for Worker-hosted plugins`):** - `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`: expose `expectedHostHeader` so plugin-side bridge clients can verify the canonical request envelope before forwarding. - `packages/adapter-utils/src/command-managed-runtime.{ts,test.ts}`: relax the always-fresh runner construction so callers can re-use a runner across exec calls (Worker-hosted bridges hold the runner inside a Durable Object). - `server/src/services/environment-runtime.ts` + `environment-runtime.test.ts`: route Worker-hosted bridges through the same env-shaping path as E2B and pin the `requestEnv` contract. - `server/src/services/plugin-environment-driver.ts`: thread an optional `issueId` through the runtime descriptor so bridges can scope leases to the originating issue (used by Cloudflare to map a sandbox to the issue/workflow for billing and audit). - `packages/plugins/sdk/src/protocol.ts`: add `issueId?` to `PluginEnvironmentDriverBaseParams` and the new `bridge.transport: "worker"` discriminator that the new plugin declares. - `server/__tests__/heartbeat-plugin-environment.test.ts`: pin the heartbeat path against the new runtime descriptor. **The Cloudflare plugin itself (`Add Cloudflare sandbox provider plugin`):** - `packages/plugins/sandbox-providers/cloudflare/`: plugin entry, manifest, plugin runtime (lifecycle + bridge client), config parsing, and Vitest coverage. Manifest declares `bridge.transport: "worker"` so the platform routes runtime RPCs through the bridge client. - `bridge-template/`: a Worker template the operator deploys with `wrangler`. Owns Durable Object-backed sessions (`sessions.ts`), exec/stream routes (`exec.ts`, `routes.ts`), and an HMAC auth layer (`auth.ts`) that pins the `Host` header surface. Includes the SDK-contract-correct exec implementation, lease recovery, and chunked stdout/stderr streaming. - Tests cover lease/session handoff (`bridge-template/src/exec.test.ts`, `routes.test.ts`), bridge client request shaping (`src/bridge-client.test.ts`), and end-to-end plugin behavior (`src/plugin.test.ts`) including streamed exec output. 27 tests in total. - `README.md` walks the operator through deploying the bridge Worker, registering the plugin, and configuring the runtime. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage packages/adapter-utils/src/sandbox-callback-bridge.test.ts packages/adapter-utils/src/command-managed-runtime.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/heartbeat-plugin-environment.test.ts` - `(cd packages/plugins/sandbox-providers/cloudflare && pnpm test)` — 27 passing For an operator-side smoke test: 1. Deploy the bridge: `cd packages/plugins/sandbox-providers/cloudflare/bridge-template && wrangler deploy` 2. Register the plugin in your Paperclip instance, point its bridge URL at the deployed Worker, set the HMAC shared secret. 3. Create a sandbox environment whose provider is `cloudflare`, then run a Codex or Claude job against it. ## Risks - Adds a new `bridge.transport: "worker"` code path, but the existing E2B / Daytona transports go through the same shaped helpers and have explicit test coverage that pins their behavior unchanged. - The Worker bridge stores session state in a Durable Object; operator instances must be aware of the corresponding Cloudflare costs (DO requests, storage). Documented in the README. - The `issueId` plumbing is optional throughout — existing plugins that don't supply it continue to work. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes (plugin README, bridge-template README) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
0fe39a2d5c |
fix(cursor-local): resolve sandbox agent installs from cursor bin (#5686)
> _Stacked on top of #5685 (Harden remote sandbox runtime). Diff against master includes commits from earlier PRs in the stack — review focuses on the new commit only._ ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The cursor-local adapter wraps the Cursor Agent CLI so a Paperclip workflow can drive it inside a sandbox > - When the adapter runs in a remote sandbox, the Cursor Agent CLI installs under `$HOME/.local/bin/cursor-agent` (or wherever `$XDG_BIN_HOME` points), not on the global PATH > - The existing post-install resolution assumed `cursor-agent` would resolve via the sandbox's login shell PATH after `npm install -g`, which fails on sandboxes where the install lands in a user-prefixed directory that isn't on PATH at probe time > - This pull request resolves the agent CLI from the cursor binary's own directory (`dirname "$(command -v cursor)"`) so the install probe and execute path agree on a real binary location > - The benefit is that cursor-local works correctly on any sandbox provider where `npm install` lands in a user-prefixed directory ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: resolve the cursor-agent binary from the cursor bin directory after install, instead of relying on PATH. - `packages/adapters/cursor-local/src/server/test.ts`: corresponding probe tweak. - `packages/adapters/cursor-local/src/server/test.test.ts` (new) + `remote-command.test.ts`: focused coverage that exercises the install + resolve path against a sandbox runner that places the binary in a user-prefixed directory. ## Verification - `pnpm exec vitest run --no-coverage packages/adapters/cursor-local/src/server/test.test.ts packages/adapters/cursor-local/src/server/remote-command.test.ts packages/adapters/cursor-local/src/server/execute.test.ts` All passing locally. ## Risks - Local cursor-local runs are unaffected — the resolution change only kicks in for the sandbox install path. - Low risk; isolated to one adapter. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: tool use (Read/Edit/Bash), no code execution beyond local repo commands ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
b24c6909e8 |
Harden remote sandbox runtime probes, timeouts, and installs (#5685)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs inside a sandbox environment so its CLI is isolated from the host > - Sandbox-backed adapter runs go through a small set of shared helpers — `ensureAdapterExecutionTargetCommandResolvable`, the sandbox callback bridge runner, and per-adapter `SANDBOX_INSTALL_COMMAND` strings > - When standing up new sandbox provider plugins, the existing helpers timed out, missed install fallbacks, or leaned on assumptions that only held for E2B > - Local adapters (`claude-local`, `codex-local`, `gemini-local`, `opencode-local`) needed slightly hardened probes so they could install themselves and validate inside *any* remote sandbox transport, not just E2B > - This pull request bundles those runtime fixes so future sandbox provider plugins inherit a working baseline > - The benefit is that adding a new sandbox provider plugin no longer requires touching adapter-utils or each local-adapter probe — the supporting infra is already correct ## What Changed - `packages/adapter-utils/src/execution-target.ts`: introduce `DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC = 1800` and `resolveAdapterExecutionTargetTimeoutSec(...)`. Local and SSH adapters keep the historical "0 means no adapter timeout" behavior; sandbox-backed runs without an explicit `timeoutSec` get an explicit 30-minute default so remote installs and warm-up don't time out at the per-RPC default. Plumbed `timeoutSec` through `ensureAdapterExecutionTargetCommandResolvable` so install probes inside a sandbox honor adapter-level overrides instead of the bridge's 5-minute default. - `packages/adapters/opencode-local/src/index.ts`: switch `SANDBOX_INSTALL_COMMAND` from `npm install -g opencode-ai` to `curl -fsSL https://opencode.ai/install | bash`. The npm package reifies four large prebuilt-binary subpackages in parallel even though only one matches the host arch; on bandwidth-constrained sandboxes that blew through the 240s install budget. The official installer fetches one arch-specific binary and adds `$HOME/.opencode/bin` to PATH via `~/.bashrc`, which the sandbox-callback-bridge login-shell script already sources. - `packages/adapters/{claude,codex,gemini,opencode}-local/`: harden remote-target probes — pass `--skip-git-repo-check` for Codex when probing outside a repo, normalize permission flags for Claude, and add `*.remote.test.ts` coverage that exercises the remote-sandbox path explicitly for each adapter. - `packages/adapter-utils/src/sandbox-install-command.{ts,test.ts}` (new): add `buildSandboxNpmInstallCommand` helper. `server/src/adapters/registry.ts` + new `server/src/__tests__/adapter-registry.test.ts`: wire adapter install commands so they fall back to a writable `$HOME/.local` prefix when global install isn't available. - `server/src/__tests__/plugin-worker-manager.test.ts` + new `server/src/__tests__/fixtures/plugin-worker-delayed.cjs`: pin per-call timeout overrides so plugin worker exec calls honor the caller's timeout instead of the worker's default. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage packages/adapter-utils/src/execution-target-sandbox.test.ts packages/adapter-utils/src/sandbox-install-command.test.ts` - `pnpm exec vitest run --no-coverage server/src/__tests__/plugin-worker-manager.test.ts server/src/__tests__/adapter-registry.test.ts server/src/__tests__/claude-local-adapter-environment.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/gemini-local-adapter-environment.test.ts` - `pnpm exec vitest run --no-coverage packages/adapters/codex-local/src/server/test.remote.test.ts packages/adapters/opencode-local/src/server/test.remote.test.ts packages/adapters/codex-local/src/server/codex-args.test.ts packages/adapters/codex-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts` All passing locally. ## Risks - Touches shared `adapter-utils` and several `*-local` adapters. The 30-minute default applies only when both (a) the target is `remote+sandbox` and (b) no `timeoutSec` is configured — local + SSH paths are unchanged. New test coverage was added alongside each behavior change to pin the contracts. - Switching OpenCode's install command to the official installer is a behavior change for any operator running OpenCode inside a remote sandbox. Local installs are unaffected (the `SANDBOX_INSTALL_COMMAND` only runs when an adapter is being installed inside a sandbox). - Low risk overall — no migrations, no API surface change. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep), no code execution beyond local repo commands ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
534aee66ae |
Add cursor_cloud adapter for Cursor SDK + Cloud Agents API v1 (#5664)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - There are many adapter types, one per agent-runtime product (Claude,
Codex, OpenCode, Cursor local CLI, etc.)
> - Cursor shipped a public TypeScript SDK on 2026-04-29 that exposes
Cursor's full hosted-agent platform (cloud VMs, harness, MCP, skills,
hooks)
> - Paperclip had no first-class adapter for this — agents that wanted
to use Cursor's managed cloud runtime had to fall back to the local CLI
adapter, which loses the cloud session, streaming, and durable run model
> - This PR adds a new `cursor_cloud` adapter built directly on
`@cursor/sdk`, with Paperclip's heartbeat mapped to Cursor's
durable-agent + per-run model
> - The benefit is that any Paperclip agent can now drive a Cursor cloud
agent across heartbeats with native session reuse, streaming, and
cancellation, while Paperclip remains the source of truth for issue/task
state
## What Changed
- New built-in adapter package `packages/adapters/cursor-cloud` (15
files, ~1.7k LOC) backed by `@cursor/sdk` ^1.0.12
- `src/server/execute.ts` — SDK-first lifecycle: `Agent.create` /
`Agent.resume` / `Agent.getRun` / `agent.send` / `run.stream` /
`run.wait`, with session reuse keyed on the (runtime env type, env name,
repo set) tuple
- `src/server/session.ts` — codec for `cursorAgentId` + `latestRunId` +
repo metadata, persisted in `runtime.sessionParams`
- `src/server/test.ts` — environment probe via `Cursor.me()` and
optional model validation via `Cursor.models.list()`
- `src/ui/parse-stdout.ts` + `src/cli/format-event.ts` — normalize
Cursor SDK message types (`status`, `thinking`, `assistant`, `user`,
`tool_call`, `tool_result`, `result`) into Paperclip transcript events
for the UI and CLI
- Registrations: `packages/shared/src/constants.ts`,
`packages/adapter-utils/src/session-compaction.ts`,
`server/src/adapters/{registry,builtin-adapter-types}.ts`,
`ui/src/adapters/{registry,adapter-display-registry}.ts` +
`ui/src/adapters/cursor-cloud/index.ts`, `cli/src/adapters/registry.ts`,
plus workspace deps in `cli`/`server`/`ui` `package.json`
- `ui/src/components/AgentConfigForm.tsx` — hide local-Cursor
`mode`/thinking-effort field for `cursor_cloud` (different config
surface)
- 11 vitest tests covering execute paths (fresh create, matching-resume,
active-run reattach, non-finished result), session codec round-trip,
transcript parsing, and config building
## Verification
Reviewer steps:
```bash
pnpm install
pnpm --filter @paperclipai/adapter-cursor-cloud typecheck # → clean
pnpm vitest run packages/adapters/cursor-cloud # → 11/11 passing
```
End-to-end check against a real Cursor cloud agent (requires
`CURSOR_API_KEY` and Cursor GitHub-app install on the target repo):
1. Create a `cursor_cloud` agent in Paperclip with `repoUrl` set to the
test repo, `repoStartingRef: main`, and `env.CURSOR_API_KEY` set
2. Trigger a heartbeat → adapter calls `Agent.create({ cloud: { env: {
type: "cloud" }, repos: [...] } })`, streams events, terminates on
`finished`
3. Trigger a second heartbeat → adapter calls `Agent.resume` or
`agent.send` follow-up depending on prior-run state, reusing
`cursorAgentId`
4. The Paperclip UI/CLI transcript reflects Cursor `status` / `thinking`
/ `assistant` events as they stream
5. Cancellation from Paperclip maps to `run.cancel()` or Cloud API v1
`cancelRun` for cross-heartbeat cancellation
A direct-SDK smoke run against a real repo (devinfoley/my_test_project @
main) confirmed: `Cursor.me()` ok → `Agent.create` → `agent.send` →
`run.stream()` (30 events) → terminal status `finished` in ~11s.
## Risks
- **New adapter, additive only.** No existing adapter or registry is
replaced; current `cursor` local-CLI adapter is untouched. Default
behavior of any existing agent is unchanged.
- **External dependency on `@cursor/sdk`.** Cursor's SDK is v1.0.x and
may evolve. Mocked unit tests cover the public surface used here; if the
SDK breaks compatibility we update the adapter independently.
- **Cost/budget.** `cursor_cloud` runs on Cursor's billed cloud VMs;
operators must understand they are spending money outside Paperclip's
budget controls when they enable this adapter. Same shape as other
API-billed adapters.
- **No webhook support in V1.** The SDK already provides
stream/wait/cancel/reattach, so V1 does not require a public callback
URL. If a future use case needs out-of-band wakes, we add a Cloud API v1
webhook bridge as a separate change. This is called out in the issue
plan document.
- **Lockfile.** Per repo policy, `pnpm-lock.yaml` is intentionally not
in this PR — CI's lockfile workflow will update it on merge given the
manifest changes.
## Model Used
- Provider: Anthropic Claude (via Claude Code / Paperclip `claude_local`
adapter)
- Model: `claude-opus-4-7` (Claude Opus 4.7), knowledge cutoff January
2026
- Mode: standard tool-use with extended reasoning
- Context: ~200k token window
- Capabilities used: code generation, multi-file edits, shell/test
execution, GitHub PR workflow
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (11/11 in
`packages/adapters/cursor-cloud`)
- [x] I have added or updated tests where applicable (4 new test files,
11 cases)
- [ ] If this change affects the UI, I have included before/after
screenshots (the only UI change is hiding the local-Cursor mode field on
the `cursor_cloud` adapter — happy to attach a screenshot if the
reviewer wants one)
- [x] I have updated relevant documentation to reflect my changes (issue
plan document supersedes the pre-SDK design; tracked in PAPA-203)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
|
||
|
|
0096b56a1c |
[codex] Add LLM Wiki plugin host support (#5597)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The plugin system needs host contracts and runtime support before large plugins can integrate cleanly. > - The source branch mixed the LLM Wiki package with supporting host/runtime work, managed plugin skills, root-level storage spaces, and a bookmarks reference plugin. > - [PAP-9173](/PAP/issues/PAP-9173) asked for the current branch to be split by file boundary: plugin package separately from everything else. > - [PAP-9188](/PAP/issues/PAP-9188) clarified that LLM Wiki may have plugin-local spaces, but Paperclip core should not reorganize top-level local storage into spaces. > - Follow-up review clarified that the bookmarks example should not ship in this PR either. > - This pull request contains the non-`packages/plugins/plugin-llm-wiki/` host/runtime work, keeps runtime state under the selected Paperclip instance root, and no longer includes the bookmarks example. ## What Changed - Added/updated plugin host contracts, SDK types, worker RPC plumbing, managed plugin skill support, and related server tests. - Removed the bookmarks example plugin package and its bundled-example/workspace references. - Removed the root-level local spaces CLI/migration surface and restored instance-root runtime defaults for config, db, logs, storage, secrets, workspaces, projects, and adapter homes. - Replaced shared root `space-paths` helpers with `home-paths` helpers for core runtime storage. - Tightened stranded recovery unique-conflict detection so concurrent recovery scans reuse the raced recovery issue when Postgres errors are wrapped. - Kept `packages/plugins/plugin-llm-wiki/` out of this PR diff; plugin-local spaces remain in the stacked plugin-only PR. ## Verification - `pnpm exec vitest run cli/src/__tests__/data-dir.test.ts cli/src/__tests__/home-paths.test.ts cli/src/__tests__/onboard.test.ts packages/shared/src/home-paths.test.ts packages/db/src/runtime-config.test.ts server/src/__tests__/agent-instructions-service.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/codex-local-execute.test.ts` - `pnpm exec vitest run packages/db/src/runtime-config.test.ts` - `pnpm exec vitest run server/src/__tests__/plugin-routes-authz.test.ts` - `pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts -t "reuses the raced stranded recovery issue"` skipped locally because embedded Postgres did not initialize on this macOS temp host; the code path was typechecked and is covered by Linux CI. - Boundary check: no core references remain for `PAPERCLIP_SPACE_ID`, `spaces migrate-default`, `@paperclipai/shared/space-paths`, `registerSpacesCommands`, or the removed bookmarks example. - Previous PR head `4f23e034` had green GitHub checks: `verify`, all four serialized server shards, `e2e`, `Canary Dry Run`, `policy`, Snyk, and `Greptile Review`. Current head `582f466d` is re-running checks after the bookmarks deletion. ## Risks - Plugin host changes touch shared runtime paths, so regressions would most likely appear in adapter startup, plugin loading, or local dev path defaults. - Removing the bookmarks example also removes one demonstration of plugin database namespaces plus local-folder persistence; remaining plugin examples still cover bundled example discovery and plugin host flows. - The plugin package itself is intentionally deferred to the stacked plugin-only PR, where LLM Wiki plugin-local spaces live. - Existing installs that tested the transient root-level spaces CLI should stop using it; this PR intentionally removes that unsupported migration surface before merge. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI GPT-5 Codex via Codex CLI, tool use and local code execution enabled; context window not exposed. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass, except where noted above for host-specific embedded Postgres initialization - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Stacked follow-up: PR #5592 contains only `packages/plugins/plugin-llm-wiki/` and targets this branch. --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
2f72cb29ea |
chore: update drizzle-orm to 0.45.2 (#5589)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The server, DB package, and CLI all rely on the shared Drizzle ORM dependency for core persistence flows. > - A published install was still resolving nested `drizzle-orm@0.38.4`, which left the production package graph behind the intended security update. > - The repo’s documented dependency policy says GitHub Actions owns `pnpm-lock.yaml`, so the correct maintainer workflow is to update dependency manifests in the feature PR and let the lockfile refresh happen separately after merge. > - This pull request therefore keeps the Drizzle upgrade to the package manifests only and leaves lockfile regeneration to the existing `Refresh Lockfile` automation. ## What Changed - Updated `drizzle-orm` dependency declarations in `cli/package.json`, `packages/db/package.json`, and `server/package.json` from `0.38.4` / `^0.38.4` to `0.45.2` / `^0.45.2`. - Re-verified the packed `@paperclipai/db` and `@paperclipai/server` publish payloads to confirm their generated `package.json` files advertise `drizzle-orm ^0.45.2`. - Removed the temporary lockfile/CI follow-up commits so the branch now matches the intended manifest-only protocol. ## Verification - `pnpm list drizzle-orm -r --depth 0` - `pnpm exec vitest run packages/db/src/client.test.ts server/src/__tests__/issues-service.test.ts` - `pnpm run test:release-registry` - Packed `@paperclipai/db` and `@paperclipai/server` locally and inspected the tarball `package.json` files to confirm they advertise `drizzle-orm ^0.45.2`. ## Risks - Low to moderate risk: the runtime code paths are unchanged, but downstream lockfile refresh now depends on the existing post-merge GitHub automation working as documented. - A separate packaging/versioning issue around unpublished `@paperclipai/plugin-sdk@1.0.0` showed up during a raw local tarball install experiment; that is called out for reviewers but is not part of this Drizzle bump. ## Model Used - OpenAI Codex via the `codex_local` adapter, using a GPT-5-based coding agent with terminal tool use and code execution. The adapter does not expose a public exact model ID or context-window value in this environment. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
778e775c35 |
Add secrets provider vaults and remote import (#5429)
## Thinking Path > - Paperclip orchestrates AI-agent companies and needs secrets handling to work across local development, hosted operators, and governed agent execution. > - The affected subsystem is the company-scoped secrets control plane: database schema, server services/routes, CLI workflows, and the Secrets settings UI. > - The gap was that secrets were local-only and operators could not manage provider vaults or import existing remote references without exposing plaintext. > - This branch adds provider vault configuration plus an AWS Secrets Manager remote-import path while preserving company boundaries, binding context, and audit trails. > - I kept the PR to a single branch PR, removed unrelated lockfile/package drift, rebased the full branch onto the current `public-gh/master`, and addressed fresh Greptile findings. > - The benefit is a reviewable implementation of provider-backed secrets with focused tests covering provider selection, import conflicts, deleted secret reuse, rotation guards, and AWS signing behavior. ## What Changed - Added provider vault support for company secrets, including provider config storage, default vault handling, health checks, binding usage, access events, and remote import preview/commit. - Added an AWS Secrets Manager provider using SigV4 request signing, bounded request timeouts, namespace guardrails, cached runtime credential resolution, and external-reference linking without plaintext reads. - Added Secrets UI surfaces for vault management and remote import, plus CLI/API documentation for setup and operations. - Stabilized routine webhook secret binding paths and SSH environment-driver fixture bindings discovered during verification. - Addressed Greptile and CI findings: no lockfile/package drift, monotonic migration metadata, disabled-vault default races, soft-deleted secret hiding/recreate behavior, remove behavior with disabled vaults, soft-deleted external-reference re-import, non-active rotation guards, managed-secret soft deletion through PATCH, and per-call AWS SDK credential client churn. - Rebased this branch onto `public-gh/master` at `0e1a5828` and force-pushed with lease to keep this as the single PR for the branch. ## Verification - `git fetch public-gh master` - `git rebase public-gh/master` - `git diff --name-only public-gh/master...HEAD | grep '^pnpm-lock\.yaml$' || true` confirmed `pnpm-lock.yaml` is not in the PR diff. - Confirmed migration ordering: master ends at `0081_optimal_dormammu`; this PR adds `0082_dry_vision` and `0083_company_secret_provider_configs`. - Inspected migrations for repeat safety: new tables/indexes use `IF NOT EXISTS`; foreign keys are guarded by `DO $$ ... IF NOT EXISTS`; column additions use `ADD COLUMN IF NOT EXISTS`. - `pnpm -r typecheck` passed before the Greptile follow-up commits. - `pnpm test:run` ran the full stable Vitest path before the Greptile follow-up commits; it completed with 3 timing-related failures under parallel load: `codex-local-execute.test.ts`, `cursor-local-execute.test.ts`, and `environment-service.test.ts`. - `pnpm --filter @paperclipai/server exec vitest run src/__tests__/codex-local-execute.test.ts src/__tests__/cursor-local-execute.test.ts src/__tests__/environment-service.test.ts` passed on targeted rerun (`24/24`). - `pnpm build` passed before the Greptile follow-up commits. Vite reported existing chunk-size/dynamic-import warnings. - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/secrets-service.test.ts` passed (`26/26`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/aws-secrets-manager-provider.test.ts src/__tests__/secrets-service.test.ts` passed (`39/39`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server typecheck` passed. - Captured Storybook screenshots from `ui/storybook-static` for visual review. - Latest PR checks on `5ca3a5cf`: `policy`, serialized server suites 1/4-4/4, `Canary Dry Run`, `e2e`, `security/snyk`, and `Greptile Review` pass; aggregate `verify` is still registering the completed child checks. - Greptile review loop continued through the latest requested pass; all Greptile review threads are resolved and the latest `Greptile Review` check on `5ca3a5cf` passed with 0 comments added. ## Screenshots Before: the provider-vault and remote-import surfaces did not exist on `master`; these are after-state screenshots from the Storybook fixtures.    ## Risks - Migration risk: this adds new secret provider tables and extends existing secret rows. The migrations were checked for monotonic ordering and idempotent guards, but reviewers should still inspect upgrade behavior carefully. - Provider risk: AWS support uses direct SigV4 requests. Automated tests cover signing, request timeouts, vault-config selection, namespace guardrails, pending-version archival, sanitized provider errors, and service-level cleanup paths. A real-vault AWS smoke test remains deployment validation for an operator with AWS credentials rather than an unverified merge blocker in this local branch. - UI risk: the Secrets page and import dialog are large new surfaces; screenshots are included above for reviewer inspection. - Verification risk: the full local stable test command hit parallel-load timing failures, although the exact failed files passed when rerun directly. - Operational risk: remote import intentionally avoids plaintext reads; operators must understand that imported external references resolve at runtime and may fail if AWS permissions change. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent with local shell/tool use in the Paperclip worktree. Exact context-window size was not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [ ] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
06e6ee25cd |
Add Daytona sandbox provider plugin (#5580)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents need isolated sandbox environments to execute work safely; Paperclip already supports E2B as a sandbox provider plugin > - Users want to use Daytona (https://www.daytona.io/) as an alternative sandbox backend, but no plugin existed for it > - Without a Daytona plugin, teams that prefer Daytona's pricing/regions/runtime can't run Paperclip agents on it > - This pull request adds a `@paperclip/sandbox-provider-daytona` plugin that mirrors the existing E2B plugin shape and wires up Daytona's `@daytonaio/sdk` for sandbox lifecycle, command execution, and shell detection > - The benefit is that operators can pick Daytona as a first-class sandbox provider without touching core code, broadening Paperclip's runtime options ## What Changed - New plugin package `packages/plugins/sandbox-providers/daytona` with manifest, worker entry, and provider implementation backed by `@daytonaio/sdk` - Implements sandbox create/destroy/exec/upload/download lifecycle, shell command detection, and config/env wiring consistent with the E2B plugin - Adds unit tests under `src/plugin.test.ts` and a README documenting setup and the `DAYTONA_API_KEY` requirement - Minor adjustments in `scripts/paperclip-issue-update.sh`, `packages/shared/src/issue-thread-interactions.test.ts`, and `packages/shared/src/validators/issue.ts` to support the integration ## Verification - Re-ran the full sandbox provider matrix on the QA Paperclip instance using Daytona as the runtime — all 6 adapters executed inside the Daytona sandbox with zero `environmentExecute` timeouts - 5/6 adapters pass cleanly (or with informational warns); the only failure is `codex_local`, which is an OpenAI quota/billing issue unrelated to Daytona - `pnpm --filter @paperclip/sandbox-provider-daytona test` runs the plugin unit tests ## Risks - New optional plugin; no behavior change for users who don't enable it - Requires `DAYTONA_API_KEY` for runtime use — documented in the plugin README - Daytona SDK is a new external dependency; tracked in the plugin's own package.json so it doesn't affect the core install footprint ## Model Used - Claude Opus 4.7 (`claude-opus-4-7`), extended thinking, tool use enabled ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots (N/A — backend plugin) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
0e1a582831 |
Revert "Add experimental newest-first issue thread" (#5460)
This is actually bad. Glad it was under experiments. |
||
|
|
a904effb96 |
Add experimental newest-first issue thread (#5455)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, so issue threads are a core operator surface for reviewing work. > - The issue detail page is the place where humans read agent messages, user comments, and execution context together. > - That thread originally rendered oldest-first, which made recent activity harder to see during active review. > - Reversing the thread order changes navigation expectations, timestamp placement, and the "Jump to latest" affordance, so the UI behavior needed to move as a coherent set. > - Because this is a visible core-product behavior shift, it also needed a safe rollout path instead of becoming the default immediately. > - This pull request adds the newest-first issue thread behavior behind an Experimental setting, updates the thread UI to match that mode, and keeps the legacy oldest-first experience unchanged by default. > - The benefit is that reviewers can opt into a more recent-first issue workflow without forcing a global behavior change on every Paperclip instance. ## What Changed - Reversed issue thread rendering so the newest comments and messages appear first when the experiment is enabled. - Moved the plain comment timestamp into the card header in newest-first mode and kept the legacy timestamp placement for oldest-first mode. - Moved the `Jump to latest` control to the bottom of the thread in newest-first mode while leaving the existing top placement for the legacy mode. - Added the `Enable Newest-First Issue Thread` experimental instance setting and wired issue detail to read that toggle. - Added regression coverage for thread order, timestamp placement, jump-button placement, and the issue-detail experiment toggle behavior. ## Verification - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Focused checks that also passed during issue review: - `pnpm vitest run src/components/IssueChatThread.test.tsx src/pages/IssueDetail.test.tsx` in `ui/` - `pnpm vitest run src/__tests__/instance-settings-routes.test.ts` in `server/` - Manual review path: - Enable `Instance Settings > Experimental > Enable Newest-First Issue Thread` - Open an issue with comments/messages and confirm newest activity renders first, timestamps move into the header, and `Jump to latest` sits below the thread - Disable the experiment and confirm the legacy oldest-first behavior returns ## Risks - Low risk: the behavioral change is gated behind an instance-level experimental toggle and defaults off. - The main regression risk is thread navigation drift between the two modes, especially around anchor scrolling and the `Jump to latest` affordance. - There is some UI coupling between issue-detail query state and experimental settings fetches, so future changes in that area should keep both modes covered. - Screenshots are not attached in this PR body; verification is described with automated coverage and manual steps instead. > I checked [`ROADMAP.md`](ROADMAP.md). This is a scoped issue-thread UX improvement and rollout gate, not a duplicate of a roadmap-level planned core feature. ## Model Used - OpenAI Codex via the local `codex_local` Paperclip adapter, GPT-5-based coding agent with terminal tool use and local code execution in this repository worktree. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
4269545b19 |
Stabilize Cursor sandbox runtime resolution (#5446)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Cursor adapter spawns the Cursor CLI against local, SSH, and sandbox execution targets; on a fresh sandbox lease, it has to resolve where Cursor was installed > - The previous resolver only looked for `~/.local/bin/cursor-agent` even though the official installer (and the adapter's own `SANDBOX_INSTALL_COMMAND`) sometimes lays the binary down as `~/.local/bin/agent`, so a sandbox where the install ran successfully would still fail to find the CLI > - This pull request lets the resolver accept either basename and lets the caller pass an optional `remoteSystemHomeDirHint` so a probe doesn't pay the cost of a remote `printf $HOME` round-trip when the home directory is already known > - The benefit is sandboxed Cursor runs find the binary that the install actually produced, and runtime probes are cheaper when the home dir is already resolved ## What Changed - `packages/adapters/cursor-local/src/server/remote-command.ts`: accept either `agent` or `cursor-agent` as the preferred basename; new optional `remoteSystemHomeDirHint` short-circuits the home-dir probe - `packages/adapters/cursor-local/src/server/execute.ts`: thread the home-dir hint through, prefer the resolved binary path, and shift the effective execution cwd to the per-run managed subdirectory once the runtime is prepared - New `remote-command.test.ts` and `execute.test.ts` cover both basenames, the hint short-circuit, and the cwd shift - `packages/adapters/cursor-local/src/index.ts`: update doc string to reflect the broader resolution - `execute.remote.test.ts` updated to expect the managed-subdirectory cwd shape introduced by the cwd shift ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-cursor-local` — 6/6 passing - `pnpm typecheck` clean - Manual: a fresh sandbox lease with `npm install -g …`-installed Cursor (binary lands as `~/.local/bin/agent`) now runs cleanly through the adapter ## Risks Low. Resolver is strictly broader (matches a superset of paths); existing setups with `~/.local/bin/cursor-agent` continue to work. The home-dir hint is opt-in; callers that don't pass it get the existing probe behavior. Cursor's effective execution cwd now matches the rest of the adapters (per-run managed subdirectory) — sessions previously rooted at the workspace root will land in the new subdirectory. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover both basenames + hint short-circuit + cwd shift - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --- > **Stacked PR.** Sits on top of #5445 (which sits on #5444). Cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are listed under "What Changed" above. Will rebase onto `master` and force-push once the prerequisite PRs merge. |
||
|
|
fe3904f434 |
Stabilize runtime probes and Codex env tests (#5445)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters expose a Test action that probes the configured runtime —
install, resolvability, hello — to give operators a fast yes/no on
whether an environment is healthy
> - The Codex test path was running its hello probe directly without
going through the managed-runtime preparation that production runs use,
so a healthy production setup could still report a probe failure
> - The plugin worker manager wasn't surfacing terminated workers
cleanly, leaving the runtime probe waiting on a dead worker until the
request timed out
> - This pull request routes the Codex test probe through
`prepareAdapterExecutionTargetRuntime` (so it sees the same managed
Codex home production sees), exposes `commandCwd` on
`createCommandManagedRuntimeClient` so callers can target a per-probe
directory without leaking the workspace `remoteCwd`, and propagates
plugin-worker termination as a usable error instead of a hang
> - The benefit is the Codex Test action mirrors production behavior
end-to-end, and probes against a terminated plugin worker fail fast
instead of timing out
## What Changed
- `packages/adapter-utils/src/command-managed-runtime.ts`: rename the
`remoteCwd` knob to `commandCwd` so callers can target a per-probe
directory without inheriting the workspace cwd; matching test coverage
in `command-managed-runtime.test.ts`
- `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`:
small fixes to keep callback bridge stop semantics deterministic
- `packages/adapters/codex-local/src/server/test.ts`: thread the Codex
hello probe through `prepareAdapterExecutionTargetRuntime` +
`prepareManagedCodexHome` so the probe sees the same managed home
production sees; new `test.remote.test.ts` covers the remote probe path
- `packages/adapters/cursor-local/src/server/execute.ts`: small
probe-side cleanup that aligns with the new commandCwd contract
- `server/src/services/plugin-worker-manager.ts`: surface plugin-worker
termination as a structured error so callers fail fast; new
`plugin-worker-terminated.cjs` fixture and
`plugin-worker-manager.test.ts` cases pin the behavior
## Verification
- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-codex-local --project
@paperclipai/adapter-cursor-local --project @paperclipai/server` —
1749/1750 passing (1 unrelated skip)
- `pnpm typecheck` clean
## Risks
Low–medium. The `remoteCwd → commandCwd` rename is a parameter renaming
on an internal helper used only by adapter test/execute paths in this
repo. The plugin-worker-terminated path was previously a hang; failing
fast may surface latent timeouts as explicit termination errors in
callers that already expected them.
## Model Used
Claude Opus 4.7 (1M context)
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
commandCwd, plugin-worker termination, and Codex remote test path
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---
> **Stacked PR.** Sits on top of #5444 which adds the per-run runtime
API surface this PR builds on. Cumulative diff against `master` includes
that PR's content; the files touched by *this* PR's commit are listed
under "What Changed" above. Will rebase onto `master` and force-push
once #5444 merges.
|
||
|
|
12cb7b40fd |
Harden remote workspace sync and restore flows (#5444)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - When an agent runs against a remote target, Paperclip syncs the
workspace out to the remote at run start and restores changes back to
the local workspace at run end
> - The previous restore flow naïvely overwrote local files with
whatever the remote returned, so files that the remote run never touched
but had timestamp/mode drift could be needlessly rewritten — and a
single static `refs/paperclip/ssh-sync/imported` ref made concurrent SSH
workspace exports race on the same git ref
> - This pull request adds a `workspace-restore-merge` module that diffs
a pre-run snapshot against the post-run remote state and only writes
back files the remote actually changed; SSH workspace exports now use a
per-import unique ref so concurrent runs can't trample each other
> - Every adapter's execute path threads the snapshot through
`prepareAdapterExecutionTargetRuntime` so the merge has the baseline it
needs
> - The benefit is workspace restores no longer churn untouched files,
and concurrent SSH runs no longer collide on the import ref
## What Changed
- `packages/adapter-utils/src/workspace-restore-merge.{ts,test.ts}`: new
module — directory snapshot (kind/mode/sha256/symlink target) plus
snapshot-aware merge that writes only the files the remote changed
- `packages/adapter-utils/src/ssh.ts`: SSH workspace export uses a
per-import unique ref (`refs/paperclip/ssh-sync/imported/<uuid>`);
restore goes through the new merge helper; `ssh-fixture.test.ts` covers
the unique-ref + merge paths
- `packages/adapter-utils/src/sandbox-managed-runtime.ts` +
`remote-managed-runtime.ts`: thread the snapshot/merge through the
sandbox and SSH paths
- `packages/adapter-utils/src/server-utils.{ts,test.ts}` +
`execution-target.ts`: helpers for capturing the pre-run snapshot;
`prepareAdapterExecutionTargetRuntime` gains required `runId` and
optional `workspaceRemoteDir`, and returns the realized
`workspaceRemoteDir`
- Each adapter's `execute.ts` (acpx, claude, codex, cursor, gemini,
opencode, pi) takes the snapshot at run start and passes it through to
the runtime restore
- Remote execute test mocks updated to match the new
`prepareWorkspaceForSshExecution` return shape and the per-run
`${managedRemoteWorkspace}` cwd subdirectory
## Verification
- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-acpx-local --project
@paperclipai/adapter-claude-local --project
@paperclipai/adapter-codex-local --project
@paperclipai/adapter-cursor-local --project
@paperclipai/adapter-gemini-local --project
@paperclipai/adapter-opencode-local --project
@paperclipai/adapter-pi-local` — 196/196 passing
- `pnpm typecheck` clean across the workspace
## Risks
Medium. The restore path now writes a strict subset of what it
previously did — files the remote did not touch are no longer rewritten.
If any flow was relying on a touch-without-content-change being copied
back (timestamp or permission propagation only), that behavior is now
skipped. Snapshot capture adds an O(N-files-in-workspace) hash pass at
run start; the cost is bounded by the existing exclude list. The `runId`
parameter on `prepareAdapterExecutionTargetRuntime` is now required —
every in-tree caller is updated; out-of-tree adapter authors need to
pass it.
## Model Used
Claude Opus 4.7 (1M context)
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new module +
every adapter execute path covered
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
|
||
|
|
e400315cbf |
Guard assigned backlog liveness (#5428)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The issue graph and liveness recovery system decide whether assigned work is executable or parked > - Assigned issues created without an explicit status could silently land in backlog, making parents look blocked with no productive wake path > - The server, shared validators, recovery analysis, and UI all need to agree on that execution semantic > - This pull request makes assigned issue creation default to `todo`, flags assigned backlog blockers, and surfaces the state in the board > - The benefit is that parked assigned work becomes intentional and visible instead of creating silent liveness stalls ## What Changed - Adds contract tests for assigned issue creation defaults. - Defaults assigned issue creation to `todo` when status is omitted while preserving explicit `backlog` parking. - Exposes `resolveCreateIssueStatusDefault` through shared validators. - Teaches liveness/blocker attention paths to distinguish assigned backlog blockers. - Adds UI notices, row/header badges, and issue detail safeguards for assigned backlog blockers. - Adds Storybook fixtures and execution-semantics documentation for the assigned-backlog behavior. ## Verification - `pnpm run preflight:workspace-links && pnpm exec vitest run packages/shared/src/validators/issue.test.ts server/src/__tests__/issue-assigned-backlog-contract-routes.test.ts server/src/__tests__/issue-blocker-attention.test.ts server/src/__tests__/issue-liveness.test.ts server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts ui/src/components/IssueAssignedBacklogNotice.test.tsx ui/src/components/IssueRow.test.tsx` — 50 passed, 23 skipped. - Skipped tests were embedded Postgres suites on this host with the repo skip message: `Postgres init script exited with code null. Please check the logs for extra info. The data directory might already exist.` - Pairwise merge check against the issue-controls PR branch completed without conflicts via `git merge --no-commit --no-ff` in a temporary worktree. - Screenshots for assigned-backlog UI states: [light](docs/pr-screenshots/pr-5428/assigned-backlog-light.png), [dark](docs/pr-screenshots/pr-5428/assigned-backlog-dark.png). - Follow-up checks: `pnpm --filter /ui typecheck`; `pnpm --filter /mcp-server build`; `pnpm --filter /mcp-server test`; `pnpm exec vitest run packages/shared/src/validators/issue.test.ts`; focused UI component tests. - Remote PR checks on head `6300b3c`: policy, verify, serialized server shards 1/4-4/4, Canary Dry Run, e2e, Greptile Review, and Snyk all passed. ## Risks - Medium: changes status defaulting for assigned issue creation when the caller omits status. Explicit `backlog` remains supported, and server/shared tests cover both paths. - Medium: liveness classification changes can affect blocker attention labels; focused service and UI tests cover the new assigned-backlog state. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled Paperclip heartbeat environment. Context window and internal reasoning mode are not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
772fc92619 |
Add issue controls and retry-now recovery (#5426)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Issue operators need clear controls for execution settings, model overrides, and recovery retries > - Existing issue properties hid useful adapter override state and did not expose a board-triggered retry for scheduled heartbeat recovery > - Scheduled retries also need to respect the same safety gates as normal execution instead of bypassing budget, review, pause, dependency, or terminal-state checks > - This pull request adds the issue property controls and retry-now surfaces together because they share the issue details/properties UI > - The benefit is that operators can inspect and adjust issue execution settings and safely trigger pending scheduled recovery without hidden control-plane behavior ## What Changed - Adds editable issue assignee model override controls in `IssueProperties`, with focused coverage. - Removes the stale workspace tasks link from issue properties. - Adds a scheduled retry `retry-now` backend path and shared response types. - Adds main-pane and properties-pane scheduled retry UI, backed by a shared `useRetryNowMutation` hook. - Adds suppression coverage for budget hard stops, review participant changes, subtree pause holds, unresolved blockers, terminal issues, and company scoping. - Updates the `IssueProperties` test harness with toast actions required by the retry-now hook. ## Verification - `pnpm exec vitest run ui/src/components/IssueProperties.test.tsx ui/src/components/IssueScheduledRetryCard.test.tsx` — 31 passed. - `pnpm exec vitest run server/src/__tests__/issue-scheduled-retry-routes.test.ts` — exited 0, but this host skipped the embedded Postgres route tests with: `Postgres init script exited with code null. Please check the logs for extra info. The data directory might already exist.` - Pairwise merge check against the assigned-backlog PR branch completed without conflicts via `git merge --no-commit --no-ff` in a temporary worktree. ### Visual verification screenshots Storybook story: `Product/Issue Scheduled retry surfaces / ScheduledRetrySurfaces`.   ## Risks - Medium: this touches issue execution/retry behavior, so CI should run the embedded Postgres route tests on a host that can initialize Postgres. - Low-to-medium UI risk around duplicated retry-now entry points; both surfaces share one mutation hook to keep behavior consistent. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled Paperclip heartbeat environment. Context window and internal reasoning mode are not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
d0e9cc76f2 |
Show workspace changes and stale notices in issue threads (#5356)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The issue thread is the operator's durable audit trail for what changed and why > - Workspace changes and stale disposition notices need to be visible in that same timeline without noisy or misleading rendering > - The local branch already contained backend activity details, timeline conversion, and UI rendering work for those events > - This pull request isolates the issue-thread activity work into a standalone branch against `origin/master` > - The benefit is a focused audit-trail PR that can merge independently of the sidebar/operator UI polish branch ## What Changed - Adds readable workspace-change activity details to issue update activity events. - Surfaces workspace-change events in issue chat/timeline rendering. - Makes the existing issue comment migration idempotent. - Folds and renders stale disposition notices inline so they match activity-log styling and spacing. - Adds focused route, timeline, and issue-thread system notice coverage. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run server/src/__tests__/issue-activity-events-routes.test.ts ui/src/lib/issue-timeline-events.test.ts ui/src/components/IssueChatThreadSystemNotice.test.tsx` — 3 files passed, 22 tests passed. - Confirmed the PR changes 9 files and does not include `pnpm-lock.yaml` or `.github/workflows/*`. - `pnpm exec vitest run server/src/__tests__/issue-closed-workspace-routes.test.ts` — 1 file passed, 4 tests passed. - `pnpm exec vitest run server/src/__tests__/issue-activity-events-routes.test.ts ui/src/lib/issue-timeline-events.test.ts ui/src/components/IssueChatThreadSystemNotice.test.tsx server/src/services/recovery/successful-run-handoff.test.ts packages/shared/src/validators/issue.test.ts` — 5 files passed, 54 tests passed. - `pnpm --filter @paperclipai/shared typecheck && pnpm --filter @paperclipai/server typecheck && pnpm --filter @paperclipai/ui typecheck`. - `pnpm --filter @paperclipai/ui typecheck` after adding the Storybook screenshot fixture. - Captured Storybook screenshots for the new UI rendering paths: - Collapsed stale notice + workspace-change row: `docs/pr-screenshots/pr-5356/issue-thread-notices-collapsed.png` - Expanded stale notice details: `docs/pr-screenshots/pr-5356/issue-thread-notices-expanded.png` ### Screenshots Collapsed stale notice with workspace-change row:  Expanded stale notice details:  ## Risks - Moderate risk: this touches issue activity serialization and issue-thread rendering, both of which are central operator surfaces. - Migration risk is low: the only migration change makes an existing migration idempotent. - No new migrations are introduced, so there is no cross-PR migration ordering requirement. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, shell/tool-use enabled, used to split the existing branch, verify the isolated PR branch, and create this PR. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
68f69975a4 |
Harden control-plane safety and issue identifiers (#5292)
## Thinking Path > - Paperclip relies on issue identifiers, execution policies, and agent heartbeat rules to keep autonomous work auditable. > - Safety checks need to reject ambiguous agent handoffs, and identifier parsing needs to support Cloud tenant prefixes. > - Agent instructions also need to make final-disposition rules explicit so work does not stall in vague states. > - This pull request isolates backend correctness and governance hardening from the UI and recovery-system-notice branches. > - The benefit is safer in-review transitions, better identifier compatibility, and clearer agent operating contracts. ## What Changed - Fixed run-aware confirmation ordering and interrupted-run state cleanup. - Added Cloud tenant identity bootstrap and alphanumeric issue identifier support across shared parsing and server routes. - Guarded agent-authored `in_review` updates unless a real review path exists. - Tightened heartbeat disposition instructions in adapter utilities/default AGENTS/Paperclip skill. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run packages/shared/src/issue-references.test.ts server/src/__tests__/issue-identifier-routes.test.ts server/src/__tests__/issue-execution-policy-routes.test.ts packages/adapter-utils/src/server-utils.test.ts` initially had the first execution-policy test hit Vitest's 5s timeout under the parallel bundle while the rest passed. - `pnpm exec vitest run server/src/__tests__/issue-execution-policy-routes.test.ts --testTimeout=20000` passed with 10/10 tests. - Follow-up: `pnpm run typecheck:build-gaps` passed. - Follow-up: `pnpm --filter @paperclipai/ui typecheck` passed. - Follow-up: `pnpm vitest run server/src/__tests__/issue-comment-reopen-routes.test.ts server/src/__tests__/company-portability.test.ts server/src/__tests__/costs-service.test.ts` passed. - Follow-up: `pnpm vitest run ui/src/context/LiveUpdatesProvider.test.ts ui/src/lib/issue-chat-messages.test.ts ui/src/lib/issue-reference.test.ts ui/src/lib/issue-timeline-events.test.ts` passed. ## Risks - Medium control-plane risk: in-review update validation changes agent behavior. The error message is explicit and tests cover allowed review paths. ## Model Used - OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with shell/git/GitHub CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
a1b30c9f35 |
Add planning mode for issue work (#5353)
## Thinking Path > - Paperclip is a control plane for autonomous AI companies. > - Issues are the core unit of work, and issue comments are how board users and agents coordinate execution. > - Some issue conversations need to produce plans and approvals instead of immediate implementation work. > - The existing issue contract did not distinguish standard execution comments from planning-oriented issue work. > - This pull request adds an issue work-mode contract and board UI affordances for standard vs planning mode. > - The benefit is that planning-mode issues can be created, displayed, discussed, and carried through agent heartbeat context without losing the normal issue workflow. ## What Changed - Added `standard` / `planning` issue work-mode contracts across DB, shared validators/types, server issue flows, plugin protocol, and adapter heartbeat payloads. - Added an idempotent `0081_optimal_dormammu` migration for `issues.work_mode`, ordered after current `public-gh/master` migrations. - Updated heartbeat/context summaries and issue-thread interaction behavior so planning work mode is preserved when creating suggested follow-up issues. - Added UI support for planning-mode issue creation, issue rows, detail composer styling, and composer work-mode toggles. - Added focused server/shared/UI tests plus a Playwright visual verification spec for planning-mode surfaces. - Rebased the branch onto current `public-gh/master` and added durable planning-mode screenshots under `doc/assets/pap-3368/`. ## Verification - `pnpm --filter @paperclipai/db run check:migrations` - `pnpm exec vitest run --project @paperclipai/shared packages/shared/src/validators/issue.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/heartbeat-context-summary.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts server/src/__tests__/issues-goal-context-routes.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm exec vitest run --project @paperclipai/ui ui/src/components/IssueChatThread.test.tsx ui/src/components/NewIssueDialog.test.tsx ui/src/components/IssueRow.test.tsx ui/src/pages/IssueDetail.test.tsx` - `pnpm exec vitest run --project @paperclipai/adapter-utils packages/adapter-utils/src/server-utils.test.ts` - `PAPERCLIP_E2E_SKIP_LLM=true npx playwright test --config tests/e2e/playwright.config.ts tests/e2e/planning-mode-visual-verification.spec.ts` ## Screenshots Desktop planning detail:  Desktop planning row:  Desktop staged standard toggle:  Mobile planning detail:  Mobile planning row:  ## Risks - Medium migration risk: this adds a non-null issue column. The migration uses `ADD COLUMN IF NOT EXISTS` so installations that applied an older branch-local migration number can still apply the final numbered migration safely. - Medium contract risk: issue payloads, plugin payloads, and adapter heartbeat payloads now include work mode; compatibility is handled by defaulting missing values to `standard`. - UI risk is moderate because composer controls changed; focused component tests and visual e2e coverage exercise standard vs planning display and toggle behavior. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent in a local Paperclip worktree, with shell/tool use. Exact context-window size is not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
320fd5d23b |
Add full company search page (#5293)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Operators need to find work, documents, agents, projects, comments, and activity across a company without jumping through separate surfaces. > - The existing Command-K flow was useful for fast navigation but not enough for deeper company-wide discovery. > - Search also needs company-scoped backend contracts, query cost controls, and indexed document matching so it stays safe as company data grows. > - This pull request adds a full company search API and a dedicated board search page that Command-K can hand off to. > - The benefit is a single searchable control-plane surface with richer result context, recents, highlights, and test coverage across server and UI behavior. ## What Changed - Added a company-scoped search endpoint/service with query validation, rate limiting, text matching, fuzzy title matching, and result typing shared through `@paperclipai/shared`. - Added idempotent search migrations for document search indexes and fuzzy matching support. - Added the full `/companies/:companyKey/search` UI, search result row components, highlighted snippets, recent searches, and sidebar/Command-K handoff. - Added Storybook coverage for search surfaces and Vitest coverage for server search behavior, rate limiting, route generation, Command-K behavior, and the search page. - Addressed Greptile findings by renaming the no-match SQL helper, applying search pagination after cross-type merge sorting, and lazy-initializing the default search service so unrelated route-test mocks do not need to know about it. - Merged current `public-gh/master` and renumbered the search migrations behind upstream `0078_white_darwin`: search indexes are now `0079_company_search_document_indexes` and fuzzy matching is `0080_company_search_fuzzystrmatch`. ## Verification - `git fetch public-gh master` - `git diff --check public-gh/master...HEAD` - `git diff --name-only public-gh/master...HEAD | rg '^pnpm-lock\.yaml$' || true` produced no output before opening the PR. - `pnpm run preflight:workspace-links && pnpm exec vitest run server/src/__tests__/company-search-service.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts ui/src/pages/Search.test.tsx ui/src/components/CommandPalette.test.tsx ui/src/lib/company-routes.test.ts` passed: 5 files, 25 tests. - `pnpm --filter @paperclipai/shared typecheck && pnpm --filter @paperclipai/db typecheck && pnpm --filter @paperclipai/server typecheck && pnpm --filter @paperclipai/ui typecheck` passed. - `pnpm exec vitest run server/src/__tests__/company-search-service.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts && pnpm --filter @paperclipai/server typecheck` passed after Greptile pagination fixes. - `pnpm exec vitest run server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts server/src/__tests__/company-search-service.test.ts && pnpm --filter @paperclipai/server typecheck` passed after the CI mock fix. - After resolving the migration conflict with current `public-gh/master`: `pnpm --filter @paperclipai/db typecheck && pnpm exec vitest run server/src/__tests__/company-search-service.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts && pnpm --filter @paperclipai/server typecheck` passed. - DB migration numbering check passed as part of `@paperclipai/db` typecheck. - UI states are covered by the added Storybook stories in `ui/storybook/stories/search.stories.tsx`. - GitHub reports the PR merge state as `CLEAN` on head `18e54fa8`. - GitHub PR checks are green on head `18e54fa8`: policy, verify, serialized server shards 1/4 through 4/4, e2e, canary dry run, Snyk, and Greptile Review. ## Risks - Search ranking and snippets are new user-facing behavior, so reviewers should check whether result ordering feels right on real company data. - Search touches broad company data, so company scoping and query cost/rate-limit behavior should be reviewed carefully. - The migrations add search indexes/extensions; they are idempotent with `IF NOT EXISTS` for users who may have applied an earlier branch migration number. > ROADMAP.md checked. This PR adds a focused board search surface and does not duplicate an open roadmap item. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub CLI session with medium reasoning effort. Existing branch commits were produced across prior agent sessions; this packaging pass verified, opened the PR, addressed Greptile findings, resolved migration conflicts after upstream PRs landed, and got PR checks green. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
424e81d087 |
Improve operator workflow QoL (#5291)
## Thinking Path > - Paperclip is a control plane operators use repeatedly to supervise agent companies. > - Common operator workflows depend on fast scanning of inboxes, issue sidebars, workspaces, cost totals, and runtime services. > - Several small UI and service gaps made those workflows slower or less clear. > - This pull request groups the operator-facing QoL changes that can stand alone from recovery and adapter work. > - The benefit is a denser, clearer board experience for issue triage and workspace operation. ## What Changed - Added inbox assignee/project grouping and issue list token/runtime totals. - Improved issue properties with removable blocker chips and workspace task links. - Improved execution workspace layout, runtime controls, issues tab default, and stopped-port reuse behavior. - Added mobile markdown/routine dialog fixes, page title company names, sidebar polish, and dashboard run task label cleanup. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run ui/src/lib/inbox.test.ts ui/src/components/IssueProperties.test.tsx ui/src/components/WorkspaceRuntimeControls.test.tsx server/src/__tests__/workspace-runtime.test.ts server/src/__tests__/costs-service.test.ts` ## Risks - Medium UI risk because this touches several operator surfaces. The branch is intentionally grouped around workflow/QoL files and keeps the file count below the Greptile limit. ## Model Used - OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with shell/git/GitHub CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
11ffd6f2c5 |
Improve ACPX adapter configuration (#5290)
## Thinking Path > - Paperclip orchestrates AI agents across several adapter implementations. > - ACPX is a local adapter path that can proxy Claude and Codex-style execution. > - Its configuration needed stronger schema defaults, provider-aware model handling, and better UI support. > - Plugin authors also need clear docs for managed resources. > - This pull request improves ACPX adapter configuration and documents plugin-managed resources. > - The benefit is a more predictable adapter setup path without changing unrelated control-plane behavior. ## What Changed - Improved ACPX config schema, execution config handling, UI build config, and route coverage. - Added ACPX model filtering support and tests. - Updated the agent config form and storybook coverage for ACPX model/provider behavior. - Expanded plugin authoring documentation for managed resources. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run server/src/__tests__/acpx-local-execute.test.ts server/src/__tests__/adapter-routes.test.ts ui/src/lib/acpx-model-filter.test.ts` ## Risks - Low-to-medium risk: adapter configuration behavior changes can affect ACPX users, but the change is isolated to ACPX/plugin-doc surfaces and covered by targeted adapter tests. ## Model Used - OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with shell/git/GitHub CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
454edfe81e |
Add recovery handoff system notices (#5289)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Agent runs can end productively while the source issue still lacks a durable final disposition. > - That leaves the control plane unsure whether to resume, escalate, or close the work. > - Issue comments also need a presentation contract so system-authored recovery notices can render as first-class thread messages without overloading normal comments. > - This pull request adds successful-run handoff recovery, comment presentation metadata, and system notice rendering. > - The benefit is stricter task liveness with clearer operator-facing recovery state. ## What Changed - Added successful-run handoff decisions, wake payloads, escalation behavior, and recovery tests. - Added issue comment presentation metadata with migration `0078_white_darwin.sql` and shared/server/company portability support. - Rendered recovery/system notices in issue chat with dedicated UI components, fixtures, tests, and storybook/lab coverage. - Included the current recovery model-profile hint patch so automatic recovery follow-ups use the cheap profile. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run server/src/services/recovery/successful-run-handoff.test.ts ui/src/components/SystemNotice.test.tsx ui/src/lib/system-notice-comment.test.ts ui/src/components/IssueChatThreadSystemNotice.test.tsx` ## Risks - Migration-bearing PR: merge this before any other branch that might later add a migration. - The branch touches both recovery services and issue-thread rendering, so review should pay attention to recovery wake idempotency and comment metadata compatibility. ## Model Used - OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with shell/git/GitHub CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
50db8c01d2 |
Serialize sandbox callback bridge against concurrent heartbeats (#5326)
> **Stacked PR.** This PR's branch carries cumulative content from #5324 (bridge allowlist expand) and #5325 (env sanitization) — the mutex/sha256 logic in this PR sits on top of both. Reviewers should focus on the files this PR's commit touches: `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`, `packages/adapter-utils/src/ssh.ts`, and `packages/adapter-utils/src/ssh-fixture.test.ts`. Will rebase onto `master` and force-push once both prerequisite PRs are merged. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent that runs in a sandbox or via SSH talks back to the Paperclip server through a per-lease callback bridge whose entrypoint script is uploaded to the remote > - When two heartbeats target the same agent on the same machine concurrently, both upload the bridge entrypoint and both write to the same response files — producing torn-write races: `SyntaxError: Identifier 'randomUUID' has already been declared` from a concatenated upload, `mv: cannot stat …` from colliding `.json.tmp` writes, and 0-byte commits from a truncated stdin > - This pull request serializes those operations with a POSIX `mkdir`-mutex (PID liveness check + atomic rename) at the bridge entrypoint upload, applies the same lock to the bridge response writer, forwards stdin into remote ssh commands so the entrypoint payload arrives intact, and verifies a sha256 of the upload before promoting it > - The benefit is concurrent heartbeats no longer corrupt each other's bridge state ## What Changed - `packages/adapter-utils/src/sandbox-callback-bridge.ts`: serialize entrypoint upload and response writes via POSIX `mkdir`-mutex with PID liveness; sha256 the upload before promoting via `mv`; content-skip when the existing entrypoint already matches - `packages/adapter-utils/src/ssh.ts`: forward stdin into remote ssh commands through the SSH managed runtime so `cat > "$remote_upload"` actually receives the base64-encoded entrypoint - `packages/adapter-utils/src/ssh-fixture.test.ts`: cover the stdin-forwarded SSH path - `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: cover the mutex, content-skip, sha256-verify, and atomic-rename paths ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` - `pnpm typecheck` clean - Manual: two parallel heartbeats targeting the same SSH agent no longer race on the bridge entrypoint or response files ## Risks Medium. Serializing previously-parallel operations adds latency on the contended path (one heartbeat waits on another), bounded by the entrypoint upload time. The mutex includes PID liveness so a crashed heartbeat doesn't deadlock subsequent ones. Sha256-verify gives a clear "torn upload" failure mode instead of silent 0-byte commits. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — tests cover mutex + sha256-verify + stdin-forwarded ssh - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
f6bad8f6bf |
Sanitize remote execution envs at the boundary (#5325)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Adapters spawn CLIs against local, SSH, and sandbox targets, threading a runtime env through `runAdapterExecutionTargetProcess` and the SSH/sandbox runners > - Host identity vars (HOME, TMPDIR, XDG_*, NVM_DIR, PATH) routinely leak into the env we send to remote targets — sometimes via test probes, sometimes via runtime config — and break sandboxed/SSH'd CLIs whose own profiles set those values correctly > - The sanitization logic existed but lived alongside other helpers in `server-utils.ts` and was applied piecemeal at adapter callsites, so it was easy to bypass > - This pull request lifts the sanitization into a standalone `remote-execution-env.ts`, applies it at the SSH and sandbox runtime boundary so every remote spawn goes through it, and removes the duplicated callsite-level filtering > - The benefit is identity-bound host env stops leaking across SSH/sandbox transports regardless of which adapter calls in ## What Changed - `packages/adapter-utils/src/remote-execution-env.ts`: new module — single source of truth for which env keys are identity-bound and how to strip them when the value matches the host's value - `packages/adapter-utils/src/server-utils.ts`: remove the inline sanitization (now in `remote-execution-env.ts`) - `packages/adapter-utils/src/execution-target.ts`: apply sanitization at the sandbox runtime boundary - `packages/adapter-utils/src/ssh.ts`: apply sanitization at the SSH spawn boundary - `packages/adapters/opencode-local/src/server/test.ts`: drop now-redundant callsite filtering - `packages/adapters/pi-local/src/server/test.ts`: drop now-redundant callsite filtering - New tests `execution-target.test.ts` and `execution-target-sandbox.test.ts` cover the sanitizer flow at both transports, including positive cases (host-shaped path stripped) and explicit-override preservation ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils --project @paperclipai/adapter-opencode-local --project @paperclipai/adapter-pi-local` - `pnpm typecheck` clean ## Risks Low–medium. The sanitization is now applied at one layer (boundary) instead of N (callsites), so behavior is more consistent. Any adapter that previously relied on a leaked host var landing on the remote shell would now see it stripped — but those reliances were what this change exists to fix. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests at both transports - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
36eaf9778f |
Expand sandbox callback bridge allowlist to cover the documented heartbeat surface (#5324)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - When an agent runs in an e2b sandbox or other non-managed environment, it talks back to the Paperclip server through a per-lease callback bridge that proxies HTTP requests > - The bridge has an allowlist of method/path patterns it will forward; anything outside the list is rejected to keep the bridge tight > - The allowlist had drifted behind what the heartbeat documentation describes as the supported callback surface — several documented endpoints (issue updates, agent-side log emit, work-status writes) were being rejected at the bridge > - This pull request expands the allowlist to cover the documented heartbeat surface and adds tests that pin every newly-allowed pattern, so the doc and the bridge stay in sync > - The benefit is sandboxed runs no longer hit "method not allowed" / "path not allowed" rejections on the documented set of callbacks ## What Changed - `packages/adapter-utils/src/sandbox-callback-bridge.ts`: expand the method/path allowlist to match the documented heartbeat callback surface - `packages/adapter-utils/src/sandbox-callback-bridge.test.ts`: add coverage for every newly-allowed pattern, plus negative cases for patterns that should still be rejected ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` - `pnpm typecheck` clean - Manual: previously-rejected callbacks from sandboxed runs now succeed end-to-end ## Risks Low. The allowlist only grows; nothing previously allowed is now blocked. Tests pin both the new allowed patterns and that out-of-doc patterns stay rejected. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover added patterns + still-rejected negatives - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
9fb0c73e0a |
Raise gemini-local hello probe timeout to 60s for SSH and E2B targets (#5322)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Gemini adapter's environment Test surfaces a hello probe so operators can confirm the CLI runs end-to-end on the configured target > - On SSH and E2B sandbox targets the round-trip cost (login-shell sourcing, network, model warm-up) routinely exceeds the existing 10s probe timeout, so the probe spuriously fails on environments that are actually healthy > - This pull request raises the gemini-local hello probe timeout to 60s, matching the timeout we use for slower-bootstrapping adapters > - The benefit is the Gemini Test action no longer reports false negatives on remote targets that need a longer first-run window ## What Changed - `packages/adapters/gemini-local/src/server/test.ts`: hello probe timeout raised from 10s to 60s ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-gemini-local` - Manual: SSH and E2B Gemini hello probes now complete cleanly without spurious timeouts ## Risks Low. A 60s ceiling on a non-blocking probe is consistent with sibling adapters; the only behavior change is a longer worst-case wait when the probe genuinely hangs. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — N/A (one-line timeout change) - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
d6d7a7cea6 |
Add routine revision history and restore flow (#5285)
## Thinking Path > - Paperclip is the control plane for autonomous AI companies. > - Routines are the scheduled/recurring work surface that keeps a company operating without manual kicks. > - Operators need routine edits to be auditable and recoverable, especially when routines control assignments, prompts, triggers, and webhook secrets. > - Documents already have revision-style safety, but routines did not have equivalent history or restore semantics. > - This pull request adds append-only routine revisions across the database, shared contracts, server routes, and board UI. > - The benefit is safer routine iteration: users can inspect history, compare changes, restore older definitions, and avoid overwriting newer edits. ## What Changed - Added `routine_revisions` storage, latest revision pointers on routines, shared types, validators, and API docs for routine revision history. - Added server service/route support for listing routine revisions, conflict-aware routine saves, and append-only restore operations. - Added a History tab on routine detail with revision preview, structured change summaries, description line diffs, dirty-edit blocking, restore confirmation, and restored webhook secret surfacing. - Extracted the line diff helper from `DocumentDiffModal` into `ui/src/lib/line-diff.ts` for reuse. - Rebased the branch onto current `public-gh/master` and renumbered the routine revision migration to `0077_unusual_karnak` after upstream `0076_useful_elektra`. - Made the `0077` routine revision migration idempotent so installs that already applied the branch-local `0076_unusual_karnak` can safely advance. - Updated the plugin SDK test harness routine fixture with the new revision fields required by the shared `Routine` contract. ## Verification - `pnpm --filter @paperclipai/db run check:migrations` passed. - `pnpm exec vitest run --project @paperclipai/shared packages/shared/src/validators/routine.test.ts` passed. - `pnpm exec vitest run --project @paperclipai/ui ui/src/lib/line-diff.test.ts ui/src/components/RoutineHistoryTab.test.tsx ui/src/lib/workspace-routines.test.ts ui/src/pages/Routines.test.tsx` passed. - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/routines-service.test.ts --pool=forks --poolOptions.forks.isolate=true` passed. - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/routines-routes.test.ts --pool=forks --poolOptions.forks.isolate=true` passed. - `pnpm --filter @paperclipai/plugin-sdk typecheck` passed after updating the SDK test harness fixture. - `pnpm --filter @paperclipai/plugin-sdk build` passed; this refreshed local generated SDK output needed by plugin example typechecks. - `pnpm -r typecheck` passed. ## Risks - Medium migration risk: this adds routine revision storage and backfills existing routines. The migration is ordered after upstream `0076` and uses `IF NOT EXISTS` / duplicate-object guards to tolerate earlier branch-local migration application. - Restore behavior intentionally appends a new revision instead of mutating history; callers expecting an in-place rollback need to follow the new latest revision pointer. - Restoring webhook triggers recreates webhook secret material, so users must copy newly surfaced secrets after restore. - Conflict-aware saves now reject stale routine edits when the client sends an older `baseRevisionId`. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent, with shell/tool use in a local git worktree. Exact context-window size is not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Screenshots: not attached in this draft PR; the new UI flow is covered by component tests listed above. --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
9578dc3da7 |
Wire per-adapter sandbox install commands through test and execute paths (#5280)
> **Stacked PR.** Sits on top of the e2b sandbox chain — #5278 (stdin staging) and #5279 (honest-resolvability + login-profiles). The cumulative diff against `master` includes both of those PRs' content; the files touched by *this* PR's commit are the new `maybeRunSandboxInstallCommand` helper in `packages/adapter-utils/src/execution-target.ts` and the per-adapter `index.ts`/`server/test.ts`/`server/execute.ts` wiring under `packages/adapters/{claude,codex,cursor,gemini,opencode,pi}-local/`. The honest resolvability check from #5279 is what gives this PR's install command a meaningful "did it actually land on PATH" follow-up. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Sandbox execution targets are ephemeral — each fresh lease starts from a template image that may or may not have the agent CLIs preinstalled > - When a CLI isn't preinstalled, the resolvability probe fails at `command -v` and the hello probe never runs > - There's no shared mechanism for "before you probe or provision, install the CLI on this sandbox" > - This pull request adds a `SANDBOX_INSTALL_COMMAND` constant per adapter and a `maybeRunSandboxInstallCommand` helper that runs it via the existing sandbox login shell, captures structured output, and never throws (so the resolvability + hello probe still run after); each adapter's `test()` and `execute()` share the constant so the two callsites can't drift > - The benefit is a fresh sandbox lease without a preinstalled CLI now installs it once via `sh -lc` before the resolvability probe and before managed-runtime provisioning, with a uniform `<adapter>_install_command_run` check on the test report ## What Changed - `packages/adapter-utils/src/execution-target.ts`: add `AdapterSandboxInstallCommandCheck` and `maybeRunSandboxInstallCommand` (runs the install via existing sandbox shell, captures exit/stdout/stderr, returns a structured info/warn check, never throws) - Add `SANDBOX_INSTALL_COMMAND` to each adapter's `index.ts` so `test()` and `execute()` share a single source of truth - Wire each of the 6 affected adapter `testEnvironment()`s to call `maybeRunSandboxInstallCommand` before `ensureAdapterExecutionTargetCommandResolvable` - Pass `installCommand: SANDBOX_INSTALL_COMMAND` through `prepareAdapterExecutionTargetRuntime` in each adapter's `execute()` - Per-adapter install commands use npm globals where possible so binaries land on a PATH segment the template already exports: - claude → `npm install -g @anthropic-ai/claude-code` - codex → `npm install -g @openai/codex` - cursor → `curl https://cursor.com/install -fsS | bash` - gemini → `npm install -g @google/gemini-cli` - opencode → `npm install -g opencode-ai` - pi → `npm install -g @mariozechner/pi-coding-agent` SSH and local targets ignore `installCommand` (SSH runtime takes no such param; local short-circuits before runtime prep), so this is a no-op for non-sandbox environments. ## Verification - `pnpm typecheck` clean - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` and per-adapter projects pass - Manual sandbox matrix (claude, codex, cursor, gemini, opencode, pi) — each goes `install_command_run → resolvable → hello_probe_passed` (Codex and Pi land on `hello_probe_auth_required`, which is the configured-credentials problem, not an install issue) - SSH no-regression: SSH Claude still passes; the helper short-circuits on non-sandbox targets ## Risks Medium — adds a network/CPU cost (npm install / curl) on every fresh sandbox lease. Cost is bounded (one-time per lease, typically tens of seconds for npm globals), and the helper never throws so a failing install still lets the report run resolvability and hello probes. If a sandbox image already has the CLI, the install is an idempotent reinstall. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
af9386f879 |
Run a real command-v probe and source login profiles before exec in e2b sandboxes (#5279)
> **Stacked PR.** Sits on top of #5278 (`e2b/stage-stdin-to-temp-file`) which ships the stdin-staging fix this builds on. The cumulative diff against `master` includes that PR's content; the files touched by *this* PR's commit are `packages/adapter-utils/src/execution-target.ts`, `packages/plugins/sandbox-providers/e2b/src/plugin.ts`, and `packages/plugins/sandbox-providers/e2b/src/plugin.test.ts`. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The adapter Test flow does an "is the command resolvable?" probe before running the hello probe so the report distinguishes "binary not installed" from "binary errored" > - For sandbox targets, that resolvability check was a no-op early-return — every sandboxed adapter test reported "Command is executable" regardless of whether the binary existed > - That made the resolvability check disagree with the hello probe in a way that looked like a PATH bug, when it was actually a missing CLI > - Separately, the e2b spawn used `sandbox.commands.run` with a non-login non-interactive shell whose PATH did not include npm-globals, nvm shims, or anything else the template installs via `.profile`/`.bashrc` > - This pull request makes the resolvability check honest by running a real `command -v` invocation through the sandbox runner, and aligns the e2b spawn with SSH by sourcing login profiles before `exec env KEY=val <cmd>` > - The benefit is the e2b sandbox spawn agrees with the hello probe and finds CLIs at template-installed paths ## What Changed - `packages/adapter-utils/src/execution-target.ts`: add `ensureSandboxCommandResolvable` that runs `command -v <cli>` through the sandbox runner; replace the early-return in `ensureAdapterExecutionTargetCommandResolvable` for sandbox targets - `packages/plugins/sandbox-providers/e2b/src/plugin.ts`: replace `buildCommandLine` with `buildLoginShellScript` (sources `/etc/profile`, `~/.profile`, `~/.bash_profile`, `~/.bashrc`, `~/.zprofile`, and nvm.sh before `exec env KEY=val <cmd>`); env vars are interpolated inline so user-configured adapter env always wins over profile-exported values; drop the now-unused `envs:` SDK option - `plugin.test.ts` updated for the login-shell wrapping ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/sandbox-e2b` — 17/17 plugin tests pass - `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils` clean - `pnpm typecheck` clean - Manual: previously every sandboxed adapter said "Command is executable" then the hello probe failed with "exec: not found". After this change, missing CLIs surface honestly at the resolvability step. SSH no-regression: SSH Claude probe still passes. ## Risks Medium — sandbox adapter Test reports will start failing at the resolvability step for environments where the CLI was never actually installed. This was always the real state; the previous "Command is executable" message was incorrect. Operators should expect previously-green-but-broken sandbox environments to report accurately. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — `plugin.test.ts` updated for the login-shell wrapping - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
cb6af7c2cc |
Stage stdin to a temp file so the e2b sandbox executor delivers it reliably (#5278)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - The e2b sandbox provider implements `onEnvironmentExecute` so
adapters can spawn CLIs in an e2b sandbox
> - For commands that need stdin (e.g. piping a hello prompt to a CLI),
the previous implementation awaited a foreground `commands.run({ stdin:
true, ... })` and then tried to call `sendStdin(pid)` on the now-dead
PID
> - That call resolves only after the process exits, so stdin was never
delivered and e2b raised "process not found"
> - This pull request stages stdin to `/tmp/paperclip-stdin-<uuid>`
inside the sandbox and shell-redirects it (`exec '<cmd>' '<args>' <
'<file>'`), making the command synchronous regardless of whether stdin
is supplied
> - The benefit is adapter Test probes that pipe a hello prompt to a CLI
inside an e2b sandbox now actually deliver the prompt
## What Changed
- `packages/plugins/sandbox-providers/e2b/src/plugin.ts`: replace the
broken async `commands.run` + `sendStdin` flow with stdin-staging to a
sandbox temp file and shell-redirection
- Staged file is removed in a `finally` block; write failures propagate
after best-effort cleanup
## Verification
- `pnpm vitest run --no-coverage --project @paperclipai/sandbox-e2b` —
all 17 unit tests pass
- `pnpm typecheck` clean
- Manual: a sandboxed adapter Test probe that pipes a hello prompt now
receives the prompt
## Risks
Low risk — `plugin.test.ts` already encodes the temp-file design; the
change brings the implementation in line with the test.
## Model Used
Claude Opus 4.7 (1M context)
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — existing tests
already encode the new design
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
|
||
|
|
9042b8d042 |
Write apikey-mode auth.json so Codex CLI 0.122+ can authenticate via OPENAI_API_KEY (#5276)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The Codex adapter spawns the OpenAI Codex CLI to drive the model > - Codex CLI 0.122 changed how it reads credentials: it ignores `OPENAI_API_KEY` from the environment and reads only `$CODEX_HOME/auth.json` > - Without auth.json, Codex 0.122+ returns 401 "Missing bearer or basic authentication" on `/v1/responses` even when `OPENAI_API_KEY` is forwarded into the sandbox or remote shell > - This pull request materializes an apikey-mode `auth.json` in the managed Codex home (or per-run for the test probe) when an `OPENAI_API_KEY` is configured > - The benefit is configured Codex API keys authenticate correctly with current Codex CLI versions across local, SSH, and sandbox targets ## What Changed - `codex-home.ts`: add `writeApiKeyAuthJson()` and let `prepareManagedCodexHome` accept an `apiKey` override that replaces the symlinked host auth.json with an apikey-mode file - `execute.ts`: pass `envConfig.OPENAI_API_KEY` into `prepareManagedCodexHome` so the managed (and synced-to-remote) Codex home authenticates via the configured key - `test.ts`: when `OPENAI_API_KEY` is available, wrap the hello probe with a small shell that materializes a per-run `$CODEX_HOME/auth.json` before exec'ing codex; key content rides through env to avoid leaking into process listings - Update the `codex_hello_probe_auth_required` hint to explain Codex CLI does not read `OPENAI_API_KEY` from env ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-codex-local` - `pnpm typecheck` clean - Manual: Codex 0.122.0 with empty `CODEX_HOME` returns 401 with env-only auth; with this change it authenticates cleanly ## Risks Low risk — when no API key is configured, behavior is unchanged (no auth.json written, existing chatgpt-mode flow preserved). Apikey-mode `auth.json` is the upstream-supported format. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |