forked from farhoodlabs/paperclip
ab8b471685bf17efd7d170cd43187c0318dcdeaa
903 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ab8b471685 |
Add built-in grok_local adapter (#6087)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, so adapter quality directly affects what runtimes the control plane can supervise. > - Local CLI adapters are one of the core execution surfaces because they turn real coding tools into Paperclip-managed employees with heartbeats, transcripts, and reviewability. > - Grok Build was installed on the Paperclip host, but Paperclip had no built-in `grok_local` adapter, so the runtime could not be configured through the normal server/UI/CLI adapter path. > - That gap needed to be closed with the same built-in registry, environment diagnostics, transcript parsing, and skill/instructions behavior that the other local adapters already rely on. > - After the initial adapter landed, a real follow-up run showed that Grok streaming text was being rendered one fragment per line, which made transcripts harder to read even though the runtime itself was working. > - This pull request adds the built-in `grok_local` adapter end-to-end and then fixes the transcript parser so streamed Grok output is coalesced into readable assistant/thinking blocks. > - The benefit is that Grok Build becomes a first-class Paperclip runtime with a usable operator experience instead of a partially wired runtime with noisy transcript output. ## What Changed - Added a new built-in `@paperclipai/adapter-grok-local` package with server, UI, and CLI entrypoints. - Implemented Grok execution, session handling, environment diagnostics, config building, skill syncing, and parser coverage inside the new adapter package. - Registered `grok_local` across the built-in adapter inventories and capability/display metadata in server, UI, CLI, and shared constants. - Added adapter route coverage for the new built-in type. - Fixed Grok transcript readability by emitting streamed `text` and `thought` fragments as deltas so the shared transcript builder coalesces them into readable message blocks. - Added regression tests for the Grok parser and transcript coalescing behavior. ## Verification - `pnpm vitest run packages/adapters/grok-local/src/ui/parse-stdout.test.ts ui/src/adapters/transcript.test.ts` - `pnpm --filter @paperclipai/adapter-grok-local build` - Manual runtime verification on the Paperclip host during implementation and follow-up review: - confirmed the Grok CLI was installed and authenticated - confirmed the worktree dev server could be restarted cleanly and health-checked after the parser follow-up - No screenshots attached. This change is primarily adapter plumbing plus transcript formatting behavior; reviewers can verify via the Grok-backed run surfaces directly. ## Risks - This adds a new built-in adapter, so any missed registration surface could create inconsistencies between server, UI, and CLI behavior. - The adapter depends on Grok Build's current event/output shape; if upstream Grok streaming JSON changes, transcript parsing or session extraction may need follow-up updates. - The transcript readability fix intentionally changes how Grok fragments are grouped, so any downstream code that implicitly expected one entry per fragment would behave differently. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex via Paperclip `codex_local` agent runtime. - GPT-5-class coding model with tool use, shell execution, file editing, and repo inspection enabled. - Exact backend model ID/context window were not surfaced to the agent in this Paperclip session. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
4c47eb46c3 |
[codex] Add multilingual issue preservation coverage (#6069)
## Thinking Path > - Paperclip orchestrates AI agents for autonomous companies. > - Agents and board operators coordinate through company-scoped issues, comments, documents, and heartbeat wake payloads. > - Chinese, Japanese, and Hindi text needs to survive the full issue lifecycle without normalization or prompt serialization damage. > - The riskiest paths are board issue creation, server issue/comment/document round-tripping, and scoped wake prompt rendering. > - This pull request adds focused regression coverage across those surfaces. > - The benefit is higher confidence that multilingual operators and agents can create, search, comment on, complete, and wake on issues using non-Latin text. ## What Changed - Added adapter-utils wake payload and prompt rendering coverage for Chinese, Japanese, and Hindi issue/comment text. - Added UI New Issue dialog coverage proving multilingual title and description text is submitted unchanged. - Added server route coverage that round-trips multilingual issue text through create, search, comments, documents, completion comments, and heartbeat context. - Addressed Greptile feedback by using a typed storage mock and splitting the server route integration path into smaller ordered assertions. ## Verification - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts ui/src/components/NewIssueDialog.test.tsx server/src/__tests__/multilingual-issues-routes.test.ts` - Result: 3 test files passed, 51 tests passed. ## Risks - Low risk: this PR adds regression coverage only and does not change runtime behavior. - The new server test uses embedded Postgres support and skips on unsupported hosts using the existing helper pattern. - No migrations are included. - No `pnpm-lock.yaml` changes are included. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected - check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 based coding agent, with shell, git, Vitest, and GitHub connector/CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
eb38b226c2 |
Fix LLM Wiki package and migration validation (#6010)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Plugins extend the control plane with optional capabilities such as LLM Wiki. > - LLM Wiki needs its package assets and plugin-owned database migrations to work when installed from the packaged plugin. > - The bundled spaces migration used validation-hostile dynamic SQL, and the packaged plugin could omit non-dist runtime assets. > - This pull request makes the LLM Wiki package include its required assets and cuts the spaces migration over to explicit, idempotent SQL that passes the production plugin database validator. > - The benefit is a simpler plugin install path that validates and applies the bundled LLM Wiki migrations without adding plugin-specific legacy handling to Paperclip core. ## What Changed - Added the LLM Wiki package asset allowlist so agents, migrations, skills, templates, dist output, and README are included when packaged. - Renamed the bootstrap `.gitignore` template to `gitignore.template` and updated the runtime lookup so package tooling does not drop the hidden template file. - Relaxed plugin migration validation to allow namespace-scoped `INSERT`/`UPDATE` backfills and `CREATE INDEX` statements while continuing to reject destructive or cross-namespace SQL. - Replaced the LLM Wiki spaces migration's dynamic constraint-drop DO block with explicit `DROP CONSTRAINT IF EXISTS` statements. - Replaced fragile regex-source dispatch in SQL reference extraction with explicit capture-group descriptors. - Added regression coverage that applies the bundled LLM Wiki migrations through the production validator and checks the expected constraints. ## Verification - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/plugin-database.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm --filter @paperclipai/plugin-llm-wiki build` - `git diff --check` - Confirmed `pnpm-lock.yaml` is not included in the branch diff. ## Risks - Low migration risk for current users: LLM Wiki spaces are new, so this intentionally cuts over the plugin migration instead of adding legacy handling in core. - Validator behavior is broader than before, but still requires fully qualified plugin namespace targets, blocks deletes/destructive DDL, and keeps public table access read-only and allowlisted. > Checked [`ROADMAP.md`](ROADMAP.md); this is a targeted plugin packaging/migration fix and does not duplicate planned core feature work. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 based coding agent, tool-enabled local repo access, reasoning mode managed by the Paperclip/Codex runtime. Exact context window was not surfaced in this session. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
03ad5c5bea |
[codex] Add issue document locking (#6009)
## Thinking Path > - Paperclip orchestrates AI-agent companies through company-scoped issues, comments, and issue documents. > - Issue documents are the durable place where plans, handoffs, and other work artifacts are revised over time. > - Some documents need to be preserved as operator-approved snapshots while agents continue working on the same issue. > - Without document locking, a later board or agent write can overwrite the document key that reviewers expected to remain stable. > - This pull request adds board-managed issue document locks and makes agent writes to locked keys create a derived document instead of mutating the locked document. > - The benefit is safer document handoffs: approved or frozen issue documents stay immutable until the board explicitly unlocks them. ## What Changed - Added `locked_at`, `locked_by_agent_id`, and `locked_by_user_id` document fields plus migration `0085_tranquil_the_executioner.sql`. - Added document lock/unlock service behavior, route endpoints, activity events, and locked-document write protections. - Made agent document writes to locked keys create a new derived key such as `plan-2` rather than overwriting the locked document. - Surfaced lock state through shared issue document types, UI API methods, document header lock controls, and activity formatting. - Added server and UI tests for lock/unlock behavior, locked document immutability, and UI action visibility. - Updated `doc/SPEC-implementation.md` with the V1 document lock contract and endpoints. ## Verification - `git rebase public-gh/master` completed cleanly after committing the branch changes. - `git diff --check` passed before commit. - `pnpm run preflight:workspace-links && pnpm exec vitest run server/src/__tests__/documents-service.test.ts server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts ui/src/components/IssueDocumentsSection.test.tsx ui/src/components/IssueContinuationHandoff.test.tsx ui/src/lib/document-revisions.test.ts` passed: 5 files, 32 tests. ## Risks - Medium risk because this changes the document persistence contract and adds a migration. - The migration uses `ADD COLUMN IF NOT EXISTS` and guarded foreign-key creation so it remains safe for users who may have already applied an earlier copy of the migration. - Locked documents intentionally reject board edits/deletes/restores until unlocked; any existing workflows that expected direct overwrite need to unlock first. - Agent writes to locked keys now create derived documents, which may create extra issue documents when agents retry locked writes. ## Model Used - OpenAI Codex coding agent based on GPT-5, with tool use and local code execution in the Paperclip worktree. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
901c088e14 |
fix: propagate projectId into wakeup context and support identifier lookup (#6026)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The server's heartbeat/wakeup pipeline resolves which project workspace an agent run should bind to > - `enqueueWakeup` resolves an issue (and therefore a project) before scheduling a run, but the resolved `projectId` was never written back into `enrichedContextSnapshot.projectId`, so `resolveWorkspaceForRun` always saw `contextProjectId === null` > - When the `issueProjectRef` DB lookup also returned null (e.g. identifier-style id like `ENV-13`, not a UUID), workspace resolution fell through to the `agent_home` fallback instead of the correct project workspace > - Surfaced while running the QA matrix on sandbox/SSH — runs were ending up in the wrong workspace > - This pull request stores the resolved `projectId` back into context and replaces the raw UUID-only DB query with `issuesSvc.getById`, which accepts both UUIDs and identifiers and canonicalizes `context.issueId` / `context.taskId` to the UUID on identifier hits > - The benefit is that wakeups triggered with identifier-style ids correctly bind to their project workspace instead of silently degrading to `agent_home` ## What Changed - In `enqueueWakeup`, after the issue resolves, write `projectId` back into `enrichedContextSnapshot.projectId` so downstream workspace resolution can use it. - Replace the raw UUID-only DB query for the issue with `issuesSvc.getById`, which handles both UUIDs and identifiers (e.g. `ENV-13`). - On an identifier hit, canonicalize `context.issueId` and `context.taskId` to the resolved UUID. ## Verification - Trigger a wakeup with an identifier-style id (`ENV-13`) on the dev instance and confirm the run binds to the correct project workspace instead of `agent_home`. - Confirm UUID-style wakeups still resolve to the same project workspace as before. ## Risks - Low risk. Scope is a single function in `server/src/services/heartbeat.ts` (+20/-7). Failure mode if regressed is the prior behavior (fallback to `agent_home`). ## Model Used - Claude (Anthropic), `claude-opus-4-7`, via Claude Code / Paperclip `claude_local` adapter. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [ ] I have run tests locally and they pass - [ ] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
333a16b035 |
Fix company export with missing run logs (#5960)
## Thinking Path > - Paperclip is the control plane for autonomous AI companies. > - Company export/import lets operators move company state, including issue threads and agent execution context, between Paperclip instances. > - Issue comments can be enriched by nearby heartbeat run logs so exported threads preserve useful agent/run attribution metadata. > - Some local instances can have heartbeat run database rows whose local log files were deleted or never copied into the current workspace. > - The export path should still include the original user comments instead of failing because optional run-log metadata is unavailable. > - This pull request makes comment run-log metadata derivation tolerate missing local log files, logs the missing-file condition for operators, and adds a regression test. > - The benefit is safer company exports for real instances with incomplete local run-log storage. ## What Changed - Treat missing local heartbeat run logs as absent optional metadata while listing issue comments. - Emit a structured warning with `runId` and `logRef` when optional comment-attribution log content is missing. - Preserve the existing error behavior for non-404 run-log read failures. - Added a regression test proving user comments still list when a candidate attribution run has a missing local log reference. ## Verification - `pnpm exec vitest run server/src/__tests__/issues-service.test.ts -t "candidate attribution run log is missing"` passed: 1 selected test passed, 47 skipped. - `pnpm --filter @paperclipai/server typecheck` passed. - Greptile Review passed with Confidence Score 5/5 and zero unresolved threads on commit `f68cac02bf98d7d31e7831e5bdfa95cffa85e254`. - GitHub PR workflow run succeeded: `policy`, `verify`, four serialized server suites, `e2e`, and `Canary Dry Run` all passed. - `security/snyk (cryppadotta)` passed. - Confirmed this branch is on top of `public-gh/master` and `pnpm-lock.yaml` is not in the PR diff. ## Risks - Low risk. The change only softens optional comment metadata derivation for 404/missing local log files; other log read errors still throw. - Exported comments in this edge case may lack derived run metadata, but they remain visible/exportable instead of failing the request. - Operators may see new warnings when historical run-log references point to missing local files; those warnings indicate degraded optional metadata, not data loss. ## Model Used - OpenAI Codex, GPT-5 coding agent in this Paperclip heartbeat, with shell/git/GitHub CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
1bd44c8a0d |
Harden Cloudflare sandbox execution (#5967)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Remote-managed adapters need sandbox/environment execution to behave like real agent runs, not just local host probes. > - The Cloudflare sandbox path was the weakest leg in the SSH + Cloudflare QA matrix because bridge execution could truncate output, time out long-running installs, and under-provision the worker instance. > - That made several adapters fail for reasons unrelated to their actual business logic, which blocks confidence in Paperclip's non-local environment model. > - This pull request hardens the Cloudflare bridge/runtime path and adjusts sandbox probe budgets so adapter verification matches the measured behavior of the fixed environment. > - It also corrects the Pi sandbox install command so the QA matrix exercises a real, supported install path. > - The benefit is a materially more reliable SSH + Cloudflare adapter matrix with fewer false negatives and clearer failure boundaries. ## What Changed - Switched the Cloudflare bridge worker instance type to `standard-2` for the QA-matrix execution path. - Raised Cloudflare bridge/plugin-worker timeout budgets and added SSE keepalives so long-running install/exec calls can complete instead of dying at the transport layer. - Fixed Cloudflare bridge-channel command handling to avoid dropped final stdout chunks on short-lived execs. - Made Claude, OpenCode, and Cursor sandbox probe timeouts configurable/sandbox-aware, then tightened the defaults to the measured post-fix range. - Updated the Pi sandbox install command to use the package currently installed by the official `pi.dev` installer, pinned to a specific npm version. - Added/updated tests around Cloudflare bridge behavior and adapter sandbox probe paths. ## Verification - `pnpm --filter @paperclipai/adapter-claude-local typecheck` - `pnpm --filter @paperclipai/adapter-opencode-local typecheck` - `pnpm --filter @paperclipai/adapter-cursor-local typecheck` - `pnpm vitest run packages/adapters/cursor-local packages/adapters/claude-local packages/adapters/opencode-local packages/adapters/pi-local packages/plugins/sandbox-providers/cloudflare server/src/services/__tests__/plugin-worker-manager.test.ts` - Manual QA on the dedicated dev instance using the SSH + Cloudflare environment matrix (`ENV-29` through `ENV-40`). Clean end-to-end passes: SSH `claude_local`, `codex_local`, `cursor`, `gemini_local`; Cloudflare `claude_local`, `codex_local`, `cursor`, `gemini_local`. ## Risks - Cloudflare sandbox cost increases because the bridge worker now runs on `standard-2` instead of `lite`. - Higher timeout ceilings can delay surfacing truly hung Cloudflare bridge calls, even though they remove transport-level false negatives. - The manual heartbeat matrix still exposed follow-on execution/sync/disposition bugs in `opencode_local` and `pi_local`; those are not fixed by this PR. ## Model Used - OpenAI `gpt-5.4` via Paperclip `codex_local`, reasoning effort `high`, tool use enabled, repo search enabled. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots (not applicable) - [x] I have updated relevant documentation to reflect my changes (not applicable) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
4142559c37 |
[codex] Add blocked inbox attention view (#5603)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies through company-scoped issues, comments, approvals, and execution workspaces. > - Operators need the Inbox to show not only active work, but also blocked work that may need human or agent attention. > - The existing inbox experience did not have a dedicated blocked-work surface, so blocked tasks were harder to triage and resume deliberately. > - Backend consumers also needed a compact attention signal that distinguishes actionable blockers from covered or waiting blocker states. > - This pull request adds a Blocked Inbox tab backed by issue blocker-attention metadata, shared validators, and UI helpers. > - The benefit is a clearer triage path for stalled or blocked Paperclip work without exposing external wait internals in the operator-facing UI. ## What Changed - Added shared issue blocker-attention types, validators, and exports for the API/UI contract. - Added backend blocker-attention computation and issue route support for blocked inbox data. - Added the Blocked Inbox tab, blocked reason chips, filtering/search UI, responsive layouts, and Storybook stories. - Updated inbox helpers and page behavior so toolbar controls only appear where they apply. - Added coverage for shared validators, server blocker-attention behavior, blocked inbox UI helpers/components, and the Inbox page. - Added a screenshot helper script for the blocked inbox Storybook stories. - Addressed Greptile feedback by making urgency sorting deterministic for null stop times, avoiding full blocked-inbox list enrichment for counts, and hardening the screenshot helper. ## Verification - Rebased the branch cleanly onto `public-gh/master`. - Confirmed the diff does not include `pnpm-lock.yaml`. - Confirmed the diff does not include database migration files. - Ran `pnpm exec vitest run packages/shared/src/validators/issue.test.ts server/src/__tests__/issue-blocker-attention.test.ts ui/src/components/BlockedInboxView.test.tsx ui/src/components/BlockedReasonChip.test.tsx ui/src/lib/blockedInbox.test.ts ui/src/lib/inbox.test.ts ui/src/pages/Inbox.test.tsx`. - Ran `pnpm --filter @paperclipai/shared typecheck && pnpm --filter @paperclipai/server typecheck && pnpm --filter @paperclipai/ui typecheck`. - Checked `ROADMAP.md`; this is scoped inbox/operator triage work and does not duplicate a listed roadmap feature. - Greptile Review is green on the latest head and all four Greptile review threads are resolved. - GitHub PR checks are green on the latest head: policy, security/snyk, e2e, verify, Canary Dry Run, Greptile Review, and serialized server suites 1/4 through 4/4. ## Risks - Medium review surface because this touches the shared issue contract, server issue services, and the Inbox UI together. - Blocker-attention classification may need product tuning after operators use it on real blocked queues. - UI screenshots were not attached in this PR-opening pass; the branch includes `scripts/screenshot-blocked-inbox.mjs` and Storybook stories for visual capture. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used OpenAI Codex, GPT-5-based coding agent with shell, git, GitHub CLI, GitHub connector, and Paperclip API tool use. Reasoning mode: medium. Context window: not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
d1a8c873b2 |
fix(remote-sandbox): harden host workspace resumes (#5922)
## Thinking Path > - Paperclip orchestrates AI agents through a control plane while adapters execute work in local, remote, or sandboxed runtimes. > - Remote sandbox execution depends on a strict host-versus-remote workspace boundary: the host prepares/restores files, while the adapter command runs inside the sandbox cwd. > - Jannes' PR #5823 identified host-side failure modes that were not covered by replacement PR #5822. > - Persisting a remote pod cwd in session params could poison the next host heartbeat resume and make Paperclip inspect or upload system temp roots. > - Plugin sandbox providers also need a narrow way to receive model-provider API keys without exposing the full server environment to every plugin worker. > - This pull request ports the host-side fixes from #5823 in the current codebase style, with focused regression coverage. > - The benefit is safer remote sandbox resumes and plugin worker environment handling without broadening core plugin privileges. ## What Changed - Persist host workspace cwd, not remote sandbox cwd, in `claude_local` session params while retaining remote execution identity metadata. - Reject saved session cwds that point at system roots before heartbeat falls back to agent home workspace. - Skip sockets, FIFOs, devices, and other non-file entries during workspace restore snapshot capture/comparison. - Pass a small model-provider API-key allowlist only to plugins declaring `environment.drivers.register`. - Added focused regression tests for remote Claude session params, unsafe session cwd detection, plugin worker env filtering, and non-file snapshot entries. Credits: ports host-side fixes from Jannes' #5823. ## Verification - `pnpm vitest run packages/adapter-utils/src/workspace-restore-merge.test.ts server/src/services/session-workspace-cwd.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/plugin-database.test.ts` (25 passed, 7 skipped by existing embedded-Postgres host guard) - `pnpm --filter @paperclipai/adapter-utils typecheck` - `pnpm --filter @paperclipai/adapter-claude-local typecheck` - `pnpm --filter @paperclipai/server typecheck` ## Risks - Low risk: changes are scoped to remote sandbox/session metadata, workspace snapshot filtering, and plugin worker env setup. - Sandbox-provider plugins now receive only the explicit model-provider key allowlist; any provider needing another key name will need a deliberate allowlist update. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent, tool-enabled local code execution and repository editing. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
eb452fba30 |
Fix comment date binding regression (#5919)
## Thinking Path > - Paperclip is the control plane for autonomous AI companies, and issue comments are the primary durable communication surface between operators and agents. > - Commit `c445e592` (`fix(ui): fix message attribution for agent-posted comments with user author IDs (#5780)`) added server-side derived attribution for historical comments by scanning heartbeat runs near comment timestamps. > - That scan accidentally bound JavaScript `Date` objects directly into postgres-js SQL fragments for the run timestamp window. > - On real Postgres, that can fail while listing issue comments with `ERR_INVALID_ARG_TYPE`, which makes comments disappear from issue pages such as `PAP-9284`. > - This pull request keeps the attribution behavior intact while changing only the broken timestamp binding path. > - The benefit is that comments load again without weakening the conservative attribution recovery introduced by `c445e592`. ## What Changed - Convert the derived-attribution heartbeat-run window bounds to ISO timestamp strings before binding them into SQL, with explicit `::timestamptz` casts. - Add an embedded Postgres regression that inserts a heartbeat run and user-authored comment, then verifies `issueService.listComments()` returns the comment while the attribution scan runs. - Delete `heartbeat_runs` during the issue service test cleanup before deleting agents so the new test data does not leak across cases. ## Verification - `pnpm exec vitest run server/src/__tests__/issues-service.test.ts -t "lists user comments when derived run attribution scans a timestamp window"` - `pnpm --filter @paperclipai/server typecheck` - `git diff --check` ## Risks - Low risk. The change is limited to how timestamp parameters are bound for an existing query. - The derived attribution logic remains conservative and still requires exact run-log proof before relabeling a comment. - The regression uses embedded Postgres so it covers the postgres-js binding path that failed in production-like local runs. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex via the Paperclip `codex_local` adapter; GPT-5 coding-agent family with local terminal, file-editing, and git/GitHub CLI tool use. Exact hosted model deployment ID is not exposed by this local adapter runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots (not applicable: server-side comment API bugfix) - [x] I have updated relevant documentation to reflect my changes (not applicable: no documented behavior or command changed) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
b947a7d76c |
[codex] Improve local plugin development workflow (#5821)
## Thinking Path > - Paperclip is the control plane for autonomous AI-agent companies. > - Plugins are the extension point for adding capabilities without expanding the core product surface. > - Local plugin development needed a tighter CLI-first loop so plugin authors can scaffold, run, install, inspect, and reload plugins without reaching into internal package paths. > - The server plugin install path also needed local-path handling that keeps plugin identity, dashboard routes, and development watchers coherent. > - This pull request adds the CLI scaffold/install workflow, fixes the server and SDK edge cases that blocked that loop, and updates the agent-facing plugin creation skill and docs. > - The benefit is that contributors can develop plugins from local folders with a documented, repeatable happy path. ## What Changed - Added `paperclipai plugin init` coverage and CLI wiring for local plugin scaffolding. - Improved local plugin install handling, plugin key route resolution, dashboard capability behavior, and dev watcher startup/reload behavior. - Fixed plugin SDK worker entrypoint validation for symlinked package layouts. - Added targeted tests for plugin init, server plugin authz/watcher behavior, SDK worker host validation, and the authoring smoke example. - Added a short local plugin development guide and refreshed the plugin authoring guide plus `paperclip-create-plugin` skill instructions. ## Verification - `pnpm run preflight:workspace-links && pnpm --filter @paperclipai/plugin-sdk build && pnpm --filter @paperclipai/create-paperclip-plugin typecheck && pnpm --filter paperclipai typecheck && pnpm --filter @paperclipai/plugin-sdk typecheck && pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run --project paperclipai cli/src/__tests__/plugin-init.test.ts` - `pnpm exec vitest run --project @paperclipai/plugin-sdk packages/plugins/sdk/tests/worker-rpc-host.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/plugin-dev-watcher.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/plugin-routes-authz.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm --dir packages/plugins/examples/plugin-authoring-smoke-example test` - Confirmed `pnpm-lock.yaml` is not included in the PR diff. ## Risks - Medium risk: this touches plugin install routing, CLI command behavior, and the local development watcher. - Local path plugin installs execute trusted local code by design; the new docs call out that trust boundary. - No database migrations are included. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled local shell and git workflow, medium reasoning effort. Context window details were not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge UI screenshots: not applicable; this PR changes CLI/server/plugin docs and tests, not board UI rendering. --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
0808b388ee |
[codex] Add source-scoped recovery actions (#5599)
## Thinking Path > - Paperclip is a control plane for autonomous AI companies, where work must end with a clear disposition rather than ambiguous agent liveness. > - Recovery currently detects stalled or missing-next-step issues, but source issue recovery can become split across child recovery issues, blockers, and comments. > - That makes it harder for operators and agents to see who owns recovery and what exact action is needed on the original issue. > - Source-scoped recovery actions give the original issue a first-class active recovery state with owner, evidence, wake policy, and resolution outcome. > - This pull request adds the recovery-action data model, backend reconciliation and resolution APIs, and board UI indicators/actions. > - The benefit is clearer stalled-work recovery without losing source issue context or relying on comments as the liveness path. ## What Changed - Added the `issue_recovery_actions` schema, shared types/constants/validators, and an idempotent `0084_issue_recovery_actions` migration ordered after current `master` migrations. - Updated stranded/missing-disposition recovery to create source-scoped recovery actions, wake the recovery owner on the source issue, and avoid locking the source issue for recovery-action wakes. - Added API support for reading active recovery actions on issue detail/list surfaces and resolving them with restored, blocked, cancelled, or false-positive outcomes. - Require blocked recovery resolutions to have an unresolved first-class blocker, and removed the UI shortcut that could mark recovery blocked without a blocker selection path. - Surfaced recovery indicators/actions in the issue UI, blocker notices, active run panels, issue rows, and Storybook coverage. - Updated docs and focused tests for recovery semantics, ownership, races, stale comments, and UI behavior. ## Verification - `pnpm exec vitest run server/src/__tests__/issue-recovery-actions.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts ui/src/components/IssueRecoveryActionCard.test.tsx ui/src/components/IssueBlockedNotice.test.tsx ui/src/api/issues.test.ts` — 5 files, 72 tests passed. - `pnpm --filter @paperclipai/shared typecheck` — passed. - `pnpm --filter @paperclipai/db typecheck` — passed, including migration numbering check. - `pnpm --filter @paperclipai/server typecheck` — passed. - `pnpm --filter @paperclipai/ui typecheck` — passed. - Follow-up verification after blocker-resolution guard: `pnpm exec vitest run server/src/__tests__/issue-recovery-actions.test.ts ui/src/components/IssueRecoveryActionCard.test.tsx ui/src/api/issues.test.ts` — 3 files, 27 tests passed. - Follow-up `pnpm --filter @paperclipai/server typecheck` — passed. - Follow-up `pnpm --filter @paperclipai/ui typecheck` — passed. - UI states are available in `ui/storybook/stories/source-issue-recovery.stories.tsx`; screenshot capture helper is `scripts/screenshot-recovery-card.cjs`. ## Risks - Medium: recovery behavior changes from child recovery issue ownership toward source-scoped actions, so operators may see stalled-work state in new places. - Migration risk is mitigated by using the next migration slot after `master` and making the table/constraints/index creation idempotent for anyone who previously applied the old branch-local `0082_dizzy_master_mold` migration. - Existing child recovery issue paths are still guarded for already-created recovery issues, but new source-scoped flows should be watched in CI and Greptile review. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool use enabled for shell, Git, GitHub, and local test execution. Context window not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
c445e59256 |
fix(ui): fix message attribution for agent-posted comments with user author IDs (#5780)
## Thinking Path > - Paperclip’s issue chat is an audit surface: reviewers need to trust who actually authored a message. > - Some historical agent comments were persisted with `authorUserId` and no surviving `createdByRunId`, so the UI rendered real agent output as if it came from the board user. > - A pure timestamp-window fallback is too risky because human reviewers can comment while agents are running. > - The safe recovery path is to derive attribution only when the server can prove it from same-issue run logs that include the exact posted comment id, then let the chat renderer prefer that recovered agent attribution. > - This keeps historical threads trustworthy without mutating old database rows or guessing in ambiguous cases. ## What Changed - Added shared `IssueComment` fields for derived attribution so server and UI can carry recovered `derivedAuthorAgentId`, `derivedCreatedByRunId`, and `derivedAuthorSource` consistently. - Added server-side attribution recovery in `server/src/services/issues.ts` that reads same-issue run logs and only derives agent authorship when a run log contains the exact `comment id: ...` emitted during posting. - Updated issue chat rendering in `ui/src/lib/issue-chat-messages.ts` to prefer direct agent authorship, then activity-log `runAgentId`, then the server-derived attribution. - Removed the unsafe UI-only run-window fallback from `ui/src/pages/IssueDetail.tsx` so human comments posted during an active run are not silently relabeled as agent output. - Added regression coverage for both the run-log derivation path and the chat-rendering fallback behavior. - Bounded server-side run-log enrichment to 8 concurrent reads per request and removed the unused `issueCommentSchema` declaration during PR cleanup. ## Verification - `pnpm exec vitest run ui/src/lib/issue-chat-messages.test.ts server/src/__tests__/issues-service.test.ts` - `pnpm test:run:general` - Live validation on May 12, 2026 in `PAPA-322`: confirmed the previously misattributed historical comments on `PAPA-316` now render as Claude-authored on `http://goldie.gerbil-company.ts.net:3100`. - Reviewer check: open `PAPA-316` in the running instance and confirm historical comments such as `## Investigation: exe.dev 422 + codex re-test` render under Claude instead of the board user. ## Risks - Low risk. The change is scoped to comment attribution recovery and rendering. - Derived attribution is intentionally conservative: if there is no exact run-log proof, the comment remains user-authored instead of guessing. - Run-log recovery depends on retained same-issue logs, so older comments without that evidence remain unchanged. ## Model Used - OpenAI Codex via the Paperclip `codex_local` adapter (GPT-5-class coding agent with tool use in the local Paperclip runtime; the exact deployment/model ID is not surfaced by this workspace). ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
ad0bb57350 |
Fix exe.dev sandbox installs for gemini/opencode local adapters (#5737)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, including running adapter CLIs inside remote sandboxes > - The QA matrix in PAPA-316 spins up local-runtime adapters (claude/gemini/opencode) against both SSH and the new exe.dev sandbox provider, and "Test" exercises the same install + probe path the real runtime uses > - On exe.dev the QA matrix failed at three different points: SSH/sandbox secret refs would not resolve, gemini-local could not find npm, and opencode-local installed a binary that was not on the probe-shell PATH > - These are all environment-shape issues the runtime should handle, not regressions in any individual adapter, so they need to be fixed in the shared install/resolve layer before the matrix can pass > - This pull request wires the environment id through to secret-ref resolution, bootstraps npm from a portable Node tarball when the sandbox image lacks Node, and symlinks the opencode binary into a directory that non-login shells see > - The benefit is that the QA matrix passes end-to-end on exe.dev, and any future sandbox provider that ships without Node or relies on rc-file PATH wiring gets the same fixes for free ## What Changed - `server/src/services/environment-execution-target.ts`: pass the environment `id` into `resolveEnvironmentDriverConfigForRuntime` for both the sandbox and SSH branches, so `privateKeySecretRef` / sandbox-provider secret refs (e.g. exe.dev `apiKey`) can resolve against the secret store at runtime instead of throwing `Runtime secret resolution requires an environment id`. - `packages/adapter-utils/src/sandbox-install-command.ts`: extend `buildSandboxNpmInstallCommand` with an `ENSURE_NPM_PREAMBLE` that, when `npm` is missing, downloads a portable Node v22 tarball into `$HOME/.local` and sets `PAPERCLIP_NPM_BOOTSTRAPPED=1` so the install step skips sudo (sudo's `secure_path` would lose the freshly-installed `npm` in `$HOME/.local/bin`). Distro-packaged Node from apt-get is intentionally avoided because it tends to be too old to parse modern JS syntax used by `@google/gemini-cli`. - `packages/adapters/gemini-local/src/index.ts`: switch the hardcoded `npm install -g @google/gemini-cli` to `buildSandboxNpmInstallCommand`, so gemini-local picks up the same sudo-aware + npm-bootstrap behavior as the other local adapters. - `packages/adapters/opencode-local/src/index.ts`: append a step to the install command that symlinks `$HOME/.opencode/bin/opencode` into `$HOME/.local/bin`. The upstream installer only adds `~/.opencode/bin` to PATH via `~/.bashrc`, which non-login `sh -c` probe invocations do not source. - `packages/adapter-utils/src/sandbox-install-command.test.ts`: cover the new preamble plus the unchanged root/sudo/user-prefix branches. ## Verification - `cd packages/adapter-utils && npm test -- sandbox-install-command` (passes; new "bootstraps npm from a portable Node tarball when missing" case is included). - Manual: ran the in-app `Test` action against the QA matrix dev instance for `QA exe.dev Claude`, `QA exe.dev Gemini`, and `QA exe.dev OpenCode` — all three now report `status=pass` including the hello probe. `QA SSH Claude` also passes; without the environment-id fix, SSH resolution threw before the wrapper / install fixes could run. - Suggested reviewer check: re-run the matrix on a fresh exe.dev environment and confirm the install step no longer hits `npm: command not found` for gemini and the opencode probe no longer hits `opencode: command not found`. ## Risks - Low/medium. The npm bootstrap pins Node `v22.11.0` from `nodejs.org/dist`; if that URL becomes unreachable the install will fail with a clear `curl` error rather than corrupting state. The bootstrap path is only taken when `npm` is genuinely missing, so existing sandbox images that ship with Node are unaffected. - The opencode symlink uses `ln -sf` into `$HOME/.local/bin`, which is created with `mkdir -p`; idempotent on re-install. - The `id` change is a strict additive: callers previously got `undefined` and only the secret-ref code paths actually read it. No behavior change for environments without secret refs. ## Model Used - Claude (Anthropic), `claude-opus-4-7`, with extended thinking and tool use enabled. Iterated through the Paperclip QA matrix harness; no other model assisted. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots (n/a — runtime/install path only) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
486fb88a15 |
Add Cloudflare sandbox provider plugin (#5687)
> _Stacked on top of #5685 → #5686. Diff against master includes commits from earlier PRs in the stack — review focuses on the two new commits (`Extend sandbox callback bridge for Worker-hosted plugins` + `Add Cloudflare sandbox provider plugin`)._ ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs in a sandbox environment, and operators choose which provider backs that sandbox — today E2B and Daytona are bundled with the platform > - Cloudflare Workers + Durable Objects + the Sandbox SDK offer a credible new option: globally distributed, cheap idle, and operator-deployable as a single Worker > - To plug it in, Paperclip needs (a) a provider plugin that speaks the `PaperclipPluginManifestV1` lifecycle and (b) a small operator-deployed Worker — the **bridge** — that adapts Paperclip's runtime RPCs to the Cloudflare Sandbox SDK > - The plugin extends the existing sandbox-callback-bridge with a `bridge.transport: "worker"` discriminator so the platform routes runtime RPCs through the Worker bridge instead of the in-process runner > - This pull request adds the plugin, the bridge Worker template, and the supporting adapter-utils + server hooks the new transport needs > - The benefit is that operators can run sandboxes on Cloudflare's edge with no new platform code beyond installing the plugin and deploying the Worker ## What Changed **Shared support (`Extend sandbox callback bridge for Worker-hosted plugins`):** - `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`: expose `expectedHostHeader` so plugin-side bridge clients can verify the canonical request envelope before forwarding. - `packages/adapter-utils/src/command-managed-runtime.{ts,test.ts}`: relax the always-fresh runner construction so callers can re-use a runner across exec calls (Worker-hosted bridges hold the runner inside a Durable Object). - `server/src/services/environment-runtime.ts` + `environment-runtime.test.ts`: route Worker-hosted bridges through the same env-shaping path as E2B and pin the `requestEnv` contract. - `server/src/services/plugin-environment-driver.ts`: thread an optional `issueId` through the runtime descriptor so bridges can scope leases to the originating issue (used by Cloudflare to map a sandbox to the issue/workflow for billing and audit). - `packages/plugins/sdk/src/protocol.ts`: add `issueId?` to `PluginEnvironmentDriverBaseParams` and the new `bridge.transport: "worker"` discriminator that the new plugin declares. - `server/__tests__/heartbeat-plugin-environment.test.ts`: pin the heartbeat path against the new runtime descriptor. **The Cloudflare plugin itself (`Add Cloudflare sandbox provider plugin`):** - `packages/plugins/sandbox-providers/cloudflare/`: plugin entry, manifest, plugin runtime (lifecycle + bridge client), config parsing, and Vitest coverage. Manifest declares `bridge.transport: "worker"` so the platform routes runtime RPCs through the bridge client. - `bridge-template/`: a Worker template the operator deploys with `wrangler`. Owns Durable Object-backed sessions (`sessions.ts`), exec/stream routes (`exec.ts`, `routes.ts`), and an HMAC auth layer (`auth.ts`) that pins the `Host` header surface. Includes the SDK-contract-correct exec implementation, lease recovery, and chunked stdout/stderr streaming. - Tests cover lease/session handoff (`bridge-template/src/exec.test.ts`, `routes.test.ts`), bridge client request shaping (`src/bridge-client.test.ts`), and end-to-end plugin behavior (`src/plugin.test.ts`) including streamed exec output. 27 tests in total. - `README.md` walks the operator through deploying the bridge Worker, registering the plugin, and configuring the runtime. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage packages/adapter-utils/src/sandbox-callback-bridge.test.ts packages/adapter-utils/src/command-managed-runtime.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/heartbeat-plugin-environment.test.ts` - `(cd packages/plugins/sandbox-providers/cloudflare && pnpm test)` — 27 passing For an operator-side smoke test: 1. Deploy the bridge: `cd packages/plugins/sandbox-providers/cloudflare/bridge-template && wrangler deploy` 2. Register the plugin in your Paperclip instance, point its bridge URL at the deployed Worker, set the HMAC shared secret. 3. Create a sandbox environment whose provider is `cloudflare`, then run a Codex or Claude job against it. ## Risks - Adds a new `bridge.transport: "worker"` code path, but the existing E2B / Daytona transports go through the same shaped helpers and have explicit test coverage that pins their behavior unchanged. - The Worker bridge stores session state in a Durable Object; operator instances must be aware of the corresponding Cloudflare costs (DO requests, storage). Documented in the README. - The `issueId` plumbing is optional throughout — existing plugins that don't supply it continue to work. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes (plugin README, bridge-template README) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
b24c6909e8 |
Harden remote sandbox runtime probes, timeouts, and installs (#5685)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent runs inside a sandbox environment so its CLI is isolated from the host > - Sandbox-backed adapter runs go through a small set of shared helpers — `ensureAdapterExecutionTargetCommandResolvable`, the sandbox callback bridge runner, and per-adapter `SANDBOX_INSTALL_COMMAND` strings > - When standing up new sandbox provider plugins, the existing helpers timed out, missed install fallbacks, or leaned on assumptions that only held for E2B > - Local adapters (`claude-local`, `codex-local`, `gemini-local`, `opencode-local`) needed slightly hardened probes so they could install themselves and validate inside *any* remote sandbox transport, not just E2B > - This pull request bundles those runtime fixes so future sandbox provider plugins inherit a working baseline > - The benefit is that adding a new sandbox provider plugin no longer requires touching adapter-utils or each local-adapter probe — the supporting infra is already correct ## What Changed - `packages/adapter-utils/src/execution-target.ts`: introduce `DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC = 1800` and `resolveAdapterExecutionTargetTimeoutSec(...)`. Local and SSH adapters keep the historical "0 means no adapter timeout" behavior; sandbox-backed runs without an explicit `timeoutSec` get an explicit 30-minute default so remote installs and warm-up don't time out at the per-RPC default. Plumbed `timeoutSec` through `ensureAdapterExecutionTargetCommandResolvable` so install probes inside a sandbox honor adapter-level overrides instead of the bridge's 5-minute default. - `packages/adapters/opencode-local/src/index.ts`: switch `SANDBOX_INSTALL_COMMAND` from `npm install -g opencode-ai` to `curl -fsSL https://opencode.ai/install | bash`. The npm package reifies four large prebuilt-binary subpackages in parallel even though only one matches the host arch; on bandwidth-constrained sandboxes that blew through the 240s install budget. The official installer fetches one arch-specific binary and adds `$HOME/.opencode/bin` to PATH via `~/.bashrc`, which the sandbox-callback-bridge login-shell script already sources. - `packages/adapters/{claude,codex,gemini,opencode}-local/`: harden remote-target probes — pass `--skip-git-repo-check` for Codex when probing outside a repo, normalize permission flags for Claude, and add `*.remote.test.ts` coverage that exercises the remote-sandbox path explicitly for each adapter. - `packages/adapter-utils/src/sandbox-install-command.{ts,test.ts}` (new): add `buildSandboxNpmInstallCommand` helper. `server/src/adapters/registry.ts` + new `server/src/__tests__/adapter-registry.test.ts`: wire adapter install commands so they fall back to a writable `$HOME/.local` prefix when global install isn't available. - `server/src/__tests__/plugin-worker-manager.test.ts` + new `server/src/__tests__/fixtures/plugin-worker-delayed.cjs`: pin per-call timeout overrides so plugin worker exec calls honor the caller's timeout instead of the worker's default. ## Verification - `pnpm typecheck` - `pnpm exec vitest run --no-coverage packages/adapter-utils/src/execution-target-sandbox.test.ts packages/adapter-utils/src/sandbox-install-command.test.ts` - `pnpm exec vitest run --no-coverage server/src/__tests__/plugin-worker-manager.test.ts server/src/__tests__/adapter-registry.test.ts server/src/__tests__/claude-local-adapter-environment.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/gemini-local-adapter-environment.test.ts` - `pnpm exec vitest run --no-coverage packages/adapters/codex-local/src/server/test.remote.test.ts packages/adapters/opencode-local/src/server/test.remote.test.ts packages/adapters/codex-local/src/server/codex-args.test.ts packages/adapters/codex-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts` All passing locally. ## Risks - Touches shared `adapter-utils` and several `*-local` adapters. The 30-minute default applies only when both (a) the target is `remote+sandbox` and (b) no `timeoutSec` is configured — local + SSH paths are unchanged. New test coverage was added alongside each behavior change to pin the contracts. - Switching OpenCode's install command to the official installer is a behavior change for any operator running OpenCode inside a remote sandbox. Local installs are unaffected (the `SANDBOX_INSTALL_COMMAND` only runs when an adapter is being installed inside a sandbox). - Low risk overall — no migrations, no API surface change. ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 (1M context) - Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep), no code execution beyond local repo commands ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A, no UI change - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
534aee66ae |
Add cursor_cloud adapter for Cursor SDK + Cloud Agents API v1 (#5664)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - There are many adapter types, one per agent-runtime product (Claude,
Codex, OpenCode, Cursor local CLI, etc.)
> - Cursor shipped a public TypeScript SDK on 2026-04-29 that exposes
Cursor's full hosted-agent platform (cloud VMs, harness, MCP, skills,
hooks)
> - Paperclip had no first-class adapter for this — agents that wanted
to use Cursor's managed cloud runtime had to fall back to the local CLI
adapter, which loses the cloud session, streaming, and durable run model
> - This PR adds a new `cursor_cloud` adapter built directly on
`@cursor/sdk`, with Paperclip's heartbeat mapped to Cursor's
durable-agent + per-run model
> - The benefit is that any Paperclip agent can now drive a Cursor cloud
agent across heartbeats with native session reuse, streaming, and
cancellation, while Paperclip remains the source of truth for issue/task
state
## What Changed
- New built-in adapter package `packages/adapters/cursor-cloud` (15
files, ~1.7k LOC) backed by `@cursor/sdk` ^1.0.12
- `src/server/execute.ts` — SDK-first lifecycle: `Agent.create` /
`Agent.resume` / `Agent.getRun` / `agent.send` / `run.stream` /
`run.wait`, with session reuse keyed on the (runtime env type, env name,
repo set) tuple
- `src/server/session.ts` — codec for `cursorAgentId` + `latestRunId` +
repo metadata, persisted in `runtime.sessionParams`
- `src/server/test.ts` — environment probe via `Cursor.me()` and
optional model validation via `Cursor.models.list()`
- `src/ui/parse-stdout.ts` + `src/cli/format-event.ts` — normalize
Cursor SDK message types (`status`, `thinking`, `assistant`, `user`,
`tool_call`, `tool_result`, `result`) into Paperclip transcript events
for the UI and CLI
- Registrations: `packages/shared/src/constants.ts`,
`packages/adapter-utils/src/session-compaction.ts`,
`server/src/adapters/{registry,builtin-adapter-types}.ts`,
`ui/src/adapters/{registry,adapter-display-registry}.ts` +
`ui/src/adapters/cursor-cloud/index.ts`, `cli/src/adapters/registry.ts`,
plus workspace deps in `cli`/`server`/`ui` `package.json`
- `ui/src/components/AgentConfigForm.tsx` — hide local-Cursor
`mode`/thinking-effort field for `cursor_cloud` (different config
surface)
- 11 vitest tests covering execute paths (fresh create, matching-resume,
active-run reattach, non-finished result), session codec round-trip,
transcript parsing, and config building
## Verification
Reviewer steps:
```bash
pnpm install
pnpm --filter @paperclipai/adapter-cursor-cloud typecheck # → clean
pnpm vitest run packages/adapters/cursor-cloud # → 11/11 passing
```
End-to-end check against a real Cursor cloud agent (requires
`CURSOR_API_KEY` and Cursor GitHub-app install on the target repo):
1. Create a `cursor_cloud` agent in Paperclip with `repoUrl` set to the
test repo, `repoStartingRef: main`, and `env.CURSOR_API_KEY` set
2. Trigger a heartbeat → adapter calls `Agent.create({ cloud: { env: {
type: "cloud" }, repos: [...] } })`, streams events, terminates on
`finished`
3. Trigger a second heartbeat → adapter calls `Agent.resume` or
`agent.send` follow-up depending on prior-run state, reusing
`cursorAgentId`
4. The Paperclip UI/CLI transcript reflects Cursor `status` / `thinking`
/ `assistant` events as they stream
5. Cancellation from Paperclip maps to `run.cancel()` or Cloud API v1
`cancelRun` for cross-heartbeat cancellation
A direct-SDK smoke run against a real repo (devinfoley/my_test_project @
main) confirmed: `Cursor.me()` ok → `Agent.create` → `agent.send` →
`run.stream()` (30 events) → terminal status `finished` in ~11s.
## Risks
- **New adapter, additive only.** No existing adapter or registry is
replaced; current `cursor` local-CLI adapter is untouched. Default
behavior of any existing agent is unchanged.
- **External dependency on `@cursor/sdk`.** Cursor's SDK is v1.0.x and
may evolve. Mocked unit tests cover the public surface used here; if the
SDK breaks compatibility we update the adapter independently.
- **Cost/budget.** `cursor_cloud` runs on Cursor's billed cloud VMs;
operators must understand they are spending money outside Paperclip's
budget controls when they enable this adapter. Same shape as other
API-billed adapters.
- **No webhook support in V1.** The SDK already provides
stream/wait/cancel/reattach, so V1 does not require a public callback
URL. If a future use case needs out-of-band wakes, we add a Cloud API v1
webhook bridge as a separate change. This is called out in the issue
plan document.
- **Lockfile.** Per repo policy, `pnpm-lock.yaml` is intentionally not
in this PR — CI's lockfile workflow will update it on merge given the
manifest changes.
## Model Used
- Provider: Anthropic Claude (via Claude Code / Paperclip `claude_local`
adapter)
- Model: `claude-opus-4-7` (Claude Opus 4.7), knowledge cutoff January
2026
- Mode: standard tool-use with extended reasoning
- Context: ~200k token window
- Capabilities used: code generation, multi-file edits, shell/test
execution, GitHub PR workflow
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (11/11 in
`packages/adapters/cursor-cloud`)
- [x] I have added or updated tests where applicable (4 new test files,
11 cases)
- [ ] If this change affects the UI, I have included before/after
screenshots (the only UI change is hiding the local-Cursor mode field on
the `cursor_cloud` adapter — happy to attach a screenshot if the
reviewer wants one)
- [x] I have updated relevant documentation to reflect my changes (issue
plan document supersedes the pre-SDK design; tracked in PAPA-203)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
|
||
|
|
0096b56a1c |
[codex] Add LLM Wiki plugin host support (#5597)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The plugin system needs host contracts and runtime support before large plugins can integrate cleanly. > - The source branch mixed the LLM Wiki package with supporting host/runtime work, managed plugin skills, root-level storage spaces, and a bookmarks reference plugin. > - [PAP-9173](/PAP/issues/PAP-9173) asked for the current branch to be split by file boundary: plugin package separately from everything else. > - [PAP-9188](/PAP/issues/PAP-9188) clarified that LLM Wiki may have plugin-local spaces, but Paperclip core should not reorganize top-level local storage into spaces. > - Follow-up review clarified that the bookmarks example should not ship in this PR either. > - This pull request contains the non-`packages/plugins/plugin-llm-wiki/` host/runtime work, keeps runtime state under the selected Paperclip instance root, and no longer includes the bookmarks example. ## What Changed - Added/updated plugin host contracts, SDK types, worker RPC plumbing, managed plugin skill support, and related server tests. - Removed the bookmarks example plugin package and its bundled-example/workspace references. - Removed the root-level local spaces CLI/migration surface and restored instance-root runtime defaults for config, db, logs, storage, secrets, workspaces, projects, and adapter homes. - Replaced shared root `space-paths` helpers with `home-paths` helpers for core runtime storage. - Tightened stranded recovery unique-conflict detection so concurrent recovery scans reuse the raced recovery issue when Postgres errors are wrapped. - Kept `packages/plugins/plugin-llm-wiki/` out of this PR diff; plugin-local spaces remain in the stacked plugin-only PR. ## Verification - `pnpm exec vitest run cli/src/__tests__/data-dir.test.ts cli/src/__tests__/home-paths.test.ts cli/src/__tests__/onboard.test.ts packages/shared/src/home-paths.test.ts packages/db/src/runtime-config.test.ts server/src/__tests__/agent-instructions-service.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/codex-local-execute.test.ts` - `pnpm exec vitest run packages/db/src/runtime-config.test.ts` - `pnpm exec vitest run server/src/__tests__/plugin-routes-authz.test.ts` - `pnpm --filter @paperclipai/server typecheck` - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts -t "reuses the raced stranded recovery issue"` skipped locally because embedded Postgres did not initialize on this macOS temp host; the code path was typechecked and is covered by Linux CI. - Boundary check: no core references remain for `PAPERCLIP_SPACE_ID`, `spaces migrate-default`, `@paperclipai/shared/space-paths`, `registerSpacesCommands`, or the removed bookmarks example. - Previous PR head `4f23e034` had green GitHub checks: `verify`, all four serialized server shards, `e2e`, `Canary Dry Run`, `policy`, Snyk, and `Greptile Review`. Current head `582f466d` is re-running checks after the bookmarks deletion. ## Risks - Plugin host changes touch shared runtime paths, so regressions would most likely appear in adapter startup, plugin loading, or local dev path defaults. - Removing the bookmarks example also removes one demonstration of plugin database namespaces plus local-folder persistence; remaining plugin examples still cover bundled example discovery and plugin host flows. - The plugin package itself is intentionally deferred to the stacked plugin-only PR, where LLM Wiki plugin-local spaces live. - Existing installs that tested the transient root-level spaces CLI should stop using it; this PR intentionally removes that unsupported migration surface before merge. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI GPT-5 Codex via Codex CLI, tool use and local code execution enabled; context window not exposed. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass, except where noted above for host-specific embedded Postgres initialization - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Stacked follow-up: PR #5592 contains only `packages/plugins/plugin-llm-wiki/` and targets this branch. --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
a72731f118 |
fix: harden release registry verification against npm lag (#4816)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Its release automation publishes canary packages to npm and then validates the published registry state before considering the release healthy > - The failing canary run `25139465018` showed that npm can expose a newly published version through version-specific endpoints before the root package document has fully converged > - That made a successful canary publish look like a failed release because the verifier trusted stale root metadata too early > - This pull request hardens the registry verification path by preferring version-specific manifest checks, retrying convergence-sensitive failures, and distinguishing permanent failures from propagation lag > - While validating that change in CI, a separate teardown race in `heartbeat-stale-queue-invalidation.test.ts` surfaced and was hardened so the PR could pass reliably > - The benefit is that transient npm propagation lag no longer fails a successful canary publish, while genuine registry-state and dependency-integrity failures still stop the release flow promptly ## What Changed - Hardened `scripts/verify-release-registry-state.mjs` so it prefers version-specific manifest resolution over stale root metadata, adds bounded registry-fetch timeouts, and classifies failures as retriable vs non-retriable. - Updated `scripts/release-lib.sh` and `scripts/release.sh` so post-publish registry verification retries only convergence-sensitive failures and reports immediate permanent failures clearly. - Expanded `scripts/verify-release-registry-state.test.mjs` with regression coverage for stale root metadata, fetch timeout behavior, peer dependency range handling, non-retriable canary-latest cases, and related verifier edge cases. - Hardened `server/src/__tests__/heartbeat-stale-queue-invalidation.test.ts` teardown to tolerate the late-comment foreign-key race that CI exposed while validating this branch. ## Verification - `pnpm run test:release-registry` - `node --check scripts/verify-release-registry-state.mjs` - `bash -n scripts/release.sh && bash -n scripts/release-lib.sh` - PR checks passed on head `5c422600fc12acac61f6b7c267a4dc915df622b1`: `policy`, `verify`, `e2e`, `security/snyk`, and `Greptile Review` ## Risks - Low risk. The main behavioral changes are limited to release automation and verifier retry semantics, plus a test-only teardown hardening for a CI race. > I checked [`ROADMAP.md`](ROADMAP.md). This is a narrow release bugfix and does not overlap planned core feature work. ## Model Used - OpenAI Codex via Paperclip `codex_local` with tool use and local code execution enabled. This agent session runs on a GPT-5-class coding model; the exact backend model ID/context window is not exposed by the local adapter runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I have addressed all Greptile and reviewer comments before requesting merge |
||
|
|
2f72cb29ea |
chore: update drizzle-orm to 0.45.2 (#5589)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The server, DB package, and CLI all rely on the shared Drizzle ORM dependency for core persistence flows. > - A published install was still resolving nested `drizzle-orm@0.38.4`, which left the production package graph behind the intended security update. > - The repo’s documented dependency policy says GitHub Actions owns `pnpm-lock.yaml`, so the correct maintainer workflow is to update dependency manifests in the feature PR and let the lockfile refresh happen separately after merge. > - This pull request therefore keeps the Drizzle upgrade to the package manifests only and leaves lockfile regeneration to the existing `Refresh Lockfile` automation. ## What Changed - Updated `drizzle-orm` dependency declarations in `cli/package.json`, `packages/db/package.json`, and `server/package.json` from `0.38.4` / `^0.38.4` to `0.45.2` / `^0.45.2`. - Re-verified the packed `@paperclipai/db` and `@paperclipai/server` publish payloads to confirm their generated `package.json` files advertise `drizzle-orm ^0.45.2`. - Removed the temporary lockfile/CI follow-up commits so the branch now matches the intended manifest-only protocol. ## Verification - `pnpm list drizzle-orm -r --depth 0` - `pnpm exec vitest run packages/db/src/client.test.ts server/src/__tests__/issues-service.test.ts` - `pnpm run test:release-registry` - Packed `@paperclipai/db` and `@paperclipai/server` locally and inspected the tarball `package.json` files to confirm they advertise `drizzle-orm ^0.45.2`. ## Risks - Low to moderate risk: the runtime code paths are unchanged, but downstream lockfile refresh now depends on the existing post-merge GitHub automation working as documented. - A separate packaging/versioning issue around unpublished `@paperclipai/plugin-sdk@1.0.0` showed up during a raw local tarball install experiment; that is called out for reviewers but is not part of this Drizzle bump. ## Model Used - OpenAI Codex via the `codex_local` adapter, using a GPT-5-based coding agent with terminal tool use and code execution. The adapter does not expose a public exact model ID or context-window value in this environment. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
778e775c35 |
Add secrets provider vaults and remote import (#5429)
## Thinking Path > - Paperclip orchestrates AI-agent companies and needs secrets handling to work across local development, hosted operators, and governed agent execution. > - The affected subsystem is the company-scoped secrets control plane: database schema, server services/routes, CLI workflows, and the Secrets settings UI. > - The gap was that secrets were local-only and operators could not manage provider vaults or import existing remote references without exposing plaintext. > - This branch adds provider vault configuration plus an AWS Secrets Manager remote-import path while preserving company boundaries, binding context, and audit trails. > - I kept the PR to a single branch PR, removed unrelated lockfile/package drift, rebased the full branch onto the current `public-gh/master`, and addressed fresh Greptile findings. > - The benefit is a reviewable implementation of provider-backed secrets with focused tests covering provider selection, import conflicts, deleted secret reuse, rotation guards, and AWS signing behavior. ## What Changed - Added provider vault support for company secrets, including provider config storage, default vault handling, health checks, binding usage, access events, and remote import preview/commit. - Added an AWS Secrets Manager provider using SigV4 request signing, bounded request timeouts, namespace guardrails, cached runtime credential resolution, and external-reference linking without plaintext reads. - Added Secrets UI surfaces for vault management and remote import, plus CLI/API documentation for setup and operations. - Stabilized routine webhook secret binding paths and SSH environment-driver fixture bindings discovered during verification. - Addressed Greptile and CI findings: no lockfile/package drift, monotonic migration metadata, disabled-vault default races, soft-deleted secret hiding/recreate behavior, remove behavior with disabled vaults, soft-deleted external-reference re-import, non-active rotation guards, managed-secret soft deletion through PATCH, and per-call AWS SDK credential client churn. - Rebased this branch onto `public-gh/master` at `0e1a5828` and force-pushed with lease to keep this as the single PR for the branch. ## Verification - `git fetch public-gh master` - `git rebase public-gh/master` - `git diff --name-only public-gh/master...HEAD | grep '^pnpm-lock\.yaml$' || true` confirmed `pnpm-lock.yaml` is not in the PR diff. - Confirmed migration ordering: master ends at `0081_optimal_dormammu`; this PR adds `0082_dry_vision` and `0083_company_secret_provider_configs`. - Inspected migrations for repeat safety: new tables/indexes use `IF NOT EXISTS`; foreign keys are guarded by `DO $$ ... IF NOT EXISTS`; column additions use `ADD COLUMN IF NOT EXISTS`. - `pnpm -r typecheck` passed before the Greptile follow-up commits. - `pnpm test:run` ran the full stable Vitest path before the Greptile follow-up commits; it completed with 3 timing-related failures under parallel load: `codex-local-execute.test.ts`, `cursor-local-execute.test.ts`, and `environment-service.test.ts`. - `pnpm --filter @paperclipai/server exec vitest run src/__tests__/codex-local-execute.test.ts src/__tests__/cursor-local-execute.test.ts src/__tests__/environment-service.test.ts` passed on targeted rerun (`24/24`). - `pnpm build` passed before the Greptile follow-up commits. Vite reported existing chunk-size/dynamic-import warnings. - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/secrets-service.test.ts` passed (`26/26`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server exec vitest run src/__tests__/aws-secrets-manager-provider.test.ts src/__tests__/secrets-service.test.ts` passed (`39/39`). - After Greptile follow-up commits: `pnpm --filter @paperclipai/server typecheck` passed. - Captured Storybook screenshots from `ui/storybook-static` for visual review. - Latest PR checks on `5ca3a5cf`: `policy`, serialized server suites 1/4-4/4, `Canary Dry Run`, `e2e`, `security/snyk`, and `Greptile Review` pass; aggregate `verify` is still registering the completed child checks. - Greptile review loop continued through the latest requested pass; all Greptile review threads are resolved and the latest `Greptile Review` check on `5ca3a5cf` passed with 0 comments added. ## Screenshots Before: the provider-vault and remote-import surfaces did not exist on `master`; these are after-state screenshots from the Storybook fixtures.    ## Risks - Migration risk: this adds new secret provider tables and extends existing secret rows. The migrations were checked for monotonic ordering and idempotent guards, but reviewers should still inspect upgrade behavior carefully. - Provider risk: AWS support uses direct SigV4 requests. Automated tests cover signing, request timeouts, vault-config selection, namespace guardrails, pending-version archival, sanitized provider errors, and service-level cleanup paths. A real-vault AWS smoke test remains deployment validation for an operator with AWS credentials rather than an unverified merge blocker in this local branch. - UI risk: the Secrets page and import dialog are large new surfaces; screenshots are included above for reviewer inspection. - Verification risk: the full local stable test command hit parallel-load timing failures, although the exact failed files passed when rerun directly. - Operational risk: remote import intentionally avoids plaintext reads; operators must understand that imported external references resolve at runtime and may fail if AWS permissions change. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent with local shell/tool use in the Paperclip worktree. Exact context-window size was not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [ ] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
0e1a582831 |
Revert "Add experimental newest-first issue thread" (#5460)
This is actually bad. Glad it was under experiments. |
||
|
|
a904effb96 |
Add experimental newest-first issue thread (#5455)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, so issue threads are a core operator surface for reviewing work. > - The issue detail page is the place where humans read agent messages, user comments, and execution context together. > - That thread originally rendered oldest-first, which made recent activity harder to see during active review. > - Reversing the thread order changes navigation expectations, timestamp placement, and the "Jump to latest" affordance, so the UI behavior needed to move as a coherent set. > - Because this is a visible core-product behavior shift, it also needed a safe rollout path instead of becoming the default immediately. > - This pull request adds the newest-first issue thread behavior behind an Experimental setting, updates the thread UI to match that mode, and keeps the legacy oldest-first experience unchanged by default. > - The benefit is that reviewers can opt into a more recent-first issue workflow without forcing a global behavior change on every Paperclip instance. ## What Changed - Reversed issue thread rendering so the newest comments and messages appear first when the experiment is enabled. - Moved the plain comment timestamp into the card header in newest-first mode and kept the legacy timestamp placement for oldest-first mode. - Moved the `Jump to latest` control to the bottom of the thread in newest-first mode while leaving the existing top placement for the legacy mode. - Added the `Enable Newest-First Issue Thread` experimental instance setting and wired issue detail to read that toggle. - Added regression coverage for thread order, timestamp placement, jump-button placement, and the issue-detail experiment toggle behavior. ## Verification - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Focused checks that also passed during issue review: - `pnpm vitest run src/components/IssueChatThread.test.tsx src/pages/IssueDetail.test.tsx` in `ui/` - `pnpm vitest run src/__tests__/instance-settings-routes.test.ts` in `server/` - Manual review path: - Enable `Instance Settings > Experimental > Enable Newest-First Issue Thread` - Open an issue with comments/messages and confirm newest activity renders first, timestamps move into the header, and `Jump to latest` sits below the thread - Disable the experiment and confirm the legacy oldest-first behavior returns ## Risks - Low risk: the behavioral change is gated behind an instance-level experimental toggle and defaults off. - The main regression risk is thread navigation drift between the two modes, especially around anchor scrolling and the `Jump to latest` affordance. - There is some UI coupling between issue-detail query state and experimental settings fetches, so future changes in that area should keep both modes covered. - Screenshots are not attached in this PR body; verification is described with automated coverage and manual steps instead. > I checked [`ROADMAP.md`](ROADMAP.md). This is a scoped issue-thread UX improvement and rollout gate, not a duplicate of a roadmap-level planned core feature. ## Model Used - OpenAI Codex via the local `codex_local` Paperclip adapter, GPT-5-based coding agent with terminal tool use and local code execution in this repository worktree. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
fe3904f434 |
Stabilize runtime probes and Codex env tests (#5445)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters expose a Test action that probes the configured runtime —
install, resolvability, hello — to give operators a fast yes/no on
whether an environment is healthy
> - The Codex test path was running its hello probe directly without
going through the managed-runtime preparation that production runs use,
so a healthy production setup could still report a probe failure
> - The plugin worker manager wasn't surfacing terminated workers
cleanly, leaving the runtime probe waiting on a dead worker until the
request timed out
> - This pull request routes the Codex test probe through
`prepareAdapterExecutionTargetRuntime` (so it sees the same managed
Codex home production sees), exposes `commandCwd` on
`createCommandManagedRuntimeClient` so callers can target a per-probe
directory without leaking the workspace `remoteCwd`, and propagates
plugin-worker termination as a usable error instead of a hang
> - The benefit is the Codex Test action mirrors production behavior
end-to-end, and probes against a terminated plugin worker fail fast
instead of timing out
## What Changed
- `packages/adapter-utils/src/command-managed-runtime.ts`: rename the
`remoteCwd` knob to `commandCwd` so callers can target a per-probe
directory without inheriting the workspace cwd; matching test coverage
in `command-managed-runtime.test.ts`
- `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`:
small fixes to keep callback bridge stop semantics deterministic
- `packages/adapters/codex-local/src/server/test.ts`: thread the Codex
hello probe through `prepareAdapterExecutionTargetRuntime` +
`prepareManagedCodexHome` so the probe sees the same managed home
production sees; new `test.remote.test.ts` covers the remote probe path
- `packages/adapters/cursor-local/src/server/execute.ts`: small
probe-side cleanup that aligns with the new commandCwd contract
- `server/src/services/plugin-worker-manager.ts`: surface plugin-worker
termination as a structured error so callers fail fast; new
`plugin-worker-terminated.cjs` fixture and
`plugin-worker-manager.test.ts` cases pin the behavior
## Verification
- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-codex-local --project
@paperclipai/adapter-cursor-local --project @paperclipai/server` —
1749/1750 passing (1 unrelated skip)
- `pnpm typecheck` clean
## Risks
Low–medium. The `remoteCwd → commandCwd` rename is a parameter renaming
on an internal helper used only by adapter test/execute paths in this
repo. The plugin-worker-terminated path was previously a hang; failing
fast may surface latent timeouts as explicit termination errors in
callers that already expected them.
## Model Used
Claude Opus 4.7 (1M context)
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
commandCwd, plugin-worker termination, and Codex remote test path
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---
> **Stacked PR.** Sits on top of #5444 which adds the per-run runtime
API surface this PR builds on. Cumulative diff against `master` includes
that PR's content; the files touched by *this* PR's commit are listed
under "What Changed" above. Will rebase onto `master` and force-push
once #5444 merges.
|
||
|
|
e400315cbf |
Guard assigned backlog liveness (#5428)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The issue graph and liveness recovery system decide whether assigned work is executable or parked > - Assigned issues created without an explicit status could silently land in backlog, making parents look blocked with no productive wake path > - The server, shared validators, recovery analysis, and UI all need to agree on that execution semantic > - This pull request makes assigned issue creation default to `todo`, flags assigned backlog blockers, and surfaces the state in the board > - The benefit is that parked assigned work becomes intentional and visible instead of creating silent liveness stalls ## What Changed - Adds contract tests for assigned issue creation defaults. - Defaults assigned issue creation to `todo` when status is omitted while preserving explicit `backlog` parking. - Exposes `resolveCreateIssueStatusDefault` through shared validators. - Teaches liveness/blocker attention paths to distinguish assigned backlog blockers. - Adds UI notices, row/header badges, and issue detail safeguards for assigned backlog blockers. - Adds Storybook fixtures and execution-semantics documentation for the assigned-backlog behavior. ## Verification - `pnpm run preflight:workspace-links && pnpm exec vitest run packages/shared/src/validators/issue.test.ts server/src/__tests__/issue-assigned-backlog-contract-routes.test.ts server/src/__tests__/issue-blocker-attention.test.ts server/src/__tests__/issue-liveness.test.ts server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts ui/src/components/IssueAssignedBacklogNotice.test.tsx ui/src/components/IssueRow.test.tsx` — 50 passed, 23 skipped. - Skipped tests were embedded Postgres suites on this host with the repo skip message: `Postgres init script exited with code null. Please check the logs for extra info. The data directory might already exist.` - Pairwise merge check against the issue-controls PR branch completed without conflicts via `git merge --no-commit --no-ff` in a temporary worktree. - Screenshots for assigned-backlog UI states: [light](docs/pr-screenshots/pr-5428/assigned-backlog-light.png), [dark](docs/pr-screenshots/pr-5428/assigned-backlog-dark.png). - Follow-up checks: `pnpm --filter /ui typecheck`; `pnpm --filter /mcp-server build`; `pnpm --filter /mcp-server test`; `pnpm exec vitest run packages/shared/src/validators/issue.test.ts`; focused UI component tests. - Remote PR checks on head `6300b3c`: policy, verify, serialized server shards 1/4-4/4, Canary Dry Run, e2e, Greptile Review, and Snyk all passed. ## Risks - Medium: changes status defaulting for assigned issue creation when the caller omits status. Explicit `backlog` remains supported, and server/shared tests cover both paths. - Medium: liveness classification changes can affect blocker attention labels; focused service and UI tests cover the new assigned-backlog state. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled Paperclip heartbeat environment. Context window and internal reasoning mode are not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
772fc92619 |
Add issue controls and retry-now recovery (#5426)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Issue operators need clear controls for execution settings, model overrides, and recovery retries > - Existing issue properties hid useful adapter override state and did not expose a board-triggered retry for scheduled heartbeat recovery > - Scheduled retries also need to respect the same safety gates as normal execution instead of bypassing budget, review, pause, dependency, or terminal-state checks > - This pull request adds the issue property controls and retry-now surfaces together because they share the issue details/properties UI > - The benefit is that operators can inspect and adjust issue execution settings and safely trigger pending scheduled recovery without hidden control-plane behavior ## What Changed - Adds editable issue assignee model override controls in `IssueProperties`, with focused coverage. - Removes the stale workspace tasks link from issue properties. - Adds a scheduled retry `retry-now` backend path and shared response types. - Adds main-pane and properties-pane scheduled retry UI, backed by a shared `useRetryNowMutation` hook. - Adds suppression coverage for budget hard stops, review participant changes, subtree pause holds, unresolved blockers, terminal issues, and company scoping. - Updates the `IssueProperties` test harness with toast actions required by the retry-now hook. ## Verification - `pnpm exec vitest run ui/src/components/IssueProperties.test.tsx ui/src/components/IssueScheduledRetryCard.test.tsx` — 31 passed. - `pnpm exec vitest run server/src/__tests__/issue-scheduled-retry-routes.test.ts` — exited 0, but this host skipped the embedded Postgres route tests with: `Postgres init script exited with code null. Please check the logs for extra info. The data directory might already exist.` - Pairwise merge check against the assigned-backlog PR branch completed without conflicts via `git merge --no-commit --no-ff` in a temporary worktree. ### Visual verification screenshots Storybook story: `Product/Issue Scheduled retry surfaces / ScheduledRetrySurfaces`.   ## Risks - Medium: this touches issue execution/retry behavior, so CI should run the embedded Postgres route tests on a host that can initialize Postgres. - Low-to-medium UI risk around duplicated retry-now entry points; both surfaces share one mutation hook to keep behavior consistent. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent, GPT-5 model family (`gpt-5`), tool-enabled Paperclip heartbeat environment. Context window and internal reasoning mode are not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
d0e9cc76f2 |
Show workspace changes and stale notices in issue threads (#5356)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The issue thread is the operator's durable audit trail for what changed and why > - Workspace changes and stale disposition notices need to be visible in that same timeline without noisy or misleading rendering > - The local branch already contained backend activity details, timeline conversion, and UI rendering work for those events > - This pull request isolates the issue-thread activity work into a standalone branch against `origin/master` > - The benefit is a focused audit-trail PR that can merge independently of the sidebar/operator UI polish branch ## What Changed - Adds readable workspace-change activity details to issue update activity events. - Surfaces workspace-change events in issue chat/timeline rendering. - Makes the existing issue comment migration idempotent. - Folds and renders stale disposition notices inline so they match activity-log styling and spacing. - Adds focused route, timeline, and issue-thread system notice coverage. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run server/src/__tests__/issue-activity-events-routes.test.ts ui/src/lib/issue-timeline-events.test.ts ui/src/components/IssueChatThreadSystemNotice.test.tsx` — 3 files passed, 22 tests passed. - Confirmed the PR changes 9 files and does not include `pnpm-lock.yaml` or `.github/workflows/*`. - `pnpm exec vitest run server/src/__tests__/issue-closed-workspace-routes.test.ts` — 1 file passed, 4 tests passed. - `pnpm exec vitest run server/src/__tests__/issue-activity-events-routes.test.ts ui/src/lib/issue-timeline-events.test.ts ui/src/components/IssueChatThreadSystemNotice.test.tsx server/src/services/recovery/successful-run-handoff.test.ts packages/shared/src/validators/issue.test.ts` — 5 files passed, 54 tests passed. - `pnpm --filter @paperclipai/shared typecheck && pnpm --filter @paperclipai/server typecheck && pnpm --filter @paperclipai/ui typecheck`. - `pnpm --filter @paperclipai/ui typecheck` after adding the Storybook screenshot fixture. - Captured Storybook screenshots for the new UI rendering paths: - Collapsed stale notice + workspace-change row: `docs/pr-screenshots/pr-5356/issue-thread-notices-collapsed.png` - Expanded stale notice details: `docs/pr-screenshots/pr-5356/issue-thread-notices-expanded.png` ### Screenshots Collapsed stale notice with workspace-change row:  Expanded stale notice details:  ## Risks - Moderate risk: this touches issue activity serialization and issue-thread rendering, both of which are central operator surfaces. - Migration risk is low: the only migration change makes an existing migration idempotent. - No new migrations are introduced, so there is no cross-PR migration ordering requirement. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, shell/tool-use enabled, used to split the existing branch, verify the isolated PR branch, and create this PR. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
68f69975a4 |
Harden control-plane safety and issue identifiers (#5292)
## Thinking Path > - Paperclip relies on issue identifiers, execution policies, and agent heartbeat rules to keep autonomous work auditable. > - Safety checks need to reject ambiguous agent handoffs, and identifier parsing needs to support Cloud tenant prefixes. > - Agent instructions also need to make final-disposition rules explicit so work does not stall in vague states. > - This pull request isolates backend correctness and governance hardening from the UI and recovery-system-notice branches. > - The benefit is safer in-review transitions, better identifier compatibility, and clearer agent operating contracts. ## What Changed - Fixed run-aware confirmation ordering and interrupted-run state cleanup. - Added Cloud tenant identity bootstrap and alphanumeric issue identifier support across shared parsing and server routes. - Guarded agent-authored `in_review` updates unless a real review path exists. - Tightened heartbeat disposition instructions in adapter utilities/default AGENTS/Paperclip skill. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run packages/shared/src/issue-references.test.ts server/src/__tests__/issue-identifier-routes.test.ts server/src/__tests__/issue-execution-policy-routes.test.ts packages/adapter-utils/src/server-utils.test.ts` initially had the first execution-policy test hit Vitest's 5s timeout under the parallel bundle while the rest passed. - `pnpm exec vitest run server/src/__tests__/issue-execution-policy-routes.test.ts --testTimeout=20000` passed with 10/10 tests. - Follow-up: `pnpm run typecheck:build-gaps` passed. - Follow-up: `pnpm --filter @paperclipai/ui typecheck` passed. - Follow-up: `pnpm vitest run server/src/__tests__/issue-comment-reopen-routes.test.ts server/src/__tests__/company-portability.test.ts server/src/__tests__/costs-service.test.ts` passed. - Follow-up: `pnpm vitest run ui/src/context/LiveUpdatesProvider.test.ts ui/src/lib/issue-chat-messages.test.ts ui/src/lib/issue-reference.test.ts ui/src/lib/issue-timeline-events.test.ts` passed. ## Risks - Medium control-plane risk: in-review update validation changes agent behavior. The error message is explicit and tests cover allowed review paths. ## Model Used - OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with shell/git/GitHub CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
a1b30c9f35 |
Add planning mode for issue work (#5353)
## Thinking Path > - Paperclip is a control plane for autonomous AI companies. > - Issues are the core unit of work, and issue comments are how board users and agents coordinate execution. > - Some issue conversations need to produce plans and approvals instead of immediate implementation work. > - The existing issue contract did not distinguish standard execution comments from planning-oriented issue work. > - This pull request adds an issue work-mode contract and board UI affordances for standard vs planning mode. > - The benefit is that planning-mode issues can be created, displayed, discussed, and carried through agent heartbeat context without losing the normal issue workflow. ## What Changed - Added `standard` / `planning` issue work-mode contracts across DB, shared validators/types, server issue flows, plugin protocol, and adapter heartbeat payloads. - Added an idempotent `0081_optimal_dormammu` migration for `issues.work_mode`, ordered after current `public-gh/master` migrations. - Updated heartbeat/context summaries and issue-thread interaction behavior so planning work mode is preserved when creating suggested follow-up issues. - Added UI support for planning-mode issue creation, issue rows, detail composer styling, and composer work-mode toggles. - Added focused server/shared/UI tests plus a Playwright visual verification spec for planning-mode surfaces. - Rebased the branch onto current `public-gh/master` and added durable planning-mode screenshots under `doc/assets/pap-3368/`. ## Verification - `pnpm --filter @paperclipai/db run check:migrations` - `pnpm exec vitest run --project @paperclipai/shared packages/shared/src/validators/issue.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/heartbeat-context-summary.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts server/src/__tests__/issues-goal-context-routes.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm exec vitest run --project @paperclipai/ui ui/src/components/IssueChatThread.test.tsx ui/src/components/NewIssueDialog.test.tsx ui/src/components/IssueRow.test.tsx ui/src/pages/IssueDetail.test.tsx` - `pnpm exec vitest run --project @paperclipai/adapter-utils packages/adapter-utils/src/server-utils.test.ts` - `PAPERCLIP_E2E_SKIP_LLM=true npx playwright test --config tests/e2e/playwright.config.ts tests/e2e/planning-mode-visual-verification.spec.ts` ## Screenshots Desktop planning detail:  Desktop planning row:  Desktop staged standard toggle:  Mobile planning detail:  Mobile planning row:  ## Risks - Medium migration risk: this adds a non-null issue column. The migration uses `ADD COLUMN IF NOT EXISTS` so installations that applied an older branch-local migration number can still apply the final numbered migration safely. - Medium contract risk: issue payloads, plugin payloads, and adapter heartbeat payloads now include work mode; compatibility is handled by defaulting missing values to `standard`. - UI risk is moderate because composer controls changed; focused component tests and visual e2e coverage exercise standard vs planning display and toggle behavior. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent in a local Paperclip worktree, with shell/tool use. Exact context-window size is not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
320fd5d23b |
Add full company search page (#5293)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Operators need to find work, documents, agents, projects, comments, and activity across a company without jumping through separate surfaces. > - The existing Command-K flow was useful for fast navigation but not enough for deeper company-wide discovery. > - Search also needs company-scoped backend contracts, query cost controls, and indexed document matching so it stays safe as company data grows. > - This pull request adds a full company search API and a dedicated board search page that Command-K can hand off to. > - The benefit is a single searchable control-plane surface with richer result context, recents, highlights, and test coverage across server and UI behavior. ## What Changed - Added a company-scoped search endpoint/service with query validation, rate limiting, text matching, fuzzy title matching, and result typing shared through `@paperclipai/shared`. - Added idempotent search migrations for document search indexes and fuzzy matching support. - Added the full `/companies/:companyKey/search` UI, search result row components, highlighted snippets, recent searches, and sidebar/Command-K handoff. - Added Storybook coverage for search surfaces and Vitest coverage for server search behavior, rate limiting, route generation, Command-K behavior, and the search page. - Addressed Greptile findings by renaming the no-match SQL helper, applying search pagination after cross-type merge sorting, and lazy-initializing the default search service so unrelated route-test mocks do not need to know about it. - Merged current `public-gh/master` and renumbered the search migrations behind upstream `0078_white_darwin`: search indexes are now `0079_company_search_document_indexes` and fuzzy matching is `0080_company_search_fuzzystrmatch`. ## Verification - `git fetch public-gh master` - `git diff --check public-gh/master...HEAD` - `git diff --name-only public-gh/master...HEAD | rg '^pnpm-lock\.yaml$' || true` produced no output before opening the PR. - `pnpm run preflight:workspace-links && pnpm exec vitest run server/src/__tests__/company-search-service.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts ui/src/pages/Search.test.tsx ui/src/components/CommandPalette.test.tsx ui/src/lib/company-routes.test.ts` passed: 5 files, 25 tests. - `pnpm --filter @paperclipai/shared typecheck && pnpm --filter @paperclipai/db typecheck && pnpm --filter @paperclipai/server typecheck && pnpm --filter @paperclipai/ui typecheck` passed. - `pnpm exec vitest run server/src/__tests__/company-search-service.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts && pnpm --filter @paperclipai/server typecheck` passed after Greptile pagination fixes. - `pnpm exec vitest run server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts server/src/__tests__/company-search-service.test.ts && pnpm --filter @paperclipai/server typecheck` passed after the CI mock fix. - After resolving the migration conflict with current `public-gh/master`: `pnpm --filter @paperclipai/db typecheck && pnpm exec vitest run server/src/__tests__/company-search-service.test.ts server/src/__tests__/company-search-rate-limit-routes.test.ts && pnpm --filter @paperclipai/server typecheck` passed. - DB migration numbering check passed as part of `@paperclipai/db` typecheck. - UI states are covered by the added Storybook stories in `ui/storybook/stories/search.stories.tsx`. - GitHub reports the PR merge state as `CLEAN` on head `18e54fa8`. - GitHub PR checks are green on head `18e54fa8`: policy, verify, serialized server shards 1/4 through 4/4, e2e, canary dry run, Snyk, and Greptile Review. ## Risks - Search ranking and snippets are new user-facing behavior, so reviewers should check whether result ordering feels right on real company data. - Search touches broad company data, so company scoping and query cost/rate-limit behavior should be reviewed carefully. - The migrations add search indexes/extensions; they are idempotent with `IF NOT EXISTS` for users who may have applied an earlier branch migration number. > ROADMAP.md checked. This PR adds a focused board search surface and does not duplicate an open roadmap item. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub CLI session with medium reasoning effort. Existing branch commits were produced across prior agent sessions; this packaging pass verified, opened the PR, addressed Greptile findings, resolved migration conflicts after upstream PRs landed, and got PR checks green. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
424e81d087 |
Improve operator workflow QoL (#5291)
## Thinking Path > - Paperclip is a control plane operators use repeatedly to supervise agent companies. > - Common operator workflows depend on fast scanning of inboxes, issue sidebars, workspaces, cost totals, and runtime services. > - Several small UI and service gaps made those workflows slower or less clear. > - This pull request groups the operator-facing QoL changes that can stand alone from recovery and adapter work. > - The benefit is a denser, clearer board experience for issue triage and workspace operation. ## What Changed - Added inbox assignee/project grouping and issue list token/runtime totals. - Improved issue properties with removable blocker chips and workspace task links. - Improved execution workspace layout, runtime controls, issues tab default, and stopped-port reuse behavior. - Added mobile markdown/routine dialog fixes, page title company names, sidebar polish, and dashboard run task label cleanup. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run ui/src/lib/inbox.test.ts ui/src/components/IssueProperties.test.tsx ui/src/components/WorkspaceRuntimeControls.test.tsx server/src/__tests__/workspace-runtime.test.ts server/src/__tests__/costs-service.test.ts` ## Risks - Medium UI risk because this touches several operator surfaces. The branch is intentionally grouped around workflow/QoL files and keeps the file count below the Greptile limit. ## Model Used - OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with shell/git/GitHub CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
11ffd6f2c5 |
Improve ACPX adapter configuration (#5290)
## Thinking Path > - Paperclip orchestrates AI agents across several adapter implementations. > - ACPX is a local adapter path that can proxy Claude and Codex-style execution. > - Its configuration needed stronger schema defaults, provider-aware model handling, and better UI support. > - Plugin authors also need clear docs for managed resources. > - This pull request improves ACPX adapter configuration and documents plugin-managed resources. > - The benefit is a more predictable adapter setup path without changing unrelated control-plane behavior. ## What Changed - Improved ACPX config schema, execution config handling, UI build config, and route coverage. - Added ACPX model filtering support and tests. - Updated the agent config form and storybook coverage for ACPX model/provider behavior. - Expanded plugin authoring documentation for managed resources. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run server/src/__tests__/acpx-local-execute.test.ts server/src/__tests__/adapter-routes.test.ts ui/src/lib/acpx-model-filter.test.ts` ## Risks - Low-to-medium risk: adapter configuration behavior changes can affect ACPX users, but the change is isolated to ACPX/plugin-doc surfaces and covered by targeted adapter tests. ## Model Used - OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with shell/git/GitHub CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
454edfe81e |
Add recovery handoff system notices (#5289)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Agent runs can end productively while the source issue still lacks a durable final disposition. > - That leaves the control plane unsure whether to resume, escalate, or close the work. > - Issue comments also need a presentation contract so system-authored recovery notices can render as first-class thread messages without overloading normal comments. > - This pull request adds successful-run handoff recovery, comment presentation metadata, and system notice rendering. > - The benefit is stricter task liveness with clearer operator-facing recovery state. ## What Changed - Added successful-run handoff decisions, wake payloads, escalation behavior, and recovery tests. - Added issue comment presentation metadata with migration `0078_white_darwin.sql` and shared/server/company portability support. - Rendered recovery/system notices in issue chat with dedicated UI components, fixtures, tests, and storybook/lab coverage. - Included the current recovery model-profile hint patch so automatic recovery follow-ups use the cheap profile. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run server/src/services/recovery/successful-run-handoff.test.ts ui/src/components/SystemNotice.test.tsx ui/src/lib/system-notice-comment.test.ts ui/src/components/IssueChatThreadSystemNotice.test.tsx` ## Risks - Migration-bearing PR: merge this before any other branch that might later add a migration. - The branch touches both recovery services and issue-thread rendering, so review should pay attention to recovery wake idempotency and comment metadata compatibility. ## Model Used - OpenAI GPT-5 Codex via Paperclip `codex_local` adapter, with shell/git/GitHub CLI tool use. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
83e7ecc58e |
Preserve scope on manual heartbeat invokes (#5323)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The agent live-run route lets operators trigger a manual heartbeat invocation so an agent can pick up a specific issue or step out of band > - The current route flow drops the caller's scope (issue/run context) when forwarding the manual invoke into the heartbeat service, so the resulting run loses the targeting the operator specified > - This pull request threads the operator-supplied scope through the manual invoke path on both the server route and the UI client, with a regression test that confirms the scope round-trips > - The benefit is manual heartbeat invokes from the live-run UI actually pick up the scoped issue/run instead of falling through to the agent's default routine ## What Changed - `server/src/routes/agents.ts`: forward the operator-supplied scope into the manual invoke heartbeat service call - `server/src/__tests__/agent-live-run-routes.test.ts`: new test verifying the manual invoke path preserves scope - `ui/src/api/agents.ts`: pass scope through the live-run client API ## Verification - `pnpm vitest run --no-coverage server/src/__tests__/agent-live-run-routes.test.ts` - `pnpm typecheck` clean ## Risks Low. The change is purely additive on the route surface — handlers that did not previously pass scope continue to work; handlers that did pass it now have it preserved instead of dropped. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new test covers the preserved-scope path - [x] If this change affects the UI, I have included before/after screenshots — N/A (internal API change, no visible UI shift) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
d6d7a7cea6 |
Add routine revision history and restore flow (#5285)
## Thinking Path > - Paperclip is the control plane for autonomous AI companies. > - Routines are the scheduled/recurring work surface that keeps a company operating without manual kicks. > - Operators need routine edits to be auditable and recoverable, especially when routines control assignments, prompts, triggers, and webhook secrets. > - Documents already have revision-style safety, but routines did not have equivalent history or restore semantics. > - This pull request adds append-only routine revisions across the database, shared contracts, server routes, and board UI. > - The benefit is safer routine iteration: users can inspect history, compare changes, restore older definitions, and avoid overwriting newer edits. ## What Changed - Added `routine_revisions` storage, latest revision pointers on routines, shared types, validators, and API docs for routine revision history. - Added server service/route support for listing routine revisions, conflict-aware routine saves, and append-only restore operations. - Added a History tab on routine detail with revision preview, structured change summaries, description line diffs, dirty-edit blocking, restore confirmation, and restored webhook secret surfacing. - Extracted the line diff helper from `DocumentDiffModal` into `ui/src/lib/line-diff.ts` for reuse. - Rebased the branch onto current `public-gh/master` and renumbered the routine revision migration to `0077_unusual_karnak` after upstream `0076_useful_elektra`. - Made the `0077` routine revision migration idempotent so installs that already applied the branch-local `0076_unusual_karnak` can safely advance. - Updated the plugin SDK test harness routine fixture with the new revision fields required by the shared `Routine` contract. ## Verification - `pnpm --filter @paperclipai/db run check:migrations` passed. - `pnpm exec vitest run --project @paperclipai/shared packages/shared/src/validators/routine.test.ts` passed. - `pnpm exec vitest run --project @paperclipai/ui ui/src/lib/line-diff.test.ts ui/src/components/RoutineHistoryTab.test.tsx ui/src/lib/workspace-routines.test.ts ui/src/pages/Routines.test.tsx` passed. - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/routines-service.test.ts --pool=forks --poolOptions.forks.isolate=true` passed. - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/routines-routes.test.ts --pool=forks --poolOptions.forks.isolate=true` passed. - `pnpm --filter @paperclipai/plugin-sdk typecheck` passed after updating the SDK test harness fixture. - `pnpm --filter @paperclipai/plugin-sdk build` passed; this refreshed local generated SDK output needed by plugin example typechecks. - `pnpm -r typecheck` passed. ## Risks - Medium migration risk: this adds routine revision storage and backfills existing routines. The migration is ordered after upstream `0076` and uses `IF NOT EXISTS` / duplicate-object guards to tolerate earlier branch-local migration application. - Restore behavior intentionally appends a new revision instead of mutating history; callers expecting an in-place rollback need to follow the new latest revision pointer. - Restoring webhook triggers recreates webhook secret material, so users must copy newly surfaced secrets after restore. - Conflict-aware saves now reject stale routine edits when the client sends an older `baseRevisionId`. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent, with shell/tool use in a local git worktree. Exact context-window size is not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Screenshots: not attached in this draft PR; the new UI flow is covered by component tests listed above. --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
5c2f9aba9d |
Run explicit-environment adapter tests on the requested target instead of falling back to the host (#5277)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - When a user clicks "Test" on a configured environment (SSH or sandbox), the agent-test route exercises the adapter against that target > - The route previously fell back to running the probe on the Paperclip host whenever an explicit environment target couldn't be resolved, with the test report still saying "passed" > - That hid two real failure modes: misconfigured environments looked green, and sandbox environments were never actually exercised > - This pull request acquires an ad-hoc lease and realizes a workspace for sandbox/plugin test environments, resolves a sandbox execution target wired to the environment runtime, and returns synthesized diagnostics instead of running a host probe when an explicit env target can't be resolved > - The benefit is the Test action surfaces the real environment state and never silently exercises the wrong machine ## What Changed - `server/routes/agents.ts`: acquire an ad-hoc lease and realize a workspace for sandbox/plugin test environments; resolve a sandbox execution target wired to the environment runtime - Return synthesized diagnostics (no host fallback) when an explicit env target can't be resolved - `server/services/environment-runtime.ts`: small adjustments to support the explicit-env-target case - Clarify test-route messages so they no longer claim a host fallback in explicit env flows - New `agent-test-environment-routes.test.ts` covers the guard and missing-environment path ## Verification - `pnpm vitest run --no-coverage server/src/__tests__/agent-test-environment-routes.test.ts` - `pnpm typecheck` clean - Manual: a deliberately misconfigured sandbox environment now reports diagnostics instead of a misleading host-pass ## Risks Medium — Test route behavior change. Explicit environments that previously appeared to pass via host fallback will now report their real state. This is the desired behavior, but operators should expect to see new failures for environments that were never actually working. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable — new tests cover guard + missing-env paths - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
ea7f53fd7d |
Handle Gemini CLI v0.38 stream-json wire format across parser, UI, and CLI formatter (#5273)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Each agent uses an adapter that drives a CLI (Claude, Gemini, Codex, etc.) > - The Gemini adapter parses a JSONL transcript stream the CLI emits to learn what the model said > - Gemini CLI v0.38 changed the transcript shape: assistant text now comes through `type=message` with `role`/`content` and terminal status comes through `type=status` / `type=stats` > - The existing parser was written against the older `type=assistant` / `type=result` shape, so post-v0.38 outputs left the parsed summary empty and downgraded the SSH hello probe to "unexpected output" > - This pull request updates every Gemini consumer (server parser, UI parser, CLI formatter) to accept the v0.38 shape while keeping the legacy shape working > - The benefit is the Gemini adapter handles current upstream output without losing backward compatibility, with explicit test coverage for both shapes ## What Changed - `packages/adapters/gemini-local/src/server/parse.ts` recognizes `type=message` events with role/content and stops downgrading them - `packages/adapters/gemini-local/src/ui/parse-stdout.ts` mirrors the parser changes for the live UI transcript - `packages/adapters/gemini-local/src/cli/format-event.ts` formats the new event shape correctly for CLI output - `parse.test.ts` and `parse-stdout.test.ts` add v0.38 coverage; `gemini-local-adapter.test.ts` and `execute.remote.test.ts` switch happy-path fixtures to the current real wire format and keep dedicated tests for the older schema ## Verification - `pnpm vitest run --no-coverage --project @paperclipai/adapter-gemini-local` — full suite passes including new v0.38 cases and preserved legacy cases - `pnpm typecheck` clean ## Risks Low risk — additive event handling. Legacy event shape path is preserved with its own tests, so existing fixtures continue to parse identically. ## Model Used Claude Opus 4.7 (1M context) ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots — N/A (no UI) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
3c73ed26b5 |
Expand plugin host surface (#5205)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The plugin system is the extension boundary for optional product capabilities > - Rich plugins need more than a worker entrypoint: they need scoped database storage, local project folders, managed agents/routines, host navigation, and reusable UI components > - The LLM Wiki work exposed those missing host surfaces while keeping plugin code outside the core control plane > - This pull request expands the core plugin host, SDK, server APIs, and UI bridge so plugins can declare and use those surfaces > - The benefit is that future plugins can integrate with Paperclip through documented, validated contracts instead of bespoke server or UI imports ## What Changed - Added plugin-managed database namespaces and migration tracking, including Drizzle schema/migration files and SQL validation for namespace isolation. - Added server support for plugin local folders, managed agents, managed routines, scoped plugin APIs, and plugin operation visibility. - Expanded shared plugin manifest/types/validators and SDK host/testing/UI exports for richer plugin surfaces. - Added reusable UI pieces for file trees, managed routines, resizable sidebars, route sidebars, and plugin bridge initialization. - Updated plugin docs and example plugins to use the expanded host and SDK surface. ## Verification - `pnpm install --frozen-lockfile` - `pnpm run preflight:workspace-links && pnpm exec vitest run packages/shared/src/validators/plugin.test.ts server/src/__tests__/plugin-database.test.ts server/src/__tests__/plugin-local-folders.test.ts server/src/__tests__/plugin-managed-agents.test.ts server/src/__tests__/plugin-managed-routines.test.ts server/src/__tests__/plugin-orchestration-apis.test.ts ui/src/api/plugins.test.ts ui/src/components/FileTree.test.tsx ui/src/components/ResizableSidebarPane.test.tsx ui/src/pages/PluginPage.test.tsx ui/src/plugins/bridge.test.ts` passed: 11 files, 67 tests. - Confirmed this PR changes 89 files and does not include `pnpm-lock.yaml` or `.github/workflows/*`. ## Risks - Medium: this expands plugin host contracts across db/shared/server/ui and includes a new core migration (`0076_useful_elektra.sql`). - The plugin database namespace validator is intentionally restrictive; plugin authors may need follow-up affordances for SQL patterns that remain blocked. - Merge this before the LLM Wiki plugin PR so the plugin can resolve the new SDK and host APIs. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub workflow. Context window size was not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
d6bee62f02 |
Fix Cloud tenant issue identifier routes (#5196)
## Summary - Allow Cloud tenant issue identifiers with alphanumeric prefixes, such as `PC1897-1`, to normalize as issue references. - Resolve those identifiers through issue detail/update routes, active run/live run polling, activity, costs, and `issueService.getById`. - Keep UI issue-link parsing aligned so tenant links normalize back to `/issues/<IDENTIFIER>`. ## Root Cause Cloud tenant issue prefixes include digits from the stack-id hash. The app-side route normalization still accepted only all-letter prefixes, so `/api/issues/PC1897-1` skipped identifier lookup and fell through as a non-UUID id. ## Verification - `pnpm exec vitest run packages/shared/src/issue-references.test.ts ui/src/lib/issue-reference.test.ts server/src/__tests__/issue-identifier-routes.test.ts server/src/__tests__/activity-routes.test.ts server/src/__tests__/costs-service.test.ts server/src/__tests__/agent-live-run-routes.test.ts server/src/__tests__/issues-service.test.ts` - `pnpm --filter @paperclipai/shared typecheck && pnpm --filter @paperclipai/server typecheck` - `git diff --check` Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
ae23e02526 |
Support Cloud tenant identity bootstrap
Co-Authored-By: Paperclip <noreply@paperclip.ing> |
||
|
|
29401b231b |
fix(ci): gate new release packages on npm bootstrap (#5146)
## Thinking Path > - Paperclip is a control plane for autonomous agent companies, so its release automation is part of the core operator trust boundary. > - The affected subsystem is npm/GitHub Actions release publishing for the public monorepo packages. > - The concrete failure was that a newly added package reached `master`, the canary workflow attempted its first publish, and npm trusted publishing was not yet bootstrapped for that package. > - That means the problem is not just one broken run; it is a missing pre-merge guard that lets release-ineligible packages land and only fail once `publish_canary` runs. > - This pull request makes release enrollment explicit, validates that enrollment in CI, and adds a PR-time bootstrap check against npm for changed release-enabled package manifests. > - The result is that we keep trusted publishing, avoid teaching CI to `npm adduser`, and move this class of failure from post-merge canary time to pre-merge review time. ## What Changed - Added `scripts/release-package-manifest.json` so release-managed public packages are explicitly enrolled instead of being inferred from every non-private workspace package. - Hardened `scripts/release-package-map.mjs` to validate the manifest before release workflows rewrite versions or assemble publish payloads. - Added `scripts/check-release-package-bootstrap.mjs` and wired it into `.github/workflows/pr.yml` so PRs that change a release-enabled package manifest fail if that package does not already exist on npm. - Added release-package manifest coverage tests to `scripts/release-package-map.test.mjs` and included them in `pnpm run test:release-registry`. - Wired manifest validation into `.github/workflows/release.yml` and documented the first-publish bootstrap policy in `doc/PUBLISHING.md` and `doc/RELEASE-AUTOMATION-SETUP.md`. ## Verification - `pnpm run test:release-registry` - `./scripts/release.sh canary --skip-verify --dry-run` - Confirmed the committed diff contains no obvious PII/secrets via targeted pattern scan before pushing. ## Risks - Low risk overall: this is CI/release-policy code, not product runtime logic. - The new PR bootstrap check depends on npm metadata availability, so a transient npm outage could block a PR that changes a release-enabled package manifest. - The manifest introduces a new source of truth that must stay aligned with public package additions, but that is intentional and now enforced. ## Model Used - OpenAI Codex via the `codex_local` Paperclip adapter; GPT-5-based coding agent with tool use, terminal execution, git, and GitHub CLI. Exact served model ID/context window are not exposed by the local runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
90631b09b3 |
Let adapters declare runtime command spec for remote provisioning (#5141)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies, running
adapter
> commands like `claude`, `codex`, `pi` either locally or on remote
runtimes
> (SSH hosts, sandboxes, etc.)
> - On a fresh remote runtime — particularly an ephemeral sandbox — the
> adapter's CLI may not be installed yet. Today operators handle this
via
> external configuration (e.g. a project-level `provisionCommand` shell
> script) that has to know about every adapter the operator might want
to use
> - This means every adapter has its own well-known npm package, but
operators
> end up writing duplicate provision shell scripts that paste together
> `npm install -g @anthropic-ai/claude-code`, `npm install -g
@openai/codex`,
> etc. — knowledge the adapter itself already has
> - This PR moves that knowledge into the adapter modules: each adapter
declares
> how its runtime command should be detected and (if applicable)
installed
> via `getRuntimeCommandSpec(config)`. The execution path runs the
adapter's
> own install command on remote sandbox targets before launching, so a
fresh
> sandbox bootstraps itself instead of requiring a hand-written
provision script
> - The benefit is fewer footguns for operators provisioning remote
runtimes,
> and a clean place for new adapters to plug in their install recipe
## What Changed
- New types in `packages/adapter-utils/src/types.ts`:
- `AdapterRuntimeCommandSpec` describing `command`, optional
`detectCommand`, and optional `installCommand`
- Optional `getRuntimeCommandSpec(config)` on `ServerAdapterModule`
- Optional `runtimeCommandSpec` on `AdapterExecutionContext` so adapters
receive the resolved spec at execute time
- New helper `ensureAdapterExecutionTargetRuntimeCommandInstalled(...)`
in
`packages/adapter-utils/src/execution-target.ts` that runs the install
command
on remote targets when `transport === "sandbox"`. SSH and local targets
are
no-ops. Throws on timeout or non-zero exit so failures surface early.
- Each of `claude-local`, `codex-local`, `cursor-local`, `gemini-local`,
`opencode-local`, `pi-local`'s `execute.ts` now reads
`ctx.runtimeCommandSpec?.installCommand` and calls the helper before
launching
the adapter command.
- `server/src/adapters/registry.ts` declares `getRuntimeCommandSpec` for
each
adapter:
- claude/codex/gemini/opencode/pi-local: `npm install -g <package>`
recipe via
a shared `buildNpmRuntimeCommandSpec` helper, with a defensive guard
that
only auto-installs when the configured `command` matches the well-known
fallback (custom binaries are left alone).
- cursor-local: declares `command` only; no auto-install (no public npm
package), preserving the existing manual setup.
- `server/src/services/heartbeat.ts` resolves the spec via
`adapter.getRuntimeCommandSpec?.(runtimeConfig)` and passes it through
to
`AdapterExecutionContext`.
- Tests added in `execution-target.test.ts` (~75 lines), e2b
`plugin.test.ts` (~32 lines), and `environment-run-orchestrator.test.ts`
(~76 lines).
## Verification
- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-run-orchestrator`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: run an adapter (claude/codex/etc.) against a fresh
sandbox-backed
environment that does NOT have the adapter CLI pre-installed. Confirm
the
install runs once at the start of the agent run and the adapter then
launches
successfully. Re-run on the same sandbox; confirm the install command is
idempotent and the second run starts faster.
- Confirm SSH and local execution paths are unaffected (gated by
`transport === "sandbox"`).
## Risks
- Behavioural shift on sandbox runs: a new install step now runs at the
start
of every sandbox agent run for adapters with `installCommand` set. The
install commands are idempotent (`if ! command -v X >/dev/null 2>&1;
then
npm install -g <pkg>; fi`), so this is fast on warm sandboxes. On a cold
sandbox, the first run takes longer.
- Operators who used the legacy project-level `provisionCommand` to
install
adapter CLIs can drop that part of their script; the adapter handles it
now.
Existing scripts continue to work — installs are idempotent.
- The cursor-local adapter has no auto-install (no public npm package).
Behaviour for cursor-local on sandboxes is unchanged.
- New optional surface on `ServerAdapterModule`. Plugins that don't
implement
`getRuntimeCommandSpec` retain previous behaviour (no auto-install).
## Model Used
- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
|
||
|
|
0e51fa2b0d |
Honor reuse-existing preference and assignee default environment in issue runs (#5139)
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents run inside execution workspaces (a per-issue cwd + env), and an issue > can prefer to reuse an existing workspace or get a fresh one each time > - The heartbeat service was reading the existing workspace's config to derive > environment selection regardless of whether the issue actually wanted to reuse > it. So fresh-run issues were inheriting stale config from a workspace that was > about to be discarded > - Separately, when an issue is assigned to an agent, the issue's execution > workspace settings weren't picking up the agent's `defaultEnvironmentId`, > even though the agent's choice is the natural default for that issue > - This PR makes both selection paths honor the obvious source of truth: > workspace config flows only when the issue actually wants `reuse_existing`, > and the assignee agent's default environment is applied at assignment time if > nothing else is set on the issue > - The benefit is that re-running a flaky issue picks up the right environment > instead of inheriting the previous run's config, and assigning an agent to an > issue does the obvious thing without operator intervention ## What Changed - `server/src/services/heartbeat.ts`: introduce `reusableExecutionWorkspaceConfig` that is non-null only when `shouldReuseExisting` is true. Both `resolveExecutionWorkspaceEnvironmentId(...)` and `applyPersistedExecutionWorkspaceConfig(...)` now read from it instead of unconditionally consulting `existingExecutionWorkspace?.config`. Fresh-run issues no longer inherit stale environment config from an in-flight workspace about to be discarded. - `server/src/services/issues.ts`: when an issue update sets a new `assigneeAgentId` and isolated workspaces are enabled, populate `executionWorkspaceSettings.environmentId` from the assignee agent's `defaultEnvironmentId` if the issue doesn't have an explicit `environmentId` set yet. - Tests added in `heartbeat-plugin-environment.test.ts` (~216 lines) and `issues-service.test.ts` (~85 lines) covering both paths. ## Verification - `pnpm --filter @paperclipai/server test -- heartbeat-plugin-environment issues-service` - Manual QA: assign an issue to an agent that has a non-default `defaultEnvironmentId`, confirm the issue's workspace settings now include that environment id without operator intervention. Trigger a rerun on an issue whose existing workspace points at a stale environment, confirm the rerun uses the freshly-resolved environment. ## Risks - Behavioural shift on assignment: previously assigning an agent didn't propagate the agent's default environment to the issue. Now it does. Callers that explicitly want the issue to keep its existing/null environment must set `executionWorkspaceSettings.environmentId` themselves; the new logic only fires when no explicit value is set. - Behavioural shift on rerun: stale workspace config is no longer applied to fresh runs. Operators who relied on this implicit inheritance may see different environment selection on the first rerun after deploy. Mitigation: the explicit isssue settings and project policy are still honored as before. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A (no UI changes) - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
bb7d040894 |
Switch OpenCode to explicit static/local-aware model selection (#5117)
> **Stacked PR (part 4 of 7).** Depends on: - PR #5114 - PR #5115 - PR #5116 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - When creating an OpenCode-local agent, Paperclip currently validates > `adapterConfig.model` against the *Paperclip host's* `opencode models` output > - SSH testing surfaced that this blocks creating an OpenCode agent for an SSH > environment: the model that exists on the SSH target isn't visible to the > host, so creation fails with "OpenCode requires `adapterConfig.model` in > provider/model format" even when the operator picked a real remote model > - The initial direction was environment-aware model discovery; the final > decision was to keep OpenCode on the same explicit-model pattern as other > adapters (default + curated list + manual override) and stop blocking > creation on host-side discovery > - This PR does both: the adapter-models endpoint now accepts `environmentId` and > probes against the target environment, and the create-time hard gate is > replaced by `requireOpenCodeModelId` which validates `provider/model` *format* > without requiring host-local discovery. Test/run-time still surfaces real > auth/availability problems > - The benefit is that operators can create OpenCode agents for remote > environments without out-of-band setup, and the model picker in the UI > reflects the actually-targeted environment ## What Changed - Added `requireOpenCodeModelId(input)` in `opencode-local/src/server/models.ts`, exported it from the adapter index - `ensureOpenCodeModelConfiguredAndAvailable` now delegates the format check to `requireOpenCodeModelId` - `agentsApi.adapterModels(companyId, adapterType, { environmentId })` now accepts an environment ID and passes it as a query parameter - `queryKeys.agents.adapterModels` now keys on `(companyId, adapterType, environmentId)` - `server/src/routes/agents.ts` reads and validates the new query parameter, forwarding it to the adapter's model probe - `AgentConfigForm.tsx` and `OnboardingWizard.tsx` build the model query key from the currently selected default environment ID and disable autodetect for `opencode_local` (model selection is explicit) - `NewAgent.tsx` simplified — no longer special-cases OpenCode autodetect - `company-portability.ts` no longer needs OpenCode-specific autodetect handling - Tests added/updated: `adapter-model-refresh-routes.test.ts`, `adapter-models.test.ts`, `agent-permissions-routes.test.ts`, `opencode-local/src/server/models.test.ts` ## Verification - `pnpm --filter @paperclipai/server test -- adapter-models adapter-model-refresh agent-permissions` - `pnpm --filter @paperclipai/adapter-opencode-local test` - `pnpm --filter @paperclipai/ui test -- AgentConfigForm OnboardingWizard NewAgent` - Manual QA in browser: 1. Boot Paperclip on Tailscale-bound port (so it's reachable from another machine), create an OpenCode-local agent, switch the default environment between two installed sandboxes, and confirm the model list refreshes per-environment 2. Submit with a malformed `provider/model` string and verify the new `requireOpenCodeModelId` error surfaces - Before/after screenshots attached for `AgentConfigForm` model picker ## Risks - Behavioural shift: switching default environment now triggers a model refetch. Should be cheap but introduces a new UI loading state for OpenCode users. - Removing dynamic autodetect for OpenCode: if any user configured an agent without specifying `model` and relied on autodetect populating it, that agent will now fail at submit time. Mitigation: validation error is explicit and actionable. - New query string parameter on `/api/companies/:id/adapter-models` — older clients that omit it still work (parameter is optional and defaults to null). ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
076067865f |
Migrate SSH environment callback to bridge (#5116)
> **Stacked PR (part 3 of 7).** Depends on: - PR #5114 - PR #5115 > Diff against `master` includes commits from earlier PRs in the stack — the new commit in this PR is the topmost one. ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents executing on a remote SSH-backed environment need a way to call back into > the Paperclip control plane (run events, log streaming, signals) > - When the SSH host can't reach the Paperclip host (NAT, firewalls, or simply not > on the same network), the run silently fails or hangs — a recurring class of > failure during SSH testing > - In sandboxed environments we already solved this with a callback bridge that > tunnels back through the existing connection; SSH was the odd one out > - This PR migrates SSH execution to use the same callback bridge, so every > adapter's remote run uses one consistent reverse-channel. Per-adapter SSH glue > is deleted in favour of a shared `CommandManagedRuntimeRunner` built from the > SSH spec > - The benefit is fewer SSH-specific failure modes, a smaller code surface, and > one place to evolve the callback contract going forward ## What Changed - Added `createSshCommandManagedRuntimeRunner` in `packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a generic command-managed-runtime runner (with cwd, env, and timeout handling) - Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge URL now flows through the shared runner - Reworked `execution-target.ts` to use the SSH runner alongside sandbox runners via a unified `CommandManagedRuntimeRunner` interface - Simplified `remote-managed-runtime.ts` and `sandbox-managed-runtime.ts` to consume the shared runner abstraction - Deleted per-adapter SSH callback wiring from claude-local, codex-local, cursor-local, gemini-local, opencode-local, pi-local execute.ts files - Removed `environment-runtime-driver-contract.test.ts` (the contract is now enforced by `environment-execution-target.test.ts`) - Added/updated `execute.remote.test.ts` cases for each adapter to cover the SSH runner path ## Verification - `pnpm --filter @paperclipai/adapter-utils test` - `pnpm test -- execute.remote` (covers all six local adapters' SSH paths) - Manual QA: ran a claude-local agent against an SSH-backed environment, confirmed the agent successfully called back to `/api/agent-callback/*` endpoints during the run ## Risks - Refactor touches all six local adapters. If any adapter had subtle SSH-specific behaviour that wasn't captured in tests, it could regress. Mitigation: each adapter's `execute.remote.test.ts` was extended. - `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking type change for any internal consumer. Verified no external plugins consume this type. - The new `CommandManagedRuntimeRunner` shape is a public surface in `@paperclipai/adapter-utils`; downstream plugins implementing custom runners may need updates, but no such plugins exist in this repo. ## Model Used - OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI - Provider: OpenAI - Used to author the code changes in this PR ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots — N/A - [ ] I have updated relevant documentation to reflect my changes — N/A - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge |
||
|
|
a7b45938b7 |
Let sandbox providers declare shell defaults (#5114)
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
> providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
> provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
> provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
> helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
> preference
## What Changed
- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
`shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
(validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
`execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
`environment-execution-target.test.ts`
## Verification
- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
shell semantics flow through the callback bridge end-to-end)
## Risks
- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.
## Model Used
- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
|
||
|
|
15eac43b43 |
[codex] Retry max-turn exhausted heartbeats (#5096)
## Thinking Path > - Paperclip orchestrates AI agents for autonomous companies, and heartbeat execution is the control-plane loop that keeps assigned work moving. > - Max-turn exhaustion is a recoverable local-adapter stop condition for Claude and Gemini agents when a run needs another heartbeat to continue safely. > - The previous behavior could leave max-turn continuation details hard to inspect, and duplicate/stale continuation wakes could keep running after issue state changed. > - The adapter layer also needed to avoid trusting arbitrary stdout/stderr text as scheduler control metadata. > - This pull request adds bounded max-turn continuation scheduling, visible retry state, structured stop metadata handling, and stale/duplicate continuation guards. > - The benefit is safer automatic continuation after max-turn stops, clearer operator visibility, and fewer duplicate or stale agent runs. ## What Changed - Replaces closed PR #4952, whose head repository was deleted. - Rebases the recovered max-turn continuation branch onto current `paperclipai/paperclip:master`. - Adds max-turn continuation scheduling and retry-state plumbing for heartbeat runs. - Adds stale/duplicate continuation suppression when issue status, ownership, or execution locks change. - Normalizes Claude/Gemini max-turn detection around structured stop metadata instead of unstructured stdout/stderr text. - Surfaces max-turn continuation settings and retry visibility in the board UI. - Adds focused server, adapter, and UI tests for max-turn stop metadata, retry scheduling, stale queued-run invalidation, adapter parsing/execution, run ledger display, and agent config patching. ## Verification - `pnpm install --no-frozen-lockfile` to refresh local dependencies after rebasing onto current `master`. - `pnpm run preflight:workspace-links && pnpm exec vitest run server/src/__tests__/claude-local-adapter.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/gemini-local-adapter.test.ts server/src/__tests__/gemini-local-execute.test.ts server/src/__tests__/heartbeat-retry-scheduling.test.ts server/src/__tests__/heartbeat-stale-queue-invalidation.test.ts server/src/services/heartbeat-stop-metadata.test.ts ui/src/components/IssueRunLedger.test.tsx ui/src/lib/agent-config-patch.test.ts ui/src/lib/runRetryState.test.ts --testTimeout=20000` - `pnpm --filter @paperclipai/adapter-claude-local typecheck && pnpm --filter @paperclipai/adapter-gemini-local typecheck && pnpm --filter @paperclipai/server typecheck && pnpm --filter @paperclipai/ui typecheck` - UI screenshot note: the UI changes are limited to config/ledger state rendering rather than layout changes; component/unit coverage above verifies the rendered behavior. ## Risks - Medium behavior risk: heartbeat retry gating now suppresses max-turn continuations when issue state or execution locks drift, so any callers that relied on stale continuations running will now see cancellation instead. - Low adapter risk: Claude/Gemini unstructured text no longer triggers max-turn scheduler metadata, so only structured stop signals and Gemini exit code 53 are trusted. - No database migrations. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent, GPT-5-class model, tool-enabled local repository editing and command execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots (not applicable: state/default rendering only; covered by component/unit tests) - [x] I have updated relevant documentation to reflect my changes (not applicable: no user-facing command or docs contract changed) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
57229d0f24 |
[codex] Add issue monitor liveness controls (#4988)
## Thinking Path > - Paperclip is a control plane for autonomous AI companies where work must stay observable, governable, and recoverable. > - The task/heartbeat subsystem owns agent execution continuity, issue state transitions, and visible recovery behavior. > - Waiting on an external service is not the same as being blocked when the assignee still owns a future check. > - The gap was that agents had no first-class one-shot monitor state for external-service waits, so recovery could look stalled or require ad hoc comments. > - This pull request adds bounded issue monitors that can wake the owner, clear exhausted waits, and produce explicit recovery behavior. > - It also surfaces monitor status in the board UI and documents when to use monitors versus `blocked`. > - The benefit is clearer liveness semantics for asynchronous waits without weakening single-assignee task ownership. ## What Changed - Added issue monitor fields, shared types, validators, constants, and an idempotent `0075` migration for scheduled monitor state. - Added server-side monitor scheduling, dispatch, recovery bounds, activity logging, and external-ref redaction. - Added board/agent route coverage for monitor permissions and child monitor scheduling. - Added issue detail/property UI for monitor state, a monitor activity card, and Storybook stories for review surfaces. - Documented monitor semantics and recovery policy behavior in `doc/execution-semantics.md`. - Addressed Greptile review feedback by preserving monitor state in skipped-stage builders and making board monitor saves send `scheduledBy: "board"`. ## Verification - `pnpm install --frozen-lockfile` - `pnpm run preflight:workspace-links && pnpm exec vitest run server/src/__tests__/issue-execution-policy-routes.test.ts server/src/__tests__/issue-execution-policy.test.ts server/src/__tests__/issue-monitor-scheduler.test.ts server/src/__tests__/recovery-classifiers.test.ts ui/src/components/IssueMonitorActivityCard.test.tsx ui/src/components/IssueProperties.test.tsx ui/src/lib/activity-format.test.ts` - First run passed 5 files and failed to collect 2 server suites because the worktree was missing the optional `acpx/runtime` dependency. - After `pnpm install --frozen-lockfile`, reran the 2 failed suites successfully. - `pnpm exec vitest run server/src/__tests__/issue-monitor-scheduler.test.ts server/src/__tests__/recovery-classifiers.test.ts` - `pnpm --filter @paperclipai/shared typecheck && pnpm --filter @paperclipai/db typecheck && pnpm --filter @paperclipai/server typecheck && pnpm --filter @paperclipai/ui typecheck` - `pnpm exec vitest run server/src/__tests__/issue-execution-policy.test.ts ui/src/components/IssueProperties.test.tsx` - `pnpm --filter @paperclipai/server typecheck && pnpm --filter @paperclipai/ui typecheck` - `pnpm exec vitest run ui/src/components/IssueMonitorActivityCard.test.tsx ui/src/components/IssueProperties.test.tsx` - `pnpm --filter @paperclipai/ui typecheck` - Storybook screenshot captured from `http://127.0.0.1:6006/iframe.html?viewMode=story&id=product-issue-monitor-surfaces--monitor-surfaces` with Playwright. ## Screenshots  ## Risks - Medium: this changes heartbeat recovery behavior for scheduled external-service waits, so regressions could affect wake timing or recovery issue creation. - Migration risk is reduced by using `IF NOT EXISTS` for the new issue monitor columns and index. - External monitor references are treated as secret-adjacent and are intentionally omitted from visible activity/wake payloads. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent with repository tool use and terminal execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots or Storybook review surfaces - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
2d72292ad6 |
[codex] Add workspace routine run tab (#4958)
## Thinking Path > - Paperclip orchestrates AI agents through reusable execution workspaces and routines > - Operators need a fast way to run workspace-aware routines against a specific execution workspace > - The existing workspace detail surface showed configuration, runtime logs, and linked issues, but not routines that depend on workspace variables > - Routine runs also needed to prefill the selected execution workspace so branch variables resolve correctly > - This pull request adds a workspace routines tab and prefilled routine-run dialog support > - The benefit is a tighter workflow for rerunning reviews, smoke checks, and other workspace-specific routines ## What Changed - Added an execution workspace `Routines` tab and company-prefixed routes. - Listed routines that declare or reference workspace-specific variables. - Added `Run now` support that preselects the current execution workspace in `RoutineRunVariablesDialog`. - Centralized reusable execution workspace ordering/deduplication for issue creation and workspace cards. - Added focused UI helper and dialog regression tests. ## Verification - `pnpm exec vitest run ui/src/lib/reusable-execution-workspaces.test.ts ui/src/lib/workspace-routines.test.ts ui/src/components/RoutineRunVariablesDialog.test.tsx ui/src/lib/company-routes.test.ts` - Screenshots were not captured in this PR split; the visible flow is covered by focused component/helper tests and should get browser QA in the follow-up issue. ## Risks - Medium risk: this adds a new workspace detail tab and routine-run path. It is isolated to workspace-scoped routines and uses existing routine run APIs. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool use and local command execution. Exact context window was not exposed in the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |
||
|
|
570a4206da |
[codex] Recover productive terminal continuations (#4956)
## Thinking Path > - Paperclip orchestrates AI agents through issue-scoped heartbeat runs > - Recovery logic decides whether in-progress work still has a live path after a terminal run > - A productive terminal continuation can still leave an issue stranded when no active run or wake remains > - Treating that state as healthy leaves work stuck despite evidence that more action is needed > - This pull request re-enqueues recovery for productive terminal continuations that left no live path > - The benefit is fewer silently stranded in-progress issues after agents make partial progress ## What Changed - Reclassified successful-but-productive terminal continuations as recoverable when no live path remains. - Enqueue a follow-up recovery wake with the original run id and continuation metadata. - Added regression tests covering productive terminal continuation recovery and advanced liveness handoff. ## Verification - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts server/src/__tests__/run-continuations.test.ts` ## Risks - Medium risk: recovery may schedule one more follow-up where Paperclip previously considered the work observed. The existing uniqueness, budget, and escalation checks still constrain retry loops. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool use and local command execution. Exact context window was not exposed in the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> |