paperclip

Author	SHA1	Message	Date
Dotta	fda296ee4f	[codex] Add configurable liveness auto-recovery controls (#4587 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Heartbeat liveness recovery decides when stalled issue trees need manager-visible follow-up. > - Automatic recovery issue creation is useful, but operators need instance-level controls for how aggressive it is. > - Without controls, recovery behavior is harder to tune for local development, production operations, and noisy edge cases. > - This pull request adds configurable liveness auto-recovery settings across shared contracts, API routes, services, and the instance experimental settings UI. > - The benefit is that operators can keep liveness findings advisory or enable bounded recovery automation with explicit intervals and lookback windows. ## What Changed - Added shared types and validators for liveness auto-recovery settings. - Extended instance settings routes and services to persist and validate the new controls. - Wired heartbeat/recovery services to honor enablement, minimum interval, and lookback settings. - Added UI controls for liveness recovery under instance experimental settings. - Covered the new server behavior with instance settings and liveness escalation tests. ## Verification - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts server/src/__tests__/instance-settings-routes.test.ts --pool=forks --poolOptions.forks.isolate=true` - `pnpm --filter @paperclipai/shared typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm --filter @paperclipai/ui typecheck` ## Risks - Moderate behavioral risk because recovery automation timing changes when enabled; defaults keep existing advisory behavior unless the setting is turned on. - No database migration in this PR; settings are stored through the existing instance settings path. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, `gpt-5`, coding model with tool use and local command execution; context window not exposed by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-27 08:46:44 -05:00
Neeraj Kumar Singh B	f0f9460d1d	docs: AWS ECS Fargate deployment runbook (#3897 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies and ships a > "local-first, cloud-ready" deployment model > - The deploy docs currently cover local/Docker but not a production > cloud target, so teams asking "how do I put this behind a real domain" > have no canonical path > - We already support Docker images, RDS-compatible Postgres, and an EFS > storage profile, so AWS ECS Fargate is a natural fit > - Without a runbook, each team reinvents VPC, security groups, TLS, and > secrets wiring and usually gets at least one step wrong > - This pull request adds `docs/deploy/aws-ecs.md`, an ECS task-definition > template, and an `.env.aws.example`, cross-linked from the deploy overview > - The benefit is a single, reproducible ~$110/mo path to a production > deployment, plus a full teardown for throwaway environments ## What Changed - New `docs/deploy/aws-ecs.md` — an 11-step ECS Fargate runbook covering ECR, VPC, RDS, EFS, Secrets Manager, IAM, ALB, and ECS service with the deployment circuit breaker enabled - New `docker/ecs-task-definition.json` — Fargate-ready task definition with `<ACCOUNT_ID>`, `<REGION>`, `<EFS_ID>`, `<DOMAIN>` placeholder tokens - New `docker/.env.aws.example` — documents every non-secret env var the ECS deployment needs - `docs/deploy/overview.md` — one-line cross-reference to the new guide - Greptile feedback addressed in follow-up commits: - `containerName` in the service-create call now matches `paperclip-server` in the task definition - HTTP :80 listener added that 301-redirects to :443 - Dedicated RDS DB subnet group created before `create-db-instance` - EFS teardown polls on mount-target deletion instead of `sleep 30` ## Verification - Walked every step of the runbook against the task definition to confirm variable names (`$ALB_SG`, `$ECS_SG`, `$RDS_SG`, `$EFS_SG`, `$TG_ARN`, `$LISTENER_ARN`, `$HTTP_LISTENER_ARN`, `$EFS_ID`, `$RDS_ENDPOINT`, etc.) are defined before they are referenced - Confirmed the `containerName` in Step 10 (`paperclip-server`) matches `docker/ecs-task-definition.json` line 11 - Confirmed the `sed` placeholder substitution in Step 8 matches the tokens in the task definition template - Teardown order was checked in reverse-dependency order: ECS service → listeners → target group → ALB → RDS (waits for deletion) → DB subnet group → EFS mount targets (polled) → EFS → secrets → SGs → ECR → IAM → log group ## Risks - Low risk for the repo. Docs-only change plus two template files under `docker/`; no runtime code paths are touched and nothing is imported by the build. - Risk for users who follow the runbook: AWS bills accrue immediately once RDS/ALB/EFS exist. The runbook calls this out and includes a full teardown procedure. Placeholder tokens (`<ACCOUNT_ID>`, `<REGION>`, `<EFS_ID>`, `<DOMAIN>`) are documented so nothing is silently hard-coded. ## Model Used - Claude (Anthropic), model `claude-opus-4-6`, ~200K context window, extended thinking mode on, used with tool access (file edit, shell) via Claude Code. The Greptile follow-up commits were authored the same way. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have run tests locally and they pass — N/A for docs/config templates; validated by reading - [x] I have added or updated tests where applicable — N/A for docs - [x] If this change affects the UI, I have included before/after screenshots — N/A, no UI - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-27 08:41:47 -05:00
Dotta	1d8c7a09b8	[codex] Add security role route regression (#4586 ) ## Thinking Path > - Paperclip orchestrates AI agents through company-scoped control-plane workflows. > - Agent creation is one of the core board/operator surfaces for defining who works in a company. > - The shared taxonomy now includes a first-class `security` agent role. > - Direct agent creation must preserve that role through default instruction materialization and telemetry. > - A prior replacement PR covered this path, but Greptile identified that the route-test mock could let a future patch object shadow the regression. > - This pull request reopens the narrow regression coverage from current `master` with the mock ordering fixed. > - The benefit is a focused guardrail that keeps `security` role creation observable without expanding the production diff. ## What Changed - Added a direct agent creation route regression test for `role: "security"`. - Verified telemetry receives `agentRole: "security"` after the default instruction materialization update path. - Ordered the regression mock as `...patch` before `role: "security"` so future patch fields cannot shadow the asserted role. ## Verification - `pnpm install --frozen-lockfile` to link dependencies in the fresh worktree; it completed with existing plugin SDK bin warnings. - `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts packages/shared/src/adapter-types.test.ts` ## Risks - Low risk. This is test-only coverage and does not change runtime behavior. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 based coding agent, tool-enabled with local shell and repository editing capabilities. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots (N/A: no UI changes) - [x] I have updated relevant documentation to reflect my changes (N/A: test-only regression) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-27 08:11:52 -05:00
Devin Foley	d2cbe2cb23	Prefer pushing feature branches to a user fork in paperclip-dev skill (#4572 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The `paperclip-dev` skill is the canonical reference agents read before doing development work on the Paperclip repo itself > - Today the skill assumes feature branches get pushed to `origin` (= `paperclipai/paperclip`), which clutters the upstream branch list when contributors actually have personal forks > - This is the standard open-source contribution pattern (push to fork, PR upstream) and the skill should reflect it > - This pull request adds a "Forks — Prefer Pushing to a User Fork" section that teaches agents to detect a fork remote, push there, and only fall back to `origin` when no fork is configured > - The benefit is cleaner upstream branch hygiene and behavior that matches typical contributor workflows without any code/runtime change ## What Changed - Added a new Forks — Prefer Pushing to a User Fork section to `skills/paperclip-dev/SKILL.md` covering: - How to detect a user fork via `git remote -v` (treat any non-`paperclipai` GitHub remote as the fork) - How to push to the fork (`git push -u <fork-remote> HEAD`) - How to create the PR from the fork (`gh pr create --repo paperclipai/paperclip --head <fork-owner>:<branch>`) - The no-fork fallback (push to `origin`, do not auto-create a fork — ask first) - Keeping the fork's `master` in sync - Added a reinforcing entry to the Common Mistakes table linking back to the new section ## Verification - Docs-only change to a single markdown skill file. Reviewer can confirm by reading the diff in `skills/paperclip-dev/SKILL.md`: - New `## Forks — Prefer Pushing to a User Fork` section sits between `## Worktrees` and `## Pull Requests` - New row appended to the `## Common Mistakes` table - No tests, no build, no runtime behavior affected. ## Risks - Low risk. Documentation-only edit. The instructions are advisory — they only change agent behavior on future runs that read the skill. ## Model Used - Provider: Anthropic (Claude) - Model ID: `claude-opus-4-7` (Claude Opus 4) - Capabilities: tool use (file read/edit, shell, git, gh CLI), extended reasoning - Context: invoked via Claude Code / Paperclip heartbeat for issue PAPA-139 ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass (N/A — docs-only change; no test surface) - [x] I have added or updated tests where applicable (N/A) - [x] If this change affects the UI, I have included before/after screenshots (N/A — no UI change) - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-26 22:19:07 -07:00
Dotta	82e257c7ba	Cancel stale queued heartbeats when issue graph changes (PAP-2314) (#4534 ) Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-26 21:17:38 -05:00
Devin Foley	868d08903e	test: isolate CLI company import e2e state (#4560 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, and its CLI import/export path is part of how operators move company state safely between environments. > - The `paperclipai company import/export` e2e test is supposed to validate that portability flow inside a hermetic harness, not against a developer's live Paperclip home. > - This regression showed nested CLI subprocesses could silently fall back to ambient `PAPERCLIP_*` state and mutate a real local instance by creating extra companies such as `CLI-1-Roundtrip-Test`. > - The first job was to pin the test subprocesses to isolated config, home, instance, auth, and context paths, and to add a regression assertion that proves the nested CLI writes stay inside the test-owned state. > - Once the PR was up, CI and Greptile exposed two follow-on issues that were blocking merge: plugin SDK typecheck bootstrap was racing across packages in fresh CI, and the new lock helper needed one more fix to release its lock on failure. > - This pull request therefore ends up doing two tightly related things: fixing the original CLI isolation leak, and hardening the supporting typecheck/bootstrap path enough for the fix to verify cleanly in CI. > - The benefit is that the portability e2e test is now actually isolated, and the PR verification path is stable enough to catch regressions instead of introducing its own nondeterministic failures. ## What Changed - Hardened `cli/src/__tests__/company-import-export-e2e.test.ts` so nested CLI subprocesses re-seed isolated `PAPERCLIP_CONFIG`, `PAPERCLIP_HOME`, `PAPERCLIP_INSTANCE_ID`, `PAPERCLIP_CONTEXT`, `PAPERCLIP_AUTH_STORE`, and throwaway `HOME` values instead of falling back to ambient machine state. - Added a regression assertion around `paperclipai context set --json`, then cleared the temporary `context.json` so the isolation check and the later export/import flow stay independent. - Passed the same isolated `HOME` into the server subprocess so both sides of the e2e harness are symmetric. - Introduced locking in `scripts/ensure-plugin-build-deps.mjs` and switched the server/plugin example `typecheck` scripts to use that helper instead of launching concurrent raw `@paperclipai/plugin-sdk` builds. - Fixed the helper failure path so it releases the lock before exiting non-zero, which prevents stale-lock timeouts during parallel typecheck runs. ## Verification - `pnpm vitest run cli/src/__tests__/company-import-export-e2e.test.ts --project paperclipai` - `pnpm --filter paperclipai typecheck` - `pnpm -r typecheck` - PR checks now pass on the current head, including `policy`, `verify`, `e2e`, `security/snyk`, and `Greptile Review`. ## Risks - Low risk. The product-facing behavior change is scoped to test harness code in the CLI e2e suite. - The CI stabilization changes only affect bootstrap/typecheck helper paths for the server and plugin/example packages, but they do touch shared verification plumbing; the main risk is changing how fresh build artifacts are prepared in local/CI typecheck runs. ## Model Used - Anthropic Claude via Paperclip `claude_local`, model `claude-opus-4-7`, high-effort local coding agent, used for the initial implementation and first peer-reviewed verification. - OpenAI Codex via Paperclip `codex_local`, model `gpt-5.4`, high reasoning-effort local coding agent with tool use, used for CI triage, Greptile follow-up fixes, verification, and PR maintenance. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-26 19:10:01 -07:00
Devin Foley	1d9f7a5149	Fix flaky heartbeat recovery teardown CI failure (#4559 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The linked CI job is in the server test/recovery path, where heartbeat runs and issue cleanup need to leave the control plane in a consistent state even when retries fail. > - In this case the failure was not runtime product behavior but test teardown behavior inside `heartbeat-process-recovery.test.ts`. > - The failing GitHub Actions job showed a foreign-key race on `company_skills_company_id_companies_id_fk` while the test tried to delete the parent company record. > - The surrounding teardown code already uses bounded retry cleanup for other dependent tables (`issues`, `heartbeatRuns`, and `agents`) because this test file intentionally exercises asynchronous recovery flows. > - This pull request applies that same retry pattern to the final `db.delete(companies)` step, re-clearing `companySkills` before each retry. > - The benefit is a targeted fix for the CI flake without changing runtime behavior or expanding the scope beyond the failing teardown path. ## What Changed - Wrapped the final `db.delete(companies)` call in `server/src/__tests__/heartbeat-process-recovery.test.ts` with the same 5-attempt retry pattern already used elsewhere in that teardown. - Re-cleared `companySkills` before each company-delete retry so late-arriving FK-dependent rows do not mask the real test result. - Verified the fix against the originally failing `heartbeat-process-recovery` test file and the broader `pnpm test:run` command under CI-like env conditions. ## Verification - `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts` - Re-ran `pnpm exec vitest run server/src/__tests__/heartbeat-process-recovery.test.ts` multiple times locally; the previously failing teardown stayed green. - `env -u PAPERCLIP_API_URL -u PAPERCLIP_RUNTIME_API_URL -u PAPERCLIP_RUN_ID -u PAPERCLIP_TASK_ID -u PAPERCLIP_AGENT_ID -u PAPERCLIP_COMPANY_ID -u PAPERCLIP_API_KEY -u PAPERCLIP_WAKE_REASON -u PAPERCLIP_WAKE_COMMENT_ID -u PAPERCLIP_WAKE_PAYLOAD_JSON -u PAPERCLIP_APPROVAL_ID -u PAPERCLIP_APPROVAL_STATUS pnpm test:run` ## Risks - Low risk. The change is test-only and scoped to teardown retry behavior in a single server test file. - If the underlying async cleanup behavior changes again, this test could still become flaky in a different way, but this PR addresses the specific FK race seen in the linked CI job. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI `gpt-5.4` via Paperclip `codex_local`, high reasoning mode, with tool use for shell, git, HTTP API calls, and patch application. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-26 17:30:20 -07:00
Devin Foley	8145141c55	Fix external issue URL rewriting in markdown (#4558 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Issue and comment rendering is part of the board UI where humans supervise and inspect agent work. > - External Paperclip issue URLs can appear in comments as references to other runs, review threads, or remote test environments. > - Those links must preserve their full destination, including origin, port, and `#comment-...` fragments, or the operator is taken to the wrong place. > - The bug here was that absolute `http(s)` issue URLs were being normalized into internal `/issues/...` routes in the markdown path. > - This pull request stops rewriting absolute URLs while keeping internal issue-reference behavior for relative paths and identifiers. > - The benefit is that authored external links now navigate exactly where the operator expects, especially for remote test and comment-deep-link workflows. ## What Changed - Stopped `ui/src/lib/issue-reference.ts` from treating absolute `http(s)` URLs as internal issue paths. - Added defense-in-depth in `ui/src/lib/mention-chips.ts` so absolute `http(s)` URLs are never reclassified as issue mention chips. - Updated `ui/src/lib/issue-reference.test.ts` to cover absolute Paperclip URLs with preserved origin, port, and comment hash. - Updated `ui/src/components/MarkdownBody.test.tsx` to assert the reported URL renders as an external link, not an internal `/issues/...` href. ## Verification - `pnpm exec vitest run ui/src/lib/issue-reference.test.ts ui/src/components/MarkdownBody.test.tsx` - Expected result: `2` files passed, `37` tests passed. - Manual spot-check from the issue report path: a URL like `http://remote.example.test:3103/PAPA/issues/PAPA-115#comment-...` should remain an external link with its full destination preserved. ## Risks - Low risk. The change narrows when Paperclip rewrites URLs, so the main risk is if some existing workflow depended on absolute `http(s)` Paperclip URLs being converted into internal issue links. The added regression coverage is aimed at preventing that from regressing silently. ## Model Used - OpenAI Codex local agent via Paperclip `codex_local` - Backing model family: GPT-5-based Codex runtime - Exact backend model ID/version: not exposed by this adapter/runtime surface - Context window: not exposed by this adapter/runtime surface - Capabilities used: tool use, shell command execution, code editing, git operations, and local test execution ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [ ] I will address all Greptile and reviewer comments before requesting merge	2026-04-26 17:19:23 -07:00
Devin Foley	54ab0d24cd	Fix disappearing issue comments (#4557 ) ## Thinking Path > - Paperclip is a control plane for AI-agent companies, so issue detail pages are a primary surface for understanding agent work and human feedback. > - The relevant subsystem here is the issue comments/chat experience across the React issue detail page and the server comment pagination API. > - Long issue threads were only surfacing the newest page of comments at first render, which hid earlier human and agent messages behind extra pagination. > - The first UI fix exposed that the descending cursor path on the server could also fail for older-page fetches, leaving the chat tab stuck on an infinite "Loading earlier comments..." state. > - This needed to be addressed in both layers so the chat tab can surface earlier conversation history without manual recovery and without server errors. > - This pull request auto-loads earlier comment pages in the issue detail chat view and fixes the descending cursor predicate used by issue comment pagination. > - The benefit is that long-running issues like `PAPA-103` now show the missing conversation history near the top of the chat surface instead of hiding it or failing to load it. ## What Changed - Auto-load earlier issue comment pages in the issue detail chat tab until the thread reaches a 150-comment cap or there are no older comments left. - Add UI-side guard logic and regression coverage for optimistic issue comment pagination so the autoload behavior stops cleanly. - Replace the raw SQL descending cursor predicate in `issueService.listComments` with typed Drizzle comparisons for the `(createdAt, id)` anchor tuple. - Add a server regression test that paginates earlier comments in descending order from an anchor comment. - Smoke-test the exact previously failing seeded `PAPA-103` cursor path on the isolated dev instance used for review. ## Verification - `pnpm --filter @paperclipai/server exec vitest run src/__tests__/issues-service.test.ts` - `pnpm --filter @paperclipai/server typecheck` - Manual smoke against seeded `PAPA-103` data on the isolated dev server: - `GET /api/issues/PAPA-103/comments?order=desc&limit=50` returns `200` - `GET /api/issues/PAPA-103/comments?after=765d3609-edc6-4d11-a8fe-d466affbe85d&order=desc&limit=50` now returns `200` with 50 comments instead of `500` ## Risks - Moderate UI/perf risk on very large threads because the chat tab now prefetches multiple earlier pages on mount; the cap is intentionally limited to 150 comments to bound that work. - Low API risk because the server fix only changes the cursor predicate construction for anchor-based comment pagination, but any mistake there would affect older-comment paging order. > I checked `ROADMAP.md` before opening this PR and this bug fix does not duplicate planned core work. ## Model Used - OpenAI Codex coding agent in the Paperclip local adapter environment. The exact backend model ID and context window were not exposed in-session. Tool-assisted workflow included shell execution, git/GitHub CLI, local test execution, and targeted code edits. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-26 16:23:53 -07:00
Devin Foley	b2496c8067	fix(auth): trust allowed hostname port variants on detected listen port (#4554 ) ## Thinking Path > - Paperclip is the control plane for autonomous AI companies, so authenticated board access has to be predictable across local and worktree deployments. > - This change sits in the authenticated-mode server startup and Better Auth origin-trust wiring. > - The original auth branch fixed one real gap by adding port-qualified trusted origins for allowed hostnames on non-default ports. > - Review of that branch found a second-order bug: trusted origins were still derived from the configured port before startup detected the actual listen port. > - In isolated worktrees, that meant a common `3100 -> 3101` port shift could still leave Better Auth trusting the stale origin. > - This pull request keeps the original allowed-hostname port-variant fix, then moves trust derivation onto the resolved listen port and adds regression coverage around startup wiring. > - The benefit is that authenticated sessions keep working on allowed private hostnames even when Paperclip has to auto-shift to a different local port. ## What Changed - Added `:port` trusted-origin variants for authenticated-mode `allowedHostnames` when Paperclip runs on non-default ports. - Changed authenticated startup so `listenPort` is detected before Better Auth initialization, and explicit auth base URLs are rewritten before auth startup. - Updated `deriveAuthTrustedOrigins()` to accept the resolved listen port so Better Auth trusts the actual browser origin instead of the stale configured port. - Added focused regression coverage in `server/src/__tests__/better-auth.test.ts` and `server/src/__tests__/server-startup-feedback-export.test.ts`. ## Verification - `pnpm exec vitest run server/src/__tests__/better-auth.test.ts server/src/__tests__/server-startup-feedback-export.test.ts` - Reviewer re-check: reviewed commits `380f5b9f` and `092bb34c` after the follow-up fix landed and found no remaining issues. ## Risks - Low risk: this only affects authenticated-mode origin derivation and startup ordering around detected listen ports. - Main behavioral shift: startup no longer mutates `config.port` to the selected port; it now carries `requestedListenPort` separately and uses `listenPort` where runtime behavior needs the resolved value. - If another path was implicitly relying on `config.port` being overwritten during startup, that path would need follow-up, though the current startup/test coverage did not reveal one. > I checked `ROADMAP.md` and did not find an overlapping planned core work item for this auth trusted-origin port handling fix. ## Model Used - OpenAI Codex via Paperclip `codex_local` agents for implementation and review. Exact backend model ID/context window were not surfaced in this run context; work was performed through the Codex local adapter with tool use, code execution, and review passes. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-26 15:40:39 -07:00
Devin Foley	08af830430	Tighten publicBaseUrl port rewriting (#4553 ) ## Thinking Path > - Paperclip is a control plane for autonomous agent companies, so its local and authenticated deployment behavior has to stay predictable under port rebinding and worktree isolation. > - This change sits in the server/worktree configuration path that derives runtime URLs and auth origins from `auth.publicBaseUrl`. > - The original hostname-port rewrite change fixed one real gap for private/tailnet host:port worktree setups, but it widened the rewrite rule too far. > - Rewriting every explicit `auth.publicBaseUrl` can corrupt public or reverse-proxy URLs by turning a stable origin like `https://paperclip.example` into a local listen-port URL. > - Paperclip's auth and trusted-origin handling depend on that URL staying semantically correct, so this had to be narrowed before merge. > - This pull request tightens the rewrite rule to explicit-port URLs only and adds regression coverage across the CLI helper, worktree config persistence, and server startup path. > - The benefit is that private host:port worktree flows still work, while public/default-port URLs remain stable and safe. ## What Changed - Tightened `rewriteLocalUrlPort` in `cli/src/commands/worktree-lib.ts`, `server/src/worktree-config.ts`, and `server/src/index.ts` so it only rewrites URLs that already include an explicit port. - Removed the old loopback-only hostname gate from the CLI/worktree helpers and replaced it with the more precise `parsed.port` guard. - Updated CLI helper coverage to assert that explicit-port non-loopback URLs still rewrite while no-port public URLs stay unchanged. - Expanded `server/src/__tests__/worktree-config.test.ts` to cover explicit-port rewrite and no-port stability for both persisted worktree config and in-memory runtime port selection. - Added startup-path coverage in `server/src/__tests__/server-startup-feedback-export.test.ts` for `detect-port` rebinding with both explicit-port and no-port `auth.publicBaseUrl` values. ## Verification - `pnpm --filter @paperclipai/plugin-sdk build` - `npx vitest run server/src/__tests__/server-startup-feedback-export.test.ts` - `npx vitest run cli/src/__tests__/worktree.test.ts server/src/__tests__/worktree-config.test.ts` - All of the above were run locally in this issue worktree and passed. ## Risks - Low risk. The behavior change is deliberately narrower than the reviewed broad-host rewrite and is guarded by regression coverage for both the explicit-port and no-port cases. - The main remaining risk is behavioral only if another code path starts depending on port rewriting for URLs that never declared a port, which would be a separate bug. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex local agent using `gpt-5.4` with high reasoning effort, tool use, shell execution, and file editing. - Anthropic Claude local agent using `claude-opus-4-6` for follow-up code review approval on the implementation issue. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-26 14:29:22 -07:00
Devin Foley	d47ffa87f0	Fix CEO AGENT_HOME paths and centralize workspace env propagation (#4551 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The local adapter layer is responsible for turning Paperclip runtime context into the environment seen by the child agent process. > - The CEO onboarding bundle tells the agent where to read and write its persistent memory and fact files. > - That bundle was using `./memory/...` and `./life/...`, which only works when the process cwd happens to equal the agent home directory. > - At the same time, six local adapters each duplicated the same workspace-env propagation logic, including `AGENT_HOME`, which makes this contract easy to drift. > - This pull request fixes the CEO instructions to use `$AGENT_HOME/...` and centralizes workspace-env propagation in one shared helper with shared tests. > - The benefit is a real bug fix for agent memory paths plus a single tested contract that makes future built-in adapter work less likely to forget `AGENT_HOME`. ## What Changed - Updated `server/src/onboarding-assets/ceo/HEARTBEAT.md` to use `$AGENT_HOME/memory/...` and `$AGENT_HOME/life/...` instead of cwd-relative `./memory/...` and `./life/...`. - Added `applyPaperclipWorkspaceEnv(...)` in `packages/adapter-utils/src/server-utils.ts` to centralize `PAPERCLIP_WORKSPACE_*` and `AGENT_HOME` propagation. - Added shared helper coverage in `packages/adapter-utils/src/server-utils.test.ts` for both populated and skip-empty cases. - Switched the built-in local adapters (`claude_local`, `codex_local`, `cursor_local`, `gemini_local`, `opencode_local`, `pi_local`) over to the shared helper instead of inline env assignment blocks. ## Verification - `pnpm install` - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/codex-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - Result: 7 test files passed, 31 tests passed, 0 failures. ## Risks - Low risk. - The only behavioral surface is the shared env propagation refactor across six adapters; if the helper diverged from prior semantics, an adapter could miss a workspace env var. - The shared helper test plus the affected adapter execute tests reduce that risk, and the helper preserves the prior "set only non-empty strings" behavior. ## Model Used - OpenAI Codex via Paperclip `codex_local` agent runtime; tool-assisted coding workflow with shell execution, file patching, git operations, and API interaction. The exact backend model identifier and context window are not surfaced by this local runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-26 13:57:35 -07:00
Devin Foley	d1484551ee	Add open-source hygiene note to paperclip-dev skill (#4541 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The `paperclip-dev` skill is part of the contributor and agent workflow layer that tells developers how to work in this repository safely. > - That skill already references the public upstream `origin`, but it did not explicitly say that pushes there must be treated as publishable open-source output. > - Without that reminder, contributors are more likely to leak secrets, PII, private logs, machine-local config, or noisy throwaway git history into the public repo. > - This pull request adds a prominent `OPEN SOURCE HYGIENE` callout near the top of the skill, before the git workflow guidance. > - The benefit is clearer safety guidance for contributors and less accidental disclosure or branch/commit noise on the upstream project. ## What Changed - Added an `OPEN SOURCE HYGIENE` callout near the top of `skills/paperclip-dev/SKILL.md`. - Explicitly warned that anything pushed to `origin` must be publishable. - Called out avoiding secrets, API keys, PII, private logs, machine-local config, and noisy throwaway branches or checkpoint commits. ## Verification - N/a ## Risks - Low risk. This is a docs-only change in a skill file; the main risk is wording tone or placement, not runtime behavior. ## Model Used - OpenAI Codex via the `codex_local` Paperclip adapter, GPT-5-based coding agent runtime. Exact backend serving model ID is not exposed in this heartbeat environment. Tool use, shell execution, and patch application were enabled. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [ ] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-26 12:14:49 -07:00
Devin Foley	91333ec86f	feat: add paperclip-dev skill with optional bundled skill support (#3854 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Agents working on the Paperclip codebase itself need guidance on dev workflows: server lifecycle, worktrees, builds, database ops, diagnostics > - There was no bundled skill covering these workflows — agents had to figure it out from scratch each time > - Additionally, not every skill should be force-installed on every agent — a dev-focused skill should be opt-in > - This PR adds a `paperclip-dev` skill with `required: false` frontmatter so it ships with Paperclip but isn't auto-installed > - The skill's PR section references canonical files (`.github/PULL_REQUEST_TEMPLATE.md`, `CONTRIBUTING.md`) instead of duplicating their content, with gated instructions that force agents to read those files before creating any PR > - The benefit is that developers (human or agent) can opt in to structured dev guidance without polluting the default agent skill set or creating drift between duplicated docs ## What Changed - Added `skills/paperclip-dev/SKILL.md` covering server management, worktree lifecycle, builds, database ops, diagnostics, agent operations, and common mistakes - The Pull Requests section uses gated, reference-based instructions — agents MUST read `.github/PULL_REQUEST_TEMPLATE.md` and `CONTRIBUTING.md` before running `gh pr create`, with a brief checklist of required section names (no content duplication) - Updated `packages/adapter-utils/src/server-utils.ts` to respect `required: false` frontmatter — optional skills are bundled but not auto-installed on agents - Added test in `server/src/__tests__/paperclip-skill-utils.test.ts` verifying that optional skills are excluded from the default install set ## Verification ```bash # Run tests pnpm test # Manual verification: create a fresh worktree without seeding npx paperclipai worktree:make test-optional-skill --no-seed cd ~/paperclip-test-optional-skill eval "$(npx paperclipai worktree env)" npx paperclipai run # Verify paperclip-dev appears in company skill library but is NOT auto-assigned # Call listPaperclipSkillEntries() — paperclip-dev should show required: false # Call resolvePaperclipDesiredSkillNames() — paperclip-dev should NOT be in the default set # Cleanup npx paperclipai worktree:cleanup test-optional-skill ``` ## Risks - Low risk. The `required` field defaults to `true` when absent, so all existing skills behave identically. Only the new `paperclip-dev` skill sets `required: false`. ## Model Used Claude Opus 4.6 (`claude-opus-4-6`) via Claude Code, with tool use and extended context. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-26 11:06:13 -07:00
Dotta	c036bbfa98	Add first-class security agent role to taxonomy (#4532 ) ## Thinking Path > - Paperclip is the control plane for AI-agent companies, so agent metadata is part of the platform's governance and audit surface. > - The shared agent taxonomy in `@paperclipai/shared` is the source of truth for allowed agent roles and their UI labels. > - The current taxonomy lacks a `security` role, which causes Security Engineer hires to collapse into `engineer`. > - That breaks separation-of-duties evidence in telemetry and weakens role-level audit fidelity even though it does not directly change permissions. > - This pull request adds `security` as a first-class shared role and covers the prior rejection path with a regression test. > - The benefit is that Security Engineer agents can now be persisted and rendered under the correct role without schema or permission churn. ## What Changed - Added `security` to `AGENT_ROLES` in `packages/shared/src/constants.ts`. - Added the `Security` display label to `AGENT_ROLE_LABELS` so existing UI consumers render the new role automatically. - Added a shared validator regression test proving `createAgentSchema` accepts `role: "security"` and that the label stays stable. ## Verification - `pnpm --filter @paperclipai/shared typecheck` - `pnpm --filter @paperclipai/shared exec vitest run src/adapter-types.test.ts` ## Risks - Low risk. This is a shared enum expansion with no database migration and no permission-model change. - Residual risk: this PR does not backfill existing agents already persisted as `engineer`; it only fixes new validations and labels going forward. > I checked `ROADMAP.md`/`doc` for overlap and did not find an existing planned item covering this taxonomy fix. ## Model Used - OpenAI GPT-5.4 via the Codex local adapter, with tool use and local code execution enabled. The runtime did not surface a separate context-window identifier in agent metadata. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-26 07:52:05 -05:00
Dotta	df425fde96	Present ordered sub-issues as a workflow checklist (#4523 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Operators use issue detail pages and child issue lists to understand multi-step execution plans. > - Ordered sub-issues currently read like a flat table, so dependency chains and current next steps are harder to scan. > - The branch work adds a workflow-oriented presentation for child issues without changing the single-assignee task model. > - This pull request makes ordered sub-issues read more like a progress checklist while preserving normal issue list controls. > - The benefit is that operators can see completed steps, active work, blocked follow-ups, and dependency order at a glance. ## What Changed - Added workflow sorting utilities and tests for dependency-aware child issue ordering. - Added sub-issue progress summary, checklist numbering, current-step affordances, blocker context, and done-state de-emphasis in the issue list UI. - Wired issue detail sub-issue panels to use the workflow sort/progress checklist presentation. - Updated issue service behavior/tests for child issue ordering inputs used by the UI. - Added a Storybook visual review fixture and screenshot helper for the sub-issue workflow checklist surface. ## Verification - `pnpm run preflight:workspace-links && pnpm exec vitest run server/src/__tests__/issues-service.test.ts ui/src/components/IssueRow.test.tsx ui/src/components/IssuesList.test.tsx ui/src/pages/IssueDetail.test.tsx ui/src/lib/issue-detail-subissues.test.ts ui/src/lib/workflow-sort.test.ts` - Result: 6 test files passed, 55 tests passed, 34 embedded Postgres issue-service tests skipped because `@embedded-postgres/darwin-x64` is unavailable on this host. - Visual review: generated Storybook screenshots from the existing local Storybook server on port 6006 with `node scripts/screenshot-subissues.mjs /tmp/pap-2189-subissues-screens http://localhost:6006`. - Screenshot artifacts: - Desktop dark: ![Desktop dark](doc/assets/pap-2189/desktop-1440x900-dark.png) - Desktop light: ![Desktop light](doc/assets/pap-2189/desktop-1440x900-light.png) - Mobile dark: ![Mobile dark](doc/assets/pap-2189/mobile-390x844-dark.png) - Mobile light: ![Mobile light](doc/assets/pap-2189/mobile-390x844-light.png) - Local Storybook note: starting a second Storybook process selected port 6008 because 6006 was occupied, then Vite failed with an esbuild host/binary version mismatch (`0.25.12` host vs `0.27.3` binary). The already-running Storybook server on 6006 served the fixture successfully for screenshots. ## Risks - Medium UI risk: the issue list now has additional sub-issue-specific visual states, so dense lists should be checked for spacing and scanability. - Low ordering risk: workflow sorting is covered by focused unit tests, but unusual dependency topologies may still need reviewer attention. - No migration risk: this PR does not add database migrations or touch `pnpm-lock.yaml`. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub workflow. Context window is runtime-provided and not exposed in this environment. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-26 07:36:49 -05:00
Devin Foley	40782f703d	Fix release packaging for standalone public packages (#4494 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies, and the sandbox-provider work just moved E2B into a standalone publishable plugin package. > - That plugin is intentionally excluded from the root pnpm workspace so it can model third-party install behavior without forcing lockfile churn in the main repo. > - The merged architecture change exposed a follow-up release problem: the canary publish workflow tried to publish `@paperclipai/plugin-e2b`, but the tarball had no `dist/` payload because standalone public packages were not being built in the release path. > - That means the release pipeline needed a packaging fix in core release tooling, not another architectural change in the sandbox provider itself. > - This pull request adds a generic release step for public packages that live outside the pnpm workspace, instead of hardcoding E2B-specific behavior into the release script. > - The benefit is that standalone publishable packages can be built and packed correctly during release, including future sandbox-provider plugins that follow the same pattern. ## What Changed - Added `scripts/build-standalone-public-packages.mjs` to discover public packages outside the pnpm workspace, run a clean package-local install, and build them before publish. - Updated `scripts/release.sh` to invoke that helper immediately after the normal workspace build step. - Kept the behavior generic by driving off the existing public package map and pnpm workspace patterns rather than special-casing `@paperclipai/plugin-e2b`. ## Verification - `rm -rf packages/plugins/sandbox-providers/e2b/dist` - `node ./scripts/build-standalone-public-packages.mjs` - `cd packages/plugins/sandbox-providers/e2b && npm pack --dry-run` - Confirm the tarball now includes the rebuilt `dist/` files instead of only `README.md` / `package.json` ## Risks - Low risk: this only changes the release build path for public packages outside the pnpm workspace. - The helper performs a clean package-local install for each standalone public package, so release time may increase slightly as more such packages are added. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, GitHub CLI, and local build/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-25 12:16:23 -07:00
Devin Foley	4ef969f084	Add E2B sandbox provider plugin (#4452 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Sandbox environments are part of that execution layer, and the recent core refactor moved provider-specific behavior to a generic plugin seam > - This pull request adds a dedicated `@paperclipai/plugin-e2b` package so E2B can live entirely outside core host code > - Because the feature is still unreleased, the plugin should model third-party packaging directly instead of carrying extra backward-compatibility complexity in core or the workspace lockfile > - This branch therefore makes the E2B provider a standalone publishable package, documents the package-local dev flow, and keeps the publish manifest/runtime dependency story correct > - The benefit is that E2B becomes a true plugin reference implementation that can be installed by package name without reopening core Paperclip code ## What Changed - Added `packages/plugins/paperclip-plugin-e2b` as the E2B sandbox provider plugin package - Implemented config validation, lease acquire/resume/release/destroy handlers, workspace realization, and command execution for E2B sandboxes - Excluded the E2B plugin package from the root workspace so the repo no longer needs `pnpm-lock.yaml` churn for its third-party dependency graph - Added package-local development/install support plus a prepack manifest generator so the published tarball still declares `@paperclipai/plugin-sdk` and `e2b` runtime dependencies - Addressed review feedback by fixing sandbox cleanup on acquire failures, rejecting blank templates, normalizing fractional `timeoutMs`, and always passing the configured template name to the E2B SDK - Updated focused Vitest coverage for config normalization, validation, acquire cleanup, command execution, and lease release behavior - Updated the Dockerfile deps stage to copy the E2B package manifest so the policy check stays in sync ## Verification - `cd packages/plugins/paperclip-plugin-e2b && pnpm install --ignore-workspace --no-lockfile` - `cd packages/plugins/paperclip-plugin-e2b && pnpm build` - `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace test` - `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace typecheck` - `cd packages/plugins/paperclip-plugin-e2b && npm pack --dry-run` ## Risks - The package now relies on a prepack manifest rewrite so the publish-time dependency list stays correct while the repo-local dev manifest stays workspace-light - The current repo snapshot is still unreleased, so the generated publish manifest points at the repo SDK version until the normal release flow rewrites versions before publish - Real-world E2B environments may still expose edge cases around lifecycle timing or sandbox metadata beyond the mocked unit coverage > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, GitHub CLI, and local build/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-25 11:01:11 -07:00
Devin Foley	5bd0f578fd	Generalize sandbox provider core for plugin-only providers (#4449 ) ## Thinking Path > - Paperclip is a control plane, so optional execution providers should sit at the plugin edge instead of hardcoding provider-specific behavior into core shared/server/ui layers. > - Sandbox environments are already first-class, and the fake provider proves the built-in path; the remaining gap was that real providers still leaked provider-specific config and runtime assumptions into core. > - That coupling showed up in config normalization, secret persistence, capabilities reporting, lease reconstruction, and the board UI form fields. > - As long as core knew about those provider-shaped details, shipping a provider as a pure third-party plugin meant every new provider would still require host changes. > - This pull request generalizes the sandbox provider seam around schema-driven plugin metadata and generic secret-ref handling. > - The runtime and UI now consume provider metadata generically, so core only special-cases the built-in fake provider while third-party providers can live entirely in plugins. ## What Changed - Added generic sandbox-provider capability metadata so plugin-backed providers can expose `configSchema` through shared environment support and the environments capabilities API. - Reworked sandbox config normalization/persistence/runtime resolution to handle schema-declared secret-ref fields generically, storing them as Paperclip secrets and resolving them for probe/execute/release flows. - Generalized plugin sandbox runtime handling so provider validation, reusable-lease matching, lease reconstruction, and plugin worker calls all operate on provider-agnostic config instead of provider-shaped branches. - Replaced hardcoded sandbox provider form fields in Company Settings with schema-driven rendering and blocked agent environment selection from the built-in fake provider. - Added regression coverage for the generic seam across shared support helpers plus environment config, probe, routes, runtime, and sandbox-provider runtime tests. ## Verification - `pnpm vitest --run packages/shared/src/environment-support.test.ts server/src/__tests__/environment-config.test.ts server/src/__tests__/environment-probe.test.ts server/src/__tests__/environment-routes.test.ts server/src/__tests__/environment-runtime.test.ts server/src/__tests__/sandbox-provider-runtime.test.ts` - `pnpm -r typecheck` ## Risks - Plugin sandbox providers now depend more heavily on accurate `configSchema` declarations; incorrect schemas can misclassify secret-bearing fields or omit required config. - Reusable lease matching is now metadata-driven for plugin-backed providers, so providers that fail to persist stable metadata may reprovision instead of resuming an existing lease. - The UI form is now fully schema-driven for plugin-backed sandbox providers; provider manifests without good defaults or descriptions may produce a rougher operator experience. ## Model Used - OpenAI Codex via `codex_local` - Model ID: `gpt-5.4` - Reasoning effort: `high` - Context window observed in runtime session metadata: `258400` tokens - Capabilities used: terminal tool execution, git, and local code/test inspection ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-24 18:03:41 -07:00
Dotta	deba60ebb2	Stabilize serialized server route tests (#4448 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The server route suite is a core confidence layer for auth, issue context, and workspace runtime behavior > - Some route tests were doing extra module/server isolation work that made local runs slower and more fragile > - The stable Vitest runner also needs to pass server-relative exclude paths to avoid accidentally re-including serialized suites > - This pull request tightens route test isolation and runner serialization behavior > - The benefit is more reliable targeted and stable-route test execution without product behavior changes ## What Changed - Updated `run-vitest-stable.mjs` to exclude serialized server tests using server-relative paths. - Forced the server Vitest config to use a single worker in addition to isolated forks. - Simplified agent permission route tests to create per-request test servers without shared server lifecycle state. - Stabilized issue goal context route mocks by using static mocked services and a sequential suite. - Re-registered workspace runtime route mocks before cache-busted route imports. ## Verification - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/agent-permissions-routes.test.ts server/src/__tests__/issues-goal-context-routes.test.ts server/src/__tests__/workspace-runtime-routes-authz.test.ts --pool=forks --poolOptions.forks.isolate=true` - `node --check scripts/run-vitest-stable.mjs` ## Risks - Low risk. This is test infrastructure only. - The stable runner path fix changes which tests are excluded from the non-serialized server batch, matching the server project root that Vitest applies internally. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled with shell/GitHub/Paperclip API access. Context window was not reported by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 19:27:00 -05:00
Dotta	f68e9caa9a	Polish markdown external link wrapping (#4447 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The board UI renders agent comments, PR links, issue links, and operational markdown throughout issue threads > - Long GitHub and external links can wrap awkwardly, leaving icons orphaned from the text they describe > - Small inbox visual polish also helps repeated board scanning without changing behavior > - This pull request glues markdown link icons to adjacent link characters and removes a redundant inbox list border > - The benefit is cleaner, more stable markdown and inbox rendering for day-to-day operator review ## What Changed - Added an external-link indicator for external markdown links. - Kept the GitHub icon attached to the first link character so it does not wrap onto a separate line. - Kept the external-link icon attached to the final link character so it does not wrap away from the URL/text. - Added markdown rendering regressions for GitHub and external link icon wrapping. - Removed the extra border around the inbox list card. ## Verification - `pnpm exec vitest run --project @paperclipai/ui ui/src/components/MarkdownBody.test.tsx` - `pnpm --filter @paperclipai/ui typecheck` ## Risks - Low risk. The markdown change is limited to link child rendering and preserves existing href/target/rel behavior. - Visual-only inbox polish. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled with shell/GitHub/Paperclip API access. Context window was not reported by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 19:26:13 -05:00
Dotta	73fbdf36db	Gate stale-run watchdog decisions by board access (#4446 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The run ledger surfaces stale-run watchdog evaluation issues and recovery actions > - Viewer-level board users should be able to inspect status without getting controls that the server will reject > - The UI also needs enough board-access context to know when to hide those decision actions > - This pull request exposes board memberships in the current board access snapshot and gates watchdog action controls for known viewer contexts > - The benefit is clearer least-privilege UI behavior around recovery controls ## What Changed - Included memberships in `/api/cli-auth/me` so the board UI can distinguish active viewer memberships from operator/admin access. - Added the stale-run evaluation issue assignee to output silence summaries. - Hid stale-run watchdog decision buttons for known non-owner viewer contexts. - Surfaced watchdog decision failures through toast and inline error text. - Threaded `companyId` through the issue activity run ledger so access checks are company-scoped. - Added IssueRunLedger coverage for non-owner viewers. ## Verification - `pnpm exec vitest run --project @paperclipai/ui ui/src/components/IssueRunLedger.test.tsx` - `pnpm --filter @paperclipai/server typecheck` - `pnpm --filter @paperclipai/ui typecheck` ## Risks - Medium-low risk. This is a UI gating change backed by existing server authorization. - Local implicit and instance-admin board contexts continue to show watchdog decision controls. - No migrations. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled with shell/GitHub/Paperclip API access. Context window was not reported by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 19:25:23 -05:00
Dotta	6916e30f8e	Cancel stale retries when issue ownership changes (#4445 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Issue execution is guarded by run locks and bounded retry scheduling > - A failed run can schedule a retry, but the issue may be reassigned before that retry becomes due > - The old assignee's scheduled retry should not continue to hold or reclaim execution for the issue > - This pull request cancels stale scheduled retries when ownership changes and cancels live work when an issue is explicitly cancelled > - The benefit is cleaner issue handoff semantics and fewer stranded or incorrect execution locks ## What Changed - Cancel scheduled retry runs when their issue has been reassigned before the retry is promoted. - Clear stale issue execution locks and cancel the associated wakeup request when a stale retry is cancelled. - Avoid deferring a new assignee behind a previous assignee's scheduled retry. - Cancel an active run when an issue status is explicitly changed to `cancelled`, while leaving `done` transitions alone. - Added route and heartbeat regressions for reassignment and cancellation behavior. ## Verification - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/heartbeat-retry-scheduling.test.ts server/src/__tests__/issue-comment-reopen-routes.test.ts --pool=forks --poolOptions.forks.isolate=true` - `issue-comment-reopen-routes.test.ts`: 28 passed. - `heartbeat-retry-scheduling.test.ts`: skipped by the existing embedded Postgres host guard (`Postgres init script exited with code null`). - `pnpm --filter @paperclipai/server typecheck` ## Risks - Medium risk because this changes heartbeat retry lifecycle behavior. - The cancellation path is scoped to scheduled retries whose issue assignee no longer matches the retrying agent, and logs a lifecycle event for auditability. - No migrations. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled with shell/GitHub/Paperclip API access. Context window was not reported by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 19:24:13 -05:00
Dotta	0c6961a03e	Normalize escaped multiline issue and approval text (#4444 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The board and agent APIs accept multiline issue, approval, interaction, and document text > - Some callers accidentally send literal escaped newline sequences like `\n` instead of JSON-decoded line breaks > - That makes comments, descriptions, documents, and approval notes render as flattened text instead of readable markdown > - This pull request centralizes multiline text normalization in shared validators > - The benefit is newline-preserving API behavior across issue and approval workflows without route-specific fixes ## What Changed - Added a shared `multilineTextSchema` helper that normalizes escaped `\n`, `\r\n`, and `\r` sequences to real line breaks. - Applied the helper to issue descriptions, issue update comments, issue comment bodies, suggested task descriptions, interaction summaries, issue documents, approval comments, and approval decision notes. - Added shared validator regressions for issue and approval multiline inputs. ## Verification - `pnpm exec vitest run --project @paperclipai/shared packages/shared/src/validators/approval.test.ts packages/shared/src/validators/issue.test.ts` - `pnpm --filter @paperclipai/shared typecheck` ## Risks - Low risk. This only changes text fields that are explicitly multiline user/operator content. - If a caller intentionally wanted literal backslash-n text in these fields, it will now render as a real line break. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled with shell/GitHub/Paperclip API access. Context window was not reported by the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 18:02:45 -05:00
Dotta	5a0c1979cf	[codex] Add runtime lifecycle recovery and live issue visibility (#4419 )	2026-04-24 15:50:32 -05:00
Dotta	9a8d219949	[codex] Stabilize tests and local maintenance assets (#4423 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - A fast-moving control plane needs stable local tests and repeatable local maintenance tools so contributors can safely split and review work > - Several route suites needed stronger isolation, Codex manual model selection needed a faster-mode option, and local browser cleanup missed Playwright's headless shell binary > - Storybook static output also needed to be preserved as a generated review artifact from the working branch > - This pull request groups the test/local-dev maintenance pieces so they can be reviewed separately from product runtime changes > - The benefit is more predictable contributor verification and cleaner local maintenance without mixing these changes into feature PRs ## What Changed - Added stable Vitest runner support and serialized route/authz test isolation. - Fixed workspace runtime authz route mocks and stabilized Claude/company-import related assertions. - Allowed Codex fast mode for manually selected models. - Broadened the agent browser cleanup script to detect `chrome-headless-shell` as well as Chrome for Testing. - Preserved generated Storybook static output from the source branch. ## Verification - `pnpm exec vitest run src/__tests__/workspace-runtime-routes-authz.test.ts src/__tests__/claude-local-execute.test.ts --config vitest.config.ts` from `server/` passed: 2 files, 19 tests. - `pnpm exec vitest run src/server/codex-args.test.ts --config vitest.config.ts` from `packages/adapters/codex-local/` passed: 1 file, 3 tests. - `bash -n scripts/kill-agent-browsers.sh && scripts/kill-agent-browsers.sh --dry` passed; dry-run detected `chrome-headless-shell` processes without killing them. - `test -f ui/storybook-static/index.html && test -f ui/storybook-static/assets/forms-editors.stories-Dry7qwx2.js` passed. - `git diff --check public-gh/master..pap-2228-test-local-maintenance -- . ':(exclude)ui/storybook-static'` passed. - `pnpm exec vitest run cli/src/__tests__/company-import-export-e2e.test.ts --config cli/vitest.config.ts` did not complete in the isolated split worktree because `paperclipai run` exited during build prep with `TS2688: Cannot find type definition file for 'react'`; this appears to be caused by the worktree dependency symlink setup, not the code under test. - Confirmed this PR does not include `pnpm-lock.yaml`. ## Risks - Medium risk: the stable Vitest runner changes how route/authz tests are scheduled. - Generated `ui/storybook-static` files are large and contain minified third-party output; `git diff --check` reports whitespace inside those generated assets, so reviewers may choose to drop or regenerate that artifact before merge. - No database migrations. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip API, and GitHub CLI tool use in the local Paperclip workspace. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Note: screenshot checklist item is not applicable to source UI behavior; the included Storybook static output is generated artifact preservation from the source branch. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 15:11:42 -05:00
Devin Foley	70679a3321	Add sandbox environment support (#4415 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The environment/runtime layer decides where agent work executes and how the control plane reaches those runtimes. > - Today Paperclip can run locally and over SSH, but sandboxed execution needs a first-class environment model instead of one-off adapter behavior. > - We also want sandbox providers to be pluggable so the core does not hardcode every provider implementation. > - This branch adds the Sandbox environment path, the provider contract, and a deterministic fake provider plugin. > - That required synchronized changes across shared contracts, plugin SDK surfaces, server runtime orchestration, and the UI environment/workspace flows. > - The result is that sandbox execution becomes a core control-plane capability while keeping provider implementations extensible and testable. ## What Changed - Added sandbox runtime support to the environment execution path, including runtime URL discovery, sandbox execution targeting, orchestration, and heartbeat integration. - Added plugin-provider support for sandbox environments so providers can be supplied via plugins instead of hardcoded server logic. - Added the fake sandbox provider plugin with deterministic behavior suitable for local and automated testing. - Updated shared types, validators, plugin protocol definitions, and SDK helpers to carry sandbox provider and workspace-runtime contracts across package boundaries. - Updated server routes and services so companies can create sandbox environments, select them for work, and execute work through the sandbox runtime path. - Updated the UI environment and workspace surfaces to expose sandbox environment configuration and selection. - Added test coverage for sandbox runtime behavior, provider seams, environment route guards, orchestration, and the fake provider plugin. ## Verification - Ran locally before the final fixture-only scrub: - `pnpm -r typecheck` - `pnpm test:run` - `pnpm build` - Ran locally after the final scrub amend: - `pnpm vitest run server/src/__tests__/runtime-api.test.ts` - Reviewer spot checks: - create a sandbox environment backed by the fake provider plugin - run work through that environment - confirm sandbox provider execution does not inherit host secrets implicitly ## Risks - This touches shared contracts, plugin SDK plumbing, server runtime orchestration, and UI environment/workspace flows, so regressions would likely show up as cross-layer mismatches rather than isolated type errors. - Runtime URL discovery and sandbox callback selection are sensitive to host/bind configuration; if that logic is wrong, sandbox-backed callbacks may fail even when execution succeeds. - The fake provider plugin is intentionally deterministic and test-oriented; future providers may expose capability gaps that this branch does not yet cover. ## Model Used - OpenAI Codex coding agent on a GPT-5-class backend in the Paperclip/Codex harness. Exact backend model ID is not exposed in-session. Tool-assisted workflow with shell execution, file editing, git history inspection, and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-24 12:15:53 -07:00
Dotta	641eb44949	[codex] Harden create-agent skill governance (#4422 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Hiring agents is a governance-sensitive workflow because it grants roles, adapter config, skills, and execution capability > - The create-agent skill needs explicit templates and review guidance so hires are auditable and not over-permissioned > - Skill sync also needs to recognize bundled Paperclip skills consistently for Codex local agents > - This pull request expands create-agent role templates, adds a security-engineer template, and documents capability/secret-handling review requirements > - The benefit is safer, more repeatable agent creation with clearer approval payloads and less permission sprawl ## What Changed - Expanded `paperclip-create-agent` guidance for template selection, adjacent-template drafting, and role-specific review bars. - Added a Security Engineer agent template and collaboration/safety sections for Coder, QA, and UX Designer templates. - Hardened draft-review guidance around desired skills, external-system access, secrets, and confidential advisory handling. - Updated LLM agent-configuration guidance to point hiring workflows at the create-agent skill. - Added tests for bundled skill sync, create-agent skill injection, hire approval payloads, and LLM route guidance. ## Verification - `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts server/src/__tests__/codex-local-skill-injection.test.ts server/src/__tests__/codex-local-skill-sync.test.ts server/src/__tests__/llms-routes.test.ts server/src/__tests__/paperclip-skill-utils.test.ts --config server/vitest.config.ts` passed: 5 files, 23 tests. - `git diff --check public-gh/master..pap-2228-create-agent-governance -- . ':(exclude)ui/storybook-static'` passed. - Confirmed this PR does not include `pnpm-lock.yaml`. ## Risks - Low-to-medium risk: this primarily changes skills/docs and tests, but it affects future hiring guidance and approval expectations. - Reviewers should check whether the new Security Engineer template is too broad for default company installs. - No database migrations. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip API, and GitHub CLI tool use in the local Paperclip workspace. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Note: screenshot checklist item is not applicable; this PR changes skills, docs, and server tests. --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 14:15:28 -05:00
Dotta	77a72e28c2	[codex] Polish issue composer and long document display (#4420 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Issue comments and documents are the main working surface where operators and agents collaborate > - File drops, markdown editing, and long issue descriptions need to feel predictable because they sit directly in the task execution loop > - The composer had edge cases around drag targets, attachment feedback, image drops, and long markdown content crowding the page > - This pull request polishes the issue composer, hardens markdown editor regressions, and adds a fold curtain for long issue descriptions/documents > - The benefit is a calmer issue detail surface that handles uploads and long work products without hiding state or breaking layout ## What Changed - Scoped issue-composer drag/drop behavior so the composer owns file drops without turning the whole thread into a competing drop target. - Added clearer attachment upload feedback for non-image files and image-drop stability coverage. - Hardened markdown editor and markdown body handling around HTML-like tag regressions. - Added `FoldCurtain` and wired it into issue descriptions and issue documents so long markdown previews can expand/collapse. - Added Storybook coverage for the fold curtain state. ## Verification - `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx ui/src/components/MarkdownEditor.test.tsx ui/src/components/MarkdownBody.test.tsx --config ui/vitest.config.ts` passed: 3 files, 75 tests. - `git diff --check public-gh/master..pap-2228-editor-composer-polish -- . ':(exclude)ui/storybook-static'` passed. - Confirmed this PR does not include `pnpm-lock.yaml`. ## Risks - Low-to-medium risk: this changes user-facing composer/drop behavior and long markdown display. - The fold curtain uses DOM measurement and `ResizeObserver`; reviewers should check browser behavior for very long descriptions and documents. - No database migrations. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip API, and GitHub CLI tool use in the local Paperclip workspace. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge Note: screenshots were not newly captured during branch splitting; the UI states are covered by component tests and a Storybook story. --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 14:12:41 -05:00
Dotta	8f1cd0474f	[codex] Improve transient recovery and Codex model refresh (#4383 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Adapter execution and retry classification decide whether agent work pauses, retries, or recovers automatically > - Transient provider failures need to be classified precisely so Paperclip does not convert retryable upstream conditions into false hard failures > - At the same time, operators need an up-to-date model list for Codex-backed agents and prompts should nudge agents toward targeted verification instead of repo-wide sweeps > - This pull request tightens transient recovery classification for Claude and Codex, updates the agent prompt guidance, and adds Codex model refresh support end-to-end > - The benefit is better automatic retry behavior plus fresher operator-facing model configuration ## What Changed - added Codex usage-limit retry-window parsing and Claude extra-usage transient classification - normalized the heartbeat transient-recovery contract across adapter executions and heartbeat scheduling - documented that deferred comment wakes only reopen completed issues for human/comment-reopen interactions, while system follow-ups leave closed work closed - updated adapter-utils prompt guidance to prefer targeted verification - added Codex model refresh support in the server route, registry, shared types, and agent config form - added adapter/server tests covering the new parsing, retry scheduling, and model-refresh behavior ## Verification - `pnpm exec vitest run --project @paperclipai/adapter-utils packages/adapter-utils/src/server-utils.test.ts` - `pnpm exec vitest run --project @paperclipai/adapter-claude-local packages/adapters/claude-local/src/server/parse.test.ts` - `pnpm exec vitest run --project @paperclipai/adapter-codex-local packages/adapters/codex-local/src/server/parse.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/adapter-model-refresh-routes.test.ts server/src/__tests__/adapter-models.test.ts server/src/__tests__/claude-local-execute.test.ts server/src/__tests__/codex-local-execute.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts server/src/__tests__/heartbeat-retry-scheduling.test.ts` ## Risks - Moderate behavior risk: retry classification affects whether runs auto-recover or block, so mistakes here could either suppress needed retries or over-retry real failures - Low workflow risk: deferred comment wake reopening is intentionally scoped to human/comment-reopen interactions so system follow-ups do not revive completed issues unexpectedly > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex GPT-5-based coding agent with tool use and code execution in the Codex CLI environment ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 09:40:40 -05:00
Dotta	4fdbbeced3	[codex] Refine markdown issue reference rendering (#4382 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Task references are a core part of how operators understand issue relationships across the UI > - Those references appear both in markdown bodies and in sidebar relationship panels > - The rendering had drifted between surfaces, and inline markdown pills were reading awkwardly inside prose and lists > - This pull request unifies the underlying issue-reference treatment, routes issue descriptions through `MarkdownBody`, and switches inline markdown references to a cleaner text-link presentation > - The benefit is more consistent issue-reference UX with better readability in markdown-heavy views ## What Changed - unified sidebar and markdown issue-reference rendering around the shared issue-reference components - routed resting issue descriptions through `MarkdownBody` so description previews inherit the richer issue-reference treatment - replaced inline markdown pill chrome with a cleaner inline reference presentation for prose contexts - added and updated UI tests for `MarkdownBody` and `InlineEditor` ## Verification - `pnpm exec vitest run --project @paperclipai/ui ui/src/components/MarkdownBody.test.tsx ui/src/components/InlineEditor.test.tsx` ## Risks - Moderate UI risk: issue-reference rendering now differs intentionally between inline markdown and relationship sidebars, so regressions would show up as styling or hover-preview mismatches > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex GPT-5-based coding agent with tool use and code execution in the Codex CLI environment ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 09:39:21 -05:00
Dotta	7ad225a198	[codex] Improve issue thread review flow (#4381 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Issue detail is where operators coordinate review, approvals, and follow-up work with active runs > - That thread UI needs to surface blockers, descendants, review handoffs, and reply ergonomics clearly enough for humans to guide agent work > - Several small gaps in the issue-thread flow were making review and navigation clunkier than necessary > - This pull request improves the reply composer, descendant/blocker presentation, interaction folding, and review-request handoff plumbing together as one cohesive issue-thread workflow slice > - The benefit is a cleaner operator review loop without changing the broader task model ## What Changed - restored and refined the floating reply composer behavior in the issue thread - folded expired confirmation interactions and improved post-submit thread scrolling behavior - surfaced descendant issue context and inline blocker/paused-assignee notices on the issue detail view - tightened large-board first paint behavior in `IssuesList` - added loose review-request handoffs through the issue execution-policy/update path and covered them with tests ## Verification - `pnpm vitest run ui/src/pages/IssueDetail.test.tsx` - `pnpm vitest run server/src/__tests__/issues-service.test.ts server/src/__tests__/issue-execution-policy.test.ts` - `pnpm exec vitest run --project @paperclipai/ui ui/src/components/IssueChatThread.test.tsx ui/src/components/IssueProperties.test.tsx ui/src/components/IssuesList.test.tsx ui/src/lib/issue-tree.test.ts ui/src/api/issues.test.ts` - `pnpm exec vitest run --project @paperclipai/adapter-utils packages/adapter-utils/src/server-utils.test.ts` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/issue-comment-reopen-routes.test.ts -t "coerces executor handoff patches into workflow-controlled review wakes\|wakes the return assignee with execution_changes_requested"` - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/issue-execution-policy.test.ts server/src/__tests__/issues-service.test.ts` ## Visual Evidence - UI layout changes are covered by the focused issue-thread component and issue-detail tests listed above. Browser screenshots were not attachable from this automated greploop environment, so reviewers should use the running preview for final visual confirmation. ## Risks - Moderate UI-flow risk: these changes touch the issue detail experience in multiple spots, so regressions would most likely show up as thread-layout quirks or incorrect review-handoff behavior > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex GPT-5-based coding agent with tool use and code execution in the Codex CLI environment ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots or documented the visual verification path - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-24 08:02:45 -05:00
Dotta	35a9dc37b0	[codex] Speed up company skill detail loading (#4380 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Company skills are part of the control plane for distributing reusable capabilities > - Board flows that inspect company skill detail should stay responsive because they are operator-facing control-plane reads > - The existing detail path was doing broader work than needed for the specific detail screen > - This pull request narrows that company-skill detail loading path and adds a regression test around it > - The benefit is faster company skill detail reads without changing the external API contract ## What Changed - tightened the company-skill detail loading path in `server/src/services/company-skills.ts` - added `server/src/__tests__/company-skills-detail.test.ts` to verify the detail route only pulls the required data ## Verification - `pnpm exec vitest run --project @paperclipai/server server/src/__tests__/company-skills-detail.test.ts` ## Risks - Low risk: this only changes the company-skill detail query path, but any missed assumption in the detail consumer would surface when loading that screen > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex GPT-5-based coding agent with tool use and code execution in the Codex CLI environment ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-24 07:37:13 -05:00
Devin Foley	e4995bbb1c	Add SSH environment support (#4358 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The environments subsystem already models execution environments, but before this branch there was no end-to-end SSH-backed runtime path for agents to actually run work against a remote box > - That meant agents could be configured around environment concepts without a reliable way to execute adapter sessions remotely, sync workspace state, and preserve run context across supported adapters > - We also need environment selection to participate in normal Paperclip control-plane behavior: agent defaults, project/issue selection, route validation, and environment probing > - Because this capability is still experimental, the UI surface should be easy to hide and easy to remove later without undoing the underlying implementation > - This pull request adds SSH environment execution support across the runtime, adapters, routes, schema, and tests, then puts the visible environment-management UI behind an experimental flag > - The benefit is that we can validate real SSH-backed agent execution now while keeping the user-facing controls safely gated until the feature is ready to come out of experimentation ## What Changed - Added SSH-backed execution target support in the shared adapter runtime, including remote workspace preparation, skill/runtime asset sync, remote session handling, and workspace restore behavior after runs. - Added SSH execution coverage for supported local adapters, plus remote execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi. - Added environment selection and environment-management backend support needed for SSH execution, including route/service work, validation, probing, and agent default environment persistence. - Added CLI support for SSH environment lab verification and updated related docs/tests. - Added the `enableEnvironments` experimental flag and gated the environment UI behind it on company settings, agent configuration, and project configuration surfaces. ## Verification - `pnpm exec vitest run packages/adapters/claude-local/src/server/execute.remote.test.ts packages/adapters/cursor-local/src/server/execute.remote.test.ts packages/adapters/gemini-local/src/server/execute.remote.test.ts packages/adapters/opencode-local/src/server/execute.remote.test.ts packages/adapters/pi-local/src/server/execute.remote.test.ts` - `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts` - `pnpm exec vitest run server/src/__tests__/instance-settings-routes.test.ts` - `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - `pnpm -r typecheck` - `pnpm build` - Manual verification on a branch-local dev server: - enabled the experimental flag - created an SSH environment - created a Linux Claude agent using that environment - confirmed a run executed on the Linux box and synced workspace changes back ## Risks - Medium: this touches runtime execution flow across multiple adapters, so regressions would likely show up in remote session setup, workspace sync, or environment selection precedence. - The UI flag reduces exposure, but the underlying runtime and route changes are still substantial and rely on migration correctness. - The change set is broad across adapters, control-plane services, migrations, and UI gating, so review should pay close attention to environment-selection precedence and remote workspace lifecycle behavior. ## Model Used - OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding model with tool use and code execution in the local repo workspace. The local adapter does not surface a more specific public model version string in this branch workflow. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-23 19:15:22 -07:00
Dotta	f98c348e2b	[codex] Add issue subtree pause, cancel, and restore controls (#4332 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - This branch extends the issue control-plane so board operators can pause, cancel, and later restore whole issue subtrees while keeping descendant execution and wake behavior coherent. > - That required new hold state in the database, shared contracts, server routes/services, and issue detail UI controls so subtree actions are durable and auditable instead of ad hoc. > - While this branch was in flight, `master` advanced with new environment lifecycle work, including a new `0065_environments` migration. > - Before opening the PR, this branch had to be rebased onto `paperclipai/paperclip:master` without losing the existing subtree-control work or leaving conflicting migration numbering behind. > - This pull request rebases the subtree pause/cancel/restore feature cleanly onto current `master`, renumbers the hold migration to `0066_issue_tree_holds`, and preserves the full branch diff in a single PR. > - The benefit is that reviewers get one clean, mergeable PR for the subtree-control feature instead of stale branch history with migration conflicts. ## What Changed - Added durable issue subtree hold data structures, shared API/types/validators, server routes/services, and UI flows for subtree pause, cancel, and restore operations. - Added server and UI coverage for subtree previewing, hold creation/release, dependency-aware scheduling under holds, and issue detail subtree controls. - Rebased the branch onto current `paperclipai/paperclip:master` and renumbered the branch migration from `0065_issue_tree_holds` to `0066_issue_tree_holds` so it no longer conflicts with upstream `0065_environments`. - Added a small follow-up commit that makes restore requests return `200 OK` explicitly while keeping pause/cancel hold creation at `201 Created`, and updated the route test to match that contract. ## Verification - `pnpm --filter @paperclipai/db typecheck` - `pnpm --filter @paperclipai/shared typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm --filter @paperclipai/ui typecheck` - `cd server && pnpm exec vitest run src/__tests__/issue-tree-control-routes.test.ts src/__tests__/issue-tree-control-service.test.ts src/__tests__/issue-tree-control-service-unit.test.ts src/__tests__/heartbeat-dependency-scheduling.test.ts` - `cd ui && pnpm exec vitest run src/components/IssueChatThread.test.tsx src/pages/IssueDetail.test.tsx` ## Risks - This is a broad cross-layer change touching DB/schema, shared contracts, server orchestration, and UI; regressions are most likely around subtree status restoration or wake suppression/resume edge cases. - The migration was renumbered during PR prep to avoid the new upstream `0065_environments` conflict. Reviewers should confirm the final `0066_issue_tree_holds` ordering is the only hold-related migration that lands. - The issue-tree restore endpoint now responds with `200` instead of relying on implicit behavior, which is semantically better for a restore operation but still changes an API detail that clients or tests could have assumed. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex coding agent in the Paperclip Codex runtime (GPT-5-class tool-using coding model; exact deployment ID/context window is not exposed inside this session). ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [ ] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-23 14:51:46 -05:00
Russell Dempsey	854fa81757	fix(pi-local): prepend installed skill bin/ dirs to child PATH (#4331 ) ## Thinking Path > - Paperclip orchestrates AI agents; each agent runs under an adapter that spawns a model CLI as a child process. > - The pi-local adapter (`packages/adapters/pi-local`) spawns `pi` and inherits the child's shell environment — including `PATH`, which determines what the child's bash tool can execute by name. > - Paperclip skills ship executable helpers under `<skill>/bin/` (e.g. `paperclip-get-issue`) and Reviewer/QA-style `AGENTS.md` files invoke them by name via the agent's bash tool. > - Pi-local builds its runtime env with `ensurePathInEnv({ ...process.env, ...env })` only — it never adds the installed skills' `bin/` dirs to PATH. The pi CLI's `--skill` arg loads each skill's SKILL.md but does not augment PATH. > - Consequence: every bash invocation of a skill helper fails with `exit 127: command not found`. The agent then spends its heartbeat guessing (re-reading SKILL.md, trying `find`, inventing command paths) and either times out or gives up. > - This PR prepends each injected skill's `bin/` directory to the child PATH immediately before runtimeEnv is constructed. > - The benefit: pi_local agents whose AGENTS.md uses any `paperclip-*` skill helper can actually run those helpers. ## What Changed - `packages/adapters/pi-local/src/server/execute.ts`: compute `skillBinDirs` from the already-resolved `piSkillEntries`, dedupe against the existing PATH, prepend them to whichever of `PATH` / `Path` the merged env uses, then build `runtimeEnv`. No new helpers, no adapter-utils changes. ## Verification Manual repro before the fix: 1. Create a pi_local agent wired to a paperclip skill (e.g. paperclip-control). 2. Wake the agent on an in_review issue with an AGENTS.md that starts with `paperclip-get-issue "$PAPERCLIP_TASK_ID"`. 3. Session file: `{ "role": "toolResult", "isError": true, "content": [{ "text": "/bin/bash: paperclip-get-issue: command not found\n\nCommand exited with code 127" }] }`. After the fix: same wake; `paperclip-get-issue` resolves and returns the issue JSON; agent proceeds. Local commands: ``` pnpm --filter @paperclipai/adapter-pi-local typecheck # clean pnpm --filter @paperclipai/adapter-pi-local build # clean pnpm --filter @paperclipai/server exec vitest run \ src/__tests__/pi-local-execute.test.ts \ src/__tests__/pi-local-adapter-environment.test.ts \ src/__tests__/pi-local-skill-sync.test.ts # 5/5 passing ``` No new tests: the existing `pi-local-skill-sync.test.ts` covers skill symlink injection (upstream of the PATH step), and `pi-local-execute.test.ts` covers the spawn path; this change only augments env on the same spawn path. ## Risks Low. Pure PATH augmentation on the child env. Edge cases: - Zero skills installed → no PATH change (guarded by `skillBinDirs.length > 0`). - Duplicate bin dirs already on PATH → deduped; no pollution on re-runs. - Windows `Path` casing → falls back correctly when merged env uses `Path` instead of `PATH`. - Skill dir without `bin/` subdir → joined path simply won't resolve; harmless. No behavioral change for pi_local agents that don't use skill-provided commands. ## Model Used - Claude, `claude-opus-4-7` (1M context), extended thinking enabled, tool use enabled. Walked pi-local/cursor-local/claude-local and adapter-utils to isolate the gap, wrote the inlined fix, and ran typecheck/build/test locally. ## Checklist - [x] Thinking path from project context to this change - [x] Model used specified - [x] Checked ROADMAP.md — no overlap - [x] Tests run locally, passing - [x] Tests added — new case in `server/src/__tests__/pi-local-execute.test.ts`; verified it fails when the fix is reverted - [ ] UI screenshots — N/A (backend adapter change) - [x] Docs updated — N/A (internal adapter, no user-facing docs) - [x] Risks documented - [x] Will address reviewer comments before merge	2026-04-23 10:15:10 -05:00
Dotta	fe14de504c	[codex] Document README architecture systems (#4250 ) ## Thinking Path > - Paperclip is the control plane for autonomous AI companies. > - The public README is the first place many operators and contributors learn what the product already includes. > - The existing README explained the product promise but did not give a compact, concrete tour of the major systems behind it. > - This made Paperclip easier to underestimate as a wrapper around agents instead of a full control plane with identity, work, execution, governance, budgets, plugins, and portability. > - This pull request adds an under-the-hood README section that names those systems and shows how adapters connect into the server. > - Greptile caught consistency gaps between the diagram and prose, so the final version aligns the system labels and adapter examples across both surfaces. > - The benefit is a clearer first-read model of Paperclip's architecture and shipped capabilities without changing runtime behavior. ## What Changed - Added a `What's Under the Hood` section to `README.md`. - Added an ASCII architecture diagram for the Paperclip server and external agent adapters. - Added a systems table covering identity, org charts, tasks, heartbeat execution, workspaces, governance, budgets, routines, plugins, secrets/storage, activity/events, and company portability. - Addressed Greptile feedback by aligning diagram labels with table rows and grouping adapter examples consistently. ## Verification - `git diff --check public-gh/master...HEAD` - Attempted `pnpm exec prettier --check README.md`, but this checkout does not expose a `prettier` binary through `pnpm exec`. - Greptile review rerun passed after addressing its two comments; review threads are resolved. - Remote PR checks passed on the latest head: `policy`, `verify`, `e2e`, `security/snyk (cryppadotta)`, and `Greptile Review`. - Not run locally: Vitest/build suites, because this is a README-only documentation change and the PR's remote `verify` job ran typecheck, tests, build, and release canary dry run. ## Risks - Low runtime risk: documentation-only change. - The main risk is wording drift if the README overstates or underspecifies evolving product capabilities; the section was aligned against the current product/spec docs and roadmap. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex / GPT-5 coding agent in a Paperclip heartbeat, with shell and GitHub CLI tool use. Exact runtime model identifier and context window were not exposed by the adapter. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-23 09:48:19 -05:00
Michel Tomas	3d15798c22	fix(adapters/routes): apply resolveExternalAdapterRegistration on hot-install (#4324 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - The external adapter plugin system (#2218) lets adapters ship as npm modules loaded via `server/src/adapters/plugin-loader.ts`; since #4296 merged, each `ServerAdapterModule` can declare `sessionManagement` (`supportsSessionResume`, `nativeContextManagement`, `defaultSessionCompaction`) and have it preserved through the init-time load via the new `resolveExternalAdapterRegistration` helper > - #4296 fixed the init-time IIFE path at `server/src/adapters/registry.ts:363-369` but noted that the hot-install path at `server/src/routes/adapters.ts:174 registerWithSessionManagement` still unconditionally overwrites module-provided `sessionManagement` during `POST /api/adapters/install` > - Practical impact today: an external adapter installed via the API needs a Paperclip restart before its declared `sessionManagement` takes effect — the IIFE runs on next boot and preserves it, but until then the hot-install overwrite wins > - This PR closes that parity gap: `registerWithSessionManagement` delegates to the same `resolveExternalAdapterRegistration` helper introduced by #4296, unifying both load paths behind one resolver > - The benefit is consistent behaviour between cold-start and hot-install: no "install then restart" ritual; declared `sessionManagement` on an external module is honoured the moment `POST /api/adapters/install` returns 201 ## What Changed - `server/src/routes/adapters.ts`: `registerWithSessionManagement` delegates to the exported `resolveExternalAdapterRegistration` helper (added in #4296). Honours module-provided `sessionManagement` first, falls back to host registry lookup, defaults `undefined`. Updated the section comment to document the parity-with-IIFE intent. - `server/src/routes/adapters.ts`: dropped the now-unused `getAdapterSessionManagement` import. - `server/src/adapters/registry.ts`: updated the JSDoc on `resolveExternalAdapterRegistration` — previously said "Exported for unit tests; runtime callers use the IIFE below", now says the helper is used by both the init-time IIFE and the hot-install path in `routes/adapters.ts`. Addresses Greptile C1. - `server/src/__tests__/adapter-routes.test.ts`: new integration test — installs a mocked external adapter module carrying a non-trivial `sessionManagement` declaration and asserts `findServerAdapter(type).sessionManagement` preserves it after `POST /api/adapters/install` returns 201. - `server/src/__tests__/adapter-routes.test.ts`: added `findServerAdapter` to the shared test-scope variable set so the new test can inspect post-install registry state. ## Verification Targeted test runs from a clean tree on `fix/external-session-management-hot-install` (rebased onto current `upstream/master` now that #4296 has merged): - `pnpm test server/src/__tests__/adapter-routes.test.ts` — 6 passed (new test + 5 pre-existing) - `pnpm test server/src/__tests__/adapter-registry.test.ts` — 15 passed (ensures the IIFE path from #4296 continues to behave correctly) - `pnpm -w run test` full workspace suite — 1923 passed / 1 skipped (unrelated skip) End-to-end smoke on file: [`@superbiche/cline-paperclip-adapter@0.1.1`](https://www.npmjs.com/package/@superbiche/cline-paperclip-adapter) and [`@superbiche/qwen-paperclip-adapter@0.1.1`](https://www.npmjs.com/package/@superbiche/qwen-paperclip-adapter), both public on npm, both declare `sessionManagement`. With this PR in place, the "restart after install" step disappears — the declared compaction policy is active immediately after the install response. ## Risks - Low risk. The change replaces an inline mutation with a call to a helper that already has dedicated unit coverage (#4296 added three tests for `resolveExternalAdapterRegistration` covering module-provided, registry-fallback, and undefined paths). Behaviour is a strict superset of the prior path — externals that did not declare `sessionManagement` continue to get the hardcoded-registry lookup; externals that did declare it now have those values preserved instead of overwritten. - No migration impact. The stored plugin records (`~/.paperclip/adapter-plugins.json`) are unchanged. Existing hot-installed adapters behave correctly before and after. - No behavioural change for builtin adapters; they hit `registerServerAdapter` directly and never flow through `registerWithSessionManagement`. ## Model Used - Provider and model: Claude (Anthropic) via Claude Code - Model ID: `claude-opus-4-7` (1M context) - Reasoning mode: standard (no extended thinking on this PR) - Tool use: yes — file edits, subprocess invocations for builds/tests/git via the Claude Code harness ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots (N/A — server-only change) - [x] I have updated relevant documentation to reflect my changes (the JSDoc on `resolveExternalAdapterRegistration` and the section comment above `registerWithSessionManagement` now document the parity-with-IIFE intent) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-23 09:45:24 -05:00
Michel Tomas	24232078fd	fix(adapters/registry): honor module-provided sessionManagement for external adapters (#4296 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Adapters are how paperclip hands work off to specific agent runtimes; since #2218, external adapter packages can ship as npm modules loaded via `server/src/adapters/plugin-loader.ts` > - Each `ServerAdapterModule` can declare `sessionManagement` (`supportsSessionResume`, `nativeContextManagement`, `defaultSessionCompaction`) — but the init-time load at `registry.ts:363-369` hard-overwrote it with a hardcoded-registry lookup that has no entries for external types, so modules could not actually set these fields > - The hot-install path at `routes/adapters.ts:179` → `registerServerAdapter` preserves module-provided `sessionManagement`, so externals worked after `POST /api/adapters/install` — until the next server restart, when the init-time IIFE wiped it back to `undefined` > - #2218 explicitly deferred this: "Adapter execution model, heartbeat protocol, and session management are untouched." This PR is the natural follow-up for session management on the plugin-loader path > - This PR aligns init-time registration with the hot-install path: honor module-provided `sessionManagement` first, fall back to the hardcoded registry when absent (so externals overriding a built-in type still inherit its policy). Extracted as a testable helper with three unit tests > - The benefit is external adapters can declare session-resume capabilities consistently across cold-start and hot-install, without requiring upstream additions to the hardcoded registry for each new plugin ## What Changed - `server/src/adapters/registry.ts`: extracted the merge logic into a new exported helper `resolveExternalAdapterRegistration()` — honors module-provided `sessionManagement` first, falls back to `getAdapterSessionManagement(type)`, else `undefined`. The init-time IIFE calls the helper instead of inlining an overwrite. - `server/src/adapters/registry.ts`: updated the section comment (lines 331–340) to reflect the new semantics and cross-reference the hot-install path's behavior. - `server/src/__tests__/adapter-registry.test.ts`: new `describe("resolveExternalAdapterRegistration")` block with three tests — module-provided value preserved, registry fallback when module omits, `undefined` when neither provides. ## Verification Targeted test run from a clean tree on `fix/external-session-management`: ``` cd server && pnpm exec vitest run src/__tests__/adapter-registry.test.ts # 1 test file, 15 tests passed, 0 failed (12 pre-existing + 3 new) ``` Full server suite via the independent review pass noted under Model Used: 1,156 tests passed, 0 failed. Typecheck note: `pnpm --filter @paperclipai/server exec tsc --noEmit` surfaces two errors in `src/services/plugin-host-services.ts:1510` (`createInteraction` + implicit-any). Verified by `git stash` + re-run on clean `upstream/master` — they reproduce without this PR's changes. Pre-existing, out of scope. ## Risks - Low behavioral risk. Strictly additive: externals that do NOT provide `sessionManagement` continue to receive exactly the same value as before (registry lookup → `undefined` for pure externals, or the builtin's entry for externals overriding a built-in type). Only a new capability is unlocked; no existing behavior changes for existing adapters. - No breaking change. `ServerAdapterModule.sessionManagement` was already optional at the type level. Externals that never set it see no difference on either path. - Consistency verified. Init-time IIFE now matches the post-`POST /api/adapters/install` behavior — a server restart no longer regresses the field. ## Note This is part of a broader effort to close the parity gap between external and built-in adapters. Once externals reach 1:1 capability coverage with internals, new-adapter contributions can increasingly be steered toward the external-plugin path instead of the core product — a trajectory CONTRIBUTING.md already encourages ("If the idea fits as an extension, prefer building it with the plugin system"). ## Model Used - Provider: Anthropic - Model: Claude Opus 4.7 - Exact model ID: `claude-opus-4-7` (1M-context variant: `claude-opus-4-7[1m]`) - Context window: 1,000,000 tokens - Harness: Claude Code (Anthropic's official CLI), orchestrated by @superbiche as human-in-the-loop. Full file-editing, shell, and `gh` tool use, plus parallel research subagents for fact-finding against paperclip internals (plugin-loader contract, sessionCodec reachability, UI parser surface, Cline CLI JSON schema). - Independent local review: Gemini 3.1 Pro (Google) performed a separate verification pass on the committed branch — confirmed the approach & necessity, ran the full workspace build, and executed the complete server test suite (1,156 tests, all passing). Not used for authoring; second-opinion pass only. - Authoring split: @superbiche identified the gap (while mapping the external-adapter surface for a downstream adapter build) and shaped the plan — categorising the surface into `works / acceptable / needs-upstream` buckets, directing the surgical-diff approach on a fresh branch from `upstream/master`, and calling the framing ("alignment bug between init-time IIFE and hot-install path" rather than "missing capability"). Opus 4.7 executed the fact-finding, the diff, the tests, and drafted this PR body — all under direct review. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work (convention-aligned bug fix on the external-adapter plugin path introduced by #2218) - [x] I have run tests locally and they pass (15/15 in the touched file; 1,156/1,156 full server suite via the independent Gemini 3.1 Pro review) - [x] I have added tests where applicable (3 new for the extracted helper) - [x] If this change affects the UI, I have included before/after screenshots (no UI touched) - [x] I have updated relevant documentation to reflect my changes (in-file comment reflects new semantics) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-23 07:39:43 -05:00
Devin Foley	13551b2bac	Add local environment lifecycle (#4297 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Every heartbeat run needs a concrete place where the agent's adapter process executes. > - Today that execution location is implicitly the local machine, which makes it hard to track, audit, and manage as a first-class runtime concern. > - The first step is to represent the current local execution path explicitly without changing how users experience agent runs. > - This pull request adds core Environment and Environment Lease records, then routes existing local heartbeat execution through a default `Local` environment. > - The benefit is that local runs remain behavior-preserving while the system now has durable environment identity, lease lifecycle tracking, and activity records for execution placement. ## What Changed - Added `environments` and `environment_leases` database tables, schema exports, and migration `0065_environments.sql`. - Added shared environment constants, TypeScript types, and validators for environment drivers, statuses, lease policies, lease statuses, and cleanup states. - Added `environmentService` for listing, reading, creating, updating, and ensuring company-scoped environments. - Added environment lease lifecycle operations for acquire, metadata update, single-lease release, and run-wide release. - Updated heartbeat execution to lazily ensure a company-scoped default `Local` environment before adapter execution. - Updated heartbeat execution to acquire an ephemeral local environment lease, write `paperclipEnvironment` into the run context snapshot, and release active leases during run finalization. - Added activity log events for environment lease acquisition and release. - Added tests for environment service behavior and the local heartbeat environment lifecycle. - Added a CI-follow-up heartbeat guard so deferred issue comment wakes are promoted before automatic missing-comment retries, with focused batching test coverage. ## Verification Local verification run for this branch: - `pnpm -r typecheck` - `pnpm build` - `pnpm exec vitest run server/src/__tests__/environment-service.test.ts server/src/__tests__/heartbeat-local-environment.test.ts --pool=forks` Additional reviewer/CI verification: - Confirm `pnpm-lock.yaml` is not modified. - Confirm `pnpm test:run` passes in CI. - Confirm `PAPERCLIP_E2E_SKIP_LLM=true pnpm run test:e2e` passes in CI. - Confirm a local heartbeat run creates one active `Local` environment when needed, records one lease for the run, releases the lease when the run finishes, and includes `paperclipEnvironment` in the run context snapshot. Screenshots: not applicable; this PR has no UI changes. ## Risks - Migration risk: introduces two new tables and a new migration journal entry. Review should verify company scoping, indexes, foreign keys, and enum defaults are correct. - Lifecycle risk: heartbeat finalization now releases environment leases in addition to existing runtime cleanup. A finalization bug could leave stale active leases or mark a failed run's lease incorrectly. - Behavior-preservation risk: local adapter execution should remain unchanged apart from environment bookkeeping. Review should pay attention to the heartbeat path around context snapshot updates and final cleanup ordering. - Activity volume risk: each heartbeat run now logs lease acquisition and release events, increasing activity log volume by two records per run. ## Model Used OpenAI GPT-5.4 via Codex CLI. Capabilities used: repository inspection, TypeScript implementation review, local test/build execution, and PR-description drafting. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots (N/A: no UI changes) - [x] I have updated relevant documentation to reflect my changes (N/A: no user-facing docs or commands changed) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-22 20:07:41 -07:00
Dotta	b69b563aa8	[codex] Fix stale issue execution run locks (#4258 ) ## Thinking Path > - Paperclip is a control plane for AI-agent companies, so issue checkout and execution ownership are core safety contracts. > - The affected subsystem is the issue service and route layer that gates agent writes by `checkoutRunId` and `executionRunId`. > - PAP-1982 exposed a stale-lock failure mode where a terminal heartbeat run could leave `executionRunId` pinned after checkout ownership had moved or been cleared. > - That stale execution lock could reject legitimate PATCH/comment/release requests from the rightful assignee after a harness restart. > - This pull request centralizes terminal-run cleanup, applies it before ownership-gated writes, and adds a board-only recovery endpoint for operator intervention. > - The benefit is that crashed or terminal runs no longer strand issues behind stale execution locks, while live execution locks still block conflicting writes. ## What Changed - Added `issueService.clearExecutionRunIfTerminal()` to atomically lock the issue/run rows and clear terminal or missing execution-run locks. - Reused stale execution-lock cleanup from checkout, `assertCheckoutOwner()`, and `release()`. - Allowed the same assigned agent/current run to adopt an unowned `in_progress` checkout after stale execution-lock cleanup. - Updated release to clear `executionRunId`, `executionAgentNameKey`, and `executionLockedAt`. - Added board-only `POST /api/issues/:id/admin/force-release` with company access checks, optional `clearAssignee=true`, and `issue.admin_force_release` audit logging. - Added embedded Postgres service tests and route integration tests for stale-lock recovery, release behavior, and admin force-release authorization/audit behavior. - Documented the new force-release API in `doc/SPEC-implementation.md`. ## Verification - `pnpm vitest run server/src/__tests__/issues-service.test.ts server/src/__tests__/issue-stale-execution-lock-routes.test.ts` passed. - `pnpm vitest run server/src/__tests__/issue-stale-execution-lock-routes.test.ts server/src/__tests__/approval-routes-idempotency.test.ts server/src/__tests__/issue-comment-reopen-routes.test.ts server/src/__tests__/issue-telemetry-routes.test.ts` passed. - `pnpm -r typecheck` passed. - `pnpm build` passed. - `git diff --check` passed. - `pnpm lint` could not run because this repo has no `lint` command. - Full `pnpm test:run` completed with 4 failures in existing route suites: `approval-routes-idempotency.test.ts` (2), `issue-comment-reopen-routes.test.ts` (1), and `issue-telemetry-routes.test.ts` (1). Those same files pass when run isolated and when run together with the new stale-lock route test, so this appears to be a whole-suite ordering/mock-isolation issue outside this patch path. ## Risks - Medium: this changes ownership-gated write behavior. The new adoption path is limited to the current run, the current assignee, `in_progress` issues, and rows with no checkout owner after terminal-lock cleanup. - Low: the admin force-release endpoint is board-only and company-scoped, but misuse can intentionally clear a live lock. It writes an audit event with prior lock IDs. - No schema or migration changes. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent (`gpt-5`), agentic coding with terminal/tool use and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-22 10:43:38 -05:00
Dotta	a957394420	[codex] Add structured issue-thread interactions (#4244 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Operators supervise that work through issues, comments, approvals, and the board UI. > - Some agent proposals need structured board/user decisions, not hidden markdown conventions or heavyweight governed approvals. > - Issue-thread interactions already provide a natural thread-native surface for proposed tasks and questions. > - This pull request extends that surface with request confirmations, richer interaction cards, and agent/plugin/MCP helpers. > - The benefit is that plan approvals and yes/no decisions become explicit, auditable, and resumable without losing the single-issue workflow. ## What Changed - Added persisted issue-thread interactions for suggested tasks, structured questions, and request confirmations. - Added board UI cards for interaction review, selection, question answers, and accept/reject confirmation flows. - Added MCP and plugin SDK helpers for creating interaction cards from agents/plugins. - Updated agent wake instructions, onboarding assets, Paperclip skill docs, and public docs to prefer structured confirmations for issue-scoped decisions. - Rebased the branch onto `public-gh/master` and renumbered branch migrations to `0063` and `0064`; the idempotency migration uses `ADD COLUMN IF NOT EXISTS` for old branch users. ## Verification - `git diff --check public-gh/master..HEAD` - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts packages/mcp-server/src/tools.test.ts packages/shared/src/issue-thread-interactions.test.ts ui/src/lib/issue-thread-interactions.test.ts ui/src/lib/issue-chat-messages.test.ts ui/src/components/IssueThreadInteractionCard.test.tsx ui/src/components/IssueChatThread.test.tsx server/src/__tests__/issue-thread-interaction-routes.test.ts server/src/__tests__/issue-thread-interactions-service.test.ts server/src/services/issue-thread-interactions.test.ts` -> 9 files / 79 tests passed - `pnpm -r typecheck` -> passed, including `packages/db` migration numbering check ## Risks - Medium: this adds a new issue-thread interaction model across db/shared/server/ui/plugin surfaces. - Migration risk is reduced by placing this branch after current master migrations (`0063`, `0064`) and making the idempotency column add idempotent for users who applied the old branch numbering. - UI interaction behavior is covered by component tests, but this PR does not include browser screenshots. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-class coding agent runtime. Exact model ID and context window are not exposed in this Paperclip run; tool use and local shell/code execution were enabled. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-21 20:15:11 -05:00
Dotta	014aa0eb2d	[codex] Clear stale queued comment targets (#4234 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Operators interact with agent work through issue threads and queued comments. > - When the selected comment target becomes stale, the composer can keep pointing at an invalid target after thread state changes. > - That makes follow-up comments easier to misroute and harder to reason about. > - This pull request clears stale queued comment targets and covers the behavior with tests. > - The benefit is more predictable issue-thread commenting during live agent work. ## What Changed - Clears queued comment targets when they no longer match the current issue thread state. - Adjusts issue detail comment-target handling to avoid stale target reuse. - Adds regression tests for optimistic issue comment target behavior. ## Verification - `pnpm exec vitest run ui/src/lib/optimistic-issue-comments.test.ts` ## Risks - Low risk; scoped to comment-target state handling in the issue UI. - No migrations. > Checked `ROADMAP.md`; this is a focused UI reliability fix, not a new roadmap-level feature. ## Model Used - OpenAI Codex, GPT-5-based coding agent, tool-enabled repository editing and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-21 16:50:26 -05:00
Dotta	bcbbb41a4b	[codex] Harden heartbeat runtime cleanup (#4233 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The heartbeat runtime is the control-plane path that turns issue assignments into agent runs and recovers after process exits. > - Several edge cases could leave high-volume reads unbounded, stale runtime services visible, blocked dependency wakes too eager, or terminal adapter processes still around after output finished. > - These problems make operator views noisy and make long-running agent work less predictable. > - This pull request tightens the runtime/read paths and adds focused regression coverage. > - The benefit is safer heartbeat execution and cleaner runtime state without changing the public task model. ## What Changed - Bounded high-volume issue/log reads in runtime code paths. - Hardened heartbeat handling for blocked dependency wakes and terminal run cleanup. - Added adapter process cleanup coverage for terminal output cases. - Added workspace runtime control tests for stale command matching and stopped services. ## Verification - `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts server/src/__tests__/heartbeat-dependency-scheduling.test.ts ui/src/components/WorkspaceRuntimeControls.test.tsx` ## Risks - Medium risk because heartbeat cleanup and runtime filtering affect active agent execution paths. - No migrations. > Checked `ROADMAP.md`; this is runtime hardening and bug-fix work, not a new roadmap-level feature. ## Model Used - OpenAI Codex, GPT-5-based coding agent, tool-enabled repository editing and local test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-21 16:48:47 -05:00
Dotta	73ef40e7be	[codex] Sandbox dynamic adapter UI parsers (#4225 ) ## Thinking Path > - Paperclip is a control plane for AI-agent companies. > - External adapters can provide UI parser code that the board loads dynamically for run transcript rendering. > - Running adapter-provided parser code directly in the board page gives that parser access to same-origin browser state. > - This PR narrows that surface by evaluating dynamically loaded external adapter UI parser code in a dedicated browser Web Worker with a constrained postMessage protocol. > - The worker here is a frontend isolation boundary for adapter UI parser JavaScript; it is not Paperclip's server plugin-worker system and it is not a server-side job runner. ## What Changed - Runs dynamically loaded external adapter UI parsers inside a dedicated Web Worker instead of importing/evaluating them directly in the board page. - Adds a narrow postMessage protocol for parser initialization and line parsing. - Caches completed async parse results and notifies the adapter registry so transcript recomputation can synchronously drain the final parsed line. - Disables common worker network, persistence, child worker, Blob/object URL, and WebRTC escape APIs inside the parser worker bootstrap. - Handles worker error messages after initialization and drains pending callbacks on worker termination or mid-session worker error. - Adds focused regression coverage for the parser worker lockdown and unused protocol removal. ## Verification - `pnpm exec vitest run --config ui/vitest.config.ts ui/src/adapters/sandboxed-parser-worker.test.ts` - `pnpm exec tsc --noEmit --target es2021 --moduleResolution bundler --module esnext --jsx react-jsx --lib dom,es2021 --skipLibCheck ui/src/adapters/dynamic-loader.ts ui/src/adapters/sandboxed-parser-worker.ts ui/src/adapters/sandboxed-parser-worker.test.ts` - `pnpm --filter @paperclipai/ui typecheck` was attempted; it reached existing unrelated failures in HeartbeatRun test/storybook fixtures and missing Storybook type resolution, with no adapter-module errors surfaced. - PR #4225 checks on current head `34c9da00`: `policy`, `e2e`, `verify`, `security/snyk`, and `Greptile Review` are all `SUCCESS`. - Greptile Review on current head `34c9da00` reached 5/5. ## Risks - Medium risk: parser execution is now asynchronous through a worker while the existing parser interface is synchronous, so transcript updates should be watched with external adapters. - Some adapter parser bundles may rely on direct ESM `export` syntax or browser APIs that are no longer available inside the worker lockdown. - The worker lockdown is a hardening layer around external parser code, not a complete browser security sandbox for arbitrary untrusted applications. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use enabled. Exact hosted model build and context window are not exposed in this Paperclip heartbeat environment. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-21 13:42:44 -05:00
Dotta	a26e1288b6	[codex] Polish issue board workflows (#4224 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies > - Human operators supervise that work through issue lists, issue detail, comments, inbox groups, markdown references, and profile/activity surfaces > - The branch had many small UI fixes that improve the operator loop but do not need to ship with backend runtime migrations > - These changes belong together as board workflow polish because they affect scanning, navigation, issue context, comment state, and markdown clarity > - This pull request groups the UI-only slice so it can merge independently from runtime/backend changes > - The benefit is a clearer board experience with better issue context, steadier optimistic updates, and more predictable keyboard navigation ## What Changed - Improves issue properties, sub-issue actions, blocker chips, and issue list/detail refresh behavior. - Adds blocker context above the issue composer and stabilizes queued/interrupted comment UI state. - Improves markdown issue/GitHub link rendering and opens external markdown links in a new tab. - Adds inbox group keyboard navigation and fold/unfold support. - Polishes activity/avatar/profile/settings/workspace presentation details. ## Verification - `pnpm exec vitest run ui/src/components/IssueProperties.test.tsx ui/src/components/IssueChatThread.test.tsx ui/src/components/MarkdownBody.test.tsx ui/src/lib/inbox.test.ts ui/src/lib/optimistic-issue-comments.test.ts` ## Risks - Low to medium risk: changes are UI-focused but cover high-traffic issue and inbox surfaces. - This branch intentionally does not include the backend runtime changes from the companion PR; where UI calls newer API filters, unsupported servers should continue to fail visibly through existing API error handling. - Visual screenshots were not captured in this heartbeat; targeted component/helper tests cover the changed behavior. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use enabled. Exact hosted model build and context window are not exposed in this Paperclip heartbeat environment. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-21 12:25:34 -05:00
Dotta	09d0678840	[codex] Harden heartbeat scheduling and runtime controls (#4223 ) ## Thinking Path > - Paperclip orchestrates AI agents through issue checkout, heartbeat runs, routines, and auditable control-plane state > - The runtime path has to recover from lost local processes, transient adapter failures, blocked dependencies, and routine coalescing without stranding work > - The existing branch carried several reliability fixes across heartbeat scheduling, issue runtime controls, routine dispatch, and operator-facing run state > - These changes belong together because they share backend contracts, migrations, and runtime status semantics > - This pull request groups the control-plane/runtime slice so it can merge independently from board UI polish and adapter sandbox work > - The benefit is safer heartbeat recovery, clearer runtime controls, and more predictable recurring execution behavior ## What Changed - Adds bounded heartbeat retry scheduling, scheduled retry state, and Codex transient failure recovery handling. - Tightens heartbeat process recovery, blocker wake behavior, issue comment wake handling, routine dispatch coalescing, and activity/dashboard bounds. - Adds runtime-control MCP tools and Paperclip skill docs for issue workspace runtime management. - Adds migrations `0061_lively_thor_girl.sql` and `0062_routine_run_dispatch_fingerprint.sql`. - Surfaces retry state in run ledger/agent UI and keeps related shared types synchronized. ## Verification - `pnpm exec vitest run server/src/__tests__/heartbeat-retry-scheduling.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts server/src/__tests__/routines-service.test.ts` - `pnpm exec vitest run src/tools.test.ts` from `packages/mcp-server` ## Risks - Medium risk: this touches heartbeat recovery and routine dispatch, which are central execution paths. - Migration order matters if split branches land out of order: merge this PR before branches that assume the new runtime/routine fields. - Runtime retry behavior should be watched in CI and in local operator smoke tests because it changes how transient failures are resumed. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use enabled. Exact hosted model build and context window are not exposed in this Paperclip heartbeat environment. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [ ] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-21 12:24:11 -05:00
Dotta	ab9051b595	Add first-class issue references (#4214 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - Operators and agents coordinate through company-scoped issues, comments, documents, and task relationships. > - Issue text can mention other tickets, but those references were previously plain markdown/text without durable relationship data. > - That made it harder to understand related work, surface backlinks, and keep cross-ticket context visible in the board. > - This pull request adds first-class issue reference extraction, storage, API responses, and UI surfaces. > - The benefit is that issue references become queryable, navigable, and visible without relying on ad hoc text scanning. ## What Changed - Added shared issue-reference parsing utilities and exported reference-related types/constants. - Added an `issue_reference_mentions` table, idempotent migration DDL, schema exports, and database documentation. - Added server-side issue reference services, route integration, activity summaries, and a backfill command for existing issue content. - Added UI reference pills, related-work panels, markdown/editor mention handling, and issue detail/property rendering updates. - Added focused shared, server, and UI tests for parsing, persistence, display, and related-work behavior. - Rebased `PAP-735-first-class-task-references` cleanly onto `public-gh/master`; no `pnpm-lock.yaml` changes are included. ## Verification - `pnpm -r typecheck` - `pnpm test:run packages/shared/src/issue-references.test.ts server/src/__tests__/issue-references-service.test.ts ui/src/components/IssueRelatedWorkPanel.test.tsx ui/src/components/IssueProperties.test.tsx ui/src/components/MarkdownBody.test.tsx` ## Risks - Medium risk because this adds a new issue-reference persistence path that touches shared parsing, database schema, server routes, and UI rendering. - Migration risk is mitigated by `CREATE TABLE IF NOT EXISTS`, guarded foreign-key creation, and `CREATE INDEX IF NOT EXISTS` statements so users who have applied an older local version of the numbered migration can re-run safely. - UI risk is limited by focused component coverage, but reviewers should still manually inspect issue detail pages containing ticket references before merge. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent, tool-using shell workflow with repository inspection, git rebase/push, typecheck, and focused Vitest verification. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: dotta <dotta@example.com> Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-04-21 10:02:52 -05:00
Dotta	1954eb3048	[codex] Detect issue graph liveness deadlocks (#4209 ) ## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The heartbeat harness is responsible for waking agents, reconciling issue state, and keeping execution moving. > - Some dependency graphs can become live-locks when a blocked issue depends on an unassigned, cancelled, or otherwise uninvokable issue. > - Review and approval stages can also stall when the recorded participant can no longer be resolved. > - This pull request adds issue graph liveness classification plus heartbeat reconciliation that creates durable escalation work for those cases. > - The benefit is that harness-level deadlocks become visible, assigned, logged, and recoverable instead of silently leaving task sequences blocked. ## What Changed - Added an issue graph liveness classifier for blocked dependency and invalid review participant states. - Added heartbeat reconciliation that creates one stable escalation issue per liveness incident, links it as a blocker, comments on the affected issue, wakes the recommended owner, and logs activity. - Wired startup and periodic server reconciliation for issue graph liveness incidents. - Added focused tests for classifier behavior, heartbeat escalation creation/deduplication, and queued dependency wake promotion. - Fixed queued issue wakes so a coalesced wake re-runs queue selection, allowing dependency-unblocked work to start immediately. ## Verification - `pnpm exec vitest run server/src/__tests__/heartbeat-dependency-scheduling.test.ts server/src/__tests__/issue-liveness.test.ts server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts` - Passed locally: `server/src/__tests__/issue-liveness.test.ts` (5 tests) - Skipped locally: embedded Postgres suites because optional package `@embedded-postgres/darwin-x64` is not installed on this host - `pnpm --filter @paperclipai/server typecheck` - `git diff --check` - Greptile review loop: ran 3 times as requested; the final Greptile-reviewed head `0a864eab` had 0 comments and all Greptile threads were resolved. Later commits are CI/test-stability fixes after the requested max Greptile pass count. - GitHub PR checks on head `87493ed4`: `policy`, `verify`, `e2e`, and `security/snyk (cryppadotta)` all passed. ## Risks - Moderate operational risk: the reconciler creates escalation issues automatically, so incorrect classification could create noise. Stable incident keys and deduplication limit repeated escalation. - Low schema risk: this uses existing issue, relation, comment, wake, and activity log tables with no migration. - No UI screenshots included because this change is server-side harness behavior only. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5-based coding agent. Exact runtime model ID and context window were not exposed in this session. Used tool execution for git, tests, typecheck, Greptile review handling, and GitHub CLI operations. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge	2026-04-21 09:11:12 -05:00
Robin van Duiven	8d0c3d2fe6	fix(hermes): inject agent JWT into Hermes adapter env to fix identity attribution (#3608 ) ## Thinking Path > - Paperclip orchestrates AI agents and records their actions through auditable issue comments and API writes. > - The local adapter registry is responsible for adapting each agent runtime to Paperclip's server-side execution context. > - The Hermes local adapter delegated directly to `hermes-paperclip-adapter`, whose current execution context type predates the server `authToken` field. > - Without explicitly passing the run-scoped agent token and run id into Hermes, Hermes could inherit a server or board-user `PAPERCLIP_API_KEY` and lack a usable `PAPERCLIP_RUN_ID` for mutating API calls. > - That made Paperclip writes from Hermes agents risk appearing under the wrong identity or without the correct run-scoped attribution. > - This pull request wraps the Hermes execution call so Hermes receives the agent run JWT as `PAPERCLIP_API_KEY` and the current execution id as `PAPERCLIP_RUN_ID` while preserving explicit adapter configuration where appropriate. > - Follow-up review fixes preserve Hermes' built-in prompt when no custom prompt template exists and document the intentional type cast. > - The benefit is reliable agent attribution for the covered local Hermes path without clobbering Hermes' default heartbeat/task instructions. ## What Changed - Wrapped `hermesLocalAdapter.execute` so `ctx.authToken` is injected into `adapterConfig.env.PAPERCLIP_API_KEY` when no explicit Paperclip API key is already configured. - Injected `ctx.runId` into `adapterConfig.env.PAPERCLIP_RUN_ID` so the auth guard's `X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID` instruction resolves to the current run id. - Added a Paperclip API auth guard to existing custom Hermes `promptTemplate` values without creating a replacement prompt when no custom template exists. - Documented the intentional `as unknown as` cast needed until `hermes-paperclip-adapter` ships an `AdapterExecutionContext` type that includes `authToken`. - Added registry tests for JWT injection, run-id injection, explicit key preservation, default prompt preservation, and the no-`authToken` early-return path. ## Verification - [x] `pnpm --filter "./server" exec vitest run adapter-registry` - 8 tests passed. - [x] `pnpm --filter "./server" typecheck` - passed. - [x] Trigger a Hermes agent heartbeat and verify Paperclip writes appear under the agent identity rather than a shared board-user identity, with the correct run id on mutating requests. ## Risks - Low migration risk: this changes only the Hermes local adapter wrapper and tests. - Existing explicit `adapterConfig.env.PAPERCLIP_API_KEY` values are preserved to avoid breaking intentionally configured agents. - `PAPERCLIP_RUN_ID` is set from `ctx.runId` for each execution so mutating API calls use the current run id instead of a stale or literal placeholder value. - Prompt behavior is intentionally conservative: the auth guard is only prepended when a custom prompt template already exists, so Hermes' built-in default prompt remains intact for unconfigured agents. - Remaining operational risk: the identity and run-id behavior should still be verified with a live Hermes heartbeat before relying on it in production. ## Model Used - OpenAI Codex, GPT-5 family coding agent, tool use enabled for local shell, GitHub CLI, and test execution. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots (not applicable: backend-only change) - [x] I have updated relevant documentation to reflect my changes (not applicable: no product docs changed; PR description updated) - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing> Co-authored-by: Dotta <bippadotta@protonmail.com>	2026-04-21 07:18:11 -05:00

1 2 3 4 5 ...

2333 Commits