Commit Graph

2333 Commits

Author SHA1 Message Date
Dotta fda296ee4f [codex] Add configurable liveness auto-recovery controls (#4587)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Heartbeat liveness recovery decides when stalled issue trees need
manager-visible follow-up.
> - Automatic recovery issue creation is useful, but operators need
instance-level controls for how aggressive it is.
> - Without controls, recovery behavior is harder to tune for local
development, production operations, and noisy edge cases.
> - This pull request adds configurable liveness auto-recovery settings
across shared contracts, API routes, services, and the instance
experimental settings UI.
> - The benefit is that operators can keep liveness findings advisory or
enable bounded recovery automation with explicit intervals and lookback
windows.

## What Changed

- Added shared types and validators for liveness auto-recovery settings.
- Extended instance settings routes and services to persist and validate
the new controls.
- Wired heartbeat/recovery services to honor enablement, minimum
interval, and lookback settings.
- Added UI controls for liveness recovery under instance experimental
settings.
- Covered the new server behavior with instance settings and liveness
escalation tests.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
server/src/__tests__/instance-settings-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Moderate behavioral risk because recovery automation timing changes
when enabled; defaults keep existing advisory behavior unless the
setting is turned on.
- No database migration in this PR; settings are stored through the
existing instance settings path.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:46:44 -05:00
Neeraj Kumar Singh B f0f9460d1d docs: AWS ECS Fargate deployment runbook (#3897)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies and ships
a
>   "local-first, cloud-ready" deployment model
> - The deploy docs currently cover local/Docker but not a production
> cloud target, so teams asking "how do I put this behind a real domain"
>   have no canonical path
> - We already support Docker images, RDS-compatible Postgres, and an
EFS
>   storage profile, so AWS ECS Fargate is a natural fit
> - Without a runbook, each team reinvents VPC, security groups, TLS,
and
>   secrets wiring and usually gets at least one step wrong
> - This pull request adds `docs/deploy/aws-ecs.md`, an ECS
task-definition
> template, and an `.env.aws.example`, cross-linked from the deploy
overview
> - The benefit is a single, reproducible ~$110/mo path to a production
>   deployment, plus a full teardown for throwaway environments

## What Changed

- New `docs/deploy/aws-ecs.md` — an 11-step ECS Fargate runbook covering
ECR,
  VPC, RDS, EFS, Secrets Manager, IAM, ALB, and ECS service with the
  deployment circuit breaker enabled
- New `docker/ecs-task-definition.json` — Fargate-ready task definition
with
  `<ACCOUNT_ID>`, `<REGION>`, `<EFS_ID>`, `<DOMAIN>` placeholder tokens
- New `docker/.env.aws.example` — documents every non-secret env var the
  ECS deployment needs
- `docs/deploy/overview.md` — one-line cross-reference to the new guide
- Greptile feedback addressed in follow-up commits:
  - `containerName` in the service-create call now matches
    `paperclip-server` in the task definition
  - HTTP :80 listener added that 301-redirects to :443
  - Dedicated RDS DB subnet group created before `create-db-instance`
  - EFS teardown polls on mount-target deletion instead of `sleep 30`

## Verification

- Walked every step of the runbook against the task definition to
confirm
  variable names (`$ALB_SG`, `$ECS_SG`, `$RDS_SG`, `$EFS_SG`, `$TG_ARN`,
`$LISTENER_ARN`, `$HTTP_LISTENER_ARN`, `$EFS_ID`, `$RDS_ENDPOINT`, etc.)
are
  defined before they are referenced
- Confirmed the `containerName` in Step 10 (`paperclip-server`) matches
  `docker/ecs-task-definition.json` line 11
- Confirmed the `sed` placeholder substitution in Step 8 matches the
tokens
  in the task definition template
- Teardown order was checked in reverse-dependency order: ECS service →
  listeners → target group → ALB → RDS (waits for deletion) → DB subnet
  group → EFS mount targets (polled) → EFS → secrets → SGs → ECR → IAM →
  log group

## Risks

- **Low risk for the repo.** Docs-only change plus two template files
under
`docker/`; no runtime code paths are touched and nothing is imported by
  the build.
- **Risk for users who follow the runbook:** AWS bills accrue
immediately
  once RDS/ALB/EFS exist. The runbook calls this out and includes a full
  teardown procedure. Placeholder tokens (`<ACCOUNT_ID>`, `<REGION>`,
`<EFS_ID>`, `<DOMAIN>`) are documented so nothing is silently
hard-coded.

## Model Used

- Claude (Anthropic), model `claude-opus-4-6`, ~200K context window,
extended thinking mode on, used with tool access (file edit, shell) via
Claude Code. The Greptile follow-up commits were authored the same way.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass — N/A for docs/config
templates; validated by reading
- [x] I have added or updated tests where applicable — N/A for docs
- [x] If this change affects the UI, I have included before/after
screenshots — N/A, no UI
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 08:41:47 -05:00
Dotta 1d8c7a09b8 [codex] Add security role route regression (#4586)
## Thinking Path

> - Paperclip orchestrates AI agents through company-scoped
control-plane workflows.
> - Agent creation is one of the core board/operator surfaces for
defining who works in a company.
> - The shared taxonomy now includes a first-class `security` agent
role.
> - Direct agent creation must preserve that role through default
instruction materialization and telemetry.
> - A prior replacement PR covered this path, but Greptile identified
that the route-test mock could let a future patch object shadow the
regression.
> - This pull request reopens the narrow regression coverage from
current `master` with the mock ordering fixed.
> - The benefit is a focused guardrail that keeps `security` role
creation observable without expanding the production diff.

## What Changed

- Added a direct agent creation route regression test for `role:
"security"`.
- Verified telemetry receives `agentRole: "security"` after the default
instruction materialization update path.
- Ordered the regression mock as `...patch` before `role: "security"` so
future patch fields cannot shadow the asserted role.

## Verification

- `pnpm install --frozen-lockfile` to link dependencies in the fresh
worktree; it completed with existing plugin SDK bin warnings.
- `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts
packages/shared/src/adapter-types.test.ts`

## Risks

- Low risk. This is test-only coverage and does not change runtime
behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 based coding agent, tool-enabled with local shell
and repository editing capabilities.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no UI changes)
- [x] I have updated relevant documentation to reflect my changes (N/A:
test-only regression)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:11:52 -05:00
Devin Foley d2cbe2cb23 Prefer pushing feature branches to a user fork in paperclip-dev skill (#4572)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The `paperclip-dev` skill is the canonical reference agents read
before doing development work on the Paperclip repo itself
> - Today the skill assumes feature branches get pushed to `origin` (=
`paperclipai/paperclip`), which clutters the upstream branch list when
contributors actually have personal forks
> - This is the standard open-source contribution pattern (push to fork,
PR upstream) and the skill should reflect it
> - This pull request adds a "Forks — Prefer Pushing to a User Fork"
section that teaches agents to detect a fork remote, push there, and
only fall back to `origin` when no fork is configured
> - The benefit is cleaner upstream branch hygiene and behavior that
matches typical contributor workflows without any code/runtime change

## What Changed

- Added a new **Forks — Prefer Pushing to a User Fork** section to
`skills/paperclip-dev/SKILL.md` covering:
- How to detect a user fork via `git remote -v` (treat any
non-`paperclipai` GitHub remote as the fork)
  - How to push to the fork (`git push -u <fork-remote> HEAD`)
- How to create the PR from the fork (`gh pr create --repo
paperclipai/paperclip --head <fork-owner>:<branch>`)
- The no-fork fallback (push to `origin`, do not auto-create a fork —
ask first)
  - Keeping the fork's `master` in sync
- Added a reinforcing entry to the **Common Mistakes** table linking
back to the new section

## Verification

- Docs-only change to a single markdown skill file. Reviewer can confirm
by reading the diff in `skills/paperclip-dev/SKILL.md`:
- New `## Forks — Prefer Pushing to a User Fork` section sits between
`## Worktrees` and `## Pull Requests`
  - New row appended to the `## Common Mistakes` table
- No tests, no build, no runtime behavior affected.

## Risks

- Low risk. Documentation-only edit. The instructions are advisory —
they only change agent behavior on future runs that read the skill.

## Model Used

- Provider: Anthropic (Claude)
- Model ID: `claude-opus-4-7` (Claude Opus 4)
- Capabilities: tool use (file read/edit, shell, git, gh CLI), extended
reasoning
- Context: invoked via Claude Code / Paperclip heartbeat for issue
PAPA-139

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (N/A — docs-only change; no
test surface)
- [x] I have added or updated tests where applicable (N/A)
- [x] If this change affects the UI, I have included before/after
screenshots (N/A — no UI change)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 22:19:07 -07:00
Dotta 82e257c7ba Cancel stale queued heartbeats when issue graph changes (PAP-2314) (#4534)
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 21:17:38 -05:00
Devin Foley 868d08903e test: isolate CLI company import e2e state (#4560)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and its
CLI import/export path is part of how operators move company state
safely between environments.
> - The `paperclipai company import/export` e2e test is supposed to
validate that portability flow inside a hermetic harness, not against a
developer's live Paperclip home.
> - This regression showed nested CLI subprocesses could silently fall
back to ambient `PAPERCLIP_*` state and mutate a real local instance by
creating extra companies such as `CLI-1-Roundtrip-Test`.
> - The first job was to pin the test subprocesses to isolated config,
home, instance, auth, and context paths, and to add a regression
assertion that proves the nested CLI writes stay inside the test-owned
state.
> - Once the PR was up, CI and Greptile exposed two follow-on issues
that were blocking merge: plugin SDK typecheck bootstrap was racing
across packages in fresh CI, and the new lock helper needed one more fix
to release its lock on failure.
> - This pull request therefore ends up doing two tightly related
things: fixing the original CLI isolation leak, and hardening the
supporting typecheck/bootstrap path enough for the fix to verify cleanly
in CI.
> - The benefit is that the portability e2e test is now actually
isolated, and the PR verification path is stable enough to catch
regressions instead of introducing its own nondeterministic failures.

## What Changed

- Hardened `cli/src/__tests__/company-import-export-e2e.test.ts` so
nested CLI subprocesses re-seed isolated `PAPERCLIP_CONFIG`,
`PAPERCLIP_HOME`, `PAPERCLIP_INSTANCE_ID`, `PAPERCLIP_CONTEXT`,
`PAPERCLIP_AUTH_STORE`, and throwaway `HOME` values instead of falling
back to ambient machine state.
- Added a regression assertion around `paperclipai context set --json`,
then cleared the temporary `context.json` so the isolation check and the
later export/import flow stay independent.
- Passed the same isolated `HOME` into the server subprocess so both
sides of the e2e harness are symmetric.
- Introduced locking in `scripts/ensure-plugin-build-deps.mjs` and
switched the server/plugin example `typecheck` scripts to use that
helper instead of launching concurrent raw `@paperclipai/plugin-sdk`
builds.
- Fixed the helper failure path so it releases the lock before exiting
non-zero, which prevents stale-lock timeouts during parallel typecheck
runs.

## Verification

- `pnpm vitest run cli/src/__tests__/company-import-export-e2e.test.ts
--project paperclipai`
- `pnpm --filter paperclipai typecheck`
- `pnpm -r typecheck`
- PR checks now pass on the current head, including `policy`, `verify`,
`e2e`, `security/snyk`, and `Greptile Review`.

## Risks

- Low risk. The product-facing behavior change is scoped to test harness
code in the CLI e2e suite.
- The CI stabilization changes only affect bootstrap/typecheck helper
paths for the server and plugin/example packages, but they do touch
shared verification plumbing; the main risk is changing how fresh build
artifacts are prepared in local/CI typecheck runs.

## Model Used

- Anthropic Claude via Paperclip `claude_local`, model
`claude-opus-4-7`, high-effort local coding agent, used for the initial
implementation and first peer-reviewed verification.
- OpenAI Codex via Paperclip `codex_local`, model `gpt-5.4`, high
reasoning-effort local coding agent with tool use, used for CI triage,
Greptile follow-up fixes, verification, and PR maintenance.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 19:10:01 -07:00
Devin Foley 1d9f7a5149 Fix flaky heartbeat recovery teardown CI failure (#4559)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The linked CI job is in the server test/recovery path, where
heartbeat runs and issue cleanup need to leave the control plane in a
consistent state even when retries fail.
> - In this case the failure was not runtime product behavior but test
teardown behavior inside `heartbeat-process-recovery.test.ts`.
> - The failing GitHub Actions job showed a foreign-key race on
`company_skills_company_id_companies_id_fk` while the test tried to
delete the parent company record.
> - The surrounding teardown code already uses bounded retry cleanup for
other dependent tables (`issues`, `heartbeatRuns`, and `agents`) because
this test file intentionally exercises asynchronous recovery flows.
> - This pull request applies that same retry pattern to the final
`db.delete(companies)` step, re-clearing `companySkills` before each
retry.
> - The benefit is a targeted fix for the CI flake without changing
runtime behavior or expanding the scope beyond the failing teardown
path.

## What Changed

- Wrapped the final `db.delete(companies)` call in
`server/src/__tests__/heartbeat-process-recovery.test.ts` with the same
5-attempt retry pattern already used elsewhere in that teardown.
- Re-cleared `companySkills` before each company-delete retry so
late-arriving FK-dependent rows do not mask the real test result.
- Verified the fix against the originally failing
`heartbeat-process-recovery` test file and the broader `pnpm test:run`
command under CI-like env conditions.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`
- Re-ran `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts` multiple times
locally; the previously failing teardown stayed green.
- `env -u PAPERCLIP_API_URL -u PAPERCLIP_RUNTIME_API_URL -u
PAPERCLIP_RUN_ID -u PAPERCLIP_TASK_ID -u PAPERCLIP_AGENT_ID -u
PAPERCLIP_COMPANY_ID -u PAPERCLIP_API_KEY -u PAPERCLIP_WAKE_REASON -u
PAPERCLIP_WAKE_COMMENT_ID -u PAPERCLIP_WAKE_PAYLOAD_JSON -u
PAPERCLIP_APPROVAL_ID -u PAPERCLIP_APPROVAL_STATUS pnpm test:run`

## Risks

- Low risk. The change is test-only and scoped to teardown retry
behavior in a single server test file.
- If the underlying async cleanup behavior changes again, this test
could still become flaky in a different way, but this PR addresses the
specific FK race seen in the linked CI job.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI `gpt-5.4` via Paperclip `codex_local`, high reasoning mode,
with tool use for shell, git, HTTP API calls, and patch application.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 17:30:20 -07:00
Devin Foley 8145141c55 Fix external issue URL rewriting in markdown (#4558)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Issue and comment rendering is part of the board UI where humans
supervise and inspect agent work.
> - External Paperclip issue URLs can appear in comments as references
to other runs, review threads, or remote test environments.
> - Those links must preserve their full destination, including origin,
port, and `#comment-...` fragments, or the operator is taken to the
wrong place.
> - The bug here was that absolute `http(s)` issue URLs were being
normalized into internal `/issues/...` routes in the markdown path.
> - This pull request stops rewriting absolute URLs while keeping
internal issue-reference behavior for relative paths and identifiers.
> - The benefit is that authored external links now navigate exactly
where the operator expects, especially for remote test and
comment-deep-link workflows.

## What Changed

- Stopped `ui/src/lib/issue-reference.ts` from treating absolute
`http(s)` URLs as internal issue paths.
- Added defense-in-depth in `ui/src/lib/mention-chips.ts` so absolute
`http(s)` URLs are never reclassified as issue mention chips.
- Updated `ui/src/lib/issue-reference.test.ts` to cover absolute
Paperclip URLs with preserved origin, port, and comment hash.
- Updated `ui/src/components/MarkdownBody.test.tsx` to assert the
reported URL renders as an external link, not an internal `/issues/...`
href.

## Verification

- `pnpm exec vitest run ui/src/lib/issue-reference.test.ts
ui/src/components/MarkdownBody.test.tsx`
- Expected result: `2` files passed, `37` tests passed.
- Manual spot-check from the issue report path: a URL like
`http://remote.example.test:3103/PAPA/issues/PAPA-115#comment-...`
should remain an external link with its full destination preserved.

## Risks

- Low risk. The change narrows when Paperclip rewrites URLs, so the main
risk is if some existing workflow depended on absolute `http(s)`
Paperclip URLs being converted into internal issue links. The added
regression coverage is aimed at preventing that from regressing
silently.

## Model Used

- OpenAI Codex local agent via Paperclip `codex_local`
- Backing model family: GPT-5-based Codex runtime
- Exact backend model ID/version: not exposed by this adapter/runtime
surface
- Context window: not exposed by this adapter/runtime surface
- Capabilities used: tool use, shell command execution, code editing,
git operations, and local test execution

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 17:19:23 -07:00
Devin Foley 54ab0d24cd Fix disappearing issue comments (#4557)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies, so issue detail
pages are a primary surface for understanding agent work and human
feedback.
> - The relevant subsystem here is the issue comments/chat experience
across the React issue detail page and the server comment pagination
API.
> - Long issue threads were only surfacing the newest page of comments
at first render, which hid earlier human and agent messages behind extra
pagination.
> - The first UI fix exposed that the descending cursor path on the
server could also fail for older-page fetches, leaving the chat tab
stuck on an infinite "Loading earlier comments..." state.
> - This needed to be addressed in both layers so the chat tab can
surface earlier conversation history without manual recovery and without
server errors.
> - This pull request auto-loads earlier comment pages in the issue
detail chat view and fixes the descending cursor predicate used by issue
comment pagination.
> - The benefit is that long-running issues like `PAPA-103` now show the
missing conversation history near the top of the chat surface instead of
hiding it or failing to load it.

## What Changed

- Auto-load earlier issue comment pages in the issue detail chat tab
until the thread reaches a 150-comment cap or there are no older
comments left.
- Add UI-side guard logic and regression coverage for optimistic issue
comment pagination so the autoload behavior stops cleanly.
- Replace the raw SQL descending cursor predicate in
`issueService.listComments` with typed Drizzle comparisons for the
`(createdAt, id)` anchor tuple.
- Add a server regression test that paginates earlier comments in
descending order from an anchor comment.
- Smoke-test the exact previously failing seeded `PAPA-103` cursor path
on the isolated dev instance used for review.

## Verification

- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/issues-service.test.ts`
- `pnpm --filter @paperclipai/server typecheck`
- Manual smoke against seeded `PAPA-103` data on the isolated dev
server:
- `GET /api/issues/PAPA-103/comments?order=desc&limit=50` returns `200`
- `GET
/api/issues/PAPA-103/comments?after=765d3609-edc6-4d11-a8fe-d466affbe85d&order=desc&limit=50`
now returns `200` with 50 comments instead of `500`

## Risks

- Moderate UI/perf risk on very large threads because the chat tab now
prefetches multiple earlier pages on mount; the cap is intentionally
limited to 150 comments to bound that work.
- Low API risk because the server fix only changes the cursor predicate
construction for anchor-based comment pagination, but any mistake there
would affect older-comment paging order.

> I checked `ROADMAP.md` before opening this PR and this bug fix does
not duplicate planned core work.

## Model Used

- OpenAI Codex coding agent in the Paperclip local adapter environment.
The exact backend model ID and context window were not exposed
in-session. Tool-assisted workflow included shell execution, git/GitHub
CLI, local test execution, and targeted code edits.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 16:23:53 -07:00
Devin Foley b2496c8067 fix(auth): trust allowed hostname port variants on detected listen port (#4554)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, so
authenticated board access has to be predictable across local and
worktree deployments.
> - This change sits in the authenticated-mode server startup and Better
Auth origin-trust wiring.
> - The original auth branch fixed one real gap by adding port-qualified
trusted origins for allowed hostnames on non-default ports.
> - Review of that branch found a second-order bug: trusted origins were
still derived from the configured port before startup detected the
actual listen port.
> - In isolated worktrees, that meant a common `3100 -> 3101` port shift
could still leave Better Auth trusting the stale origin.
> - This pull request keeps the original allowed-hostname port-variant
fix, then moves trust derivation onto the resolved listen port and adds
regression coverage around startup wiring.
> - The benefit is that authenticated sessions keep working on allowed
private hostnames even when Paperclip has to auto-shift to a different
local port.

## What Changed

- Added `:port` trusted-origin variants for authenticated-mode
`allowedHostnames` when Paperclip runs on non-default ports.
- Changed authenticated startup so `listenPort` is detected before
Better Auth initialization, and explicit auth base URLs are rewritten
before auth startup.
- Updated `deriveAuthTrustedOrigins()` to accept the resolved listen
port so Better Auth trusts the actual browser origin instead of the
stale configured port.
- Added focused regression coverage in
`server/src/__tests__/better-auth.test.ts` and
`server/src/__tests__/server-startup-feedback-export.test.ts`.

## Verification

- `pnpm exec vitest run server/src/__tests__/better-auth.test.ts
server/src/__tests__/server-startup-feedback-export.test.ts`
- Reviewer re-check: reviewed commits `380f5b9f` and `092bb34c` after
the follow-up fix landed and found no remaining issues.

## Risks

- Low risk: this only affects authenticated-mode origin derivation and
startup ordering around detected listen ports.
- Main behavioral shift: startup no longer mutates `config.port` to the
selected port; it now carries `requestedListenPort` separately and uses
`listenPort` where runtime behavior needs the resolved value.
- If another path was implicitly relying on `config.port` being
overwritten during startup, that path would need follow-up, though the
current startup/test coverage did not reveal one.

> I checked `ROADMAP.md` and did not find an overlapping planned core
work item for this auth trusted-origin port handling fix.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agents for implementation and
review. Exact backend model ID/context window were not surfaced in this
run context; work was performed through the Codex local adapter with
tool use, code execution, and review passes.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 15:40:39 -07:00
Devin Foley 08af830430 Tighten publicBaseUrl port rewriting (#4553)
## Thinking Path

> - Paperclip is a control plane for autonomous agent companies, so its
local and authenticated deployment behavior has to stay predictable
under port rebinding and worktree isolation.
> - This change sits in the server/worktree configuration path that
derives runtime URLs and auth origins from `auth.publicBaseUrl`.
> - The original hostname-port rewrite change fixed one real gap for
private/tailnet host:port worktree setups, but it widened the rewrite
rule too far.
> - Rewriting every explicit `auth.publicBaseUrl` can corrupt public or
reverse-proxy URLs by turning a stable origin like
`https://paperclip.example` into a local listen-port URL.
> - Paperclip's auth and trusted-origin handling depend on that URL
staying semantically correct, so this had to be narrowed before merge.
> - This pull request tightens the rewrite rule to explicit-port URLs
only and adds regression coverage across the CLI helper, worktree config
persistence, and server startup path.
> - The benefit is that private host:port worktree flows still work,
while public/default-port URLs remain stable and safe.

## What Changed

- Tightened `rewriteLocalUrlPort` in `cli/src/commands/worktree-lib.ts`,
`server/src/worktree-config.ts`, and `server/src/index.ts` so it only
rewrites URLs that already include an explicit port.
- Removed the old loopback-only hostname gate from the CLI/worktree
helpers and replaced it with the more precise `parsed.port` guard.
- Updated CLI helper coverage to assert that explicit-port non-loopback
URLs still rewrite while no-port public URLs stay unchanged.
- Expanded `server/src/__tests__/worktree-config.test.ts` to cover
explicit-port rewrite and no-port stability for both persisted worktree
config and in-memory runtime port selection.
- Added startup-path coverage in
`server/src/__tests__/server-startup-feedback-export.test.ts` for
`detect-port` rebinding with both explicit-port and no-port
`auth.publicBaseUrl` values.

## Verification

- `pnpm --filter @paperclipai/plugin-sdk build`
- `npx vitest run
server/src/__tests__/server-startup-feedback-export.test.ts`
- `npx vitest run cli/src/__tests__/worktree.test.ts
server/src/__tests__/worktree-config.test.ts`
- All of the above were run locally in this issue worktree and passed.

## Risks

- Low risk. The behavior change is deliberately narrower than the
reviewed broad-host rewrite and is guarded by regression coverage for
both the explicit-port and no-port cases.
- The main remaining risk is behavioral only if another code path starts
depending on port rewriting for URLs that never declared a port, which
would be a separate bug.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex local agent using `gpt-5.4` with high reasoning effort,
tool use, shell execution, and file editing.
- Anthropic Claude local agent using `claude-opus-4-6` for follow-up
code review approval on the implementation issue.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 14:29:22 -07:00
Devin Foley d47ffa87f0 Fix CEO AGENT_HOME paths and centralize workspace env propagation (#4551)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The local adapter layer is responsible for turning Paperclip runtime
context into the environment seen by the child agent process.
> - The CEO onboarding bundle tells the agent where to read and write
its persistent memory and fact files.
> - That bundle was using `./memory/...` and `./life/...`, which only
works when the process cwd happens to equal the agent home directory.
> - At the same time, six local adapters each duplicated the same
workspace-env propagation logic, including `AGENT_HOME`, which makes
this contract easy to drift.
> - This pull request fixes the CEO instructions to use
`$AGENT_HOME/...` and centralizes workspace-env propagation in one
shared helper with shared tests.
> - The benefit is a real bug fix for agent memory paths plus a single
tested contract that makes future built-in adapter work less likely to
forget `AGENT_HOME`.

## What Changed

- Updated `server/src/onboarding-assets/ceo/HEARTBEAT.md` to use
`$AGENT_HOME/memory/...` and `$AGENT_HOME/life/...` instead of
cwd-relative `./memory/...` and `./life/...`.
- Added `applyPaperclipWorkspaceEnv(...)` in
`packages/adapter-utils/src/server-utils.ts` to centralize
`PAPERCLIP_WORKSPACE_*` and `AGENT_HOME` propagation.
- Added shared helper coverage in
`packages/adapter-utils/src/server-utils.test.ts` for both populated and
skip-empty cases.
- Switched the built-in local adapters (`claude_local`, `codex_local`,
`cursor_local`, `gemini_local`, `opencode_local`, `pi_local`) over to
the shared helper instead of inline env assignment blocks.

## Verification

- `pnpm install`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/codex-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- Result: 7 test files passed, 31 tests passed, 0 failures.

## Risks

- Low risk.
- The only behavioral surface is the shared env propagation refactor
across six adapters; if the helper diverged from prior semantics, an
adapter could miss a workspace env var.
- The shared helper test plus the affected adapter execute tests reduce
that risk, and the helper preserves the prior "set only non-empty
strings" behavior.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agent runtime; tool-assisted
coding workflow with shell execution, file patching, git operations, and
API interaction. The exact backend model identifier and context window
are not surfaced by this local runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 13:57:35 -07:00
Devin Foley d1484551ee Add open-source hygiene note to paperclip-dev skill (#4541)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The `paperclip-dev` skill is part of the contributor and agent
workflow layer that tells developers how to work in this repository
safely.
> - That skill already references the public upstream `origin`, but it
did not explicitly say that pushes there must be treated as publishable
open-source output.
> - Without that reminder, contributors are more likely to leak secrets,
PII, private logs, machine-local config, or noisy throwaway git history
into the public repo.
> - This pull request adds a prominent `OPEN SOURCE HYGIENE` callout
near the top of the skill, before the git workflow guidance.
> - The benefit is clearer safety guidance for contributors and less
accidental disclosure or branch/commit noise on the upstream project.

## What Changed

- Added an `OPEN SOURCE HYGIENE` callout near the top of
`skills/paperclip-dev/SKILL.md`.
- Explicitly warned that anything pushed to `origin` must be
publishable.
- Called out avoiding secrets, API keys, PII, private logs,
machine-local config, and noisy throwaway branches or checkpoint
commits.

## Verification

- N/a

## Risks

- Low risk. This is a docs-only change in a skill file; the main risk is
wording tone or placement, not runtime behavior.

## Model Used

- OpenAI Codex via the `codex_local` Paperclip adapter, GPT-5-based
coding agent runtime. Exact backend serving model ID is not exposed in
this heartbeat environment. Tool use, shell execution, and patch
application were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 12:14:49 -07:00
Devin Foley 91333ec86f feat: add paperclip-dev skill with optional bundled skill support (#3854)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents working on the Paperclip codebase itself need guidance on dev
workflows: server lifecycle, worktrees, builds, database ops,
diagnostics
> - There was no bundled skill covering these workflows — agents had to
figure it out from scratch each time
> - Additionally, not every skill should be force-installed on every
agent — a dev-focused skill should be opt-in
> - This PR adds a `paperclip-dev` skill with `required: false`
frontmatter so it ships with Paperclip but isn't auto-installed
> - The skill's PR section references canonical files
(`.github/PULL_REQUEST_TEMPLATE.md`, `CONTRIBUTING.md`) instead of
duplicating their content, with gated instructions that force agents to
read those files before creating any PR
> - The benefit is that developers (human or agent) can opt in to
structured dev guidance without polluting the default agent skill set or
creating drift between duplicated docs

## What Changed

- Added `skills/paperclip-dev/SKILL.md` covering server management,
worktree lifecycle, builds, database ops, diagnostics, agent operations,
and common mistakes
- The Pull Requests section uses gated, reference-based instructions —
agents MUST read `.github/PULL_REQUEST_TEMPLATE.md` and
`CONTRIBUTING.md` before running `gh pr create`, with a brief checklist
of required section names (no content duplication)
- Updated `packages/adapter-utils/src/server-utils.ts` to respect
`required: false` frontmatter — optional skills are bundled but not
auto-installed on agents
- Added test in `server/src/__tests__/paperclip-skill-utils.test.ts`
verifying that optional skills are excluded from the default install set

## Verification

```bash
# Run tests
pnpm test

# Manual verification: create a fresh worktree without seeding
npx paperclipai worktree:make test-optional-skill --no-seed
cd ~/paperclip-test-optional-skill
eval "$(npx paperclipai worktree env)"
npx paperclipai run

# Verify paperclip-dev appears in company skill library but is NOT auto-assigned
# Call listPaperclipSkillEntries() — paperclip-dev should show required: false
# Call resolvePaperclipDesiredSkillNames() — paperclip-dev should NOT be in the default set

# Cleanup
npx paperclipai worktree:cleanup test-optional-skill
```

## Risks

- Low risk. The `required` field defaults to `true` when absent, so all
existing skills behave identically. Only the new `paperclip-dev` skill
sets `required: false`.

## Model Used

Claude Opus 4.6 (`claude-opus-4-6`) via Claude Code, with tool use and
extended context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-26 11:06:13 -07:00
Dotta c036bbfa98 Add first-class security agent role to taxonomy (#4532)
## Thinking Path

> - Paperclip is the control plane for AI-agent companies, so agent
metadata is part of the platform's governance and audit surface.
> - The shared agent taxonomy in `@paperclipai/shared` is the source of
truth for allowed agent roles and their UI labels.
> - The current taxonomy lacks a `security` role, which causes Security
Engineer hires to collapse into `engineer`.
> - That breaks separation-of-duties evidence in telemetry and weakens
role-level audit fidelity even though it does not directly change
permissions.
> - This pull request adds `security` as a first-class shared role and
covers the prior rejection path with a regression test.
> - The benefit is that Security Engineer agents can now be persisted
and rendered under the correct role without schema or permission churn.

## What Changed

- Added `security` to `AGENT_ROLES` in
`packages/shared/src/constants.ts`.
- Added the `Security` display label to `AGENT_ROLE_LABELS` so existing
UI consumers render the new role automatically.
- Added a shared validator regression test proving `createAgentSchema`
accepts `role: "security"` and that the label stays stable.

## Verification

- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/shared exec vitest run
src/adapter-types.test.ts`

## Risks

- Low risk. This is a shared enum expansion with no database migration
and no permission-model change.
- Residual risk: this PR does not backfill existing agents already
persisted as `engineer`; it only fixes new validations and labels going
forward.

> I checked `ROADMAP.md`/`doc` for overlap and did not find an existing
planned item covering this taxonomy fix.

## Model Used

- OpenAI GPT-5.4 via the Codex local adapter, with tool use and local
code execution enabled. The runtime did not surface a separate
context-window identifier in agent metadata.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 07:52:05 -05:00
Dotta df425fde96 Present ordered sub-issues as a workflow checklist (#4523)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators use issue detail pages and child issue lists to understand
multi-step execution plans.
> - Ordered sub-issues currently read like a flat table, so dependency
chains and current next steps are harder to scan.
> - The branch work adds a workflow-oriented presentation for child
issues without changing the single-assignee task model.
> - This pull request makes ordered sub-issues read more like a progress
checklist while preserving normal issue list controls.
> - The benefit is that operators can see completed steps, active work,
blocked follow-ups, and dependency order at a glance.

## What Changed

- Added workflow sorting utilities and tests for dependency-aware child
issue ordering.
- Added sub-issue progress summary, checklist numbering, current-step
affordances, blocker context, and done-state de-emphasis in the issue
list UI.
- Wired issue detail sub-issue panels to use the workflow sort/progress
checklist presentation.
- Updated issue service behavior/tests for child issue ordering inputs
used by the UI.
- Added a Storybook visual review fixture and screenshot helper for the
sub-issue workflow checklist surface.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/issues-service.test.ts
ui/src/components/IssueRow.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/pages/IssueDetail.test.tsx
ui/src/lib/issue-detail-subissues.test.ts
ui/src/lib/workflow-sort.test.ts`
- Result: 6 test files passed, 55 tests passed, 34 embedded Postgres
issue-service tests skipped because `@embedded-postgres/darwin-x64` is
unavailable on this host.
- Visual review: generated Storybook screenshots from the existing local
Storybook server on port 6006 with `node
scripts/screenshot-subissues.mjs /tmp/pap-2189-subissues-screens
http://localhost:6006`.
- Screenshot artifacts:
- Desktop dark: ![Desktop
dark](doc/assets/pap-2189/desktop-1440x900-dark.png)
- Desktop light: ![Desktop
light](doc/assets/pap-2189/desktop-1440x900-light.png)
- Mobile dark: ![Mobile
dark](doc/assets/pap-2189/mobile-390x844-dark.png)
- Mobile light: ![Mobile
light](doc/assets/pap-2189/mobile-390x844-light.png)
- Local Storybook note: starting a second Storybook process selected
port 6008 because 6006 was occupied, then Vite failed with an esbuild
host/binary version mismatch (`0.25.12` host vs `0.27.3` binary). The
already-running Storybook server on 6006 served the fixture successfully
for screenshots.

## Risks

- Medium UI risk: the issue list now has additional sub-issue-specific
visual states, so dense lists should be checked for spacing and
scanability.
- Low ordering risk: workflow sorting is covered by focused unit tests,
but unusual dependency topologies may still need reviewer attention.
- No migration risk: this PR does not add database migrations or touch
`pnpm-lock.yaml`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub
workflow. Context window is runtime-provided and not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-26 07:36:49 -05:00
Devin Foley 40782f703d Fix release packaging for standalone public packages (#4494)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and the
sandbox-provider work just moved E2B into a standalone publishable
plugin package.
> - That plugin is intentionally excluded from the root pnpm workspace
so it can model third-party install behavior without forcing lockfile
churn in the main repo.
> - The merged architecture change exposed a follow-up release problem:
the canary publish workflow tried to publish `@paperclipai/plugin-e2b`,
but the tarball had no `dist/` payload because standalone public
packages were not being built in the release path.
> - That means the release pipeline needed a packaging fix in core
release tooling, not another architectural change in the sandbox
provider itself.
> - This pull request adds a generic release step for public packages
that live outside the pnpm workspace, instead of hardcoding E2B-specific
behavior into the release script.
> - The benefit is that standalone publishable packages can be built and
packed correctly during release, including future sandbox-provider
plugins that follow the same pattern.

## What Changed

- Added `scripts/build-standalone-public-packages.mjs` to discover
public packages outside the pnpm workspace, run a clean package-local
install, and build them before publish.
- Updated `scripts/release.sh` to invoke that helper immediately after
the normal workspace build step.
- Kept the behavior generic by driving off the existing public package
map and pnpm workspace patterns rather than special-casing
`@paperclipai/plugin-e2b`.

## Verification

- `rm -rf packages/plugins/sandbox-providers/e2b/dist`
- `node ./scripts/build-standalone-public-packages.mjs`
- `cd packages/plugins/sandbox-providers/e2b && npm pack --dry-run`
- Confirm the tarball now includes the rebuilt `dist/` files instead of
only `README.md` / `package.json`

## Risks

- Low risk: this only changes the release build path for public packages
outside the pnpm workspace.
- The helper performs a clean package-local install for each standalone
public package, so release time may increase slightly as more such
packages are added.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, GitHub CLI, and local
build/test inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-25 12:16:23 -07:00
Devin Foley 4ef969f084 Add E2B sandbox provider plugin (#4452)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Sandbox environments are part of that execution layer, and the
recent core refactor moved provider-specific behavior to a generic
plugin seam
> - This pull request adds a dedicated `@paperclipai/plugin-e2b` package
so E2B can live entirely outside core host code
> - Because the feature is still unreleased, the plugin should model
third-party packaging directly instead of carrying extra
backward-compatibility complexity in core or the workspace lockfile
> - This branch therefore makes the E2B provider a standalone
publishable package, documents the package-local dev flow, and keeps the
publish manifest/runtime dependency story correct
> - The benefit is that E2B becomes a true plugin reference
implementation that can be installed by package name without reopening
core Paperclip code

## What Changed

- Added `packages/plugins/paperclip-plugin-e2b` as the E2B sandbox
provider plugin package
- Implemented config validation, lease acquire/resume/release/destroy
handlers, workspace realization, and command execution for E2B sandboxes
- Excluded the E2B plugin package from the root workspace so the repo no
longer needs `pnpm-lock.yaml` churn for its third-party dependency graph
- Added package-local development/install support plus a prepack
manifest generator so the published tarball still declares
`@paperclipai/plugin-sdk` and `e2b` runtime dependencies
- Addressed review feedback by fixing sandbox cleanup on acquire
failures, rejecting blank templates, normalizing fractional `timeoutMs`,
and always passing the configured template name to the E2B SDK
- Updated focused Vitest coverage for config normalization, validation,
acquire cleanup, command execution, and lease release behavior
- Updated the Dockerfile deps stage to copy the E2B package manifest so
the policy check stays in sync

## Verification

- `cd packages/plugins/paperclip-plugin-e2b && pnpm install
--ignore-workspace --no-lockfile`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm build`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace
test`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace
typecheck`
- `cd packages/plugins/paperclip-plugin-e2b && npm pack --dry-run`

## Risks

- The package now relies on a prepack manifest rewrite so the
publish-time dependency list stays correct while the repo-local dev
manifest stays workspace-light
- The current repo snapshot is still unreleased, so the generated
publish manifest points at the repo SDK version until the normal release
flow rewrites versions before publish
- Real-world E2B environments may still expose edge cases around
lifecycle timing or sandbox metadata beyond the mocked unit coverage

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, GitHub CLI, and local
build/test inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-25 11:01:11 -07:00
Devin Foley 5bd0f578fd Generalize sandbox provider core for plugin-only providers (#4449)
## Thinking Path

> - Paperclip is a control plane, so optional execution providers should
sit at the plugin edge instead of hardcoding provider-specific behavior
into core shared/server/ui layers.
> - Sandbox environments are already first-class, and the fake provider
proves the built-in path; the remaining gap was that real providers
still leaked provider-specific config and runtime assumptions into core.
> - That coupling showed up in config normalization, secret persistence,
capabilities reporting, lease reconstruction, and the board UI form
fields.
> - As long as core knew about those provider-shaped details, shipping a
provider as a pure third-party plugin meant every new provider would
still require host changes.
> - This pull request generalizes the sandbox provider seam around
schema-driven plugin metadata and generic secret-ref handling.
> - The runtime and UI now consume provider metadata generically, so
core only special-cases the built-in fake provider while third-party
providers can live entirely in plugins.

## What Changed

- Added generic sandbox-provider capability metadata so plugin-backed
providers can expose `configSchema` through shared environment support
and the environments capabilities API.
- Reworked sandbox config normalization/persistence/runtime resolution
to handle schema-declared secret-ref fields generically, storing them as
Paperclip secrets and resolving them for probe/execute/release flows.
- Generalized plugin sandbox runtime handling so provider validation,
reusable-lease matching, lease reconstruction, and plugin worker calls
all operate on provider-agnostic config instead of provider-shaped
branches.
- Replaced hardcoded sandbox provider form fields in Company Settings
with schema-driven rendering and blocked agent environment selection
from the built-in fake provider.
- Added regression coverage for the generic seam across shared support
helpers plus environment config, probe, routes, runtime, and
sandbox-provider runtime tests.

## Verification

- `pnpm vitest --run packages/shared/src/environment-support.test.ts
server/src/__tests__/environment-config.test.ts
server/src/__tests__/environment-probe.test.ts
server/src/__tests__/environment-routes.test.ts
server/src/__tests__/environment-runtime.test.ts
server/src/__tests__/sandbox-provider-runtime.test.ts`
- `pnpm -r typecheck`

## Risks

- Plugin sandbox providers now depend more heavily on accurate
`configSchema` declarations; incorrect schemas can misclassify
secret-bearing fields or omit required config.
- Reusable lease matching is now metadata-driven for plugin-backed
providers, so providers that fail to persist stable metadata may
reprovision instead of resuming an existing lease.
- The UI form is now fully schema-driven for plugin-backed sandbox
providers; provider manifests without good defaults or descriptions may
produce a rougher operator experience.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, and local code/test
inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 18:03:41 -07:00
Dotta deba60ebb2 Stabilize serialized server route tests (#4448)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The server route suite is a core confidence layer for auth, issue
context, and workspace runtime behavior
> - Some route tests were doing extra module/server isolation work that
made local runs slower and more fragile
> - The stable Vitest runner also needs to pass server-relative exclude
paths to avoid accidentally re-including serialized suites
> - This pull request tightens route test isolation and runner
serialization behavior
> - The benefit is more reliable targeted and stable-route test
execution without product behavior changes

## What Changed

- Updated `run-vitest-stable.mjs` to exclude serialized server tests
using server-relative paths.
- Forced the server Vitest config to use a single worker in addition to
isolated forks.
- Simplified agent permission route tests to create per-request test
servers without shared server lifecycle state.
- Stabilized issue goal context route mocks by using static mocked
services and a sequential suite.
- Re-registered workspace runtime route mocks before cache-busted route
imports.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/issues-goal-context-routes.test.ts
server/src/__tests__/workspace-runtime-routes-authz.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `node --check scripts/run-vitest-stable.mjs`

## Risks

- Low risk. This is test infrastructure only.
- The stable runner path fix changes which tests are excluded from the
non-serialized server batch, matching the server project root that
Vitest applies internally.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:27:00 -05:00
Dotta f68e9caa9a Polish markdown external link wrapping (#4447)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board UI renders agent comments, PR links, issue links, and
operational markdown throughout issue threads
> - Long GitHub and external links can wrap awkwardly, leaving icons
orphaned from the text they describe
> - Small inbox visual polish also helps repeated board scanning without
changing behavior
> - This pull request glues markdown link icons to adjacent link
characters and removes a redundant inbox list border
> - The benefit is cleaner, more stable markdown and inbox rendering for
day-to-day operator review

## What Changed

- Added an external-link indicator for external markdown links.
- Kept the GitHub icon attached to the first link character so it does
not wrap onto a separate line.
- Kept the external-link icon attached to the final link character so it
does not wrap away from the URL/text.
- Added markdown rendering regressions for GitHub and external link icon
wrapping.
- Removed the extra border around the inbox list card.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/MarkdownBody.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Low risk. The markdown change is limited to link child rendering and
preserves existing href/target/rel behavior.
- Visual-only inbox polish.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:26:13 -05:00
Dotta 73fbdf36db Gate stale-run watchdog decisions by board access (#4446)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The run ledger surfaces stale-run watchdog evaluation issues and
recovery actions
> - Viewer-level board users should be able to inspect status without
getting controls that the server will reject
> - The UI also needs enough board-access context to know when to hide
those decision actions
> - This pull request exposes board memberships in the current board
access snapshot and gates watchdog action controls for known viewer
contexts
> - The benefit is clearer least-privilege UI behavior around recovery
controls

## What Changed

- Included memberships in `/api/cli-auth/me` so the board UI can
distinguish active viewer memberships from operator/admin access.
- Added the stale-run evaluation issue assignee to output silence
summaries.
- Hid stale-run watchdog decision buttons for known non-owner viewer
contexts.
- Surfaced watchdog decision failures through toast and inline error
text.
- Threaded `companyId` through the issue activity run ledger so access
checks are company-scoped.
- Added IssueRunLedger coverage for non-owner viewers.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Medium-low risk. This is a UI gating change backed by existing server
authorization.
- Local implicit and instance-admin board contexts continue to show
watchdog decision controls.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:25:23 -05:00
Dotta 6916e30f8e Cancel stale retries when issue ownership changes (#4445)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue execution is guarded by run locks and bounded retry scheduling
> - A failed run can schedule a retry, but the issue may be reassigned
before that retry becomes due
> - The old assignee's scheduled retry should not continue to hold or
reclaim execution for the issue
> - This pull request cancels stale scheduled retries when ownership
changes and cancels live work when an issue is explicitly cancelled
> - The benefit is cleaner issue handoff semantics and fewer stranded or
incorrect execution locks

## What Changed

- Cancel scheduled retry runs when their issue has been reassigned
before the retry is promoted.
- Clear stale issue execution locks and cancel the associated wakeup
request when a stale retry is cancelled.
- Avoid deferring a new assignee behind a previous assignee's scheduled
retry.
- Cancel an active run when an issue status is explicitly changed to
`cancelled`, while leaving `done` transitions alone.
- Added route and heartbeat regressions for reassignment and
cancellation behavior.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
  - `issue-comment-reopen-routes.test.ts`: 28 passed.
- `heartbeat-retry-scheduling.test.ts`: skipped by the existing embedded
Postgres host guard (`Postgres init script exited with code null`).
- `pnpm --filter @paperclipai/server typecheck`

## Risks

- Medium risk because this changes heartbeat retry lifecycle behavior.
- The cancellation path is scoped to scheduled retries whose issue
assignee no longer matches the retrying agent, and logs a lifecycle
event for auditability.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:24:13 -05:00
Dotta 0c6961a03e Normalize escaped multiline issue and approval text (#4444)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board and agent APIs accept multiline issue, approval,
interaction, and document text
> - Some callers accidentally send literal escaped newline sequences
like `\n` instead of JSON-decoded line breaks
> - That makes comments, descriptions, documents, and approval notes
render as flattened text instead of readable markdown
> - This pull request centralizes multiline text normalization in shared
validators
> - The benefit is newline-preserving API behavior across issue and
approval workflows without route-specific fixes

## What Changed

- Added a shared `multilineTextSchema` helper that normalizes escaped
`\n`, `\r\n`, and `\r` sequences to real line breaks.
- Applied the helper to issue descriptions, issue update comments, issue
comment bodies, suggested task descriptions, interaction summaries,
issue documents, approval comments, and approval decision notes.
- Added shared validator regressions for issue and approval multiline
inputs.

## Verification

- `pnpm exec vitest run --project @paperclipai/shared
packages/shared/src/validators/approval.test.ts
packages/shared/src/validators/issue.test.ts`
- `pnpm --filter @paperclipai/shared typecheck`

## Risks

- Low risk. This only changes text fields that are explicitly multiline
user/operator content.
- If a caller intentionally wanted literal backslash-n text in these
fields, it will now render as a real line break.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 18:02:45 -05:00
Dotta 5a0c1979cf [codex] Add runtime lifecycle recovery and live issue visibility (#4419) 2026-04-24 15:50:32 -05:00
Dotta 9a8d219949 [codex] Stabilize tests and local maintenance assets (#4423)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - A fast-moving control plane needs stable local tests and repeatable
local maintenance tools so contributors can safely split and review work
> - Several route suites needed stronger isolation, Codex manual model
selection needed a faster-mode option, and local browser cleanup missed
Playwright's headless shell binary
> - Storybook static output also needed to be preserved as a generated
review artifact from the working branch
> - This pull request groups the test/local-dev maintenance pieces so
they can be reviewed separately from product runtime changes
> - The benefit is more predictable contributor verification and cleaner
local maintenance without mixing these changes into feature PRs

## What Changed

- Added stable Vitest runner support and serialized route/authz test
isolation.
- Fixed workspace runtime authz route mocks and stabilized
Claude/company-import related assertions.
- Allowed Codex fast mode for manually selected models.
- Broadened the agent browser cleanup script to detect
`chrome-headless-shell` as well as Chrome for Testing.
- Preserved generated Storybook static output from the source branch.

## Verification

- `pnpm exec vitest run
src/__tests__/workspace-runtime-routes-authz.test.ts
src/__tests__/claude-local-execute.test.ts --config vitest.config.ts`
from `server/` passed: 2 files, 19 tests.
- `pnpm exec vitest run src/server/codex-args.test.ts --config
vitest.config.ts` from `packages/adapters/codex-local/` passed: 1 file,
3 tests.
- `bash -n scripts/kill-agent-browsers.sh &&
scripts/kill-agent-browsers.sh --dry` passed; dry-run detected
`chrome-headless-shell` processes without killing them.
- `test -f ui/storybook-static/index.html && test -f
ui/storybook-static/assets/forms-editors.stories-Dry7qwx2.js` passed.
- `git diff --check public-gh/master..pap-2228-test-local-maintenance --
. ':(exclude)ui/storybook-static'` passed.
- `pnpm exec vitest run
cli/src/__tests__/company-import-export-e2e.test.ts --config
cli/vitest.config.ts` did not complete in the isolated split worktree
because `paperclipai run` exited during build prep with `TS2688: Cannot
find type definition file for 'react'`; this appears to be caused by the
worktree dependency symlink setup, not the code under test.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: the stable Vitest runner changes how route/authz tests
are scheduled.
- Generated `ui/storybook-static` files are large and contain minified
third-party output; `git diff --check` reports whitespace inside those
generated assets, so reviewers may choose to drop or regenerate that
artifact before merge.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshot checklist item is not applicable to source UI behavior;
the included Storybook static output is generated artifact preservation
from the source branch.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 15:11:42 -05:00
Devin Foley 70679a3321 Add sandbox environment support (#4415)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.

## What Changed

- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.

## Verification

- Ran locally before the final fixture-only scrub:
  - `pnpm -r typecheck`
  - `pnpm test:run`
  - `pnpm build`
- Ran locally after the final scrub amend:
  - `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
  - create a sandbox environment backed by the fake provider plugin
  - run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly

## Risks

- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.

## Model Used

- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
Dotta 641eb44949 [codex] Harden create-agent skill governance (#4422)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Hiring agents is a governance-sensitive workflow because it grants
roles, adapter config, skills, and execution capability
> - The create-agent skill needs explicit templates and review guidance
so hires are auditable and not over-permissioned
> - Skill sync also needs to recognize bundled Paperclip skills
consistently for Codex local agents
> - This pull request expands create-agent role templates, adds a
security-engineer template, and documents capability/secret-handling
review requirements
> - The benefit is safer, more repeatable agent creation with clearer
approval payloads and less permission sprawl

## What Changed

- Expanded `paperclip-create-agent` guidance for template selection,
adjacent-template drafting, and role-specific review bars.
- Added a Security Engineer agent template and collaboration/safety
sections for Coder, QA, and UX Designer templates.
- Hardened draft-review guidance around desired skills, external-system
access, secrets, and confidential advisory handling.
- Updated LLM agent-configuration guidance to point hiring workflows at
the create-agent skill.
- Added tests for bundled skill sync, create-agent skill injection, hire
approval payloads, and LLM route guidance.

## Verification

- `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts
server/src/__tests__/codex-local-skill-injection.test.ts
server/src/__tests__/codex-local-skill-sync.test.ts
server/src/__tests__/llms-routes.test.ts
server/src/__tests__/paperclip-skill-utils.test.ts --config
server/vitest.config.ts` passed: 5 files, 23 tests.
- `git diff --check public-gh/master..pap-2228-create-agent-governance
-- . ':(exclude)ui/storybook-static'` passed.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Low-to-medium risk: this primarily changes skills/docs and tests, but
it affects future hiring guidance and approval expectations.
- Reviewers should check whether the new Security Engineer template is
too broad for default company installs.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshot checklist item is not applicable; this PR changes
skills, docs, and server tests.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 14:15:28 -05:00
Dotta 77a72e28c2 [codex] Polish issue composer and long document display (#4420)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue comments and documents are the main working surface where
operators and agents collaborate
> - File drops, markdown editing, and long issue descriptions need to
feel predictable because they sit directly in the task execution loop
> - The composer had edge cases around drag targets, attachment
feedback, image drops, and long markdown content crowding the page
> - This pull request polishes the issue composer, hardens markdown
editor regressions, and adds a fold curtain for long issue
descriptions/documents
> - The benefit is a calmer issue detail surface that handles uploads
and long work products without hiding state or breaking layout

## What Changed

- Scoped issue-composer drag/drop behavior so the composer owns file
drops without turning the whole thread into a competing drop target.
- Added clearer attachment upload feedback for non-image files and
image-drop stability coverage.
- Hardened markdown editor and markdown body handling around HTML-like
tag regressions.
- Added `FoldCurtain` and wired it into issue descriptions and issue
documents so long markdown previews can expand/collapse.
- Added Storybook coverage for the fold curtain state.

## Verification

- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/components/MarkdownBody.test.tsx --config ui/vitest.config.ts`
passed: 3 files, 75 tests.
- `git diff --check public-gh/master..pap-2228-editor-composer-polish --
. ':(exclude)ui/storybook-static'` passed.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Low-to-medium risk: this changes user-facing composer/drop behavior
and long markdown display.
- The fold curtain uses DOM measurement and `ResizeObserver`; reviewers
should check browser behavior for very long descriptions and documents.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshots were not newly captured during branch splitting; the
UI states are covered by component tests and a Storybook story.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 14:12:41 -05:00
Dotta 8f1cd0474f [codex] Improve transient recovery and Codex model refresh (#4383)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapter execution and retry classification decide whether agent work
pauses, retries, or recovers automatically
> - Transient provider failures need to be classified precisely so
Paperclip does not convert retryable upstream conditions into false hard
failures
> - At the same time, operators need an up-to-date model list for
Codex-backed agents and prompts should nudge agents toward targeted
verification instead of repo-wide sweeps
> - This pull request tightens transient recovery classification for
Claude and Codex, updates the agent prompt guidance, and adds Codex
model refresh support end-to-end
> - The benefit is better automatic retry behavior plus fresher
operator-facing model configuration

## What Changed

- added Codex usage-limit retry-window parsing and Claude extra-usage
transient classification
- normalized the heartbeat transient-recovery contract across adapter
executions and heartbeat scheduling
- documented that deferred comment wakes only reopen completed issues
for human/comment-reopen interactions, while system follow-ups leave
closed work closed
- updated adapter-utils prompt guidance to prefer targeted verification
- added Codex model refresh support in the server route, registry,
shared types, and agent config form
- added adapter/server tests covering the new parsing, retry scheduling,
and model-refresh behavior

## Verification

- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-claude-local
packages/adapters/claude-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-codex-local
packages/adapters/codex-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/adapter-model-refresh-routes.test.ts
server/src/__tests__/adapter-models.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/codex-local-execute.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-retry-scheduling.test.ts`

## Risks

- Moderate behavior risk: retry classification affects whether runs
auto-recover or block, so mistakes here could either suppress needed
retries or over-retry real failures
- Low workflow risk: deferred comment wake reopening is intentionally
scoped to human/comment-reopen interactions so system follow-ups do not
revive completed issues unexpectedly

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:40:40 -05:00
Dotta 4fdbbeced3 [codex] Refine markdown issue reference rendering (#4382)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Task references are a core part of how operators understand issue
relationships across the UI
> - Those references appear both in markdown bodies and in sidebar
relationship panels
> - The rendering had drifted between surfaces, and inline markdown
pills were reading awkwardly inside prose and lists
> - This pull request unifies the underlying issue-reference treatment,
routes issue descriptions through `MarkdownBody`, and switches inline
markdown references to a cleaner text-link presentation
> - The benefit is more consistent issue-reference UX with better
readability in markdown-heavy views

## What Changed

- unified sidebar and markdown issue-reference rendering around the
shared issue-reference components
- routed resting issue descriptions through `MarkdownBody` so
description previews inherit the richer issue-reference treatment
- replaced inline markdown pill chrome with a cleaner inline reference
presentation for prose contexts
- added and updated UI tests for `MarkdownBody` and `InlineEditor`

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/MarkdownBody.test.tsx
ui/src/components/InlineEditor.test.tsx`

## Risks

- Moderate UI risk: issue-reference rendering now differs intentionally
between inline markdown and relationship sidebars, so regressions would
show up as styling or hover-preview mismatches

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:39:21 -05:00
Dotta 7ad225a198 [codex] Improve issue thread review flow (#4381)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue detail is where operators coordinate review, approvals, and
follow-up work with active runs
> - That thread UI needs to surface blockers, descendants, review
handoffs, and reply ergonomics clearly enough for humans to guide agent
work
> - Several small gaps in the issue-thread flow were making review and
navigation clunkier than necessary
> - This pull request improves the reply composer, descendant/blocker
presentation, interaction folding, and review-request handoff plumbing
together as one cohesive issue-thread workflow slice
> - The benefit is a cleaner operator review loop without changing the
broader task model

## What Changed

- restored and refined the floating reply composer behavior in the issue
thread
- folded expired confirmation interactions and improved post-submit
thread scrolling behavior
- surfaced descendant issue context and inline blocker/paused-assignee
notices on the issue detail view
- tightened large-board first paint behavior in `IssuesList`
- added loose review-request handoffs through the issue
execution-policy/update path and covered them with tests

## Verification

- `pnpm vitest run ui/src/pages/IssueDetail.test.tsx`
- `pnpm vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/issue-execution-policy.test.ts`
- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/lib/issue-tree.test.ts
ui/src/api/issues.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-comment-reopen-routes.test.ts -t "coerces
executor handoff patches into workflow-controlled review wakes|wakes the
return assignee with execution_changes_requested"`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-execution-policy.test.ts
server/src/__tests__/issues-service.test.ts`

## Visual Evidence

- UI layout changes are covered by the focused issue-thread component
and issue-detail tests listed above. Browser screenshots were not
attachable from this automated greploop environment, so reviewers should
use the running preview for final visual confirmation.

## Risks

- Moderate UI-flow risk: these changes touch the issue detail experience
in multiple spots, so regressions would most likely show up as
thread-layout quirks or incorrect review-handoff behavior

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or documented the visual verification path
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 08:02:45 -05:00
Dotta 35a9dc37b0 [codex] Speed up company skill detail loading (#4380)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Company skills are part of the control plane for distributing
reusable capabilities
> - Board flows that inspect company skill detail should stay responsive
because they are operator-facing control-plane reads
> - The existing detail path was doing broader work than needed for the
specific detail screen
> - This pull request narrows that company-skill detail loading path and
adds a regression test around it
> - The benefit is faster company skill detail reads without changing
the external API contract

## What Changed

- tightened the company-skill detail loading path in
`server/src/services/company-skills.ts`
- added `server/src/__tests__/company-skills-detail.test.ts` to verify
the detail route only pulls the required data

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-skills-detail.test.ts`

## Risks

- Low risk: this only changes the company-skill detail query path, but
any missed assumption in the detail consumer would surface when loading
that screen

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 07:37:13 -05:00
Devin Foley e4995bbb1c Add SSH environment support (#4358)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation

## What Changed

- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.

## Verification

- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
  - enabled the experimental flag
  - created an SSH environment
  - created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back

## Risks

- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.

## Model Used

- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
Dotta f98c348e2b [codex] Add issue subtree pause, cancel, and restore controls (#4332)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - This branch extends the issue control-plane so board operators can
pause, cancel, and later restore whole issue subtrees while keeping
descendant execution and wake behavior coherent.
> - That required new hold state in the database, shared contracts,
server routes/services, and issue detail UI controls so subtree actions
are durable and auditable instead of ad hoc.
> - While this branch was in flight, `master` advanced with new
environment lifecycle work, including a new `0065_environments`
migration.
> - Before opening the PR, this branch had to be rebased onto
`paperclipai/paperclip:master` without losing the existing
subtree-control work or leaving conflicting migration numbering behind.
> - This pull request rebases the subtree pause/cancel/restore feature
cleanly onto current `master`, renumbers the hold migration to
`0066_issue_tree_holds`, and preserves the full branch diff in a single
PR.
> - The benefit is that reviewers get one clean, mergeable PR for the
subtree-control feature instead of stale branch history with migration
conflicts.

## What Changed

- Added durable issue subtree hold data structures, shared
API/types/validators, server routes/services, and UI flows for subtree
pause, cancel, and restore operations.
- Added server and UI coverage for subtree previewing, hold
creation/release, dependency-aware scheduling under holds, and issue
detail subtree controls.
- Rebased the branch onto current `paperclipai/paperclip:master` and
renumbered the branch migration from `0065_issue_tree_holds` to
`0066_issue_tree_holds` so it no longer conflicts with upstream
`0065_environments`.
- Added a small follow-up commit that makes restore requests return `200
OK` explicitly while keeping pause/cancel hold creation at `201
Created`, and updated the route test to match that contract.

## Verification

- `pnpm --filter @paperclipai/db typecheck`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `cd server && pnpm exec vitest run
src/__tests__/issue-tree-control-routes.test.ts
src/__tests__/issue-tree-control-service.test.ts
src/__tests__/issue-tree-control-service-unit.test.ts
src/__tests__/heartbeat-dependency-scheduling.test.ts`
- `cd ui && pnpm exec vitest run src/components/IssueChatThread.test.tsx
src/pages/IssueDetail.test.tsx`

## Risks

- This is a broad cross-layer change touching DB/schema, shared
contracts, server orchestration, and UI; regressions are most likely
around subtree status restoration or wake suppression/resume edge cases.
- The migration was renumbered during PR prep to avoid the new upstream
`0065_environments` conflict. Reviewers should confirm the final
`0066_issue_tree_holds` ordering is the only hold-related migration that
lands.
- The issue-tree restore endpoint now responds with `200` instead of
relying on implicit behavior, which is semantically better for a restore
operation but still changes an API detail that clients or tests could
have assumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent in the Paperclip Codex runtime (GPT-5-class
tool-using coding model; exact deployment ID/context window is not
exposed inside this session).

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-23 14:51:46 -05:00
Russell Dempsey 854fa81757 fix(pi-local): prepend installed skill bin/ dirs to child PATH (#4331)
## Thinking Path

> - Paperclip orchestrates AI agents; each agent runs under an adapter
that spawns a model CLI as a child process.
> - The pi-local adapter (`packages/adapters/pi-local`) spawns `pi` and
inherits the child's shell environment — including `PATH`, which
determines what the child's bash tool can execute by name.
> - Paperclip skills ship executable helpers under `<skill>/bin/` (e.g.
`paperclip-get-issue`) and Reviewer/QA-style `AGENTS.md` files invoke
them by name via the agent's bash tool.
> - Pi-local builds its runtime env with `ensurePathInEnv({
...process.env, ...env })` only — it never adds the installed skills'
`bin/` dirs to PATH. The pi CLI's `--skill` arg loads each skill's
SKILL.md but does not augment PATH.
> - Consequence: every bash invocation of a skill helper fails with
`exit 127: command not found`. The agent then spends its heartbeat
guessing (re-reading SKILL.md, trying `find`, inventing command paths)
and either times out or gives up.
> - This PR prepends each injected skill's `bin/` directory to the child
PATH immediately before runtimeEnv is constructed.
> - The benefit: pi_local agents whose AGENTS.md uses any `paperclip-*`
skill helper can actually run those helpers.

## What Changed

- `packages/adapters/pi-local/src/server/execute.ts`: compute
`skillBinDirs` from the already-resolved `piSkillEntries`, dedupe
against the existing PATH, prepend them to whichever of `PATH` / `Path`
the merged env uses, then build `runtimeEnv`. No new helpers, no
adapter-utils changes.

## Verification

Manual repro before the fix:

1. Create a pi_local agent wired to a paperclip skill (e.g.
paperclip-control).
2. Wake the agent on an in_review issue with an AGENTS.md that starts
with `paperclip-get-issue "$PAPERCLIP_TASK_ID"`.
3. Session file: `{ "role": "toolResult", "isError": true, "content": [{
"text": "/bin/bash: paperclip-get-issue: command not found\n\nCommand
exited with code 127" }] }`.

After the fix: same wake; `paperclip-get-issue` resolves and returns the
issue JSON; agent proceeds.

Local commands:

```
pnpm --filter @paperclipai/adapter-pi-local typecheck   # clean
pnpm --filter @paperclipai/adapter-pi-local build       # clean
pnpm --filter @paperclipai/server exec vitest run \
  src/__tests__/pi-local-execute.test.ts \
  src/__tests__/pi-local-adapter-environment.test.ts \
  src/__tests__/pi-local-skill-sync.test.ts
# 5/5 passing
```

No new tests: the existing `pi-local-skill-sync.test.ts` covers skill
symlink injection (upstream of the PATH step), and
`pi-local-execute.test.ts` covers the spawn path; this change only
augments env on the same spawn path.

## Risks

Low. Pure PATH augmentation on the child env. Edge cases:

- Zero skills installed → no PATH change (guarded by
`skillBinDirs.length > 0`).
- Duplicate bin dirs already on PATH → deduped; no pollution on re-runs.
- Windows `Path` casing → falls back correctly when merged env uses
`Path` instead of `PATH`.
- Skill dir without `bin/` subdir → joined path simply won't resolve;
harmless.

No behavioral change for pi_local agents that don't use skill-provided
commands.

## Model Used

- Claude, `claude-opus-4-7` (1M context), extended thinking enabled,
tool use enabled. Walked pi-local/cursor-local/claude-local and
adapter-utils to isolate the gap, wrote the inlined fix, and ran
typecheck/build/test locally.

## Checklist

- [x] Thinking path from project context to this change
- [x] Model used specified
- [x] Checked ROADMAP.md — no overlap
- [x] Tests run locally, passing
- [x] Tests added — new case in
`server/src/__tests__/pi-local-execute.test.ts`; verified it fails when
the fix is reverted
- [ ] UI screenshots — N/A (backend adapter change)
- [x] Docs updated — N/A (internal adapter, no user-facing docs)
- [x] Risks documented
- [x] Will address reviewer comments before merge
2026-04-23 10:15:10 -05:00
Dotta fe14de504c [codex] Document README architecture systems (#4250)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - The public README is the first place many operators and contributors
learn what the product already includes.
> - The existing README explained the product promise but did not give a
compact, concrete tour of the major systems behind it.
> - This made Paperclip easier to underestimate as a wrapper around
agents instead of a full control plane with identity, work, execution,
governance, budgets, plugins, and portability.
> - This pull request adds an under-the-hood README section that names
those systems and shows how adapters connect into the server.
> - Greptile caught consistency gaps between the diagram and prose, so
the final version aligns the system labels and adapter examples across
both surfaces.
> - The benefit is a clearer first-read model of Paperclip's
architecture and shipped capabilities without changing runtime behavior.

## What Changed

- Added a `What's Under the Hood` section to `README.md`.
- Added an ASCII architecture diagram for the Paperclip server and
external agent adapters.
- Added a systems table covering identity, org charts, tasks, heartbeat
execution, workspaces, governance, budgets, routines, plugins,
secrets/storage, activity/events, and company portability.
- Addressed Greptile feedback by aligning diagram labels with table rows
and grouping adapter examples consistently.

## Verification

- `git diff --check public-gh/master...HEAD`
- Attempted `pnpm exec prettier --check README.md`, but this checkout
does not expose a `prettier` binary through `pnpm exec`.
- Greptile review rerun passed after addressing its two comments; review
threads are resolved.
- Remote PR checks passed on the latest head: `policy`, `verify`, `e2e`,
`security/snyk (cryppadotta)`, and `Greptile Review`.
- Not run locally: Vitest/build suites, because this is a README-only
documentation change and the PR's remote `verify` job ran typecheck,
tests, build, and release canary dry run.

## Risks

- Low runtime risk: documentation-only change.
- The main risk is wording drift if the README overstates or
underspecifies evolving product capabilities; the section was aligned
against the current product/spec docs and roadmap.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex / GPT-5 coding agent in a Paperclip heartbeat, with shell
and GitHub CLI tool use. Exact runtime model identifier and context
window were not exposed by the adapter.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-23 09:48:19 -05:00
Michel Tomas 3d15798c22 fix(adapters/routes): apply resolveExternalAdapterRegistration on hot-install (#4324)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The external adapter plugin system (#2218) lets adapters ship as npm
modules loaded via `server/src/adapters/plugin-loader.ts`; since #4296
merged, each `ServerAdapterModule` can declare `sessionManagement`
(`supportsSessionResume`, `nativeContextManagement`,
`defaultSessionCompaction`) and have it preserved through the init-time
load via the new `resolveExternalAdapterRegistration` helper
> - #4296 fixed the init-time IIFE path at
`server/src/adapters/registry.ts:363-369` but noted that the hot-install
path at `server/src/routes/adapters.ts:174
registerWithSessionManagement` still unconditionally overwrites
module-provided `sessionManagement` during `POST /api/adapters/install`
> - Practical impact today: an external adapter installed via the API
needs a Paperclip restart before its declared `sessionManagement` takes
effect — the IIFE runs on next boot and preserves it, but until then the
hot-install overwrite wins
> - This PR closes that parity gap: `registerWithSessionManagement`
delegates to the same `resolveExternalAdapterRegistration` helper
introduced by #4296, unifying both load paths behind one resolver
> - The benefit is consistent behaviour between cold-start and
hot-install: no "install then restart" ritual; declared
`sessionManagement` on an external module is honoured the moment `POST
/api/adapters/install` returns 201

## What Changed

- `server/src/routes/adapters.ts`: `registerWithSessionManagement`
delegates to the exported `resolveExternalAdapterRegistration` helper
(added in #4296). Honours module-provided `sessionManagement` first,
falls back to host registry lookup, defaults `undefined`. Updated the
section comment to document the parity-with-IIFE intent.
- `server/src/routes/adapters.ts`: dropped the now-unused
`getAdapterSessionManagement` import.
- `server/src/adapters/registry.ts`: updated the JSDoc on
`resolveExternalAdapterRegistration` — previously said "Exported for
unit tests; runtime callers use the IIFE below", now says the helper is
used by both the init-time IIFE and the hot-install path in
`routes/adapters.ts`. Addresses Greptile C1.
- `server/src/__tests__/adapter-routes.test.ts`: new integration test —
installs a mocked external adapter module carrying a non-trivial
`sessionManagement` declaration and asserts
`findServerAdapter(type).sessionManagement` preserves it after `POST
/api/adapters/install` returns 201.
- `server/src/__tests__/adapter-routes.test.ts`: added
`findServerAdapter` to the shared test-scope variable set so the new
test can inspect post-install registry state.

## Verification

Targeted test runs from a clean tree on
`fix/external-session-management-hot-install` (rebased onto current
`upstream/master` now that #4296 has merged):

- `pnpm test server/src/__tests__/adapter-routes.test.ts` — 6 passed
(new test + 5 pre-existing)
- `pnpm test server/src/__tests__/adapter-registry.test.ts` — 15 passed
(ensures the IIFE path from #4296 continues to behave correctly)
- `pnpm -w run test` full workspace suite — 1923 passed / 1 skipped
(unrelated skip)

End-to-end smoke on file:
[`@superbiche/cline-paperclip-adapter@0.1.1`](https://www.npmjs.com/package/@superbiche/cline-paperclip-adapter)
and
[`@superbiche/qwen-paperclip-adapter@0.1.1`](https://www.npmjs.com/package/@superbiche/qwen-paperclip-adapter),
both public on npm, both declare `sessionManagement`. With this PR in
place, the "restart after install" step disappears — the declared
compaction policy is active immediately after the install response.

## Risks

- Low risk. The change replaces an inline mutation with a call to a
helper that already has dedicated unit coverage (#4296 added three tests
for `resolveExternalAdapterRegistration` covering module-provided,
registry-fallback, and undefined paths). Behaviour is a strict superset
of the prior path — externals that did not declare `sessionManagement`
continue to get the hardcoded-registry lookup; externals that did
declare it now have those values preserved instead of overwritten.
- No migration impact. The stored plugin records
(`~/.paperclip/adapter-plugins.json`) are unchanged. Existing
hot-installed adapters behave correctly before and after.
- No behavioural change for builtin adapters; they hit
`registerServerAdapter` directly and never flow through
`registerWithSessionManagement`.

## Model Used

- Provider and model: Claude (Anthropic) via Claude Code
- Model ID: `claude-opus-4-7` (1M context)
- Reasoning mode: standard (no extended thinking on this PR)
- Tool use: yes — file edits, subprocess invocations for
builds/tests/git via the Claude Code harness

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A — server-only change)
- [x] I have updated relevant documentation to reflect my changes (the
JSDoc on `resolveExternalAdapterRegistration` and the section comment
above `registerWithSessionManagement` now document the parity-with-IIFE
intent)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 09:45:24 -05:00
Michel Tomas 24232078fd fix(adapters/registry): honor module-provided sessionManagement for external adapters (#4296)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters are how paperclip hands work off to specific agent
runtimes; since #2218, external adapter packages can ship as npm modules
loaded via `server/src/adapters/plugin-loader.ts`
> - Each `ServerAdapterModule` can declare `sessionManagement`
(`supportsSessionResume`, `nativeContextManagement`,
`defaultSessionCompaction`) — but the init-time load at
`registry.ts:363-369` hard-overwrote it with a hardcoded-registry lookup
that has no entries for external types, so modules could not actually
set these fields
> - The hot-install path at `routes/adapters.ts:179` →
`registerServerAdapter` preserves module-provided `sessionManagement`,
so externals worked after `POST /api/adapters/install` — *until the next
server restart*, when the init-time IIFE wiped it back to `undefined`
> - #2218 explicitly deferred this: *"Adapter execution model, heartbeat
protocol, and session management are untouched."* This PR is the natural
follow-up for session management on the plugin-loader path
> - This PR aligns init-time registration with the hot-install path:
honor module-provided `sessionManagement` first, fall back to the
hardcoded registry when absent (so externals overriding a built-in type
still inherit its policy). Extracted as a testable helper with three
unit tests
> - The benefit is external adapters can declare session-resume
capabilities consistently across cold-start and hot-install, without
requiring upstream additions to the hardcoded registry for each new
plugin

## What Changed

- `server/src/adapters/registry.ts`: extracted the merge logic into a
new exported helper `resolveExternalAdapterRegistration()` — honors
module-provided `sessionManagement` first, falls back to
`getAdapterSessionManagement(type)`, else `undefined`. The init-time
IIFE calls the helper instead of inlining an overwrite.
- `server/src/adapters/registry.ts`: updated the section comment (lines
331–340) to reflect the new semantics and cross-reference the
hot-install path's behavior.
- `server/src/__tests__/adapter-registry.test.ts`: new
`describe("resolveExternalAdapterRegistration")` block with three tests
— module-provided value preserved, registry fallback when module omits,
`undefined` when neither provides.

## Verification

Targeted test run from a clean tree on
`fix/external-session-management`:

```
cd server && pnpm exec vitest run src/__tests__/adapter-registry.test.ts
# 1 test file, 15 tests passed, 0 failed (12 pre-existing + 3 new)
```

Full server suite via the independent review pass noted under Model
Used: **1,156 tests passed, 0 failed**.

Typecheck note: `pnpm --filter @paperclipai/server exec tsc --noEmit`
surfaces two errors in `src/services/plugin-host-services.ts:1510`
(`createInteraction` + implicit-any). Verified by `git stash` + re-run
on clean `upstream/master` — they reproduce without this PR's changes.
Pre-existing, out of scope.

## Risks

- **Low behavioral risk.** Strictly additive: externals that do NOT
provide `sessionManagement` continue to receive exactly the same value
as before (registry lookup → `undefined` for pure externals, or the
builtin's entry for externals overriding a built-in type). Only a new
capability is unlocked; no existing behavior changes for existing
adapters.
- **No breaking change.** `ServerAdapterModule.sessionManagement` was
already optional at the type level. Externals that never set it see no
difference on either path.
- **Consistency verified.** Init-time IIFE now matches the post-`POST
/api/adapters/install` behavior — a server restart no longer regresses
the field.

## Note

This is part of a broader effort to close the parity gap between
external and built-in adapters. Once externals reach 1:1 capability
coverage with internals, new-adapter contributions can increasingly be
steered toward the external-plugin path instead of the core product — a
trajectory CONTRIBUTING.md already encourages ("*If the idea fits as an
extension, prefer building it with the plugin system*").

## Model Used

- **Provider**: Anthropic
- **Model**: Claude Opus 4.7
- **Exact model ID**: `claude-opus-4-7` (1M-context variant:
`claude-opus-4-7[1m]`)
- **Context window**: 1,000,000 tokens
- **Harness**: Claude Code (Anthropic's official CLI), orchestrated by
@superbiche as human-in-the-loop. Full file-editing, shell, and `gh`
tool use, plus parallel research subagents for fact-finding against
paperclip internals (plugin-loader contract, sessionCodec reachability,
UI parser surface, Cline CLI JSON schema).
- **Independent local review**: Gemini 3.1 Pro (Google) performed a
separate verification pass on the committed branch — confirmed the
approach & necessity, ran the full workspace build, and executed the
complete server test suite (1,156 tests, all passing). Not used for
authoring; second-opinion pass only.
- **Authoring split**: @superbiche identified the gap (while mapping the
external-adapter surface for a downstream adapter build) and shaped the
plan — categorising the surface into `works / acceptable /
needs-upstream` buckets, directing the surgical-diff approach on a fresh
branch from `upstream/master`, and calling the framing ("alignment bug
between init-time IIFE and hot-install path" rather than "missing
capability"). Opus 4.7 executed the fact-finding, the diff, the tests,
and drafted this PR body — all under direct review.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work (convention-aligned bug fix on the external-adapter
plugin path introduced by #2218)
- [x] I have run tests locally and they pass (15/15 in the touched file;
1,156/1,156 full server suite via the independent Gemini 3.1 Pro review)
- [x] I have added tests where applicable (3 new for the extracted
helper)
- [x] If this change affects the UI, I have included before/after
screenshots (no UI touched)
- [x] I have updated relevant documentation to reflect my changes
(in-file comment reflects new semantics)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 07:39:43 -05:00
Devin Foley 13551b2bac Add local environment lifecycle (#4297)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Every heartbeat run needs a concrete place where the agent's adapter
process executes.
> - Today that execution location is implicitly the local machine, which
makes it hard to track, audit, and manage as a first-class runtime
concern.
> - The first step is to represent the current local execution path
explicitly without changing how users experience agent runs.
> - This pull request adds core Environment and Environment Lease
records, then routes existing local heartbeat execution through a
default `Local` environment.
> - The benefit is that local runs remain behavior-preserving while the
system now has durable environment identity, lease lifecycle tracking,
and activity records for execution placement.

## What Changed

- Added `environments` and `environment_leases` database tables, schema
exports, and migration `0065_environments.sql`.
- Added shared environment constants, TypeScript types, and validators
for environment drivers, statuses, lease policies, lease statuses, and
cleanup states.
- Added `environmentService` for listing, reading, creating, updating,
and ensuring company-scoped environments.
- Added environment lease lifecycle operations for acquire, metadata
update, single-lease release, and run-wide release.
- Updated heartbeat execution to lazily ensure a company-scoped default
`Local` environment before adapter execution.
- Updated heartbeat execution to acquire an ephemeral local environment
lease, write `paperclipEnvironment` into the run context snapshot, and
release active leases during run finalization.
- Added activity log events for environment lease acquisition and
release.
- Added tests for environment service behavior and the local heartbeat
environment lifecycle.
- Added a CI-follow-up heartbeat guard so deferred issue comment wakes
are promoted before automatic missing-comment retries, with focused
batching test coverage.

## Verification

Local verification run for this branch:

- `pnpm -r typecheck`
- `pnpm build`
- `pnpm exec vitest run server/src/__tests__/environment-service.test.ts
server/src/__tests__/heartbeat-local-environment.test.ts --pool=forks`

Additional reviewer/CI verification:

- Confirm `pnpm-lock.yaml` is not modified.
- Confirm `pnpm test:run` passes in CI.
- Confirm `PAPERCLIP_E2E_SKIP_LLM=true pnpm run test:e2e` passes in CI.
- Confirm a local heartbeat run creates one active `Local` environment
when needed, records one lease for the run, releases the lease when the
run finishes, and includes `paperclipEnvironment` in the run context
snapshot.

Screenshots: not applicable; this PR has no UI changes.

## Risks

- Migration risk: introduces two new tables and a new migration journal
entry. Review should verify company scoping, indexes, foreign keys, and
enum defaults are correct.
- Lifecycle risk: heartbeat finalization now releases environment leases
in addition to existing runtime cleanup. A finalization bug could leave
stale active leases or mark a failed run's lease incorrectly.
- Behavior-preservation risk: local adapter execution should remain
unchanged apart from environment bookkeeping. Review should pay
attention to the heartbeat path around context snapshot updates and
final cleanup ordering.
- Activity volume risk: each heartbeat run now logs lease acquisition
and release events, increasing activity log volume by two records per
run.

## Model Used

OpenAI GPT-5.4 via Codex CLI. Capabilities used: repository inspection,
TypeScript implementation review, local test/build execution, and
PR-description drafting.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no UI changes)
- [x] I have updated relevant documentation to reflect my changes (N/A:
no user-facing docs or commands changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-22 20:07:41 -07:00
Dotta b69b563aa8 [codex] Fix stale issue execution run locks (#4258)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies, so issue
checkout and execution ownership are core safety contracts.
> - The affected subsystem is the issue service and route layer that
gates agent writes by `checkoutRunId` and `executionRunId`.
> - PAP-1982 exposed a stale-lock failure mode where a terminal
heartbeat run could leave `executionRunId` pinned after checkout
ownership had moved or been cleared.
> - That stale execution lock could reject legitimate
PATCH/comment/release requests from the rightful assignee after a
harness restart.
> - This pull request centralizes terminal-run cleanup, applies it
before ownership-gated writes, and adds a board-only recovery endpoint
for operator intervention.
> - The benefit is that crashed or terminal runs no longer strand issues
behind stale execution locks, while live execution locks still block
conflicting writes.

## What Changed

- Added `issueService.clearExecutionRunIfTerminal()` to atomically lock
the issue/run rows and clear terminal or missing execution-run locks.
- Reused stale execution-lock cleanup from checkout,
`assertCheckoutOwner()`, and `release()`.
- Allowed the same assigned agent/current run to adopt an unowned
`in_progress` checkout after stale execution-lock cleanup.
- Updated release to clear `executionRunId`, `executionAgentNameKey`,
and `executionLockedAt`.
- Added board-only `POST /api/issues/:id/admin/force-release` with
company access checks, optional `clearAssignee=true`, and
`issue.admin_force_release` audit logging.
- Added embedded Postgres service tests and route integration tests for
stale-lock recovery, release behavior, and admin force-release
authorization/audit behavior.
- Documented the new force-release API in `doc/SPEC-implementation.md`.

## Verification

- `pnpm vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/issue-stale-execution-lock-routes.test.ts` passed.
- `pnpm vitest run
server/src/__tests__/issue-stale-execution-lock-routes.test.ts
server/src/__tests__/approval-routes-idempotency.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/issue-telemetry-routes.test.ts` passed.
- `pnpm -r typecheck` passed.
- `pnpm build` passed.
- `git diff --check` passed.
- `pnpm lint` could not run because this repo has no `lint` command.
- Full `pnpm test:run` completed with 4 failures in existing route
suites: `approval-routes-idempotency.test.ts` (2),
`issue-comment-reopen-routes.test.ts` (1), and
`issue-telemetry-routes.test.ts` (1). Those same files pass when run
isolated and when run together with the new stale-lock route test, so
this appears to be a whole-suite ordering/mock-isolation issue outside
this patch path.

## Risks

- Medium: this changes ownership-gated write behavior. The new adoption
path is limited to the current run, the current assignee, `in_progress`
issues, and rows with no checkout owner after terminal-lock cleanup.
- Low: the admin force-release endpoint is board-only and
company-scoped, but misuse can intentionally clear a live lock. It
writes an audit event with prior lock IDs.
- No schema or migration changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent (`gpt-5`), agentic coding with
terminal/tool use and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-22 10:43:38 -05:00
Dotta a957394420 [codex] Add structured issue-thread interactions (#4244)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators supervise that work through issues, comments, approvals,
and the board UI.
> - Some agent proposals need structured board/user decisions, not
hidden markdown conventions or heavyweight governed approvals.
> - Issue-thread interactions already provide a natural thread-native
surface for proposed tasks and questions.
> - This pull request extends that surface with request confirmations,
richer interaction cards, and agent/plugin/MCP helpers.
> - The benefit is that plan approvals and yes/no decisions become
explicit, auditable, and resumable without losing the single-issue
workflow.

## What Changed

- Added persisted issue-thread interactions for suggested tasks,
structured questions, and request confirmations.
- Added board UI cards for interaction review, selection, question
answers, and accept/reject confirmation flows.
- Added MCP and plugin SDK helpers for creating interaction cards from
agents/plugins.
- Updated agent wake instructions, onboarding assets, Paperclip skill
docs, and public docs to prefer structured confirmations for
issue-scoped decisions.
- Rebased the branch onto `public-gh/master` and renumbered branch
migrations to `0063` and `0064`; the idempotency migration uses `ADD
COLUMN IF NOT EXISTS` for old branch users.

## Verification

- `git diff --check public-gh/master..HEAD`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
packages/mcp-server/src/tools.test.ts
packages/shared/src/issue-thread-interactions.test.ts
ui/src/lib/issue-thread-interactions.test.ts
ui/src/lib/issue-chat-messages.test.ts
ui/src/components/IssueThreadInteractionCard.test.tsx
ui/src/components/IssueChatThread.test.tsx
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
server/src/services/issue-thread-interactions.test.ts` -> 9 files / 79
tests passed
- `pnpm -r typecheck` -> passed, including `packages/db` migration
numbering check

## Risks

- Medium: this adds a new issue-thread interaction model across
db/shared/server/ui/plugin surfaces.
- Migration risk is reduced by placing this branch after current master
migrations (`0063`, `0064`) and making the idempotency column add
idempotent for users who applied the old branch numbering.
- UI interaction behavior is covered by component tests, but this PR
does not include browser screenshots.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-class coding agent runtime. Exact model ID and
context window are not exposed in this Paperclip run; tool use and local
shell/code execution were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 20:15:11 -05:00
Dotta 014aa0eb2d [codex] Clear stale queued comment targets (#4234)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators interact with agent work through issue threads and queued
comments.
> - When the selected comment target becomes stale, the composer can
keep pointing at an invalid target after thread state changes.
> - That makes follow-up comments easier to misroute and harder to
reason about.
> - This pull request clears stale queued comment targets and covers the
behavior with tests.
> - The benefit is more predictable issue-thread commenting during live
agent work.

## What Changed

- Clears queued comment targets when they no longer match the current
issue thread state.
- Adjusts issue detail comment-target handling to avoid stale target
reuse.
- Adds regression tests for optimistic issue comment target behavior.

## Verification

- `pnpm exec vitest run ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low risk; scoped to comment-target state handling in the issue UI.
- No migrations.

> Checked `ROADMAP.md`; this is a focused UI reliability fix, not a new
roadmap-level feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled repository
editing and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 16:50:26 -05:00
Dotta bcbbb41a4b [codex] Harden heartbeat runtime cleanup (#4233)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat runtime is the control-plane path that turns issue
assignments into agent runs and recovers after process exits.
> - Several edge cases could leave high-volume reads unbounded, stale
runtime services visible, blocked dependency wakes too eager, or
terminal adapter processes still around after output finished.
> - These problems make operator views noisy and make long-running agent
work less predictable.
> - This pull request tightens the runtime/read paths and adds focused
regression coverage.
> - The benefit is safer heartbeat execution and cleaner runtime state
without changing the public task model.

## What Changed

- Bounded high-volume issue/log reads in runtime code paths.
- Hardened heartbeat handling for blocked dependency wakes and terminal
run cleanup.
- Added adapter process cleanup coverage for terminal output cases.
- Added workspace runtime control tests for stale command matching and
stopped services.

## Verification

- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
ui/src/components/WorkspaceRuntimeControls.test.tsx`

## Risks

- Medium risk because heartbeat cleanup and runtime filtering affect
active agent execution paths.
- No migrations.

> Checked `ROADMAP.md`; this is runtime hardening and bug-fix work, not
a new roadmap-level feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled repository
editing and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 16:48:47 -05:00
Dotta 73ef40e7be [codex] Sandbox dynamic adapter UI parsers (#4225)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies.
> - External adapters can provide UI parser code that the board loads
dynamically for run transcript rendering.
> - Running adapter-provided parser code directly in the board page
gives that parser access to same-origin browser state.
> - This PR narrows that surface by evaluating dynamically loaded
external adapter UI parser code in a dedicated browser Web Worker with a
constrained postMessage protocol.
> - The worker here is a frontend isolation boundary for adapter UI
parser JavaScript; it is not Paperclip's server plugin-worker system and
it is not a server-side job runner.

## What Changed

- Runs dynamically loaded external adapter UI parsers inside a dedicated
Web Worker instead of importing/evaluating them directly in the board
page.
- Adds a narrow postMessage protocol for parser initialization and line
parsing.
- Caches completed async parse results and notifies the adapter registry
so transcript recomputation can synchronously drain the final parsed
line.
- Disables common worker network, persistence, child worker, Blob/object
URL, and WebRTC escape APIs inside the parser worker bootstrap.
- Handles worker error messages after initialization and drains pending
callbacks on worker termination or mid-session worker error.
- Adds focused regression coverage for the parser worker lockdown and
unused protocol removal.

## Verification

- `pnpm exec vitest run --config ui/vitest.config.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm exec tsc --noEmit --target es2021 --moduleResolution bundler
--module esnext --jsx react-jsx --lib dom,es2021 --skipLibCheck
ui/src/adapters/dynamic-loader.ts
ui/src/adapters/sandboxed-parser-worker.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm --filter @paperclipai/ui typecheck` was attempted; it reached
existing unrelated failures in HeartbeatRun test/storybook fixtures and
missing Storybook type resolution, with no adapter-module errors
surfaced.
- PR #4225 checks on current head `34c9da00`: `policy`, `e2e`, `verify`,
`security/snyk`, and `Greptile Review` are all `SUCCESS`.
- Greptile Review on current head `34c9da00` reached 5/5.

## Risks

- Medium risk: parser execution is now asynchronous through a worker
while the existing parser interface is synchronous, so transcript
updates should be watched with external adapters.
- Some adapter parser bundles may rely on direct ESM `export` syntax or
browser APIs that are no longer available inside the worker lockdown.
- The worker lockdown is a hardening layer around external parser code,
not a complete browser security sandbox for arbitrary untrusted
applications.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 13:42:44 -05:00
Dotta a26e1288b6 [codex] Polish issue board workflows (#4224)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Human operators supervise that work through issue lists, issue
detail, comments, inbox groups, markdown references, and
profile/activity surfaces
> - The branch had many small UI fixes that improve the operator loop
but do not need to ship with backend runtime migrations
> - These changes belong together as board workflow polish because they
affect scanning, navigation, issue context, comment state, and markdown
clarity
> - This pull request groups the UI-only slice so it can merge
independently from runtime/backend changes
> - The benefit is a clearer board experience with better issue context,
steadier optimistic updates, and more predictable keyboard navigation

## What Changed

- Improves issue properties, sub-issue actions, blocker chips, and issue
list/detail refresh behavior.
- Adds blocker context above the issue composer and stabilizes
queued/interrupted comment UI state.
- Improves markdown issue/GitHub link rendering and opens external
markdown links in a new tab.
- Adds inbox group keyboard navigation and fold/unfold support.
- Polishes activity/avatar/profile/settings/workspace presentation
details.

## Verification

- `pnpm exec vitest run ui/src/components/IssueProperties.test.tsx
ui/src/components/IssueChatThread.test.tsx
ui/src/components/MarkdownBody.test.tsx ui/src/lib/inbox.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low to medium risk: changes are UI-focused but cover high-traffic
issue and inbox surfaces.
- This branch intentionally does not include the backend runtime changes
from the companion PR; where UI calls newer API filters, unsupported
servers should continue to fail visibly through existing API error
handling.
- Visual screenshots were not captured in this heartbeat; targeted
component/helper tests cover the changed behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 12:25:34 -05:00
Dotta 09d0678840 [codex] Harden heartbeat scheduling and runtime controls (#4223)
## Thinking Path

> - Paperclip orchestrates AI agents through issue checkout, heartbeat
runs, routines, and auditable control-plane state
> - The runtime path has to recover from lost local processes, transient
adapter failures, blocked dependencies, and routine coalescing without
stranding work
> - The existing branch carried several reliability fixes across
heartbeat scheduling, issue runtime controls, routine dispatch, and
operator-facing run state
> - These changes belong together because they share backend contracts,
migrations, and runtime status semantics
> - This pull request groups the control-plane/runtime slice so it can
merge independently from board UI polish and adapter sandbox work
> - The benefit is safer heartbeat recovery, clearer runtime controls,
and more predictable recurring execution behavior

## What Changed

- Adds bounded heartbeat retry scheduling, scheduled retry state, and
Codex transient failure recovery handling.
- Tightens heartbeat process recovery, blocker wake behavior, issue
comment wake handling, routine dispatch coalescing, and
activity/dashboard bounds.
- Adds runtime-control MCP tools and Paperclip skill docs for issue
workspace runtime management.
- Adds migrations `0061_lively_thor_girl.sql` and
`0062_routine_run_dispatch_fingerprint.sql`.
- Surfaces retry state in run ledger/agent UI and keeps related shared
types synchronized.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/routines-service.test.ts`
- `pnpm exec vitest run src/tools.test.ts` from `packages/mcp-server`

## Risks

- Medium risk: this touches heartbeat recovery and routine dispatch,
which are central execution paths.
- Migration order matters if split branches land out of order: merge
this PR before branches that assume the new runtime/routine fields.
- Runtime retry behavior should be watched in CI and in local operator
smoke tests because it changes how transient failures are resumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 12:24:11 -05:00
Dotta ab9051b595 Add first-class issue references (#4214)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators and agents coordinate through company-scoped issues,
comments, documents, and task relationships.
> - Issue text can mention other tickets, but those references were
previously plain markdown/text without durable relationship data.
> - That made it harder to understand related work, surface backlinks,
and keep cross-ticket context visible in the board.
> - This pull request adds first-class issue reference extraction,
storage, API responses, and UI surfaces.
> - The benefit is that issue references become queryable, navigable,
and visible without relying on ad hoc text scanning.

## What Changed

- Added shared issue-reference parsing utilities and exported
reference-related types/constants.
- Added an `issue_reference_mentions` table, idempotent migration DDL,
schema exports, and database documentation.
- Added server-side issue reference services, route integration,
activity summaries, and a backfill command for existing issue content.
- Added UI reference pills, related-work panels, markdown/editor mention
handling, and issue detail/property rendering updates.
- Added focused shared, server, and UI tests for parsing, persistence,
display, and related-work behavior.
- Rebased `PAP-735-first-class-task-references` cleanly onto
`public-gh/master`; no `pnpm-lock.yaml` changes are included.

## Verification

- `pnpm -r typecheck`
- `pnpm test:run packages/shared/src/issue-references.test.ts
server/src/__tests__/issue-references-service.test.ts
ui/src/components/IssueRelatedWorkPanel.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/MarkdownBody.test.tsx`

## Risks

- Medium risk because this adds a new issue-reference persistence path
that touches shared parsing, database schema, server routes, and UI
rendering.
- Migration risk is mitigated by `CREATE TABLE IF NOT EXISTS`, guarded
foreign-key creation, and `CREATE INDEX IF NOT EXISTS` statements so
users who have applied an older local version of the numbered migration
can re-run safely.
- UI risk is limited by focused component coverage, but reviewers should
still manually inspect issue detail pages containing ticket references
before merge.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-using shell workflow with
repository inspection, git rebase/push, typecheck, and focused Vitest
verification.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: dotta <dotta@example.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 10:02:52 -05:00
Dotta 1954eb3048 [codex] Detect issue graph liveness deadlocks (#4209)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat harness is responsible for waking agents, reconciling
issue state, and keeping execution moving.
> - Some dependency graphs can become live-locks when a blocked issue
depends on an unassigned, cancelled, or otherwise uninvokable issue.
> - Review and approval stages can also stall when the recorded
participant can no longer be resolved.
> - This pull request adds issue graph liveness classification plus
heartbeat reconciliation that creates durable escalation work for those
cases.
> - The benefit is that harness-level deadlocks become visible,
assigned, logged, and recoverable instead of silently leaving task
sequences blocked.

## What Changed

- Added an issue graph liveness classifier for blocked dependency and
invalid review participant states.
- Added heartbeat reconciliation that creates one stable escalation
issue per liveness incident, links it as a blocker, comments on the
affected issue, wakes the recommended owner, and logs activity.
- Wired startup and periodic server reconciliation for issue graph
liveness incidents.
- Added focused tests for classifier behavior, heartbeat escalation
creation/deduplication, and queued dependency wake promotion.
- Fixed queued issue wakes so a coalesced wake re-runs queue selection,
allowing dependency-unblocked work to start immediately.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
server/src/__tests__/issue-liveness.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts`
- Passed locally: `server/src/__tests__/issue-liveness.test.ts` (5
tests)
- Skipped locally: embedded Postgres suites because optional package
`@embedded-postgres/darwin-x64` is not installed on this host
- `pnpm --filter @paperclipai/server typecheck`
- `git diff --check`
- Greptile review loop: ran 3 times as requested; the final
Greptile-reviewed head `0a864eab` had 0 comments and all Greptile
threads were resolved. Later commits are CI/test-stability fixes after
the requested max Greptile pass count.
- GitHub PR checks on head `87493ed4`: `policy`, `verify`, `e2e`, and
`security/snyk (cryppadotta)` all passed.

## Risks

- Moderate operational risk: the reconciler creates escalation issues
automatically, so incorrect classification could create noise. Stable
incident keys and deduplication limit repeated escalation.
- Low schema risk: this uses existing issue, relation, comment, wake,
and activity log tables with no migration.
- No UI screenshots included because this change is server-side harness
behavior only.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent. Exact runtime model ID and
context window were not exposed in this session. Used tool execution for
git, tests, typecheck, Greptile review handling, and GitHub CLI
operations.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 09:11:12 -05:00
Robin van Duiven 8d0c3d2fe6 fix(hermes): inject agent JWT into Hermes adapter env to fix identity attribution (#3608)
## Thinking Path

> - Paperclip orchestrates AI agents and records their actions through
auditable issue comments and API writes.
> - The local adapter registry is responsible for adapting each agent
runtime to Paperclip's server-side execution context.
> - The Hermes local adapter delegated directly to
`hermes-paperclip-adapter`, whose current execution context type
predates the server `authToken` field.
> - Without explicitly passing the run-scoped agent token and run id
into Hermes, Hermes could inherit a server or board-user
`PAPERCLIP_API_KEY` and lack a usable `PAPERCLIP_RUN_ID` for mutating
API calls.
> - That made Paperclip writes from Hermes agents risk appearing under
the wrong identity or without the correct run-scoped attribution.
> - This pull request wraps the Hermes execution call so Hermes receives
the agent run JWT as `PAPERCLIP_API_KEY` and the current execution id as
`PAPERCLIP_RUN_ID` while preserving explicit adapter configuration where
appropriate.
> - Follow-up review fixes preserve Hermes' built-in prompt when no
custom prompt template exists and document the intentional type cast.
> - The benefit is reliable agent attribution for the covered local
Hermes path without clobbering Hermes' default heartbeat/task
instructions.

## What Changed

- Wrapped `hermesLocalAdapter.execute` so `ctx.authToken` is injected
into `adapterConfig.env.PAPERCLIP_API_KEY` when no explicit Paperclip
API key is already configured.
- Injected `ctx.runId` into `adapterConfig.env.PAPERCLIP_RUN_ID` so the
auth guard's `X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID` instruction
resolves to the current run id.
- Added a Paperclip API auth guard to existing custom Hermes
`promptTemplate` values without creating a replacement prompt when no
custom template exists.
- Documented the intentional `as unknown as` cast needed until
`hermes-paperclip-adapter` ships an `AdapterExecutionContext` type that
includes `authToken`.
- Added registry tests for JWT injection, run-id injection, explicit key
preservation, default prompt preservation, and the no-`authToken`
early-return path.

## Verification

- [x] `pnpm --filter "./server" exec vitest run adapter-registry` - 8
tests passed.
- [x] `pnpm --filter "./server" typecheck` - passed.
- [x] Trigger a Hermes agent heartbeat and verify Paperclip writes
appear under the agent identity rather than a shared board-user
identity, with the correct run id on mutating requests.

## Risks

- Low migration risk: this changes only the Hermes local adapter wrapper
and tests.
- Existing explicit `adapterConfig.env.PAPERCLIP_API_KEY` values are
preserved to avoid breaking intentionally configured agents.
- `PAPERCLIP_RUN_ID` is set from `ctx.runId` for each execution so
mutating API calls use the current run id instead of a stale or literal
placeholder value.
- Prompt behavior is intentionally conservative: the auth guard is only
prepended when a custom prompt template already exists, so Hermes'
built-in default prompt remains intact for unconfigured agents.
- Remaining operational risk: the identity and run-id behavior should
still be verified with a live Hermes heartbeat before relying on it in
production.

## Model Used

- OpenAI Codex, GPT-5 family coding agent, tool use enabled for local
shell, GitHub CLI, and test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable: backend-only change)
- [x] I have updated relevant documentation to reflect my changes (not
applicable: no product docs changed; PR description updated)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Dotta <bippadotta@protonmail.com>
2026-04-21 07:18:11 -05:00