Commit Graph

2366 Commits

Author SHA1 Message Date
Dotta 42a299fb9d [codex] Bound productivity review recovery loops (#4948)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat/productivity review subsystem detects when assigned
work is likely stuck or churning.
> - Productivity reviews are useful, but repeated reconciliation can
create noisy refresh comments or repeated review issues around the same
source issue.
> - That makes manager follow-up harder because the signal can get
buried under duplicate review activity.
> - This pull request bounds productivity review refreshes and creation
loops while preserving the existing escalation path.
> - The benefit is a quieter recovery loop that still surfaces stuck or
high-churn work for manager attention.

## What Changed

- Added refresh throttling for open productivity review issues,
including a one-hour default interval and a maximum of three refresh
comments per open review.
- Added a rolling 24-hour creation cap so completed/closed reviews
cannot immediately recreate review issues indefinitely for the same
source issue.
- Excluded cancelled productivity reviews from the creation cap so
manager cancellations do not silently suppress future legitimate
reviews.
- Preserved productivity review timestamps in deterministic test paths
and added targeted coverage for immediate refresh suppression, refresh
caps, creation caps, and cancelled-review exclusion.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/productivity-review-service.test.ts`
- `pnpm exec vitest run
server/src/__tests__/productivity-review-service.test.ts`
- Greptile Review: 5/5 on commit
`bcf25832d0ffae25890b2ee7eed112d1c2d114fe` with review threads resolved.
- GitHub PR checks passed on the latest head: `policy`, `verify`, `e2e`,
`Greptile Review`, and `security/snyk (cryppadotta)`.
- Verified the branch is rebased onto `public-gh/master` with no
conflicts.
- Verified the diff does not include `pnpm-lock.yaml`, database schema
changes, or migrations.

## Risks

- Low-to-medium risk: this changes automation cadence for productivity
reviews. A truly stuck issue may receive fewer repeated refresh
comments, but the original review issue remains open and assigned for
manager action.
- No migration risk: this is server logic and tests only.

> Checked [`ROADMAP.md`](ROADMAP.md) for overlapping planned core work;
this is a targeted recovery-loop fix and does not add a new roadmap
feature.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family, tool-using software
engineering mode. Exact context window is not exposed in this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable; server-only change)
- [x] I have updated relevant documentation to reflect my changes (not
applicable; no user-facing docs or commands changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 08:32:04 -05:00
Devin Foley d2dd759caa plugins: make e2b template default explicit (#4901)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Remote execution environments are part of that control plane,
including sandbox-provider plugins like E2B
> - The E2B provider already normalizes config and runtime behavior
around a `base` template default
> - But the manifest still presented `template` as required, which
forces redundant operator input and makes the UI contract stricter than
runtime behavior
> - That mismatch showed up while building a repeatable QA workflow for
sandbox testing
> - This pull request makes the manifest and validation contract line up
with the existing `base` default
> - The benefit is a simpler and more accurate E2B environment setup
experience

## What Changed

- Removed the E2B manifest's `required: ["template"]` requirement so the
config schema matches runtime behavior
- Clarified the manifest description to say the template defaults to
`base` when omitted
- Added a focused unit test proving that validation normalizes a missing
template to `base`

## Verification

- Ran the focused E2B plugin test for the new behavior:
- `cd packages/plugins/sandbox-providers/e2b && pnpm test --
--testNamePattern "defaults a missing template to base"`

## Risks

- Low risk. This only loosens the schema to match the plugin's existing
runtime normalization and adds a test for that path.
- The broader E2B plugin suite currently has unrelated existing failures
outside this change; this PR does not modify those paths.

## Model Used

- OpenAI Codex, GPT-5 Codex via Codex CLI agent tooling, large-context
coding workflow with terminal tool use and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
2026-04-30 22:43:24 -07:00
Devin Foley b02e67cea5 fix(ci): diff PR workflow paths from merge base (#4903)
## Thinking Path

> - Paperclip’s PR workflow is part of the control-plane safety surface
because it decides whether a branch is allowed to merge.
> - This issue started in that workflow: the lockfile and manifest
policy checks were diffing `base.sha..head.sha`, which incorrectly
treated unrelated `master` commits as if they belonged to the PR branch.
> - The right fix there is to diff from the PR merge base
(`base...head`) so policy checks only evaluate files introduced by the
branch itself.
> - Once that workflow fix was in place, `/checkpr` exposed a second
blocker on the PR merge ref: `verify` was failing in newer `master`-side
tests that were not part of the original branch diff.
> - The actionable repeated failure came from the ACPX local adapter
test suite, where a test hard-coded the managed Codex home under
`instances/default` even though the stable Vitest runner sets a
non-default `PAPERCLIP_INSTANCE_ID`.
> - This pull request now includes both the original CI diff-scope fix
and the targeted ACPX test fix so the PR’s actual checks align with
current base-branch execution.
> - The benefit is that the original false-positive lockfile failure is
removed, and the merge-ref verify path is hardened against the
instance-id isolation used in CI.

## What Changed

- Updated `.github/workflows/pr.yml` so the lockfile policy and manifest
policy steps diff `pull_request.base.sha...pull_request.head.sha` from
the merge base instead of using a two-dot base/head diff.
- Added an inline workflow comment explaining why the three-dot diff is
required for PR-scoped file detection.
- Updated `packages/adapters/acpx-local/src/server/execute.test.ts` so
the managed Codex home assertion uses a test-specific
`PAPERCLIP_INSTANCE_ID` instead of hard-coding `default`.
- Restored `PAPERCLIP_INSTANCE_ID` after that ACPX test finishes so the
test remains isolated and does not leak process env changes.

## Verification

- Reproduced the original false positive locally by comparing PR heads
`#4901` and `#4902` with the old `base..head` logic; both incorrectly
included `pnpm-lock.yaml` from unrelated `master` commits.
- Verified the new `base...head` logic reduces those PRs to only their
actual changed files and excludes `pnpm-lock.yaml`.
- Verified a real manifest-changing PR (`#4893`) still reports
`package.json` changes under the new logic.
- Ran `pnpm -r typecheck` successfully.
- Ran `pnpm vitest run
packages/adapters/acpx-local/src/server/execute.test.ts` successfully
after the ACPX test fix.
- Ran `pnpm vitest run packages/db/src/backup-lib.test.ts` successfully
against the merge-ref-related DB failure path observed during
`/checkpr`.
- Pushed commit `9520a976` and allowed PR `#4903` checks to rerun on the
updated branch.

## Risks

- Low risk: the workflow change only affects how PR policy checks
determine the changed file set.
- Low risk: the ACPX change is test-only and aligns the test with the
instance-isolation behavior already used by
`scripts/run-vitest-stable.mjs` in CI.
- The remaining operational risk is limited to other unrelated
merge-ref-only failures that were not reproduced in the targeted local
verification above.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5-codex`, via the Codex local adapter in Paperclip.
- Tool-using coding model with shell execution, git, GitHub CLI, and
repository inspection in a local worktree.
- Context included the current repo, the Paperclip task thread, PR check
output, and the isolated execution workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-30 21:22:40 -07:00
github-actions[bot] 6a7cca95ef chore(lockfile): refresh pnpm-lock.yaml (#4899)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-04-30 20:00:07 -05:00
Dotta 4272c1604d Add ACPX local adapter runtime (#4893)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through a control plane
that can start, supervise, and recover agent runs.
> - Local adapters are the bridge between Paperclip issues and concrete
agent runtimes such as Claude, Codex, and other ACP-compatible tools.
> - The roadmap calls out broader “bring your own agent” and claw-style
agent support, and ACPX gives Paperclip one path to normalize multiple
ACP agents behind a single adapter.
> - The branch needed to become one reviewable PR against current
`paperclipai/paperclip:master`, without carrying stale base conflicts or
generated lockfile churn.
> - This pull request adds an experimental built-in `acpx_local`
adapter, integrates it through the server/CLI/UI adapter surfaces, and
adds regression coverage for runtime execution, skill sync, stream
parsing, diagnostics, and log redaction.
> - The benefit is that Paperclip can run Claude/Codex/custom ACP agents
through ACPX while keeping operator configuration, skills, logging, and
transcript rendering inside the existing adapter model.

## What Changed

- Added `@paperclipai/adapter-acpx-local` with server execution, config
schema, ACPX session handling, CLI formatting, UI config helpers, and
stdout parsing.
- Registered `acpx_local` across CLI, server, shared constants, UI
adapter metadata, adapter capabilities, and agent creation/editing
surfaces.
- Added ACPX runtime execution support with persistent sessions,
local-agent JWT environment handling, skill snapshots, runtime skill
materialization, and isolation/security regressions.
- Added ACPX adapter diagnostics and marked the adapter experimental in
the UI.
- Added command/env secret redaction for resolved command metadata in
adapter-utils, server event storage, and the Agent Detail invocation UI.
- Added Storybook coverage for ACPX config, transcript rendering, and
skill states, plus PR screenshots under `docs/pr-screenshots/pap-2944/`.
- Rebased the branch onto current `public-gh/master`; `pnpm-lock.yaml`
is intentionally not included and there are no migration/schema changes.

## Verification

- `pnpm exec vitest run
packages/adapters/acpx-local/src/server/execute.test.ts
packages/adapters/acpx-local/src/server/test.test.ts
packages/adapters/acpx-local/src/cli/format-event.test.ts
packages/adapters/acpx-local/src/ui/parse-stdout.test.ts
packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/redaction.test.ts
server/src/__tests__/acpx-local-execute.test.ts
server/src/__tests__/acpx-local-skill-sync.test.ts
server/src/__tests__/acpx-local-adapter-environment.test.ts
server/src/__tests__/adapter-routes.test.ts
server/src/__tests__/agent-skills-routes.test.ts
ui/src/adapters/metadata.test.ts` — 12 files, 87 tests passed.
- `pnpm --filter @paperclipai/adapter-acpx-local typecheck` — passed.
- `pnpm --filter @paperclipai/server typecheck` — passed.
- `pnpm --filter @paperclipai/ui typecheck` — passed.
- Confirmed PR diff does not include `pnpm-lock.yaml`, database schema
files, or migrations.

Screenshots:

![ACPX Claude skills
light](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-claude-light.png?raw=true)
![ACPX Claude skills
dark](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-claude-dark.png?raw=true)
![ACPX custom skills
light](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-custom-light.png?raw=true)

## Risks

- Medium risk: this introduces a new built-in adapter package and
touches runtime execution, adapter registration, agent config, skills,
and transcript rendering.
- ACPX and ACP agent behavior can vary by installed tool versions; the
adapter is marked experimental to set operator expectations.
- `pnpm-lock.yaml` is excluded per repository PR policy, so dependency
lock refresh must be handled by the repo’s automation or maintainers.
- No database migration risk: no schema or migration files changed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with repository tool use,
shell execution, git operations, and local verification. Exact hosted
context window was not exposed in this environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 19:57:05 -05:00
Dotta ad5432fece [codex] Harden issue recovery reliability (#4875)
## Thinking Path

> - Paperclip is the control plane for autonomous agent companies, so
non-terminal issue state must always have a clear live, waiting, or
recovery owner.
> - This change stays inside the server reliability and liveness
subsystem for assigned issue recovery, blocker attention, and live-run
polling.
> - Closed PR #4860 mixed this reliability work with separate
mutation-boundary policy changes, which made review and merge risk too
broad.
> - [PAP-2981](/PAP/issues/PAP-2981) asked for a replacement PR
containing only the remaining reliability slice and explicitly excluding
user-assignment and execution-policy restrictions.
> - Follow-up review also split `advanced` run-liveness continuation
behavior out of this PR so it can be reviewed separately.
> - The implementation hardens repeated recovery escalation, expands
blocker-attention coverage for explicit waiting and recovery paths, and
caps company live-run polling defaults.
> - The benefit is a smaller reliability PR that improves liveness
behavior without changing agent/user mutation authorization boundaries
or `advanced` continuation semantics.

## What Changed

- Avoid repeated liveness escalation updates when the source issue is
already blocked by the same open escalation.
- Treat open liveness escalation recovery issues, their source issues,
and their leaf blockers as covered waiting paths in blocker attention.
- Cap default company live-run polling at 50 rows for both `minCount`
and `limit`, including explicit zero values, to avoid unbounded
responses.
- Preserve the existing behavior where succeeded `advanced` runs are
considered productive/healthy for stranded-work recovery and are not
actionable bounded run-liveness continuations.
- Added focused server coverage for recovery dedupe, blocker attention,
liveness escalation, run continuations, and live-run polling.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
server/src/__tests__/issue-blocker-attention.test.ts
server/src/__tests__/run-continuations.test.ts
server/src/__tests__/agent-live-run-routes.test.ts`
- Result: 5 files passed, 63 tests passed.
- `pnpm --filter @paperclipai/server typecheck`
- Result: passed.
- No UI changes; screenshots are not applicable.

## Risks

- Recovery and blocker-attention classification changes can affect which
blocked chains are shown as covered versus needing attention.
- Live-run polling now treats omitted, invalid, or non-positive `limit`
/ `minCount` values as the capped default of 50.
- `advanced` run-liveness continuation behavior is intentionally
excluded from this PR and split for separate review.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 16:44:28 -05:00
Dotta a3de1d764d Add cheap model profiles for local adapters (#4881)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies, where
adapters are the boundary between the board, agents, and execution
runtimes.
> - Local adapters currently expose a primary runtime configuration, but
operators often need a cheaper model lane for routine or low-risk work.
> - That cheap lane has to stay adapter-owned: runtime profile settings
should not mutate the primary adapter config or bypass existing
auth/secret mediation.
> - Issue creation also needs an ergonomic way to request primary,
cheap, or custom model behavior for a selected assignee.
> - This pull request adds a first-class `cheap` model profile contract
across adapter capabilities, heartbeat config resolution, agent
configuration, and issue creation.
> - The benefit is cheaper task execution can be configured and
requested explicitly while preserving adapter boundaries, secret
handling, and audit visibility.

## What Changed

- Added adapter model-profile capability metadata and a `cheap` profile
contract for supported local adapters.
- Applied `runtimeConfig.modelProfiles.cheap.adapterConfig` during
heartbeat config resolution, including requested/applied/fallback run
metadata.
- Added agent configuration UI for cheap model profile settings without
writing those settings into primary `adapterConfig`.
- Added New Issue assignee model lane controls for Primary / Cheap /
Custom and request payload handling.
- Added run ledger profile badges and Storybook stories for the new
cheap-lane UI states.
- Added tests for validators, heartbeat model profile application,
permission/secret mediation, UI payload helpers, and run ledger
rendering.
- Added committed UI verification screenshots under
`docs/pr-screenshots/pap-2837/`.
- Addressed Greptile review feedback around cheap-profile defaults,
shared profile types, and fallback test data.

## Verification

Local:

- `pnpm exec vitest run packages/shared/src/validators/issue.test.ts
server/src/__tests__/adapter-registry.test.ts
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/heartbeat-model-profile.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/lib/agent-config-patch.test.ts
ui/src/lib/issue-assignee-overrides.test.ts
ui/src/lib/new-agent-runtime-config.test.ts` — passed, 8 files / 103
tests.
- `pnpm exec vitest run ui/src/lib/new-agent-runtime-config.test.ts
ui/src/components/IssueRunLedger.test.tsx` — passed after
Greptile/rebase follow-up, 2 files / 17 tests.
- `pnpm --filter @paperclipai/ui typecheck` — passed after
Greptile/rebase follow-up.
- `pnpm -r typecheck` — passed.
- `pnpm build` — passed.
- `pnpm test:run` — did not complete successfully in this local
worktree: it stopped in pre-existing `@paperclipai/adapter-utils`
sandbox/SSH fixture suites outside this PR diff. Failures were 5s local
timeouts plus `git init -b` unsupported by this machine's Git 2.21.0.
The branch-specific targeted suites above passed.
- Branch was fetched/rebased onto `public-gh/master`; `git rev-list
--left-right --count public-gh/master...HEAD` reports `0 9`.

Remote PR checks on latest head
`e30bf399146451c86cee98ed528d51d33fa5af5a`:

- `policy` — passed.
- `verify` — passed.
- `e2e` — passed.
- `Greptile Review` — passed, confidence score 5/5; Greptile review
threads resolved.
- `security/snyk (cryppadotta)` — passed.

Screenshots:

- [New issue cheap lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-cheap-desktop.png)
- [New issue custom lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-custom-desktop.png)
- [New issue unsupported adapter
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-unsupported-desktop.png)
- [Run ledger model profile badges
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/runledger-profile-badges-desktop.png)
- Mobile variants are also in `docs/pr-screenshots/pap-2837/`.

## Risks

- Medium: heartbeat config mediation now merges runtime model profiles
into adapter configs, so adapter secret normalization and host-command
restrictions must keep covering nested config paths.
- Medium: the UI adds another issue creation choice; unsupported
adapters must keep hiding the cheap lane and preserve primary behavior.
- Low migration risk: no database migration is included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

OpenAI Codex coding agent using GPT-5-class reasoning with repo tool use
and command execution. Exact served model/context window was not exposed
by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 15:32:04 -05:00
Dotta 1fe1067361 Polish board settings and skills workflow (#4863)
## Thinking Path

> - Paperclip's board UI and bundled skills are the operator layer for
configuring agents, routines, issue workflows, and local troubleshooting
loops.
> - The prior rollup mixed this operator polish with database backups,
backend reliability, thread scale, and cost/workflow primitives.
> - This pull request isolates the remaining board QoL, settings,
issue-detail integration, adapter config cleanup, and skills smoke
tooling.
> - It includes some integration-level overlap with the thread and
workflow slices so this branch can run from `origin/master` while still
preserving the full original work.
> - Preferred merge order is the narrower primitives first, then this
integration PR last.
> - The benefit is that reviewers can inspect the user-facing
board/settings/skills layer separately from backend infrastructure
changes.

## What Changed

- Added board/settings polish for agents, routines, company settings,
project workspace detail, and issue detail controls.
- Added agent/routine UI regression tests and New Issue dialog coverage.
- Integrated issue-detail activity/cost/interaction surfaces and leaf
work pause/resume controls.
- Cleaned bundled adapter UI config defaults and onboarding copy.
- Added terminal-bench loop and work-stoppage diagnosis skills plus a
smoke test script.
- Updated attachment type handling and Paperclip skill/API guidance.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/pages/Agents.test.tsx
ui/src/pages/Routines.test.tsx ui/src/components/NewIssueDialog.test.tsx
ui/src/pages/IssueDetail.test.tsx
server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts`
- Result: 7 test files passed, 54 tests passed.
- `pnpm run smoke:terminal-bench-loop-skill`
- Result: JSON output included `"ok": true` and `"cleanup": true`.
- UI screenshots not included because verification is focused
component/page coverage for the changed board surfaces.

## Risks

- This is the integration-heavy PR in the split and intentionally
overlaps some component/API primitives with the issue-thread and
workflow PRs so it can run from `origin/master`.
- Preferred merge order: #4859, #4860, #4861, #4862, then this PR last.
If earlier branches merge first, this PR may need a straightforward
conflict refresh in shared UI files.
- The terminal-bench smoke script creates temporary mock issues and
relies on cleanup; the verified run returned `cleanup: true`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 15:28:11 -05:00
Dotta c4269bab59 Add workflow interaction cancellation and issue cost summaries (#4862)
## Thinking Path

> - Paperclip coordinates work through issue-thread interactions, run
history, and cost telemetry.
> - Operators need workflow prompts to be cancellable and costs to be
visible at the issue level.
> - The earlier rollup mixed this workflow/cost work with database
backups, reliability recovery, thread scaling, and settings polish.
> - This pull request isolates the interaction and cost surfaces into a
reviewable slice.
> - The backend now supports cancelling pending question interactions
and summarizing issue-tree costs.
> - The UI component layer can render cancelled questions and interleave
activity with run ledger rows.

## What Changed

- Added `cancelled` as an issue-thread interaction status and result
shape for question interactions.
- Added the board-only `POST
/issues/:id/interactions/:interactionId/cancel` route and service
implementation.
- Added issue-tree cost summary support in the cost service and
`/issues/:id/cost-summary` API route.
- Extended shared cost exports and UI API/query keys for issue cost
summaries.
- Updated `IssueThreadInteractionCard` and `IssueRunLedger` components
for cancelled questions, issue cost surfaces, and activity/run
interleaving.
- Added focused server and component regression coverage.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- Result: 4 test files passed, 45 tests passed.
- UI screenshots not included because this PR updates reusable
components and API surfaces without wiring a new page-level layout.

## Risks

- Adds a new interaction terminal status; clients that switch
exhaustively on interaction status may need to handle `cancelled`.
- Issue-tree cost summaries use recursive issue traversal and should be
watched on unusually large issue trees.
- Page-level issue detail wiring is intentionally left to the board
QoL/issue-detail branch to keep this PR narrow.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 13:57:25 -05:00
Dotta 87f19cd9a6 Improve issue thread scale and markdown polish (#4861)
## Thinking Path

> - Paperclip's board UI is the operator surface for supervising
AI-agent companies.
> - Issue threads are where operators read progress, respond to agents,
inspect markdown, and jump through long histories.
> - Large threads and rich markdown had become difficult to navigate and
expensive to render.
> - The previous rollup mixed these UI scale fixes with unrelated
backend recovery, costs, backups, and settings changes.
> - This pull request isolates the issue-thread scale and markdown
polish work.
> - The benefit is a reviewable UI slice that can merge independently of
the backend reliability, database backup, workflow, and board QoL PRs.

## What Changed

- Virtualized long issue chat threads and stabilized
anchor/jump-to-latest behavior for large histories.
- Added incremental issue-list row loading and tests for
scroll-triggered pagination behavior.
- Hardened markdown body rendering and markdown editor behavior around
HTML tags, image drops, code-copy UI, and escaped newline handling.
- Added a long-thread measurement harness at
`scripts/measure-issue-chat-long-thread.mjs` plus
`perf:issue-chat-long-thread`.
- Added focused UI/lib regression coverage for thread rendering,
markdown, optimistic comments, and message building.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssuesList.test.tsx
ui/src/components/MarkdownBody.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/lib/issue-chat-messages.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`
- Result: 6 test files passed, 170 tests passed.
- UI screenshots not included because this PR is covered by targeted
component tests and does not introduce a new page layout.

## Risks

- Virtualization changes can affect scroll anchoring in edge cases on
very long threads.
- Markdown/editor hardening changes are intentionally defensive, but
malformed content may render differently than before.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 13:18:01 -05:00
Dotta cd606563f6 Expand database backups to non-system schemas (#4859)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - Reliable backups are part of operating that control plane safely.
> - The previous backup path was public-schema oriented and did not
clearly cover plugin-owned schemas or migration history.
> - Paperclip now has plugin database namespaces and Drizzle migration
state that must survive backup/restore.
> - This pull request expands logical database backups to non-system
schemas and documents the backup boundary.
> - The benefit is safer restore behavior for core and plugin-owned
database state without implying full filesystem disaster recovery.

## What Changed

- Include non-system database schemas in JavaScript and pg_dump backup
paths.
- Preserve enum, table, sequence, index, constraint, migration, and
plugin-schema objects across backup/restore.
- Add restore coverage for plugin-owned schemas and Drizzle migration
history.
- Clarify docs that DB backups are logical database backups, not full
instance filesystem backups.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run packages/db/src/backup-lib.test.ts`
- Result: 1 test file passed, 4 tests passed.
- Confirmed this PR does not include `pnpm-lock.yaml` or
`.github/workflows/*` changes.

## Risks

- Medium: backup generation touches schema discovery and restore
ordering, so unusual database objects may need additional coverage
later.
- No migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use enabled, medium reasoning
effort. Exact hosted context-window details are not exposed in this
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: no UI changes are included in this PR, so screenshots are not
applicable.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 12:54:35 -05:00
Devin Foley c0ce35d1fb Improve E2B plugin configuration UX and fix execution timeouts (#4802)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - E2B is a sandbox provider plugin that runs agent code in isolated
cloud environments
> - Operators configure E2B through the plugin settings page
> - But the E2B API key configuration was unclear — the settings field
description didn't explain that pasted keys are auto-saved as company
secrets, and the fallback to the host `E2B_API_KEY` variable wasn't
documented
> - Additionally, long-running E2B sandbox commands were timing out
because the plugin environment RPC driver used a fixed timeout, and
environment commands competed for the single foreground command slot
> - This PR clarifies the E2B configuration UX, fixes RPC timeouts for
plugin environment execution, and runs E2B environment commands in
background mode to avoid blocking the foreground slot
> - The benefit is clearer E2B setup for operators and more reliable
sandbox command execution

## What Changed

- Updated E2B plugin manifest and settings UI to clarify API key
configuration — field description now explains that pasted keys are
saved as company secrets and documents the `E2B_API_KEY` host fallback
- Added test coverage for the plugin settings page rendering
- Fixed `plugin-environment-driver.ts` to pass the configured timeout
through to RPC calls instead of using a hardcoded default
- Updated `environment-runtime.ts` to propagate timeout from the
environment lease to the plugin driver
- Changed E2B sandbox command execution to use background handles so
long-running agent commands don't block the foreground slot needed by
the callback bridge

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to plugin settings, verify E2B API key field shows
the updated description text
- Manual: run an E2B-backed agent task with a long-running command,
verify it completes without RPC timeout

## Risks

- Low risk. Configuration UX change is cosmetic. The timeout fix passes
an existing value through instead of dropping it. Background command
execution is a behavioral change but only affects E2B sandbox commands —
the foreground slot is still available for bridge health checks.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 17:12:30 -07:00
Devin Foley a4ac6ff133 Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end

## What Changed

- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export

## Verification

- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge

## Risks

- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
Devin Foley 4cf612a92d Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments that are leased, and the server
manages runtime state, workspace configuration, and plugin lifecycle
> - Several edge cases caused failures during concurrent operations: a
race condition in runtime state insertion could produce duplicate-key
errors, reused workspaces didn't sync their configuration when the
parent issue was updated, sandbox provider plugins could be queried
before registration completed, and orphaned environment leases from
failed runs were never released
> - This PR fixes these four runtime/environment issues
> - The benefit is more reliable concurrent agent execution and proper
resource cleanup

## What Changed

- `services/heartbeat.ts`: Fixed a race condition where concurrent
runtime state inserts could fail with a duplicate-key error by using an
upsert pattern
- `services/issues.ts`: Sync reused workspace configuration when an
issue is updated, so the workspace reflects the latest issue state
- `services/environment-runtime.ts`: Fixed a startup race where sandbox
provider plugins could be queried before registration completed, by
awaiting plugin readiness before resolving environment drivers
- `services/heartbeat.ts`: Release environment leases for orphaned runs
that lost their process without cleanup

## Verification

- `pnpm test` — all existing and new tests pass, including new tests for
runtime state upsert and process recovery lease cleanup
- `pnpm typecheck` — clean
- Manual: trigger concurrent agent runs to verify no duplicate-key
failures; verify orphaned leases are released after process loss

## Risks

- Low risk. The runtime state upsert changes insert-to-upsert behavior,
which could mask a legitimate duplicate if two different runs produce
the same key — but this is prevented by the run ID being part of the
key. The plugin startup await is bounded by the existing registration
timeout.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:10 -07:00
Devin Foley f9cf1d2f6a Add cursor sandbox support and fix SSH workspace sync (#4803)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, or on remote
hosts via SSH
> - The cursor adapter needs to resolve `cursor-agent` inside sandbox
environments where it's installed in `~/.local/bin`
> - But when using the default `agent` command on a sandbox target, the
adapter didn't know to look in `~/.local/bin/cursor-agent`, causing
"command not found" failures
> - Additionally, repeated SSH runs failed because `git checkout` during
workspace sync conflicted with leftover `.paperclip-runtime` files from
previous runs
> - This PR adds sandbox-aware command resolution for cursor and fixes
the SSH workspace sync conflict
> - The benefit is cursor works in E2B sandboxes out of the box, and
repeated SSH runs don't fail on workspace sync

## What Changed

- `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox
targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and
prefers `~/.local/bin/cursor-agent` when the default command is
requested; tightened the sandbox command probe to validate the binary
exists before launching; preserves explicit custom command overrides
- `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH
workspace sync to handle `.paperclip-runtime` untracked file conflicts
from previous runs

## Verification

- `pnpm test` — all existing and new tests pass, including cursor
sandbox probe, sandbox execution, and custom command override tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run a cursor-local task, verify
it resolves cursor-agent from the sandbox install path

## Risks

- Low-medium. The `--force` flag on git checkout could discard
uncommitted changes in the remote workspace, but the workspace is
managed by Paperclip and should not contain user edits.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:12:06 -07:00
Devin Foley a0f5cbffd7 Harden release flow with registry verification and dist-tag checks (#4800)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Paperclip is distributed as npm packages, including plugins like
`plugin-e2b`
> - The release process publishes canary and stable builds via npm
dist-tags
> - But there was no automated verification that published packages
actually landed with the correct dist-tags, and broken canary publishes
could silently ship to users
> - This PR adds a registry verification script that checks published
packages match their expected dist-tags, and wires it into PR CI so
regressions are caught before merge
> - The benefit is release integrity is verified automatically, and
broken dist-tag states are caught early

## What Changed

- Added `scripts/verify-release-registry-state.mjs` — verifies that
published npm packages have correct dist-tag assignments and detects
orphaned or mispointed tags
- Added `scripts/verify-release-registry-state.test.mjs` — test coverage
for the verification logic
- Updated `scripts/release.sh` to include canary dist-tag safety checks
before publishing
- Updated `.github/workflows/pr.yml` to run registry verification as a
CI step
- Updated `doc/PUBLISHING.md` and `doc/RELEASING.md` with the new
verification workflow

## Verification

- `pnpm test` — all tests pass including new verification script tests
- `node scripts/verify-release-registry-state.mjs` — runs against the
live npm registry and reports current state
- CI: the new PR workflow step runs on every PR push

## Risks

- Low risk. This is additive CI and tooling — no runtime code changes.
The registry verification is read-only (queries npm, does not publish).
The release script changes add safety checks that abort before
publishing if state is unexpected.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:20 -07:00
Devin Foley 367d4cab72 Fix SSH callback URL selection for LAN and private networks (#4799)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run on remote hosts via SSH environments
> - When a remote agent needs to call back to the Paperclip API, it
needs a reachable URL
> - But the runtime API URL candidate builder did not account for
private network topologies where the server is only reachable via LAN or
VPN addresses
> - Agents on SSH hosts were failing to connect because the callback URL
pointed to localhost or an unreachable address
> - This PR fixes callback URL selection to honor `PAPERCLIP_API_URL`,
prefer LAN-reachable candidates, filter unreachable link-local
addresses, and include interface hosts in onboarding invite URLs
> - The benefit is SSH-based agents can reliably reach the Paperclip API
on private networks without manual URL configuration

## What Changed

- `runtime-api.ts`: Added `PAPERCLIP_API_URL` as a first-priority
candidate in `buildRuntimeApiCandidateUrls`; extracted
`collectReachableInterfaceHosts` to enumerate non-loopback,
non-link-local network interface IPs with IPv4 preference
- `server/src/index.ts`: Export `PAPERCLIP_API_URL` from the server
environment so it is available to callback candidate resolution
- `server/src/routes/access.ts`: Include LAN interface hosts in
onboarding invite connection candidates
- `server/src/config.ts`: Attempted auto-allowing LAN interface hosts,
then reverted to the per-instance allowlist approach (both commits
included for history clarity)

## Verification

- `pnpm test` — all existing and new tests pass, including new tests for
LAN candidate ordering and link-local filtering
- `pnpm typecheck` — clean
- Manual: start a Paperclip server on a machine with a LAN IP, create an
SSH environment pointing to another host on the same LAN, verify the
agent's callback URL uses the LAN IP rather than localhost

## Risks

- Low-medium. The candidate list now includes more addresses (all
non-loopback LAN interfaces). These are candidates for the agent to try,
not an allowlist — the server's allowed hostnames still gate which
origins are accepted. Ordering change (LAN preferred over loopback)
could affect existing setups where localhost was intentionally
preferred.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:17 -07:00
Devin Foley 9b99d30330 Add dedicated environment settings page and test-in-environment (#4798)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments (local, SSH, E2B sandbox)
> - Operators need to configure and manage these environments
> - But environment settings were buried inside the general company
settings page, making them hard to find
> - Additionally, when testing an agent from the configuration form, the
test always ran locally regardless of which environment was selected
> - This PR moves environments into a dedicated top-level company
settings section and wires the "Test Environment" button to run inside
the selected environment
> - The benefit is operators can find and manage environments more
easily, and the test button now validates the actual environment the
agent will use

## What Changed

- Added a dedicated `CompanyEnvironments` settings page with its own
route and sidebar entry
- Updated `CompanySettingsSidebar` and `CompanySettingsNav` to include
the new environments section
- Modified the agent test route (`POST /agents/:id/test`) to accept an
optional `environmentId` parameter
- Updated all adapter `test.ts` handlers to resolve and use the
specified execution target environment
- Added `resolveTestExecutionTarget` to `execution-target.ts` for remote
environment test resolution with cwd fallback
- Moved the "Test Environment" button and its feedback display into the
`NewAgent` page footer for better UX flow

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to Company Settings, confirm "Environments" appears
as a top-level section
- Manual: configure an agent with a non-local environment, click "Test
Environment", confirm the test runs inside that environment

## Risks

- Low risk. UI-only routing change for the settings page. The
test-in-environment change adds an optional parameter with a local
fallback, so existing behavior is preserved when no environment is
specified.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:13 -07:00
Dotta 3494e84a29 Add v2026.428.0 release changelog (#4665)
## Summary

- Adds `releases/v2026.428.0.md` covering the diff between `v2026.427.0`
and `origin/master` (seven merged PRs).
- Generated via `.agents/skills/release-changelog/SKILL.md`.
- Flags the additive migrations `0071_default_hire_approval_off` and
`0072_large_sandman` plus the new-companies hire-approval default flip
in the upgrade guide.

## Notes

- Highlight: pause/resume actions in the sidebar agents panel
([#4616](https://github.com/paperclipai/paperclip/pull/4616)).
- Improvements: assigned-todo recovery dispatch, recovery issue
hardening, hire-approval opt-in default, inline selector keyboard
handling.
- Fixes: manual routine inbox visibility, stale company skill refresh
rejection, stale stored company-selection cleanup.

## Test plan

- [ ] Reviewer confirms section coverage and PR-to-bullet attribution.
- [ ] Confirm the file lands at \`releases/v2026.428.0.md\`.
- [ ] Confirm no canary suffix in title or filename.

Source issue: [PAP-2599](/PAP/issues/PAP-2599)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 17:40:20 -05:00
Dotta 6b7f6ce4b8 [codex] Split PR #4692 UI/QoL updates (#4701)
## Thinking Path

> - Paperclip orchestrates AI agents through a company-scoped control
plane.
> - The affected surface is the board UI for issue threads, issue lists,
routines, dialogs, navigation, and issue review indicators.
> - Closed PR #4692 bundled backend, schema, docs, workflow, and UI/QoL
work into one oversized change set.
> - Greptile could not keep reviewing that broad PR because it exceeded
the 100-file review limit and mixed unrelated concerns.
> - This pull request extracts the UI/QoL slice into a fresh branch
under the review limit while leaving workflow and lockfile churn out.
> - The benefit is a focused review path for the board UI performance
and workflow improvements without reopening the oversized PR.

## What Changed

- Added long issue-thread virtualization, scroll-container binding,
anchor preservation, latest-comment jump targeting, and related
regression/perf fixtures.
- Improved issue list scalability with scroll-based loading, server
offset parameters, and pagination-focused UI tests.
- Reduced new issue dialog typing churn and split dialog action
subscriptions so broad layout/nav surfaces avoid unnecessary renders.
- Added routine variables help and routine description mention options
for users, agents, and projects.
- Added productivity review badge/link UI and fixed the badge to use
Paperclip's company-prefixed router link.
- Kept the split PR below Greptile's review limit and excluded
`.github/workflows/pr.yml` and `pnpm-lock.yaml`.

## Verification

- `pnpm install --no-frozen-lockfile` in the clean worktree to install
`@tanstack/react-virtual` locally without committing lockfile churn.
- `pnpm --filter @paperclipai/ui exec vitest run --config
vitest.config.ts src/components/IssueChatThread.test.tsx
src/components/IssuesList.test.tsx
src/components/NewIssueDialog.test.tsx src/pages/Routines.test.tsx
src/pages/Issues.test.tsx` passed: 5 files, 83 tests.
- `pnpm --filter @paperclipai/ui typecheck` passed.
- `git diff --check origin/master..HEAD` passed.
- Split-scope checks: 53 changed files; no `.github/workflows/pr.yml`;
no `pnpm-lock.yaml`.
- Screenshots were not captured in this heartbeat; the changes are
primarily virtualization, routing, pagination, and editor behavior
covered by focused regression tests.

## Risks

- Moderate UI risk because issue-thread virtualization changes scroll
behavior on long conversations; regression tests cover anchor jumps,
latest-comment targeting, row metadata, and short-thread fallback.
- Moderate integration risk because the issue-list offset parameter and
productivity review field depend on matching API behavior.
- Dependency risk: the UI package adds `@tanstack/react-virtual` while
repository policy keeps `pnpm-lock.yaml` out of PRs, so CI must resolve
dependency changes through the repo's normal lockfile policy.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled local repository and
GitHub workflow. Exact runtime context window was not exposed by the
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 17:18:58 -05:00
Dotta 1991ec9d6f [codex] Split backend control-plane QoL slice (#4700)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, so
backend task ownership, recovery, review visibility, and company-scoped
limits need to stay enforceable without UI-only coupling.
> - Closed PR #4692 bundled those backend changes with UI workflow,
docs, skills, workflow, and lockfile churn.
> - PAP-2694 asks for a clean backend/control-plane slice from that
closed branch.
> - This branch starts from current `master` and mines only the `cli`,
`packages/db`, `packages/shared`, and `server` contracts/tests needed
for the backend behavior.
> - It explicitly excludes UI workflow/performance work,
`.github/workflows/pr.yml`, `pnpm-lock.yaml`, docs, skills,
package-script, adapter UI build-config, and perf fixture script
changes; the only UI files are fixture/test updates required by the
tightened shared `Company` contract.
> - The benefit is a smaller reviewable PR that preserves the
control-plane fixes while staying under Greptile s 100-file review
limit.

## What Changed

- Added company-scoped attachment-size limits through DB
schema/migrations, shared company portability contracts, CLI
import/export coverage, and server attachment upload enforcement.
- Added productivity review service/API behavior for no-comment streak,
long-active, and high-churn review issues, including request-depth
clamping and issue summary exposure.
- Hardened issue ownership and recovery/control-plane paths: peer-agent
mutation denial, issue tree pause/resume behavior, stranded recovery
origins, and related activity/test coverage.
- Preserved related backend contract updates for routine timestamp
variables and managed agent instruction bundles because they live in
shared/server contracts from the source branch.
- Addressed Greptile feedback by making `Company.attachmentMaxBytes`
non-optional, simplifying review request-depth clamping, fixing the
migration final newline, and enforcing the process-level attachment cap
as the final ceiling for uploads.
- Added minimal company fixtures needed for repo-wide typecheck/build
and kept the PR to 66 changed files with forbidden/non-slice paths
excluded.

## Verification

- `pnpm install --frozen-lockfile`
- `git diff --check origin/master..HEAD`
- `git diff --name-only origin/master..HEAD | wc -l` -> 66 files
- `git diff --name-only origin/master..HEAD -- .github/workflows/pr.yml
pnpm-lock.yaml package.json doc skills .agents scripts
packages/adapters` -> no output
- `pnpm exec vitest run --config vitest.config.ts
packages/shared/src/validators/issue.test.ts
packages/shared/src/routine-variables.test.ts
packages/shared/src/adapter-types.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts
cli/src/__tests__/company.test.ts
server/src/__tests__/productivity-review-service.test.ts
server/src/__tests__/issue-tree-control-service.test.ts
server/src/__tests__/issue-tree-control-routes.test.ts
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
server/src/__tests__/issue-attachment-routes.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/issues-service.test.ts` -> 12 files, 147 tests
passed
- `pnpm exec vitest run --config vitest.config.ts
cli/src/__tests__/company-delete.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts
server/src/__tests__/productivity-review-service.test.ts` -> 3 files, 18
tests passed
- `pnpm exec vitest run --config vitest.config.ts
server/src/__tests__/issue-attachment-routes.test.ts` -> 1 file, 6 tests
passed
- `pnpm --filter @paperclipai/db typecheck && pnpm --filter
@paperclipai/shared typecheck && pnpm --filter @paperclipai/server
typecheck && pnpm --filter paperclipai typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck && pnpm --filter
@paperclipai/ui build`

## Risks

- Includes migrations `0073_shiny_salo.sql` and
`0074_striped_genesis.sql`; merge ordering matters if another PR adds
migrations first.
- This is intentionally backend-only apart from fixture/test updates
forced by shared type correctness; UI affordances from PR #4692 are not
present here and should land in separate UI slices.
- The worktree install emitted plugin SDK bin-link warnings for unbuilt
plugin packages, but the targeted tests and package typechecks completed
successfully.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected; check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow. Exact runtime context window was not exposed by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 16:46:45 -05:00
Dotta d9f540c331 [codex] Refresh docs and agent skills (#4693)
## Thinking Path

> - Paperclip orchestrates AI agents through a company-scoped control
plane
> - Contributors and agents need docs and skills that match the current
V1 behavior
> - The source branch included documentation updates alongside
implementation work
> - Keeping docs and skill guidance separate makes the implementation PR
easier to review
> - This pull request refreshes the V1 docs and agent-operating guidance
without changing runtime behavior
> - The benefit is current contributor guidance that can merge
independently from code changes

## What Changed

- Refreshed V1 product, goal, implementation, database, and development
documentation.
- Updated the Paperclip heartbeat skill guidance and create-agent skill
references.
- Added the Paperclip plan-to-task conversion skill.
- Updated release changelog skill guidance.

## Verification

- `git diff --check public-gh/master..HEAD` passed in the PR worktree
after the Greptile fix.
- Greptile Review passed on head `673317ed` with zero unresolved review
threads.
- GitHub PR checks passed on head `673317ed`: `policy`, `verify`, `e2e`,
and `security/snyk (cryppadotta)`.

## Risks

- Low runtime risk because this branch only changes docs and skill
guidance.
- Documentation may need follow-up wording adjustments if reviewers want
a different framing for V1 behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow. Exact runtime context window was not exposed by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 16:12:03 -05:00
Dotta d0bdbe11a9 Stabilize inline selector keyboard handling (#4617)
## Thinking Path

> - Paperclip's board UI relies on compact selectors for frequent issue
and agent edits.
> - Inline selectors often live inside larger keyboard-aware surfaces
such as composers and popovers.
> - Arrow, enter, tab, and escape keys handled by the selector should
not leak to parent document shortcuts.
> - Stale company selection should also stay hidden until the company
list confirms it is valid.
> - This pull request tightens inline selector keyboard handling and
adds regression coverage for stale company bootstrap behavior.
> - The benefit is fewer accidental parent interactions and safer
company-scoped UI initialization.

## What Changed

- Added a stable empty `recentOptionIds` default so selector filtering
does not get a new array every render.
- Mirrored highlighted option state into a ref so Enter/Tab commits the
current highlighted option reliably after keyboard navigation.
- Stopped propagation for selector-owned navigation/commit/escape keys.
- Added jsdom regressions for inline selector keyboard handling and
CompanyProvider stale selection behavior.

## Verification

- `pnpm exec vitest run ui/src/components/InlineEntitySelector.test.tsx
ui/src/context/CompanyContext.test.tsx`
- Targeted selector and CompanyProvider tests pass cleanly without React
`act(...)` warnings.
- Screenshots not attached: this is keyboard/state behavior covered by
component tests.

## Risks

- Low risk: changes are scoped to inline selector key handling and
tests. The main behavior shift is intentionally preventing handled
selector keys from reaching parent listeners.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:04:35 -05:00
Dotta 43b0f2ae58 Add pause and resume actions to sidebar agents (#4616)
## Thinking Path

> - Paperclip operators need fast control over the agents running their
company.
> - The sidebar is the persistent place operators scan agent state while
navigating the board UI.
> - Agent pause and resume already exist as control-plane actions, but
sidebar users had to navigate away to use them.
> - This pull request adds a compact per-agent action menu in the
sidebar.
> - It keeps edit, pause, and resume close to the visible agent list
while preserving existing navigation behavior.
> - The benefit is faster operator intervention when an agent needs to
be paused or restarted.

## What Changed

- Refactored sidebar agent rows into a small item component with a
hover/focus action menu.
- Added edit, pause, and resume actions using existing agent API calls
and cache invalidation keys.
- Added success/error toasts for pause and resume mutations.
- Tracked pause/resume pending state per agent so one active mutation
does not disable every sidebar row.
- Disabled direct sidebar resume for budget-paused agents and labeled
that state clearly.
- Added jsdom coverage for active-agent pause, paused-agent resume,
per-agent pending state, and budget-paused resume protection.
- Added visual review artifacts for the default row and opened action
menu:
  - [Sidebar row](docs/pr-screenshots/pr-4616/sidebar-agent-row.png)
- [Sidebar action
menu](docs/pr-screenshots/pr-4616/sidebar-agent-actions.png)

## Verification

- `pnpm exec vitest run ui/src/components/SidebarAgents.test.tsx`
- `pnpm --filter @paperclipai/server prepare:ui-dist`
- Browser screenshot pass against a temporary local trusted instance at
`http://127.0.0.1:3102` using Playwright.

## Risks

- Low risk: UI-only addition using existing agent pause/resume
endpoints. The main risk is layout crowding for very narrow sidebars,
mitigated by the icon-only trigger and existing truncation.
- Budget-paused agents now require a non-sidebar path to resume, which
is intentional to avoid accidentally restarting agents stopped by budget
enforcement.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell/browser access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:03:54 -05:00
Dotta f88f538e6d Keep manual routine runs visible in the runner inbox (#4615)
## Thinking Path

> - Paperclip coordinates recurring agent work through scheduled and
manual routines.
> - Manual routine runs are board-initiated work and should stay visible
to the human who kicked them off.
> - Routine execution issues are agent-assigned, so they can be filtered
away from a board user's inbox unless the user is recorded as touching
the work.
> - Coalesced or skipped active routine runs have the same visibility
problem because they reuse an existing live issue.
> - This pull request carries the manual runner actor into routine
dispatch and touches the linked issue for that user's inbox.
> - The benefit is that manually triggered routine work stays
discoverable by the operator who started it.

## What Changed

- Passed the board or agent actor from the routine run route into the
routine service.
- Recorded manual board runners as `createdByUserId` on fresh routine
execution issues.
- Touched coalesced or skipped active routine issues for the manual
runner by updating read state and clearing that user's inbox archive.
- Added route and service regressions for manual routine run actor
propagation and inbox visibility.

## Verification

- `pnpm exec vitest run server/src/__tests__/routines-routes.test.ts
server/src/__tests__/routines-service.test.ts`

## Risks

- Low risk: the change is scoped to manual routine runs and only updates
issue attribution/read-state metadata for the initiating actor.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:03:24 -05:00
Dotta 68c37660f0 Dispatch assigned todo work during recovery sweeps (#4614)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies.
> - Agent assignments must reliably turn into heartbeat work without
board operators manually nudging stuck tasks.
> - The stranded-assignment recovery sweep already handles failed or
lost runs.
> - But assigned `todo` issues with no prior run could sit idle because
there was nothing to retry or recover.
> - This pull request dispatches those never-started assigned todos as
normal assignment wakes.
> - The benefit is that recovery fixes missed initial dispatches without
creating unnecessary recovery issues.

## What Changed

- Added an initial assigned-todo dispatch path to the recovery service
when an assigned `todo` issue has no heartbeat run yet.
- Reused invocation budget hard-stop checks before dispatching or
requeueing recovery work.
- Counted `assignmentDispatched` in startup/scheduled recovery logs.
- Added heartbeat recovery regressions for first dispatch, duplicate
queued wake prevention, budget-blocked skips, and paused-agent skips.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`

## Risks

- Low to medium risk: this changes liveness recovery behavior for
assigned `todo` issues, but it stays on the existing assignment wake
path and skips paused or budget-blocked agents.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-27 20:02:44 -05:00
Dotta 7a9b3a6037 [codex] Harden recovery issue handling (#4600)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The control plane must recover stranded agent work without creating
new operational loops
> - Stranded recovery issues can themselves fail, and exposing raw retry
errors in comments can leak sensitive adapter details
> - New local companies also should not force a hire-approval gate
unless operators enable that policy
> - This pull request hardens recovery issue handling, redacts retry
failure details in issue copy, preserves `maxConcurrentRuns: 1`, and
flips new-hire approval to an opt-in default
> - The benefit is safer automatic recovery and smoother default company
setup without hidden migration conflicts

## What Changed

- Added migration `0071_default_hire_approval_off` and updated company
schema/import/export/docs so hire approvals default off and serialize
only when enabled.
- Added migration `0072_large_sandman` with a partial unique index
preventing duplicate active stranded recovery issues for the same source
issue.
- Blocked failed `stranded_issue_recovery` issues in place instead of
creating nested recovery issues.
- Redacted latest retry failure details from recovery issue comments
while still linking reviewers to run evidence.
- Allowed `maxConcurrentRuns: 1` to be honored by heartbeat concurrency
normalization.
- Added focused regression coverage for recovery recursion, redaction,
migration ordering, and concurrency behavior.

## Verification

- `pnpm --filter @paperclipai/db run check:migrations`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/recovery-classifiers.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-portability.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-permissions-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-process-recovery.test.ts --pool=forks
--poolOptions.forks.isolate=true` exits 0, but this host skipped the
embedded Postgres tests with the existing init guard.
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
--pool=forks --poolOptions.forks.isolate=true` exits 0, but this host
skipped the embedded Postgres tests with the existing init guard.

## Risks

- Migration risk is low but this PR intentionally owns both new
migrations to avoid separate PR migration-journal conflicts.
- Recovery comments now require operators to inspect linked run evidence
for details instead of reading raw errors inline.
- The hire approval default changes behavior for newly created/imported
companies only; existing persisted company settings are not changed
except by the SQL default for future rows.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 15:02:47 -05:00
Dotta 6ccf80bcf2 [codex] Reject stale company skill refreshes (#4601)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Company skills are part of the reusable agent capability layer
> - Skill inventory refresh work can outlive the company it was
requested for
> - Without an explicit company existence check, stale refreshes can
continue into bundled/local skill cleanup for deleted or missing
companies
> - This pull request makes company-skill listing fail fast when the
company no longer exists
> - The benefit is clearer API behavior and less stale background work
against missing company scope

## What Changed

- Added a company existence check before `companySkillService.list()`
refreshes bundled and local-path skill state.
- Added regression coverage asserting missing companies return `404
Company not found`.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-skills-service.test.ts --pool=forks
--poolOptions.forks.isolate=true` exits 0, but this host skipped the
embedded Postgres tests with the existing init guard.

## Risks

- Low risk. Existing callers for valid companies are unchanged.
- Missing-company callers now receive an explicit 404 instead of
continuing refresh work.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-27 13:19:38 -05:00
Dotta d95968a9f8 [codex] Ignore stale stored company selections (#4602)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board UI is the operator’s control surface for selecting the
active company
> - A company id stored in localStorage can become stale across resets,
imports, or deleted companies
> - Exposing that stale id before companies load can briefly put
downstream UI in an invalid company scope
> - This pull request defers selected-company exposure until the loaded
company list validates the stored id
> - The benefit is a cleaner company-selection bootstrap path and fewer
transient invalid API requests

## What Changed

- Initialized `CompanyProvider` selection as `null` until companies
finish loading.
- Reused a stored company id only when it exists in the loaded
selectable company list.
- Cleared storage and selected state when no companies are available.
- Added jsdom regression coverage for stale stored ids before and after
company loading.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/context/CompanyContext.test.tsx`

## Risks

- Low risk. The change only affects selection bootstrap and keeps valid
stored selections intact.
- There may be a slightly longer initial `null` selected-company state
while the company list is loading.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 13:18:21 -05:00
Dotta 15c0ce3722 Add Twitter/X link to READMEs (PAP-2475) (#4593)
## Summary

- Adds [@papercliping on X](https://x.com/papercliping) next to the
existing Discord link in the top-of-file nav and the bottom
**Community** list of both `README.md` and `cli/README.md`.
- The `cli/README.md` change keeps the npm-published readme consistent
with the GitHub one.

Resolves PAP-2475.

## Test plan

- [ ] Render `README.md` on GitHub and confirm the new Twitter link in
the header strip and the Community section.
- [ ] Render `cli/README.md` (preview on GitHub or via npm) and confirm
the same.
- [ ] Click both Twitter links and verify they land on
`https://x.com/papercliping`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 09:35:33 -05:00
Dotta ecd92af001 release: v2026.427.0 notes (#4590)
## Summary

- Adds `releases/v2026.427.0.md` covering the 77 PRs since `v2026.416.0`
- Highlights: multi-user access + invites, structured issue-thread
interactions, run liveness continuations, sub-issues as a workflow
checklist, issue subtree pause/cancel/restore, first-class issue
references
- Beta Features: Environments + pluggable sandbox providers (incl.
`@paperclipai/plugin-e2b`)
- Notes 14 additive migrations (`0057`–`0070`) in the Upgrade Guide; no
breaking changes flagged

Source issue: [PAP-2476](/PAP/issues/PAP-2476)

## Test plan
- [ ] Spot-check that each linked PR actually landed in the v2026.427.0
range
- [ ] Confirm migration list matches `server/src/database/migrations`
- [ ] Verify rendered markdown looks right on GitHub

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 09:16:20 -05:00
Dotta 215b6cd161 [codex] Add security role route coverage (#4589)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Agent creation accepts roles that become part of the agent contract
and telemetry.
> - The shared role list already includes the security role.
> - Direct agent creation should preserve that role through route
handling and analytics metadata.
> - This pull request adds route coverage for creating a security-role
agent and asserting telemetry receives the same role.
> - The benefit is regression coverage for security agents without
changing the production route behavior.

## What Changed

- Added a server route test that creates an agent with `role:
"security"`.
- Asserted the create payload and telemetry metadata preserve `security`
as the agent role.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-skills-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`

## Risks

- Low risk; test-only coverage.
- No runtime behavior, schema, or API contract changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:49:59 -05:00
Dotta 53396f272a [codex] Fix sub-issue progress summary styling (#4588)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The issue list and issue detail surfaces summarize child/sub-issue
progress for operators.
> - Those summaries need to be compact and visually consistent because
they appear in dense lists.
> - The progress strip is most useful when there are multiple sub-issues
to compare, so the summary intentionally stays hidden for a single
sub-issue.
> - This pull request tightens the sub-issue progress summary styling
and updates the related tests.
> - The benefit is a cleaner, more scannable task list without changing
task ownership, status, or workflow behavior.

## What Changed

- Adjusted sub-issue progress summary copy/styling in the issue list and
detail summary helpers.
- Intentionally render the progress summary only for two or more child
issues; a single child issue still appears in the normal sub-issue list
without a redundant progress strip.
- Updated the UI tests that assert the rendered summary behavior.
- Clarified the two-plus-child threshold in code with a named constant.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssuesList.test.tsx
ui/src/lib/issue-detail-subissues.test.ts`

## Screenshots

![Before/after comparison of sub-issue progress summary
styling](https://gist.githubusercontent.com/cryppadotta/3a0aded379de3515acd3360bd54638e0/raw/cd26b5bd63ee65d01334f6c8ad88b1c831eb5d8f/pap-2449-subissue-progress-before-after.svg)

## Risks

- Low risk; this is a small UI presentation change with focused test
coverage.
- The intentional threshold change means parents with exactly one child
no longer show the aggregate progress strip, avoiding redundant summary
chrome while keeping the child visible in the list.
- No schema or API behavior changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:48:26 -05:00
Dotta fda296ee4f [codex] Add configurable liveness auto-recovery controls (#4587)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Heartbeat liveness recovery decides when stalled issue trees need
manager-visible follow-up.
> - Automatic recovery issue creation is useful, but operators need
instance-level controls for how aggressive it is.
> - Without controls, recovery behavior is harder to tune for local
development, production operations, and noisy edge cases.
> - This pull request adds configurable liveness auto-recovery settings
across shared contracts, API routes, services, and the instance
experimental settings UI.
> - The benefit is that operators can keep liveness findings advisory or
enable bounded recovery automation with explicit intervals and lookback
windows.

## What Changed

- Added shared types and validators for liveness auto-recovery settings.
- Extended instance settings routes and services to persist and validate
the new controls.
- Wired heartbeat/recovery services to honor enablement, minimum
interval, and lookback settings.
- Added UI controls for liveness recovery under instance experimental
settings.
- Covered the new server behavior with instance settings and liveness
escalation tests.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
server/src/__tests__/instance-settings-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Moderate behavioral risk because recovery automation timing changes
when enabled; defaults keep existing advisory behavior unless the
setting is turned on.
- No database migration in this PR; settings are stored through the
existing instance settings path.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:46:44 -05:00
Neeraj Kumar Singh B f0f9460d1d docs: AWS ECS Fargate deployment runbook (#3897)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies and ships
a
>   "local-first, cloud-ready" deployment model
> - The deploy docs currently cover local/Docker but not a production
> cloud target, so teams asking "how do I put this behind a real domain"
>   have no canonical path
> - We already support Docker images, RDS-compatible Postgres, and an
EFS
>   storage profile, so AWS ECS Fargate is a natural fit
> - Without a runbook, each team reinvents VPC, security groups, TLS,
and
>   secrets wiring and usually gets at least one step wrong
> - This pull request adds `docs/deploy/aws-ecs.md`, an ECS
task-definition
> template, and an `.env.aws.example`, cross-linked from the deploy
overview
> - The benefit is a single, reproducible ~$110/mo path to a production
>   deployment, plus a full teardown for throwaway environments

## What Changed

- New `docs/deploy/aws-ecs.md` — an 11-step ECS Fargate runbook covering
ECR,
  VPC, RDS, EFS, Secrets Manager, IAM, ALB, and ECS service with the
  deployment circuit breaker enabled
- New `docker/ecs-task-definition.json` — Fargate-ready task definition
with
  `<ACCOUNT_ID>`, `<REGION>`, `<EFS_ID>`, `<DOMAIN>` placeholder tokens
- New `docker/.env.aws.example` — documents every non-secret env var the
  ECS deployment needs
- `docs/deploy/overview.md` — one-line cross-reference to the new guide
- Greptile feedback addressed in follow-up commits:
  - `containerName` in the service-create call now matches
    `paperclip-server` in the task definition
  - HTTP :80 listener added that 301-redirects to :443
  - Dedicated RDS DB subnet group created before `create-db-instance`
  - EFS teardown polls on mount-target deletion instead of `sleep 30`

## Verification

- Walked every step of the runbook against the task definition to
confirm
  variable names (`$ALB_SG`, `$ECS_SG`, `$RDS_SG`, `$EFS_SG`, `$TG_ARN`,
`$LISTENER_ARN`, `$HTTP_LISTENER_ARN`, `$EFS_ID`, `$RDS_ENDPOINT`, etc.)
are
  defined before they are referenced
- Confirmed the `containerName` in Step 10 (`paperclip-server`) matches
  `docker/ecs-task-definition.json` line 11
- Confirmed the `sed` placeholder substitution in Step 8 matches the
tokens
  in the task definition template
- Teardown order was checked in reverse-dependency order: ECS service →
  listeners → target group → ALB → RDS (waits for deletion) → DB subnet
  group → EFS mount targets (polled) → EFS → secrets → SGs → ECR → IAM →
  log group

## Risks

- **Low risk for the repo.** Docs-only change plus two template files
under
`docker/`; no runtime code paths are touched and nothing is imported by
  the build.
- **Risk for users who follow the runbook:** AWS bills accrue
immediately
  once RDS/ALB/EFS exist. The runbook calls this out and includes a full
  teardown procedure. Placeholder tokens (`<ACCOUNT_ID>`, `<REGION>`,
`<EFS_ID>`, `<DOMAIN>`) are documented so nothing is silently
hard-coded.

## Model Used

- Claude (Anthropic), model `claude-opus-4-6`, ~200K context window,
extended thinking mode on, used with tool access (file edit, shell) via
Claude Code. The Greptile follow-up commits were authored the same way.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass — N/A for docs/config
templates; validated by reading
- [x] I have added or updated tests where applicable — N/A for docs
- [x] If this change affects the UI, I have included before/after
screenshots — N/A, no UI
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 08:41:47 -05:00
Dotta 1d8c7a09b8 [codex] Add security role route regression (#4586)
## Thinking Path

> - Paperclip orchestrates AI agents through company-scoped
control-plane workflows.
> - Agent creation is one of the core board/operator surfaces for
defining who works in a company.
> - The shared taxonomy now includes a first-class `security` agent
role.
> - Direct agent creation must preserve that role through default
instruction materialization and telemetry.
> - A prior replacement PR covered this path, but Greptile identified
that the route-test mock could let a future patch object shadow the
regression.
> - This pull request reopens the narrow regression coverage from
current `master` with the mock ordering fixed.
> - The benefit is a focused guardrail that keeps `security` role
creation observable without expanding the production diff.

## What Changed

- Added a direct agent creation route regression test for `role:
"security"`.
- Verified telemetry receives `agentRole: "security"` after the default
instruction materialization update path.
- Ordered the regression mock as `...patch` before `role: "security"` so
future patch fields cannot shadow the asserted role.

## Verification

- `pnpm install --frozen-lockfile` to link dependencies in the fresh
worktree; it completed with existing plugin SDK bin warnings.
- `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts
packages/shared/src/adapter-types.test.ts`

## Risks

- Low risk. This is test-only coverage and does not change runtime
behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 based coding agent, tool-enabled with local shell
and repository editing capabilities.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no UI changes)
- [x] I have updated relevant documentation to reflect my changes (N/A:
test-only regression)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:11:52 -05:00
Devin Foley d2cbe2cb23 Prefer pushing feature branches to a user fork in paperclip-dev skill (#4572)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The `paperclip-dev` skill is the canonical reference agents read
before doing development work on the Paperclip repo itself
> - Today the skill assumes feature branches get pushed to `origin` (=
`paperclipai/paperclip`), which clutters the upstream branch list when
contributors actually have personal forks
> - This is the standard open-source contribution pattern (push to fork,
PR upstream) and the skill should reflect it
> - This pull request adds a "Forks — Prefer Pushing to a User Fork"
section that teaches agents to detect a fork remote, push there, and
only fall back to `origin` when no fork is configured
> - The benefit is cleaner upstream branch hygiene and behavior that
matches typical contributor workflows without any code/runtime change

## What Changed

- Added a new **Forks — Prefer Pushing to a User Fork** section to
`skills/paperclip-dev/SKILL.md` covering:
- How to detect a user fork via `git remote -v` (treat any
non-`paperclipai` GitHub remote as the fork)
  - How to push to the fork (`git push -u <fork-remote> HEAD`)
- How to create the PR from the fork (`gh pr create --repo
paperclipai/paperclip --head <fork-owner>:<branch>`)
- The no-fork fallback (push to `origin`, do not auto-create a fork —
ask first)
  - Keeping the fork's `master` in sync
- Added a reinforcing entry to the **Common Mistakes** table linking
back to the new section

## Verification

- Docs-only change to a single markdown skill file. Reviewer can confirm
by reading the diff in `skills/paperclip-dev/SKILL.md`:
- New `## Forks — Prefer Pushing to a User Fork` section sits between
`## Worktrees` and `## Pull Requests`
  - New row appended to the `## Common Mistakes` table
- No tests, no build, no runtime behavior affected.

## Risks

- Low risk. Documentation-only edit. The instructions are advisory —
they only change agent behavior on future runs that read the skill.

## Model Used

- Provider: Anthropic (Claude)
- Model ID: `claude-opus-4-7` (Claude Opus 4)
- Capabilities: tool use (file read/edit, shell, git, gh CLI), extended
reasoning
- Context: invoked via Claude Code / Paperclip heartbeat for issue
PAPA-139

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (N/A — docs-only change; no
test surface)
- [x] I have added or updated tests where applicable (N/A)
- [x] If this change affects the UI, I have included before/after
screenshots (N/A — no UI change)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 22:19:07 -07:00
Dotta 82e257c7ba Cancel stale queued heartbeats when issue graph changes (PAP-2314) (#4534)
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 21:17:38 -05:00
Devin Foley 868d08903e test: isolate CLI company import e2e state (#4560)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and its
CLI import/export path is part of how operators move company state
safely between environments.
> - The `paperclipai company import/export` e2e test is supposed to
validate that portability flow inside a hermetic harness, not against a
developer's live Paperclip home.
> - This regression showed nested CLI subprocesses could silently fall
back to ambient `PAPERCLIP_*` state and mutate a real local instance by
creating extra companies such as `CLI-1-Roundtrip-Test`.
> - The first job was to pin the test subprocesses to isolated config,
home, instance, auth, and context paths, and to add a regression
assertion that proves the nested CLI writes stay inside the test-owned
state.
> - Once the PR was up, CI and Greptile exposed two follow-on issues
that were blocking merge: plugin SDK typecheck bootstrap was racing
across packages in fresh CI, and the new lock helper needed one more fix
to release its lock on failure.
> - This pull request therefore ends up doing two tightly related
things: fixing the original CLI isolation leak, and hardening the
supporting typecheck/bootstrap path enough for the fix to verify cleanly
in CI.
> - The benefit is that the portability e2e test is now actually
isolated, and the PR verification path is stable enough to catch
regressions instead of introducing its own nondeterministic failures.

## What Changed

- Hardened `cli/src/__tests__/company-import-export-e2e.test.ts` so
nested CLI subprocesses re-seed isolated `PAPERCLIP_CONFIG`,
`PAPERCLIP_HOME`, `PAPERCLIP_INSTANCE_ID`, `PAPERCLIP_CONTEXT`,
`PAPERCLIP_AUTH_STORE`, and throwaway `HOME` values instead of falling
back to ambient machine state.
- Added a regression assertion around `paperclipai context set --json`,
then cleared the temporary `context.json` so the isolation check and the
later export/import flow stay independent.
- Passed the same isolated `HOME` into the server subprocess so both
sides of the e2e harness are symmetric.
- Introduced locking in `scripts/ensure-plugin-build-deps.mjs` and
switched the server/plugin example `typecheck` scripts to use that
helper instead of launching concurrent raw `@paperclipai/plugin-sdk`
builds.
- Fixed the helper failure path so it releases the lock before exiting
non-zero, which prevents stale-lock timeouts during parallel typecheck
runs.

## Verification

- `pnpm vitest run cli/src/__tests__/company-import-export-e2e.test.ts
--project paperclipai`
- `pnpm --filter paperclipai typecheck`
- `pnpm -r typecheck`
- PR checks now pass on the current head, including `policy`, `verify`,
`e2e`, `security/snyk`, and `Greptile Review`.

## Risks

- Low risk. The product-facing behavior change is scoped to test harness
code in the CLI e2e suite.
- The CI stabilization changes only affect bootstrap/typecheck helper
paths for the server and plugin/example packages, but they do touch
shared verification plumbing; the main risk is changing how fresh build
artifacts are prepared in local/CI typecheck runs.

## Model Used

- Anthropic Claude via Paperclip `claude_local`, model
`claude-opus-4-7`, high-effort local coding agent, used for the initial
implementation and first peer-reviewed verification.
- OpenAI Codex via Paperclip `codex_local`, model `gpt-5.4`, high
reasoning-effort local coding agent with tool use, used for CI triage,
Greptile follow-up fixes, verification, and PR maintenance.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 19:10:01 -07:00
Devin Foley 1d9f7a5149 Fix flaky heartbeat recovery teardown CI failure (#4559)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The linked CI job is in the server test/recovery path, where
heartbeat runs and issue cleanup need to leave the control plane in a
consistent state even when retries fail.
> - In this case the failure was not runtime product behavior but test
teardown behavior inside `heartbeat-process-recovery.test.ts`.
> - The failing GitHub Actions job showed a foreign-key race on
`company_skills_company_id_companies_id_fk` while the test tried to
delete the parent company record.
> - The surrounding teardown code already uses bounded retry cleanup for
other dependent tables (`issues`, `heartbeatRuns`, and `agents`) because
this test file intentionally exercises asynchronous recovery flows.
> - This pull request applies that same retry pattern to the final
`db.delete(companies)` step, re-clearing `companySkills` before each
retry.
> - The benefit is a targeted fix for the CI flake without changing
runtime behavior or expanding the scope beyond the failing teardown
path.

## What Changed

- Wrapped the final `db.delete(companies)` call in
`server/src/__tests__/heartbeat-process-recovery.test.ts` with the same
5-attempt retry pattern already used elsewhere in that teardown.
- Re-cleared `companySkills` before each company-delete retry so
late-arriving FK-dependent rows do not mask the real test result.
- Verified the fix against the originally failing
`heartbeat-process-recovery` test file and the broader `pnpm test:run`
command under CI-like env conditions.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`
- Re-ran `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts` multiple times
locally; the previously failing teardown stayed green.
- `env -u PAPERCLIP_API_URL -u PAPERCLIP_RUNTIME_API_URL -u
PAPERCLIP_RUN_ID -u PAPERCLIP_TASK_ID -u PAPERCLIP_AGENT_ID -u
PAPERCLIP_COMPANY_ID -u PAPERCLIP_API_KEY -u PAPERCLIP_WAKE_REASON -u
PAPERCLIP_WAKE_COMMENT_ID -u PAPERCLIP_WAKE_PAYLOAD_JSON -u
PAPERCLIP_APPROVAL_ID -u PAPERCLIP_APPROVAL_STATUS pnpm test:run`

## Risks

- Low risk. The change is test-only and scoped to teardown retry
behavior in a single server test file.
- If the underlying async cleanup behavior changes again, this test
could still become flaky in a different way, but this PR addresses the
specific FK race seen in the linked CI job.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI `gpt-5.4` via Paperclip `codex_local`, high reasoning mode,
with tool use for shell, git, HTTP API calls, and patch application.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 17:30:20 -07:00
Devin Foley 8145141c55 Fix external issue URL rewriting in markdown (#4558)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Issue and comment rendering is part of the board UI where humans
supervise and inspect agent work.
> - External Paperclip issue URLs can appear in comments as references
to other runs, review threads, or remote test environments.
> - Those links must preserve their full destination, including origin,
port, and `#comment-...` fragments, or the operator is taken to the
wrong place.
> - The bug here was that absolute `http(s)` issue URLs were being
normalized into internal `/issues/...` routes in the markdown path.
> - This pull request stops rewriting absolute URLs while keeping
internal issue-reference behavior for relative paths and identifiers.
> - The benefit is that authored external links now navigate exactly
where the operator expects, especially for remote test and
comment-deep-link workflows.

## What Changed

- Stopped `ui/src/lib/issue-reference.ts` from treating absolute
`http(s)` URLs as internal issue paths.
- Added defense-in-depth in `ui/src/lib/mention-chips.ts` so absolute
`http(s)` URLs are never reclassified as issue mention chips.
- Updated `ui/src/lib/issue-reference.test.ts` to cover absolute
Paperclip URLs with preserved origin, port, and comment hash.
- Updated `ui/src/components/MarkdownBody.test.tsx` to assert the
reported URL renders as an external link, not an internal `/issues/...`
href.

## Verification

- `pnpm exec vitest run ui/src/lib/issue-reference.test.ts
ui/src/components/MarkdownBody.test.tsx`
- Expected result: `2` files passed, `37` tests passed.
- Manual spot-check from the issue report path: a URL like
`http://remote.example.test:3103/PAPA/issues/PAPA-115#comment-...`
should remain an external link with its full destination preserved.

## Risks

- Low risk. The change narrows when Paperclip rewrites URLs, so the main
risk is if some existing workflow depended on absolute `http(s)`
Paperclip URLs being converted into internal issue links. The added
regression coverage is aimed at preventing that from regressing
silently.

## Model Used

- OpenAI Codex local agent via Paperclip `codex_local`
- Backing model family: GPT-5-based Codex runtime
- Exact backend model ID/version: not exposed by this adapter/runtime
surface
- Context window: not exposed by this adapter/runtime surface
- Capabilities used: tool use, shell command execution, code editing,
git operations, and local test execution

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 17:19:23 -07:00
Devin Foley 54ab0d24cd Fix disappearing issue comments (#4557)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies, so issue detail
pages are a primary surface for understanding agent work and human
feedback.
> - The relevant subsystem here is the issue comments/chat experience
across the React issue detail page and the server comment pagination
API.
> - Long issue threads were only surfacing the newest page of comments
at first render, which hid earlier human and agent messages behind extra
pagination.
> - The first UI fix exposed that the descending cursor path on the
server could also fail for older-page fetches, leaving the chat tab
stuck on an infinite "Loading earlier comments..." state.
> - This needed to be addressed in both layers so the chat tab can
surface earlier conversation history without manual recovery and without
server errors.
> - This pull request auto-loads earlier comment pages in the issue
detail chat view and fixes the descending cursor predicate used by issue
comment pagination.
> - The benefit is that long-running issues like `PAPA-103` now show the
missing conversation history near the top of the chat surface instead of
hiding it or failing to load it.

## What Changed

- Auto-load earlier issue comment pages in the issue detail chat tab
until the thread reaches a 150-comment cap or there are no older
comments left.
- Add UI-side guard logic and regression coverage for optimistic issue
comment pagination so the autoload behavior stops cleanly.
- Replace the raw SQL descending cursor predicate in
`issueService.listComments` with typed Drizzle comparisons for the
`(createdAt, id)` anchor tuple.
- Add a server regression test that paginates earlier comments in
descending order from an anchor comment.
- Smoke-test the exact previously failing seeded `PAPA-103` cursor path
on the isolated dev instance used for review.

## Verification

- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/issues-service.test.ts`
- `pnpm --filter @paperclipai/server typecheck`
- Manual smoke against seeded `PAPA-103` data on the isolated dev
server:
- `GET /api/issues/PAPA-103/comments?order=desc&limit=50` returns `200`
- `GET
/api/issues/PAPA-103/comments?after=765d3609-edc6-4d11-a8fe-d466affbe85d&order=desc&limit=50`
now returns `200` with 50 comments instead of `500`

## Risks

- Moderate UI/perf risk on very large threads because the chat tab now
prefetches multiple earlier pages on mount; the cap is intentionally
limited to 150 comments to bound that work.
- Low API risk because the server fix only changes the cursor predicate
construction for anchor-based comment pagination, but any mistake there
would affect older-comment paging order.

> I checked `ROADMAP.md` before opening this PR and this bug fix does
not duplicate planned core work.

## Model Used

- OpenAI Codex coding agent in the Paperclip local adapter environment.
The exact backend model ID and context window were not exposed
in-session. Tool-assisted workflow included shell execution, git/GitHub
CLI, local test execution, and targeted code edits.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 16:23:53 -07:00
Devin Foley b2496c8067 fix(auth): trust allowed hostname port variants on detected listen port (#4554)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, so
authenticated board access has to be predictable across local and
worktree deployments.
> - This change sits in the authenticated-mode server startup and Better
Auth origin-trust wiring.
> - The original auth branch fixed one real gap by adding port-qualified
trusted origins for allowed hostnames on non-default ports.
> - Review of that branch found a second-order bug: trusted origins were
still derived from the configured port before startup detected the
actual listen port.
> - In isolated worktrees, that meant a common `3100 -> 3101` port shift
could still leave Better Auth trusting the stale origin.
> - This pull request keeps the original allowed-hostname port-variant
fix, then moves trust derivation onto the resolved listen port and adds
regression coverage around startup wiring.
> - The benefit is that authenticated sessions keep working on allowed
private hostnames even when Paperclip has to auto-shift to a different
local port.

## What Changed

- Added `:port` trusted-origin variants for authenticated-mode
`allowedHostnames` when Paperclip runs on non-default ports.
- Changed authenticated startup so `listenPort` is detected before
Better Auth initialization, and explicit auth base URLs are rewritten
before auth startup.
- Updated `deriveAuthTrustedOrigins()` to accept the resolved listen
port so Better Auth trusts the actual browser origin instead of the
stale configured port.
- Added focused regression coverage in
`server/src/__tests__/better-auth.test.ts` and
`server/src/__tests__/server-startup-feedback-export.test.ts`.

## Verification

- `pnpm exec vitest run server/src/__tests__/better-auth.test.ts
server/src/__tests__/server-startup-feedback-export.test.ts`
- Reviewer re-check: reviewed commits `380f5b9f` and `092bb34c` after
the follow-up fix landed and found no remaining issues.

## Risks

- Low risk: this only affects authenticated-mode origin derivation and
startup ordering around detected listen ports.
- Main behavioral shift: startup no longer mutates `config.port` to the
selected port; it now carries `requestedListenPort` separately and uses
`listenPort` where runtime behavior needs the resolved value.
- If another path was implicitly relying on `config.port` being
overwritten during startup, that path would need follow-up, though the
current startup/test coverage did not reveal one.

> I checked `ROADMAP.md` and did not find an overlapping planned core
work item for this auth trusted-origin port handling fix.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agents for implementation and
review. Exact backend model ID/context window were not surfaced in this
run context; work was performed through the Codex local adapter with
tool use, code execution, and review passes.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 15:40:39 -07:00
Devin Foley 08af830430 Tighten publicBaseUrl port rewriting (#4553)
## Thinking Path

> - Paperclip is a control plane for autonomous agent companies, so its
local and authenticated deployment behavior has to stay predictable
under port rebinding and worktree isolation.
> - This change sits in the server/worktree configuration path that
derives runtime URLs and auth origins from `auth.publicBaseUrl`.
> - The original hostname-port rewrite change fixed one real gap for
private/tailnet host:port worktree setups, but it widened the rewrite
rule too far.
> - Rewriting every explicit `auth.publicBaseUrl` can corrupt public or
reverse-proxy URLs by turning a stable origin like
`https://paperclip.example` into a local listen-port URL.
> - Paperclip's auth and trusted-origin handling depend on that URL
staying semantically correct, so this had to be narrowed before merge.
> - This pull request tightens the rewrite rule to explicit-port URLs
only and adds regression coverage across the CLI helper, worktree config
persistence, and server startup path.
> - The benefit is that private host:port worktree flows still work,
while public/default-port URLs remain stable and safe.

## What Changed

- Tightened `rewriteLocalUrlPort` in `cli/src/commands/worktree-lib.ts`,
`server/src/worktree-config.ts`, and `server/src/index.ts` so it only
rewrites URLs that already include an explicit port.
- Removed the old loopback-only hostname gate from the CLI/worktree
helpers and replaced it with the more precise `parsed.port` guard.
- Updated CLI helper coverage to assert that explicit-port non-loopback
URLs still rewrite while no-port public URLs stay unchanged.
- Expanded `server/src/__tests__/worktree-config.test.ts` to cover
explicit-port rewrite and no-port stability for both persisted worktree
config and in-memory runtime port selection.
- Added startup-path coverage in
`server/src/__tests__/server-startup-feedback-export.test.ts` for
`detect-port` rebinding with both explicit-port and no-port
`auth.publicBaseUrl` values.

## Verification

- `pnpm --filter @paperclipai/plugin-sdk build`
- `npx vitest run
server/src/__tests__/server-startup-feedback-export.test.ts`
- `npx vitest run cli/src/__tests__/worktree.test.ts
server/src/__tests__/worktree-config.test.ts`
- All of the above were run locally in this issue worktree and passed.

## Risks

- Low risk. The behavior change is deliberately narrower than the
reviewed broad-host rewrite and is guarded by regression coverage for
both the explicit-port and no-port cases.
- The main remaining risk is behavioral only if another code path starts
depending on port rewriting for URLs that never declared a port, which
would be a separate bug.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex local agent using `gpt-5.4` with high reasoning effort,
tool use, shell execution, and file editing.
- Anthropic Claude local agent using `claude-opus-4-6` for follow-up
code review approval on the implementation issue.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 14:29:22 -07:00
Devin Foley d47ffa87f0 Fix CEO AGENT_HOME paths and centralize workspace env propagation (#4551)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The local adapter layer is responsible for turning Paperclip runtime
context into the environment seen by the child agent process.
> - The CEO onboarding bundle tells the agent where to read and write
its persistent memory and fact files.
> - That bundle was using `./memory/...` and `./life/...`, which only
works when the process cwd happens to equal the agent home directory.
> - At the same time, six local adapters each duplicated the same
workspace-env propagation logic, including `AGENT_HOME`, which makes
this contract easy to drift.
> - This pull request fixes the CEO instructions to use
`$AGENT_HOME/...` and centralizes workspace-env propagation in one
shared helper with shared tests.
> - The benefit is a real bug fix for agent memory paths plus a single
tested contract that makes future built-in adapter work less likely to
forget `AGENT_HOME`.

## What Changed

- Updated `server/src/onboarding-assets/ceo/HEARTBEAT.md` to use
`$AGENT_HOME/memory/...` and `$AGENT_HOME/life/...` instead of
cwd-relative `./memory/...` and `./life/...`.
- Added `applyPaperclipWorkspaceEnv(...)` in
`packages/adapter-utils/src/server-utils.ts` to centralize
`PAPERCLIP_WORKSPACE_*` and `AGENT_HOME` propagation.
- Added shared helper coverage in
`packages/adapter-utils/src/server-utils.test.ts` for both populated and
skip-empty cases.
- Switched the built-in local adapters (`claude_local`, `codex_local`,
`cursor_local`, `gemini_local`, `opencode_local`, `pi_local`) over to
the shared helper instead of inline env assignment blocks.

## Verification

- `pnpm install`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/codex-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- Result: 7 test files passed, 31 tests passed, 0 failures.

## Risks

- Low risk.
- The only behavioral surface is the shared env propagation refactor
across six adapters; if the helper diverged from prior semantics, an
adapter could miss a workspace env var.
- The shared helper test plus the affected adapter execute tests reduce
that risk, and the helper preserves the prior "set only non-empty
strings" behavior.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agent runtime; tool-assisted
coding workflow with shell execution, file patching, git operations, and
API interaction. The exact backend model identifier and context window
are not surfaced by this local runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 13:57:35 -07:00
Devin Foley d1484551ee Add open-source hygiene note to paperclip-dev skill (#4541)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The `paperclip-dev` skill is part of the contributor and agent
workflow layer that tells developers how to work in this repository
safely.
> - That skill already references the public upstream `origin`, but it
did not explicitly say that pushes there must be treated as publishable
open-source output.
> - Without that reminder, contributors are more likely to leak secrets,
PII, private logs, machine-local config, or noisy throwaway git history
into the public repo.
> - This pull request adds a prominent `OPEN SOURCE HYGIENE` callout
near the top of the skill, before the git workflow guidance.
> - The benefit is clearer safety guidance for contributors and less
accidental disclosure or branch/commit noise on the upstream project.

## What Changed

- Added an `OPEN SOURCE HYGIENE` callout near the top of
`skills/paperclip-dev/SKILL.md`.
- Explicitly warned that anything pushed to `origin` must be
publishable.
- Called out avoiding secrets, API keys, PII, private logs,
machine-local config, and noisy throwaway branches or checkpoint
commits.

## Verification

- N/a

## Risks

- Low risk. This is a docs-only change in a skill file; the main risk is
wording tone or placement, not runtime behavior.

## Model Used

- OpenAI Codex via the `codex_local` Paperclip adapter, GPT-5-based
coding agent runtime. Exact backend serving model ID is not exposed in
this heartbeat environment. Tool use, shell execution, and patch
application were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 12:14:49 -07:00
Devin Foley 91333ec86f feat: add paperclip-dev skill with optional bundled skill support (#3854)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents working on the Paperclip codebase itself need guidance on dev
workflows: server lifecycle, worktrees, builds, database ops,
diagnostics
> - There was no bundled skill covering these workflows — agents had to
figure it out from scratch each time
> - Additionally, not every skill should be force-installed on every
agent — a dev-focused skill should be opt-in
> - This PR adds a `paperclip-dev` skill with `required: false`
frontmatter so it ships with Paperclip but isn't auto-installed
> - The skill's PR section references canonical files
(`.github/PULL_REQUEST_TEMPLATE.md`, `CONTRIBUTING.md`) instead of
duplicating their content, with gated instructions that force agents to
read those files before creating any PR
> - The benefit is that developers (human or agent) can opt in to
structured dev guidance without polluting the default agent skill set or
creating drift between duplicated docs

## What Changed

- Added `skills/paperclip-dev/SKILL.md` covering server management,
worktree lifecycle, builds, database ops, diagnostics, agent operations,
and common mistakes
- The Pull Requests section uses gated, reference-based instructions —
agents MUST read `.github/PULL_REQUEST_TEMPLATE.md` and
`CONTRIBUTING.md` before running `gh pr create`, with a brief checklist
of required section names (no content duplication)
- Updated `packages/adapter-utils/src/server-utils.ts` to respect
`required: false` frontmatter — optional skills are bundled but not
auto-installed on agents
- Added test in `server/src/__tests__/paperclip-skill-utils.test.ts`
verifying that optional skills are excluded from the default install set

## Verification

```bash
# Run tests
pnpm test

# Manual verification: create a fresh worktree without seeding
npx paperclipai worktree:make test-optional-skill --no-seed
cd ~/paperclip-test-optional-skill
eval "$(npx paperclipai worktree env)"
npx paperclipai run

# Verify paperclip-dev appears in company skill library but is NOT auto-assigned
# Call listPaperclipSkillEntries() — paperclip-dev should show required: false
# Call resolvePaperclipDesiredSkillNames() — paperclip-dev should NOT be in the default set

# Cleanup
npx paperclipai worktree:cleanup test-optional-skill
```

## Risks

- Low risk. The `required` field defaults to `true` when absent, so all
existing skills behave identically. Only the new `paperclip-dev` skill
sets `required: false`.

## Model Used

Claude Opus 4.6 (`claude-opus-4-6`) via Claude Code, with tool use and
extended context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-26 11:06:13 -07:00
Dotta c036bbfa98 Add first-class security agent role to taxonomy (#4532)
## Thinking Path

> - Paperclip is the control plane for AI-agent companies, so agent
metadata is part of the platform's governance and audit surface.
> - The shared agent taxonomy in `@paperclipai/shared` is the source of
truth for allowed agent roles and their UI labels.
> - The current taxonomy lacks a `security` role, which causes Security
Engineer hires to collapse into `engineer`.
> - That breaks separation-of-duties evidence in telemetry and weakens
role-level audit fidelity even though it does not directly change
permissions.
> - This pull request adds `security` as a first-class shared role and
covers the prior rejection path with a regression test.
> - The benefit is that Security Engineer agents can now be persisted
and rendered under the correct role without schema or permission churn.

## What Changed

- Added `security` to `AGENT_ROLES` in
`packages/shared/src/constants.ts`.
- Added the `Security` display label to `AGENT_ROLE_LABELS` so existing
UI consumers render the new role automatically.
- Added a shared validator regression test proving `createAgentSchema`
accepts `role: "security"` and that the label stays stable.

## Verification

- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/shared exec vitest run
src/adapter-types.test.ts`

## Risks

- Low risk. This is a shared enum expansion with no database migration
and no permission-model change.
- Residual risk: this PR does not backfill existing agents already
persisted as `engineer`; it only fixes new validations and labels going
forward.

> I checked `ROADMAP.md`/`doc` for overlap and did not find an existing
planned item covering this taxonomy fix.

## Model Used

- OpenAI GPT-5.4 via the Codex local adapter, with tool use and local
code execution enabled. The runtime did not surface a separate
context-window identifier in agent metadata.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 07:52:05 -05:00
Dotta df425fde96 Present ordered sub-issues as a workflow checklist (#4523)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators use issue detail pages and child issue lists to understand
multi-step execution plans.
> - Ordered sub-issues currently read like a flat table, so dependency
chains and current next steps are harder to scan.
> - The branch work adds a workflow-oriented presentation for child
issues without changing the single-assignee task model.
> - This pull request makes ordered sub-issues read more like a progress
checklist while preserving normal issue list controls.
> - The benefit is that operators can see completed steps, active work,
blocked follow-ups, and dependency order at a glance.

## What Changed

- Added workflow sorting utilities and tests for dependency-aware child
issue ordering.
- Added sub-issue progress summary, checklist numbering, current-step
affordances, blocker context, and done-state de-emphasis in the issue
list UI.
- Wired issue detail sub-issue panels to use the workflow sort/progress
checklist presentation.
- Updated issue service behavior/tests for child issue ordering inputs
used by the UI.
- Added a Storybook visual review fixture and screenshot helper for the
sub-issue workflow checklist surface.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/issues-service.test.ts
ui/src/components/IssueRow.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/pages/IssueDetail.test.tsx
ui/src/lib/issue-detail-subissues.test.ts
ui/src/lib/workflow-sort.test.ts`
- Result: 6 test files passed, 55 tests passed, 34 embedded Postgres
issue-service tests skipped because `@embedded-postgres/darwin-x64` is
unavailable on this host.
- Visual review: generated Storybook screenshots from the existing local
Storybook server on port 6006 with `node
scripts/screenshot-subissues.mjs /tmp/pap-2189-subissues-screens
http://localhost:6006`.
- Screenshot artifacts:
- Desktop dark: ![Desktop
dark](doc/assets/pap-2189/desktop-1440x900-dark.png)
- Desktop light: ![Desktop
light](doc/assets/pap-2189/desktop-1440x900-light.png)
- Mobile dark: ![Mobile
dark](doc/assets/pap-2189/mobile-390x844-dark.png)
- Mobile light: ![Mobile
light](doc/assets/pap-2189/mobile-390x844-light.png)
- Local Storybook note: starting a second Storybook process selected
port 6008 because 6006 was occupied, then Vite failed with an esbuild
host/binary version mismatch (`0.25.12` host vs `0.27.3` binary). The
already-running Storybook server on 6006 served the fixture successfully
for screenshots.

## Risks

- Medium UI risk: the issue list now has additional sub-issue-specific
visual states, so dense lists should be checked for spacing and
scanability.
- Low ordering risk: workflow sorting is covered by focused unit tests,
but unusual dependency topologies may still need reviewer attention.
- No migration risk: this PR does not add database migrations or touch
`pnpm-lock.yaml`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub
workflow. Context window is runtime-provided and not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-26 07:36:49 -05:00
Devin Foley 40782f703d Fix release packaging for standalone public packages (#4494)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and the
sandbox-provider work just moved E2B into a standalone publishable
plugin package.
> - That plugin is intentionally excluded from the root pnpm workspace
so it can model third-party install behavior without forcing lockfile
churn in the main repo.
> - The merged architecture change exposed a follow-up release problem:
the canary publish workflow tried to publish `@paperclipai/plugin-e2b`,
but the tarball had no `dist/` payload because standalone public
packages were not being built in the release path.
> - That means the release pipeline needed a packaging fix in core
release tooling, not another architectural change in the sandbox
provider itself.
> - This pull request adds a generic release step for public packages
that live outside the pnpm workspace, instead of hardcoding E2B-specific
behavior into the release script.
> - The benefit is that standalone publishable packages can be built and
packed correctly during release, including future sandbox-provider
plugins that follow the same pattern.

## What Changed

- Added `scripts/build-standalone-public-packages.mjs` to discover
public packages outside the pnpm workspace, run a clean package-local
install, and build them before publish.
- Updated `scripts/release.sh` to invoke that helper immediately after
the normal workspace build step.
- Kept the behavior generic by driving off the existing public package
map and pnpm workspace patterns rather than special-casing
`@paperclipai/plugin-e2b`.

## Verification

- `rm -rf packages/plugins/sandbox-providers/e2b/dist`
- `node ./scripts/build-standalone-public-packages.mjs`
- `cd packages/plugins/sandbox-providers/e2b && npm pack --dry-run`
- Confirm the tarball now includes the rebuilt `dist/` files instead of
only `README.md` / `package.json`

## Risks

- Low risk: this only changes the release build path for public packages
outside the pnpm workspace.
- The helper performs a clean package-local install for each standalone
public package, so release time may increase slightly as more such
packages are added.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, GitHub CLI, and local
build/test inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-25 12:16:23 -07:00