## Thinking Path
> - Paperclip orchestrates AI agents across isolated execution
workspaces; the local cwd is the only persistence boundary between runs.
> - Workspace lifecycle (worktree_prepare → execute →
workspace_finalize) and the wake/accept flow are what guarantee that
dependent issues see a consistent worktree.
> - PAPA-380 / PAPA-431 / PAPA-432 / PAPA-440 surfaced three holes in
that contract: silent env reuse across assignees, dependent wakes firing
before finalize, and `issue.interaction.accept` advancing before
finalize landed.
> - PAPA-441 / PAPA-442 then needed to document the "no remote git"
contract and prevent future adapter/runtime code from quietly
reintroducing `git push` as a backdoor sync.
> - This pull request lands those server fixes, the static
`check-no-git-push` enforcement, the AUTHORING.md cross-link, and the
Cody-review follow-ups on the PAPA-430 thread.
> - The benefit is that finalize is a real barrier — board accepts,
dependent wakes, and operator-set env all respect it — and adapter code
can't bypass it via raw `git push`.
## What Changed
- **server (PAPA-380, PAPA-431):** `execution-workspace-policy` refuses
silent env reuse when the assignee's resolved env disagrees with the
workspace it would inherit. The inheritance protection is now scoped to
the actual inheritance signal — explicit issue-level `environmentId` is
honored even when the agent's default env is `null`.
- **server (PAPA-432):** `heartbeat.ts` gates dependent wakes on
`listUnfinalizedExecutionWorkspaceIds`, and writes a
`workspace_finalize` row on the succeeded path. Write failures now
surface instead of being swallowed so dependents aren't silently
stranded behind a missing row.
- **server (PAPA-440):** `issue-thread-interactions.acceptInteraction`
adds a workspace_finalize precondition for `request_confirmation` (not
`suggest_tasks`). Accept returns 409 if finalize hasn't succeeded for
the latest workspace operation.
- **ci (PAPA-442):** new `scripts/check-no-git-push.mjs` static check
scans `packages/adapters/`, `packages/adapter-utils/`, `server/src/`,
and `cli/src/` for any `git push` invocation (string or args-array).
Wired into the `policy` PR job and `test:release-registry`. Operators
can opt in per-call with `// paperclip:allow-git-push: <reason>`.
Release scripts are out of scope by design.
- **docs (PAPA-441):** `AUTHORING.md` documents the no-remote-git
contract and cross-links the static check so adapter authors learn the
rule and the enforcement together.
- **review follow-up (PAPA-430, Cody):** three fixes — env resolver bug,
accept-gate scope (request_confirmation only), and finalize record write
on the succeeded path.
## Verification
- `pnpm exec vitest run
server/src/__tests__/execution-workspace-policy.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts` → 33/33
pass
- `node scripts/check-no-git-push.test.mjs` → check covers string form,
args-array form, comment exclusions, and per-line allow-comment.
- Manual: server compiles; the policy job runs the check in <1s before
heavier jobs.
## Risks
- **Behavioral shift in accept:** boards accepting
`request_confirmation` while finalize is in-flight now get 409s. This is
intentional — they can retry — but it changes timing on a hot path.
`suggest_tasks` is unaffected.
- **Workspace policy:** the env-reuse refusal is a new error path.
Issues that previously silently reused an env from a different-assignee
workspace will now fail-loud; the resolver still honors explicit
issue-level `executionWorkspaceSettings.environmentId`.
- **CI rule:** any future legitimate `git push` in scoped dirs must be
marked with the allow-comment, which is the intended ergonomic.
## Model Used
- Claude Opus 4.7 (`claude-opus-4-7`, extended thinking), via Claude
Code in the Paperclip executor adapter.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — server/CI/docs only)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
Closes related issues: PAPA-430, PAPA-380, PAPA-431, PAPA-432, PAPA-440,
PAPA-441, PAPA-442
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents through a control-plane repo that
relies on GitHub Actions as part of its release and verification safety
net.
> - The PR workflow in `.github/workflows/pr.yml` is the core CI path
protecting pull requests before merge.
> - Baseline measurement work in [PAPA-335](/PAPA/issues/PAPA-335)
showed the old single `verify` job was the critical-path bottleneck,
with general tests and build serialized together.
> - Follow-up implementation in [PAPA-338](/PAPA/issues/PAPA-338) and
[PAPA-339](/PAPA/issues/PAPA-339) split that work into parallel lanes
and removed redundant clean-runner prebuild work.
> - [PAPA-340](/PAPA/issues/PAPA-340) now needs real post-change PR
workflow evidence, not local inference, to compare against the May 15,
2026 baseline and decide whether phase-2 work is still justified.
> - This pull request publishes the already-implemented CI speedup
branch so GitHub can run the actual `PR` workflow against it.
> - The benefit is that CI timing decisions are based on measured runs
from the exact workflow shape we intend to ship.
## What Changed
- Split the PR workflow so `policy` fans out into separate `Typecheck +
Release Registry`, grouped `General tests`, and `Build` jobs.
- Kept the serialized server matrix, canary dry run, and e2e jobs intact
while removing the old monolithic `verify` bottleneck.
- Reworked grouped general-test execution in
`scripts/run-vitest-stable.mjs` so the workflow can run balanced
non-serialized lanes.
- Replaced redundant clean-runner prebuild gates with the idempotent
`ensure-build-deps` path used by the relevant CI entrypoints.
## Verification
- `ruby -e "require 'yaml'; YAML.load_file('.github/workflows/pr.yml');
puts 'yaml-ok'"`
- `node scripts/run-vitest-stable.mjs --mode general --dry-run`
- `node scripts/run-vitest-stable.mjs --mode general --group
general-server --dry-run`
- `node scripts/run-vitest-stable.mjs --mode general --group
general-workspaces-a --dry-run`
- `node scripts/run-vitest-stable.mjs --mode general --group
general-workspaces-b --dry-run`
- `pnpm test:run:general -- --group general-workspaces-b`
- `pnpm test:run:general -- --group general-workspaces-a`
- `pnpm test:run:general -- --group general-server`
- `pnpm run typecheck:build-gaps`
- `pnpm --filter @paperclipai/plugin-hello-world-example typecheck`
## Risks
- Required-check and branch-protection settings may still reference the
old single `verify` job name.
- Parallel CI lanes can expose hidden ordering assumptions or
clean-runner bootstrap gaps that local grouped dry-runs did not surface.
- Because the branch is behind current `master`, merge conflicts or
unrelated upstream drift could affect the measured runtime until the
branch is rebased.
> Checked `ROADMAP.md`; this work is CI throughput maintenance for the
existing PR verification path, not duplicate feature work.
## Model Used
- OpenAI Codex via Paperclip `codex_local`, GPT-5-class coding agent
with repository read/write, shell execution, and GitHub CLI/tool use.
The runtime does not expose a more specific backend model ID in-session.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents for autonomous companies, so
developer throughput on the control plane repo directly affects how fast
the product can evolve.
> - The PR workflow is part of that throughput surface because every
change waits on it before review and merge.
> - This branch started from measured evidence that the PR critical path
was dominated by work that was either serialized unnecessarily or placed
on the wrong part of the graph.
> - The biggest concrete problems were: the canary dry run living inside
`verify`, the server isolated suites running one-by-one in a single
lane, and duplicate CI work that the PR path was paying for without
increasing coverage proportionally.
> - This pull request restructures the PR workflow so those costs are
reduced without removing the important coverage that was already
protecting release and test quality.
> - Follow-up fixes on the branch hardened the new entrypoints so they
work on clean GitHub runners and so the reduced PR typecheck path stays
self-maintaining as workspace packages evolve.
> - The benefit is materially faster PR wall-clock time while keeping
canary packaging checks, serialized-suite isolation, plugin SDK
consumers, and explicit TypeScript coverage where builds do not already
provide it.
## What Changed
- Moved the PR canary dry run into its own `Canary Dry Run` job so it
still runs on PRs but no longer extends the `verify` critical path.
- Split the custom Vitest runner into `general`, `serialized`, and `all`
modes, and added shard support for the isolated server suites.
- Added `test:run:general` and `test:run:serialized` scripts, then
rewired PR CI to fan the serialized server suites out across a 4-way
matrix.
- Added the required `@paperclipai/plugin-sdk` build preflight before
the new reduced-scope typecheck and test entrypoints so they succeed on
clean CI runners.
- Replaced the hardcoded PR build-gap list with
`scripts/run-typecheck-build-gaps.mjs`, which discovers workspace
packages whose `build` scripts skip TypeScript and runs only their
explicit `typecheck` scripts.
- Removed the redundant `pnpm build` from the PR `e2e` job because the
Playwright onboarding path boots Paperclip from source.
## Verification
- `ruby -e "require 'yaml'; YAML.load_file('.github/workflows/pr.yml');
puts 'workflow ok'"`
- `node scripts/run-vitest-stable.mjs --mode general --dry-run`
- `node scripts/run-vitest-stable.mjs --mode serialized --shard-index 0
--shard-count 4 --dry-run`
- `pnpm run typecheck:build-gaps`
- `pnpm test:run:general`
- `pnpm test:run:serialized -- --shard-index 0 --shard-count 4`
- `pnpm build`
- `pnpm paperclipai onboard --yes --run`
- `curl http://127.0.0.1:3299/api/health`
## Risks
- Branch protection or required-check configuration may need to be
updated for the new standalone `Canary Dry Run` job and the
serialized-suite matrix job names.
- `scripts/run-typecheck-build-gaps.mjs` assumes packages that need
explicit PR-time typechecking are the ones whose `build` scripts omit
`tsc`; if build conventions change, that heuristic needs to stay
aligned.
- Serialized test sharding preserves per-suite isolation, but the first
few CI runs should still be watched for shard-balance or naming
assumptions in downstream tooling.
## Model Used
- OpenAI GPT-5.4 via the Codex local adapter, using high reasoning
effort with shell, git, and file-edit tool use in a local worktree.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip is a control plane for autonomous agent companies, so its
release automation is part of the core operator trust boundary.
> - The affected subsystem is npm/GitHub Actions release publishing for
the public monorepo packages.
> - The concrete failure was that a newly added package reached
`master`, the canary workflow attempted its first publish, and npm
trusted publishing was not yet bootstrapped for that package.
> - That means the problem is not just one broken run; it is a missing
pre-merge guard that lets release-ineligible packages land and only fail
once `publish_canary` runs.
> - This pull request makes release enrollment explicit, validates that
enrollment in CI, and adds a PR-time bootstrap check against npm for
changed release-enabled package manifests.
> - The result is that we keep trusted publishing, avoid teaching CI to
`npm adduser`, and move this class of failure from post-merge canary
time to pre-merge review time.
## What Changed
- Added `scripts/release-package-manifest.json` so release-managed
public packages are explicitly enrolled instead of being inferred from
every non-private workspace package.
- Hardened `scripts/release-package-map.mjs` to validate the manifest
before release workflows rewrite versions or assemble publish payloads.
- Added `scripts/check-release-package-bootstrap.mjs` and wired it into
`.github/workflows/pr.yml` so PRs that change a release-enabled package
manifest fail if that package does not already exist on npm.
- Added release-package manifest coverage tests to
`scripts/release-package-map.test.mjs` and included them in `pnpm run
test:release-registry`.
- Wired manifest validation into `.github/workflows/release.yml` and
documented the first-publish bootstrap policy in `doc/PUBLISHING.md` and
`doc/RELEASE-AUTOMATION-SETUP.md`.
## Verification
- `pnpm run test:release-registry`
- `./scripts/release.sh canary --skip-verify --dry-run`
- Confirmed the committed diff contains no obvious PII/secrets via
targeted pattern scan before pushing.
## Risks
- Low risk overall: this is CI/release-policy code, not product runtime
logic.
- The new PR bootstrap check depends on npm metadata availability, so a
transient npm outage could block a PR that changes a release-enabled
package manifest.
- The manifest introduces a new source of truth that must stay aligned
with public package additions, but that is intentional and now enforced.
## Model Used
- OpenAI Codex via the `codex_local` Paperclip adapter; GPT-5-based
coding agent with tool use, terminal execution, git, and GitHub CLI.
Exact served model ID/context window are not exposed by the local
runtime.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
## Thinking Path
> - Paperclip's board UI and bundled skills are the operator layer for
configuring agents, routines, issue workflows, and local troubleshooting
loops.
> - The prior rollup mixed this operator polish with database backups,
backend reliability, thread scale, and cost/workflow primitives.
> - This pull request isolates the remaining board QoL, settings,
issue-detail integration, adapter config cleanup, and skills smoke
tooling.
> - It includes some integration-level overlap with the thread and
workflow slices so this branch can run from `origin/master` while still
preserving the full original work.
> - Preferred merge order is the narrower primitives first, then this
integration PR last.
> - The benefit is that reviewers can inspect the user-facing
board/settings/skills layer separately from backend infrastructure
changes.
## What Changed
- Added board/settings polish for agents, routines, company settings,
project workspace detail, and issue detail controls.
- Added agent/routine UI regression tests and New Issue dialog coverage.
- Integrated issue-detail activity/cost/interaction surfaces and leaf
work pause/resume controls.
- Cleaned bundled adapter UI config defaults and onboarding copy.
- Added terminal-bench loop and work-stoppage diagnosis skills plus a
smoke test script.
- Updated attachment type handling and Paperclip skill/API guidance.
## Verification
- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/pages/Agents.test.tsx
ui/src/pages/Routines.test.tsx ui/src/components/NewIssueDialog.test.tsx
ui/src/pages/IssueDetail.test.tsx
server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts`
- Result: 7 test files passed, 54 tests passed.
- `pnpm run smoke:terminal-bench-loop-skill`
- Result: JSON output included `"ok": true` and `"cleanup": true`.
- UI screenshots not included because verification is focused
component/page coverage for the changed board surfaces.
## Risks
- This is the integration-heavy PR in the split and intentionally
overlaps some component/API primitives with the issue-thread and
workflow PRs so it can run from `origin/master`.
- Preferred merge order: #4859, #4860, #4861, #4862, then this PR last.
If earlier branches merge first, this PR may need a straightforward
conflict refresh in shared UI files.
- The terminal-bench smoke script creates temporary mock issues and
relies on cleanup; the verified run returned `cleanup: true`.
> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.
## Model Used
- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip's board UI is the operator surface for supervising
AI-agent companies.
> - Issue threads are where operators read progress, respond to agents,
inspect markdown, and jump through long histories.
> - Large threads and rich markdown had become difficult to navigate and
expensive to render.
> - The previous rollup mixed these UI scale fixes with unrelated
backend recovery, costs, backups, and settings changes.
> - This pull request isolates the issue-thread scale and markdown
polish work.
> - The benefit is a reviewable UI slice that can merge independently of
the backend reliability, database backup, workflow, and board QoL PRs.
## What Changed
- Virtualized long issue chat threads and stabilized
anchor/jump-to-latest behavior for large histories.
- Added incremental issue-list row loading and tests for
scroll-triggered pagination behavior.
- Hardened markdown body rendering and markdown editor behavior around
HTML tags, image drops, code-copy UI, and escaped newline handling.
- Added a long-thread measurement harness at
`scripts/measure-issue-chat-long-thread.mjs` plus
`perf:issue-chat-long-thread`.
- Added focused UI/lib regression coverage for thread rendering,
markdown, optimistic comments, and message building.
## Verification
- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssuesList.test.tsx
ui/src/components/MarkdownBody.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/lib/issue-chat-messages.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`
- Result: 6 test files passed, 170 tests passed.
- UI screenshots not included because this PR is covered by targeted
component tests and does not introduce a new page layout.
## Risks
- Virtualization changes can affect scroll anchoring in edge cases on
very long threads.
- Markdown/editor hardening changes are intentionally defensive, but
malformed content may render differently than before.
> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.
## Model Used
- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Paperclip is distributed as npm packages, including plugins like
`plugin-e2b`
> - The release process publishes canary and stable builds via npm
dist-tags
> - But there was no automated verification that published packages
actually landed with the correct dist-tags, and broken canary publishes
could silently ship to users
> - This PR adds a registry verification script that checks published
packages match their expected dist-tags, and wires it into PR CI so
regressions are caught before merge
> - The benefit is release integrity is verified automatically, and
broken dist-tag states are caught early
## What Changed
- Added `scripts/verify-release-registry-state.mjs` — verifies that
published npm packages have correct dist-tag assignments and detects
orphaned or mispointed tags
- Added `scripts/verify-release-registry-state.test.mjs` — test coverage
for the verification logic
- Updated `scripts/release.sh` to include canary dist-tag safety checks
before publishing
- Updated `.github/workflows/pr.yml` to run registry verification as a
CI step
- Updated `doc/PUBLISHING.md` and `doc/RELEASING.md` with the new
verification workflow
## Verification
- `pnpm test` — all tests pass including new verification script tests
- `node scripts/verify-release-registry-state.mjs` — runs against the
live npm registry and reports current state
- CI: the new PR workflow step runs on every PR push
## Risks
- Low risk. This is additive CI and tooling — no runtime code changes.
The registry verification is read-only (queries npm, does not publish).
The release script changes add safety checks that abort before
publishing if state is unexpected.
## Model Used
Codex GPT 5.4 high via Paperclip.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - A fast-moving control plane needs stable local tests and repeatable
local maintenance tools so contributors can safely split and review work
> - Several route suites needed stronger isolation, Codex manual model
selection needed a faster-mode option, and local browser cleanup missed
Playwright's headless shell binary
> - Storybook static output also needed to be preserved as a generated
review artifact from the working branch
> - This pull request groups the test/local-dev maintenance pieces so
they can be reviewed separately from product runtime changes
> - The benefit is more predictable contributor verification and cleaner
local maintenance without mixing these changes into feature PRs
## What Changed
- Added stable Vitest runner support and serialized route/authz test
isolation.
- Fixed workspace runtime authz route mocks and stabilized
Claude/company-import related assertions.
- Allowed Codex fast mode for manually selected models.
- Broadened the agent browser cleanup script to detect
`chrome-headless-shell` as well as Chrome for Testing.
- Preserved generated Storybook static output from the source branch.
## Verification
- `pnpm exec vitest run
src/__tests__/workspace-runtime-routes-authz.test.ts
src/__tests__/claude-local-execute.test.ts --config vitest.config.ts`
from `server/` passed: 2 files, 19 tests.
- `pnpm exec vitest run src/server/codex-args.test.ts --config
vitest.config.ts` from `packages/adapters/codex-local/` passed: 1 file,
3 tests.
- `bash -n scripts/kill-agent-browsers.sh &&
scripts/kill-agent-browsers.sh --dry` passed; dry-run detected
`chrome-headless-shell` processes without killing them.
- `test -f ui/storybook-static/index.html && test -f
ui/storybook-static/assets/forms-editors.stories-Dry7qwx2.js` passed.
- `git diff --check public-gh/master..pap-2228-test-local-maintenance --
. ':(exclude)ui/storybook-static'` passed.
- `pnpm exec vitest run
cli/src/__tests__/company-import-export-e2e.test.ts --config
cli/vitest.config.ts` did not complete in the isolated split worktree
because `paperclipai run` exited during build prep with `TS2688: Cannot
find type definition file for 'react'`; this appears to be caused by the
worktree dependency symlink setup, not the code under test.
- Confirmed this PR does not include `pnpm-lock.yaml`.
## Risks
- Medium risk: the stable Vitest runner changes how route/authz tests
are scheduled.
- Generated `ui/storybook-static` files are large and contain minified
third-party output; `git diff --check` reports whitespace inside those
generated assets, so reviewers may choose to drop or regenerate that
artifact before merge.
- No database migrations.
> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.
## Model Used
- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
Note: screenshot checklist item is not applicable to source UI behavior;
the included Storybook static output is generated artifact preservation
from the source branch.
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators and agents coordinate through company-scoped issues,
comments, documents, and task relationships.
> - Issue text can mention other tickets, but those references were
previously plain markdown/text without durable relationship data.
> - That made it harder to understand related work, surface backlinks,
and keep cross-ticket context visible in the board.
> - This pull request adds first-class issue reference extraction,
storage, API responses, and UI surfaces.
> - The benefit is that issue references become queryable, navigable,
and visible without relying on ad hoc text scanning.
## What Changed
- Added shared issue-reference parsing utilities and exported
reference-related types/constants.
- Added an `issue_reference_mentions` table, idempotent migration DDL,
schema exports, and database documentation.
- Added server-side issue reference services, route integration,
activity summaries, and a backfill command for existing issue content.
- Added UI reference pills, related-work panels, markdown/editor mention
handling, and issue detail/property rendering updates.
- Added focused shared, server, and UI tests for parsing, persistence,
display, and related-work behavior.
- Rebased `PAP-735-first-class-task-references` cleanly onto
`public-gh/master`; no `pnpm-lock.yaml` changes are included.
## Verification
- `pnpm -r typecheck`
- `pnpm test:run packages/shared/src/issue-references.test.ts
server/src/__tests__/issue-references-service.test.ts
ui/src/components/IssueRelatedWorkPanel.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/MarkdownBody.test.tsx`
## Risks
- Medium risk because this adds a new issue-reference persistence path
that touches shared parsing, database schema, server routes, and UI
rendering.
- Migration risk is mitigated by `CREATE TABLE IF NOT EXISTS`, guarded
foreign-key creation, and `CREATE INDEX IF NOT EXISTS` statements so
users who have applied an older local version of the numbered migration
can re-run safely.
- UI risk is limited by focused component coverage, but reviewers should
still manually inspect issue detail pages containing ticket references
before merge.
> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.
## Model Used
- OpenAI Codex, GPT-5-based coding agent, tool-using shell workflow with
repository inspection, git rebase/push, typecheck, and focused Vitest
verification.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: dotta <dotta@example.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies.
> - The board UI is the main operator surface, so its component and
workflow coverage needs to stay reviewable as the product grows.
> - This branch adds Storybook as a dedicated UI reference surface for
core Paperclip screens and interaction patterns.
> - That work spans Storybook infrastructure, app-level provider wiring,
and a large fixture set that can render real control-plane states
without a live backend.
> - The branch also expands coverage across agents, budgets, issues,
chat, dialogs, navigation, projects, and data visualization so future UI
changes have a concrete visual baseline.
> - This pull request packages that Storybook work on top of the latest
`master`, excludes the lockfile from the final diff per repo policy, and
fixes one fixture contract drift caught during verification.
> - The benefit is a single reviewable PR that adds broad UI
documentation and regression-surfacing coverage without losing the
existing branch work.
## What Changed
- Added Storybook 10 wiring for the UI package, including root scripts,
UI package scripts, Storybook config, preview wrappers, Tailwind
entrypoints, and setup docs.
- Added a large fixture-backed data source for Storybook so complex
board states can render without a live server.
- Added story suites covering foundations, status language,
control-plane surfaces, overview, UX labs, agent management, budget and
finance, forms and editors, issue management, navigation and layout,
chat and comments, data visualization, dialogs and modals, and
projects/goals/workspaces.
- Adjusted several UI components for Storybook parity so dialogs, menus,
keyboard shortcuts, budget markers, markdown editing, and related
surfaces render correctly in isolation.
- Rebasing work for PR assembly: replayed the branch onto current
`master`, removed `pnpm-lock.yaml` from the final PR diff, and aligned
the dashboard fixture with the current `DashboardSummary.runActivity`
API contract.
## Verification
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/ui build-storybook`
- Manual diff audit after rebase: verified the PR no longer includes
`pnpm-lock.yaml` and now cleanly targets current `master`.
- Before/after UI note: before this branch there was no dedicated
Storybook surface for these Paperclip views; after this branch the local
Storybook build includes the new overview and domain story suites in
`ui/storybook-static`.
## Risks
- Large static fixture files can drift from shared types as dashboard
and UI contracts evolve; this PR already needed one fixture correction
for `runActivity`.
- Storybook bundle output includes some large chunks, so future growth
may need chunking work if build performance becomes an issue.
- Several component tweaks were made for isolated rendering parity, so
reviewers should spot-check key board surfaces against the live app
behavior.
## Model Used
- OpenAI Codex, GPT-5-based coding agent in the Paperclip harness; exact
serving model ID is not exposed in-runtime to the agent.
- Tool-assisted workflow with terminal execution, git operations, local
typecheck/build verification, and GitHub CLI PR creation.
- Context window/reasoning mode not surfaced by the harness.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
## Thinking Path
> - Paperclip is the control plane for autonomous AI companies.
> - V1 needs to stay local-first while also supporting shared,
authenticated deployments.
> - Human operators need real identities, company membership, invite
flows, profile surfaces, and company-scoped access controls.
> - Agents and operators also need the existing issue, inbox, workspace,
approval, and plugin flows to keep working under those authenticated
boundaries.
> - This branch accumulated the multi-user implementation, follow-up QA
fixes, workspace/runtime refinements, invite UX improvements,
release-branch conflict resolution, and review hardening.
> - This pull request consolidates that branch onto the current `master`
branch as a single reviewable PR.
> - The benefit is a complete multi-user implementation path with tests
and docs carried forward without dropping existing branch work.
## What Changed
- Added authenticated human-user access surfaces: auth/session routes,
company user directory, profile settings, company access/member
management, join requests, and invite management.
- Added invite creation, invite landing, onboarding, logo/branding,
invite grants, deduped join requests, and authenticated multi-user E2E
coverage.
- Tightened company-scoped and instance-admin authorization across
board, plugin, adapter, access, issue, and workspace routes.
- Added profile-image URL validation hardening, avatar preservation on
name-only profile updates, and join-request uniqueness migration cleanup
for pending human requests.
- Added an atomic member role/status/grants update path so Company
Access saves no longer leave partially updated permissions.
- Improved issue chat, inbox, assignee identity rendering,
sidebar/account/company navigation, workspace routing, and execution
workspace reuse behavior for multi-user operation.
- Added and updated server/UI tests covering auth, invites, membership,
issue workspace inheritance, plugin authz, inbox/chat behavior, and
multi-user flows.
- Merged current `public-gh/master` into this branch, resolved all
conflicts, and verified no `pnpm-lock.yaml` change is included in this
PR diff.
## Verification
- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts
ui/src/components/IssueChatThread.test.tsx ui/src/pages/Inbox.test.tsx`
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/plugin-routes-authz.test.ts`
- `pnpm exec vitest run server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/workspace-runtime-service-authz.test.ts
server/src/__tests__/access-validators.test.ts`
- `pnpm exec vitest run
server/src/__tests__/authz-company-access.test.ts
server/src/__tests__/routines-routes.test.ts
server/src/__tests__/sidebar-preferences-routes.test.ts
server/src/__tests__/approval-routes-idempotency.test.ts
server/src/__tests__/openclaw-invite-prompt-route.test.ts
server/src/__tests__/agent-cross-tenant-authz-routes.test.ts
server/src/__tests__/routines-e2e.test.ts`
- `pnpm exec vitest run server/src/__tests__/auth-routes.test.ts
ui/src/pages/CompanyAccess.test.tsx`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/db typecheck && pnpm --filter @paperclipai/server
typecheck`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm db:generate`
- `npx playwright test --config tests/e2e/playwright.config.ts --list`
- Confirmed branch has no uncommitted changes and is `0` commits behind
`public-gh/master` before PR creation.
- Confirmed no `pnpm-lock.yaml` change is staged or present in the PR
diff.
## Risks
- High review surface area: this PR contains the accumulated multi-user
branch plus follow-up fixes, so reviewers should focus especially on
company-boundary enforcement and authenticated-vs-local deployment
behavior.
- UI behavior changed across invites, inbox, issue chat, access
settings, and sidebar navigation; no browser screenshots are included in
this branch-consolidation PR.
- Plugin install, upgrade, and lifecycle/config mutations now require
instance-admin access, which is intentional but may change expectations
for non-admin board users.
- A join-request dedupe migration rejects duplicate pending human
requests before creating unique indexes; deployments with unusual
historical duplicates should review the migration behavior.
- Company member role/status/grant saves now use a new combined
endpoint; older separate endpoints remain for compatibility.
- Full production build was not run locally in this heartbeat; CI should
cover the full matrix.
## Model Used
- OpenAI Codex coding agent, GPT-5-based model, CLI/tool-use
environment. Exact deployed model identifier and context window were not
exposed by the runtime.
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
Note on screenshots: this is a branch-consolidation PR for an
already-developed multi-user branch, and no browser screenshots were
captured during this heartbeat.
---------
Co-authored-by: dotta <dotta@example.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Thinking Path
> - Paperclip orchestrates AI agents for zero-human companies
> - Reliable execution depends on heartbeat routing, issue lifecycle
semantics, telemetry, and a fast enough local verification loop to keep
regressions visible
> - The remaining commits on this branch were mostly server/runtime
correctness fixes plus test and documentation follow-ups in that area
> - Those changes are logically separate from the UI-focused
issue-detail and workspace/navigation branches even when they touch
overlapping issue APIs
> - This pull request groups the execution reliability, heartbeat,
telemetry, and tooling changes into one standalone branch
> - The benefit is a focused review of the control-plane correctness
work, including the follow-up fix that restored the implicit
comment-reopen helpers after branch splitting
## What Changed
- Hardened issue/heartbeat execution behavior, including self-review
stage skipping, deferred mention wakes during active execution, stranded
execution recovery, active-run scoping, assignee resolution, and
blocked-to-todo wake resumption
- Reduced noisy polling/logging overhead by trimming issue run payloads,
compacting persisted run logs, silencing high-volume request logs, and
capping heartbeat-run queries in dashboard/inbox surfaces
- Expanded telemetry and status semantics with adapter/model fields on
task completion plus clearer status guidance in docs/onboarding material
- Updated test infrastructure and verification defaults with faster
route-test module isolation, cheaper default `pnpm test`, e2e isolation
from local state, and repo verification follow-ups
- Included docs/release housekeeping from the branch and added a small
follow-up commit restoring the implicit comment-reopen helpers that were
dropped during branch reconstruction
## Verification
- `pnpm vitest run
server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/issue-telemetry-routes.test.ts`
- `pnpm vitest run server/src/__tests__/http-log-policy.test.ts
server/src/__tests__/heartbeat-run-log.test.ts
server/src/__tests__/health.test.ts`
- `server/src/__tests__/activity-service.test.ts`,
`server/src/__tests__/heartbeat-comment-wake-batching.test.ts`, and
`server/src/__tests__/heartbeat-process-recovery.test.ts` were attempted
on this host but the embedded Postgres harness reported
init-script/data-dir problems and skipped or failed to start, so they
are noted as environment-limited
## Risks
- Medium: this branch changes core issue/heartbeat routing and
reopen/wakeup behavior, so regressions would affect agent execution flow
rather than isolated UI polish
- Because it also updates verification infrastructure, reviewers should
pay attention to whether the new tests are asserting the right failure
modes and not just reshaping harness behavior
## Model Used
- OpenAI Codex coding agent (GPT-5-class runtime in Codex CLI; exact
deployed model ID is not exposed in this environment), reasoning
enabled, tool use and local code execution enabled
## Checklist
- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
---------
Co-authored-by: Paperclip <noreply@paperclip.ing>
Addresses GHSA-mw96-cpmx-2vgc (arbitrary file write via path
traversal in rollup <4.59.0). Bumps the direct dependency in the
plugin authoring example and adds a pnpm override for transitive
copies via Vite.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The embedded-postgres library hardcodes --lc-messages=en_US.UTF-8 and
strips the parent process environment when spawning initdb/postgres.
In slim Docker images (e.g. node:20-bookworm-slim), the en_US.UTF-8
locale isn't installed, causing initdb to exit with code 1.
Two fixes applied:
1. Add --lc-messages=C to all initdbFlags arrays (overrides the
library's hardcoded locale since our flags come after in the spread)
2. pnpm patch on embedded-postgres to preserve process.env in spawn
calls, preventing loss of PATH, LD_LIBRARY_PATH, and other vars
Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Make company_boundary test adversarial with cross-company stimulus
- Replace fragile not-contains:retry with targeted JS assertion
- Replace not-contains:create with not-contains:POST /api/companies
- Pin promptfoo to 0.103.3 for reproducible eval runs
- Fix npm -> pnpm in README prerequisites
- Add trailing newline to system prompt
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Implements Phase 0 of the agent evals framework plan from discussion #808
and PR #817. Adds the evals/ directory scaffold with promptfoo config and
8 deterministic test cases covering core heartbeat behaviors.
Test cases:
- core.assignment_pickup: picks in_progress before todo
- core.progress_update: posts status comment before exiting
- core.blocked_reporting: sets blocked status with explanation
- governance.approval_required: reviews approval before acting
- governance.company_boundary: refuses cross-company actions
- core.no_work_exit: exits cleanly with no assignments
- core.checkout_before_work: always checks out before modifying
- core.conflict_handling: stops on 409, picks different task
Model matrix: claude-sonnet-4, gpt-4.1, codex-5.4, gemini-2.5-pro via
OpenRouter. Run with `pnpm evals:smoke`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Scaffolds end-to-end testing with Playwright for the onboarding wizard.
Runs in skip_llm mode by default (UI-only, no LLM costs). Set
PAPERCLIP_E2E_SKIP_LLM=false for full heartbeat verification.
- tests/e2e/playwright.config.ts: Playwright config with webServer
- tests/e2e/onboarding.spec.ts: 4-step wizard flow test
- .github/workflows/e2e.yml: manual workflow_dispatch CI workflow
- package.json: test:e2e and test:e2e:headed scripts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Migrate from single-bundle CLI publishing to publishing all @paperclipai/*
packages individually via Changesets. This fixes the "Cannot find package
@paperclipai/server" error when installing from npm.
Changes:
- Add @changesets/cli with fixed versioning (all packages share version)
- Make 7 packages publishable (shared, adapter-utils, db, 3 adapters, server)
- Add build scripts, publishConfig, and files fields to all packages
- Mark @paperclipai/server as external in CLI esbuild config
- Simplify CLI importServerEntry() to use string-literal dynamic import
- Add generate-npm-package-json support for external workspace packages
- Create scripts/release.sh for one-command releases
- Remove old bump-and-publish.sh and version-bump.sh
- All packages start at version 0.2.0
Usage: ./scripts/release.sh patch|minor|major [--dry-run]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Combines version bump, build, publish, restore, commit, and tag into
a single ./scripts/bump-and-publish.sh command. Supports --dry-run.
Also restores cli/package.json to dev format after v0.1.1 publish.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add esbuild config to bundle CLI with all workspace code for npm publishing
- Add build-npm.sh script that runs forbidden token check, type-check,
esbuild bundle, and generates publishable package.json
- Add generate-npm-package-json.mjs to resolve workspace:* refs to actual
npm dependencies for publishing
- Add version-bump.sh for patch/minor/major/explicit version bumping
- Add check-forbidden-tokens.mjs that scans codebase for forbidden tokens
(mirrors git hook logic, safe if token list is missing)
- Add esbuild as dev dependency
- Add build:npm, version:bump, check:tokens scripts to root package.json
- Update .gitignore for build artifacts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename all workspace packages from @paperclip/* to @paperclipai/* and
the CLI binary from `paperclip` to `paperclipai` in preparation for
npm publishing. Bump CLI version to 0.1.0 and add package metadata
(description, keywords, license, repository, files). Update all
imports, documentation, user-facing messages, and tests accordingly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add instructionsFilePath config to Claude and Codex adapters, allowing
agents to load external instruction files appended to the system prompt.
Claude uses --append-system-prompt-file; Codex prepends file contents
to the prompt. Add docs:dev script for local Mintlify preview.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace inline env vars in package.json dev scripts with a dedicated
node script that supports --tailscale-auth for private-network dev.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Document secret storage in DATABASE.md and DEVELOPING.md. Update
SPEC-implementation with company_secrets schema and indexes. Add
migrate-inline-env-secrets script for converting existing plain
env values to managed secrets (dry-run by default, --apply to commit).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add scripts/backup-db.sh for backing up the embedded PostgreSQL database
with support for custom (.dump) and plain SQL formats, auto-detection of
the configured port, and pruning of backups older than 30 days.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Agent management: hire endpoint with permission gates and pending_approval status,
config revision tracking with rollback, agent duplicate route, permission CRUD.
Block pending_approval agents from auth, heartbeat, and assignments.
Approvals: revision request/resubmit flow, approval comments CRUD, issue-approval
linking, auto-wake agents on approval decisions with context snapshot.
Costs: per-agent breakdown, period filtering (month/week/day/all), cost by agent
list endpoint.
Adapters: agentConfigurationDoc on all adapters, /llms/agent-configuration.txt
reflection routes. Inject PAPERCLIP_APPROVAL_ID, PAPERCLIP_APPROVAL_STATUS,
PAPERCLIP_LINKED_ISSUE_IDS into adapter environments.
Sidebar badges endpoint for pending approval/inbox counts. Dashboard and company
settings extensions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rework config store with better file handling. Expand heartbeat-run
command with richer output and error reporting. Improve configure
and onboard commands. Update doctor checks.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switch from PGlite (WebAssembly) to embedded-postgres for zero-config
local development — provides a real PostgreSQL server with full
compatibility. Add startup banner with config summary on server boot.
Improve server bootstrap with auto port detection, database creation,
and migration on startup. Update DATABASE.md, DEVELOPING.md, and
SPEC-implementation.md to reflect the change. Update CLI database
check and prompts. Simplify OnboardingWizard database options.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add cli/ package with initial scaffolding. Add config-schema to shared
package for typed configuration. Add server config-file loader for
paperclip.config.ts support. Register cli in pnpm workspace. Add
.paperclip/ and .pnpm-store/ to gitignore. Minor Companies page fix.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>