Commit Graph

421 Commits

Author SHA1 Message Date
Devin Foley 90631b09b3 Let adapters declare runtime command spec for remote provisioning (#5141)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, running
adapter
> commands like `claude`, `codex`, `pi` either locally or on remote
runtimes
>   (SSH hosts, sandboxes, etc.)
> - On a fresh remote runtime — particularly an ephemeral sandbox — the
> adapter's CLI may not be installed yet. Today operators handle this
via
> external configuration (e.g. a project-level `provisionCommand` shell
> script) that has to know about every adapter the operator might want
to use
> - This means every adapter has its own well-known npm package, but
operators
>   end up writing duplicate provision shell scripts that paste together
> `npm install -g @anthropic-ai/claude-code`, `npm install -g
@openai/codex`,
>   etc. — knowledge the adapter itself already has
> - This PR moves that knowledge into the adapter modules: each adapter
declares
> how its runtime command should be detected and (if applicable)
installed
> via `getRuntimeCommandSpec(config)`. The execution path runs the
adapter's
> own install command on remote sandbox targets before launching, so a
fresh
> sandbox bootstraps itself instead of requiring a hand-written
provision script
> - The benefit is fewer footguns for operators provisioning remote
runtimes,
>   and a clean place for new adapters to plug in their install recipe

## What Changed

- New types in `packages/adapter-utils/src/types.ts`:
    - `AdapterRuntimeCommandSpec` describing `command`, optional
      `detectCommand`, and optional `installCommand`
    - Optional `getRuntimeCommandSpec(config)` on `ServerAdapterModule`
- Optional `runtimeCommandSpec` on `AdapterExecutionContext` so adapters
      receive the resolved spec at execute time
- New helper `ensureAdapterExecutionTargetRuntimeCommandInstalled(...)`
in
`packages/adapter-utils/src/execution-target.ts` that runs the install
command
on remote targets when `transport === "sandbox"`. SSH and local targets
are
  no-ops. Throws on timeout or non-zero exit so failures surface early.
- Each of `claude-local`, `codex-local`, `cursor-local`, `gemini-local`,
  `opencode-local`, `pi-local`'s `execute.ts` now reads
`ctx.runtimeCommandSpec?.installCommand` and calls the helper before
launching
  the adapter command.
- `server/src/adapters/registry.ts` declares `getRuntimeCommandSpec` for
each
  adapter:
- claude/codex/gemini/opencode/pi-local: `npm install -g <package>`
recipe via
a shared `buildNpmRuntimeCommandSpec` helper, with a defensive guard
that
only auto-installs when the configured `command` matches the well-known
      fallback (custom binaries are left alone).
- cursor-local: declares `command` only; no auto-install (no public npm
      package), preserving the existing manual setup.
- `server/src/services/heartbeat.ts` resolves the spec via
`adapter.getRuntimeCommandSpec?.(runtimeConfig)` and passes it through
to
  `AdapterExecutionContext`.
- Tests added in `execution-target.test.ts` (~75 lines), e2b
`plugin.test.ts` (~32 lines), and `environment-run-orchestrator.test.ts`
  (~76 lines).

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-run-orchestrator`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: run an adapter (claude/codex/etc.) against a fresh
sandbox-backed
environment that does NOT have the adapter CLI pre-installed. Confirm
the
install runs once at the start of the agent run and the adapter then
launches
successfully. Re-run on the same sandbox; confirm the install command is
  idempotent and the second run starts faster.
- Confirm SSH and local execution paths are unaffected (gated by
  `transport === "sandbox"`).

## Risks

- Behavioural shift on sandbox runs: a new install step now runs at the
start
  of every sandbox agent run for adapters with `installCommand` set. The
install commands are idempotent (`if ! command -v X >/dev/null 2>&1;
then
npm install -g <pkg>; fi`), so this is fast on warm sandboxes. On a cold
  sandbox, the first run takes longer.
- Operators who used the legacy project-level `provisionCommand` to
install
adapter CLIs can drop that part of their script; the adapter handles it
now.
  Existing scripts continue to work — installs are idempotent.
- The cursor-local adapter has no auto-install (no public npm package).
  Behaviour for cursor-local on sandboxes is unchanged.
- New optional surface on `ServerAdapterModule`. Plugins that don't
implement
  `getRuntimeCommandSpec` retain previous behaviour (no auto-install).

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:35:36 -07:00
Devin Foley 0e51fa2b0d Honor reuse-existing preference and assignee default environment in issue runs (#5139)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside execution workspaces (a per-issue cwd + env), and
an issue
> can prefer to reuse an existing workspace or get a fresh one each time
> - The heartbeat service was reading the existing workspace's config to
derive
> environment selection regardless of whether the issue actually wanted
to reuse
> it. So fresh-run issues were inheriting stale config from a workspace
that was
>   about to be discarded
> - Separately, when an issue is assigned to an agent, the issue's
execution
> workspace settings weren't picking up the agent's
`defaultEnvironmentId`,
>   even though the agent's choice is the natural default for that issue
> - This PR makes both selection paths honor the obvious source of
truth:
> workspace config flows only when the issue actually wants
`reuse_existing`,
> and the assignee agent's default environment is applied at assignment
time if
>   nothing else is set on the issue
> - The benefit is that re-running a flaky issue picks up the right
environment
> instead of inheriting the previous run's config, and assigning an
agent to an
>   issue does the obvious thing without operator intervention

## What Changed

- `server/src/services/heartbeat.ts`: introduce
`reusableExecutionWorkspaceConfig`
  that is non-null only when `shouldReuseExisting` is true. Both
  `resolveExecutionWorkspaceEnvironmentId(...)` and
`applyPersistedExecutionWorkspaceConfig(...)` now read from it instead
of
unconditionally consulting `existingExecutionWorkspace?.config`.
Fresh-run
issues no longer inherit stale environment config from an in-flight
workspace
  about to be discarded.
- `server/src/services/issues.ts`: when an issue update sets a new
  `assigneeAgentId` and isolated workspaces are enabled, populate
  `executionWorkspaceSettings.environmentId` from the assignee agent's
  `defaultEnvironmentId` if the issue doesn't have an explicit
  `environmentId` set yet.
- Tests added in `heartbeat-plugin-environment.test.ts` (~216 lines) and
  `issues-service.test.ts` (~85 lines) covering both paths.

## Verification

- `pnpm --filter @paperclipai/server test --
heartbeat-plugin-environment issues-service`
- Manual QA: assign an issue to an agent that has a non-default
`defaultEnvironmentId`, confirm the issue's workspace settings now
include that
environment id without operator intervention. Trigger a rerun on an
issue
whose existing workspace points at a stale environment, confirm the
rerun uses
  the freshly-resolved environment.

## Risks

- Behavioural shift on assignment: previously assigning an agent didn't
propagate the agent's default environment to the issue. Now it does.
Callers
that explicitly want the issue to keep its existing/null environment
must set
`executionWorkspaceSettings.environmentId` themselves; the new logic
only
  fires when no explicit value is set.
- Behavioural shift on rerun: stale workspace config is no longer
applied to
  fresh runs. Operators who relied on this implicit inheritance may see
different environment selection on the first rerun after deploy.
Mitigation:
the explicit isssue settings and project policy are still honored as
before.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:33:55 -07:00
Devin Foley bb7d040894 Switch OpenCode to explicit static/local-aware model selection (#5117)
> **Stacked PR (part 4 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - When creating an OpenCode-local agent, Paperclip currently validates
> `adapterConfig.model` against the *Paperclip host's* `opencode models`
output
> - SSH testing surfaced that this blocks creating an OpenCode agent for
an SSH
> environment: the model that exists on the SSH target isn't visible to
the
> host, so creation fails with "OpenCode requires `adapterConfig.model`
in
> provider/model format" even when the operator picked a real remote
model
> - The initial direction was environment-aware model discovery; the
final
> decision was to keep OpenCode on the same explicit-model pattern as
other
> adapters (default + curated list + manual override) and stop blocking
>   creation on host-side discovery
> - This PR does both: the adapter-models endpoint now accepts
`environmentId` and
> probes against the target environment, and the create-time hard gate
is
> replaced by `requireOpenCodeModelId` which validates `provider/model`
*format*
> without requiring host-local discovery. Test/run-time still surfaces
real
>   auth/availability problems
> - The benefit is that operators can create OpenCode agents for remote
> environments without out-of-band setup, and the model picker in the UI
>   reflects the actually-targeted environment

## What Changed

- Added `requireOpenCodeModelId(input)` in
`opencode-local/src/server/models.ts`,
  exported it from the adapter index
- `ensureOpenCodeModelConfiguredAndAvailable` now delegates the format
check to
  `requireOpenCodeModelId`
- `agentsApi.adapterModels(companyId, adapterType, { environmentId })`
now accepts
  an environment ID and passes it as a query parameter
- `queryKeys.agents.adapterModels` now keys on `(companyId, adapterType,
environmentId)`
- `server/src/routes/agents.ts` reads and validates the new query
parameter,
  forwarding it to the adapter's model probe
- `AgentConfigForm.tsx` and `OnboardingWizard.tsx` build the model query
key from
the currently selected default environment ID and disable autodetect for
  `opencode_local` (model selection is explicit)
- `NewAgent.tsx` simplified — no longer special-cases OpenCode
autodetect
- `company-portability.ts` no longer needs OpenCode-specific autodetect
handling
- Tests added/updated:
  `adapter-model-refresh-routes.test.ts`, `adapter-models.test.ts`,
`agent-permissions-routes.test.ts`,
`opencode-local/src/server/models.test.ts`

## Verification

- `pnpm --filter @paperclipai/server test -- adapter-models
adapter-model-refresh agent-permissions`
- `pnpm --filter @paperclipai/adapter-opencode-local test`
- `pnpm --filter @paperclipai/ui test -- AgentConfigForm
OnboardingWizard NewAgent`
- Manual QA in browser:
1. Boot Paperclip on Tailscale-bound port (so it's reachable from
another
machine), create an OpenCode-local agent, switch the default environment
between two installed sandboxes, and confirm the model list refreshes
     per-environment
  2. Submit with a malformed `provider/model` string and verify the new
     `requireOpenCodeModelId` error surfaces
- Before/after screenshots attached for `AgentConfigForm` model picker

## Risks

- Behavioural shift: switching default environment now triggers a model
refetch.
Should be cheap but introduces a new UI loading state for OpenCode
users.
- Removing dynamic autodetect for OpenCode: if any user configured an
agent
without specifying `model` and relied on autodetect populating it, that
agent
will now fail at submit time. Mitigation: validation error is explicit
and
  actionable.
- New query string parameter on `/api/companies/:id/adapter-models` —
older
clients that omit it still work (parameter is optional and defaults to
null).

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:01:34 -07:00
Devin Foley 076067865f Migrate SSH environment callback to bridge (#5116)
> **Stacked PR (part 3 of 7).** Depends on:
  - PR #5114
  - PR #5115
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents executing on a remote SSH-backed environment need a way to
call back into
>   the Paperclip control plane (run events, log streaming, signals)
> - When the SSH host can't reach the Paperclip host (NAT, firewalls, or
simply not
> on the same network), the run silently fails or hangs — a recurring
class of
>   failure during SSH testing
> - In sandboxed environments we already solved this with a callback
bridge that
> tunnels back through the existing connection; SSH was the odd one out
> - This PR migrates SSH execution to use the same callback bridge, so
every
> adapter's remote run uses one consistent reverse-channel. Per-adapter
SSH glue
> is deleted in favour of a shared `CommandManagedRuntimeRunner` built
from the
>   SSH spec
> - The benefit is fewer SSH-specific failure modes, a smaller code
surface, and
>   one place to evolve the callback contract going forward

## What Changed

- Added `createSshCommandManagedRuntimeRunner` in
`packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a
generic
  command-managed-runtime runner (with cwd, env, and timeout handling)
- Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge
URL now flows
  through the shared runner
- Reworked `execution-target.ts` to use the SSH runner alongside sandbox
runners
  via a unified `CommandManagedRuntimeRunner` interface
- Simplified `remote-managed-runtime.ts` and
`sandbox-managed-runtime.ts` to consume
  the shared runner abstraction
- Deleted per-adapter SSH callback wiring from claude-local,
codex-local,
  cursor-local, gemini-local, opencode-local, pi-local execute.ts files
- Removed `environment-runtime-driver-contract.test.ts` (the contract is
now
  enforced by `environment-execution-target.test.ts`)
- Added/updated `execute.remote.test.ts` cases for each adapter to cover
the SSH
  runner path

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm test -- execute.remote` (covers all six local adapters' SSH
paths)
- Manual QA: ran a claude-local agent against an SSH-backed environment,
confirmed
the agent successfully called back to `/api/agent-callback/*` endpoints
during
  the run

## Risks

- Refactor touches all six local adapters. If any adapter had subtle
SSH-specific
behaviour that wasn't captured in tests, it could regress. Mitigation:
each
  adapter's `execute.remote.test.ts` was extended.
- `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking
type change
for any internal consumer. Verified no external plugins consume this
type.
- The new `CommandManagedRuntimeRunner` shape is a public surface in
`@paperclipai/adapter-utils`; downstream plugins implementing custom
runners may
  need updates, but no such plugins exist in this repo.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:43:52 -07:00
Devin Foley a7b45938b7 Let sandbox providers declare shell defaults (#5114)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
>   providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
>   provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
>   provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
>   helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
>   preference

## What Changed

- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
  else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
  and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
  and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
  `shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
  startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
  (validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
  external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
  `execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
  `environment-execution-target.test.ts`

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
  shell semantics flow through the callback bridge end-to-end)

## Risks

- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
  providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
  Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
Dotta 15eac43b43 [codex] Retry max-turn exhausted heartbeats (#5096)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies, and
heartbeat execution is the control-plane loop that keeps assigned work
moving.
> - Max-turn exhaustion is a recoverable local-adapter stop condition
for Claude and Gemini agents when a run needs another heartbeat to
continue safely.
> - The previous behavior could leave max-turn continuation details hard
to inspect, and duplicate/stale continuation wakes could keep running
after issue state changed.
> - The adapter layer also needed to avoid trusting arbitrary
stdout/stderr text as scheduler control metadata.
> - This pull request adds bounded max-turn continuation scheduling,
visible retry state, structured stop metadata handling, and
stale/duplicate continuation guards.
> - The benefit is safer automatic continuation after max-turn stops,
clearer operator visibility, and fewer duplicate or stale agent runs.

## What Changed

- Replaces closed PR #4952, whose head repository was deleted.
- Rebases the recovered max-turn continuation branch onto current
`paperclipai/paperclip:master`.
- Adds max-turn continuation scheduling and retry-state plumbing for
heartbeat runs.
- Adds stale/duplicate continuation suppression when issue status,
ownership, or execution locks change.
- Normalizes Claude/Gemini max-turn detection around structured stop
metadata instead of unstructured stdout/stderr text.
- Surfaces max-turn continuation settings and retry visibility in the
board UI.
- Adds focused server, adapter, and UI tests for max-turn stop metadata,
retry scheduling, stale queued-run invalidation, adapter
parsing/execution, run ledger display, and agent config patching.

## Verification

- `pnpm install --no-frozen-lockfile` to refresh local dependencies
after rebasing onto current `master`.
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/claude-local-adapter.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/gemini-local-adapter.test.ts
server/src/__tests__/gemini-local-execute.test.ts
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/heartbeat-stale-queue-invalidation.test.ts
server/src/services/heartbeat-stop-metadata.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/lib/agent-config-patch.test.ts ui/src/lib/runRetryState.test.ts
--testTimeout=20000`
- `pnpm --filter @paperclipai/adapter-claude-local typecheck && pnpm
--filter @paperclipai/adapter-gemini-local typecheck && pnpm --filter
@paperclipai/server typecheck && pnpm --filter @paperclipai/ui
typecheck`
- UI screenshot note: the UI changes are limited to config/ledger state
rendering rather than layout changes; component/unit coverage above
verifies the rendered behavior.

## Risks

- Medium behavior risk: heartbeat retry gating now suppresses max-turn
continuations when issue state or execution locks drift, so any callers
that relied on stale continuations running will now see cancellation
instead.
- Low adapter risk: Claude/Gemini unstructured text no longer triggers
max-turn scheduler metadata, so only structured stop signals and Gemini
exit code 53 are trusted.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5-class model, tool-enabled local
repository editing and command execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable: state/default rendering only; covered by
component/unit tests)
- [x] I have updated relevant documentation to reflect my changes (not
applicable: no user-facing command or docs contract changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 11:30:48 -05:00
Dotta 57229d0f24 [codex] Add issue monitor liveness controls (#4988)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies where work
must stay observable, governable, and recoverable.
> - The task/heartbeat subsystem owns agent execution continuity, issue
state transitions, and visible recovery behavior.
> - Waiting on an external service is not the same as being blocked when
the assignee still owns a future check.
> - The gap was that agents had no first-class one-shot monitor state
for external-service waits, so recovery could look stalled or require ad
hoc comments.
> - This pull request adds bounded issue monitors that can wake the
owner, clear exhausted waits, and produce explicit recovery behavior.
> - It also surfaces monitor status in the board UI and documents when
to use monitors versus `blocked`.
> - The benefit is clearer liveness semantics for asynchronous waits
without weakening single-assignee task ownership.

## What Changed

- Added issue monitor fields, shared types, validators, constants, and
an idempotent `0075` migration for scheduled monitor state.
- Added server-side monitor scheduling, dispatch, recovery bounds,
activity logging, and external-ref redaction.
- Added board/agent route coverage for monitor permissions and child
monitor scheduling.
- Added issue detail/property UI for monitor state, a monitor activity
card, and Storybook stories for review surfaces.
- Documented monitor semantics and recovery policy behavior in
`doc/execution-semantics.md`.
- Addressed Greptile review feedback by preserving monitor state in
skipped-stage builders and making board monitor saves send `scheduledBy:
"board"`.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/issue-execution-policy-routes.test.ts
server/src/__tests__/issue-execution-policy.test.ts
server/src/__tests__/issue-monitor-scheduler.test.ts
server/src/__tests__/recovery-classifiers.test.ts
ui/src/components/IssueMonitorActivityCard.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/lib/activity-format.test.ts`
- First run passed 5 files and failed to collect 2 server suites because
the worktree was missing the optional `acpx/runtime` dependency.
- After `pnpm install --frozen-lockfile`, reran the 2 failed suites
successfully.
- `pnpm exec vitest run
server/src/__tests__/issue-monitor-scheduler.test.ts
server/src/__tests__/recovery-classifiers.test.ts`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/db typecheck && pnpm --filter @paperclipai/server typecheck
&& pnpm --filter @paperclipai/ui typecheck`
- `pnpm exec vitest run
server/src/__tests__/issue-execution-policy.test.ts
ui/src/components/IssueProperties.test.tsx`
- `pnpm --filter @paperclipai/server typecheck && pnpm --filter
@paperclipai/ui typecheck`
- `pnpm exec vitest run
ui/src/components/IssueMonitorActivityCard.test.tsx
ui/src/components/IssueProperties.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- Storybook screenshot captured from
`http://127.0.0.1:6006/iframe.html?viewMode=story&id=product-issue-monitor-surfaces--monitor-surfaces`
with Playwright.

## Screenshots

![Issue monitor Storybook
surfaces](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2945-when-a-task-is-waiting-for-an-_external-service_-what-state-should-it-be-in-and-what-recovery-method-could-it-h/docs/pr-screenshots/pap-2945/monitor-surfaces.png)

## Risks

- Medium: this changes heartbeat recovery behavior for scheduled
external-service waits, so regressions could affect wake timing or
recovery issue creation.
- Migration risk is reduced by using `IF NOT EXISTS` for the new issue
monitor columns and index.
- External monitor references are treated as secret-adjacent and are
intentionally omitted from visible activity/wake payloads.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent with repository tool use and terminal
execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or Storybook review surfaces
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 08:58:53 -05:00
Dotta 2d72292ad6 [codex] Add workspace routine run tab (#4958)
## Thinking Path

> - Paperclip orchestrates AI agents through reusable execution
workspaces and routines
> - Operators need a fast way to run workspace-aware routines against a
specific execution workspace
> - The existing workspace detail surface showed configuration, runtime
logs, and linked issues, but not routines that depend on workspace
variables
> - Routine runs also needed to prefill the selected execution workspace
so branch variables resolve correctly
> - This pull request adds a workspace routines tab and prefilled
routine-run dialog support
> - The benefit is a tighter workflow for rerunning reviews, smoke
checks, and other workspace-specific routines

## What Changed

- Added an execution workspace `Routines` tab and company-prefixed
routes.
- Listed routines that declare or reference workspace-specific
variables.
- Added `Run now` support that preselects the current execution
workspace in `RoutineRunVariablesDialog`.
- Centralized reusable execution workspace ordering/deduplication for
issue creation and workspace cards.
- Added focused UI helper and dialog regression tests.

## Verification

- `pnpm exec vitest run ui/src/lib/reusable-execution-workspaces.test.ts
ui/src/lib/workspace-routines.test.ts
ui/src/components/RoutineRunVariablesDialog.test.tsx
ui/src/lib/company-routes.test.ts`
- Screenshots were not captured in this PR split; the visible flow is
covered by focused component/helper tests and should get browser QA in
the follow-up issue.

## Risks

- Medium risk: this adds a new workspace detail tab and routine-run
path. It is isolated to workspace-scoped routines and uses existing
routine run APIs.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:58:15 -05:00
Dotta 570a4206da [codex] Recover productive terminal continuations (#4956)
## Thinking Path

> - Paperclip orchestrates AI agents through issue-scoped heartbeat runs
> - Recovery logic decides whether in-progress work still has a live
path after a terminal run
> - A productive terminal continuation can still leave an issue stranded
when no active run or wake remains
> - Treating that state as healthy leaves work stuck despite evidence
that more action is needed
> - This pull request re-enqueues recovery for productive terminal
continuations that left no live path
> - The benefit is fewer silently stranded in-progress issues after
agents make partial progress

## What Changed

- Reclassified successful-but-productive terminal continuations as
recoverable when no live path remains.
- Enqueue a follow-up recovery wake with the original run id and
continuation metadata.
- Added regression tests covering productive terminal continuation
recovery and advanced liveness handoff.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/run-continuations.test.ts`

## Risks

- Medium risk: recovery may schedule one more follow-up where Paperclip
previously considered the work observed. The existing uniqueness,
budget, and escalation checks still constrain retry loops.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:57:23 -05:00
Dotta 3cd26a78fc [codex] Surface live run comment context (#4957)
## Thinking Path

> - Paperclip orchestrates AI agents through issue comments and
heartbeat runs
> - The board UI needs to distinguish a comment that triggered a live
run from comments queued after that run started
> - The run payload already stores comment context, but active-run API
responses did not expose the ids the UI needs
> - Without those ids, the triggering comment can flash as queued while
the agent is already responding to it
> - This pull request exposes live-run comment context and teaches the
optimistic comment helper to ignore the trigger comment
> - The benefit is clearer issue-chat state during comment-triggered
agent interruptions

## What Changed

- Added `contextCommentId` and `contextWakeCommentId` to active/live run
payloads.
- Threaded those ids through server routes, heartbeat summaries, UI API
types, and issue detail rendering.
- Updated optimistic comment classification to avoid marking the
triggering comment as queued.
- Added server and UI regression coverage.

## Verification

- `pnpm exec vitest run
server/src/__tests__/agent-live-run-routes.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low-to-medium risk: adds optional fields to existing run payloads.
Existing consumers should ignore unknown fields, and UI handling is
null-safe.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 10:44:11 -05:00
Dotta e8275318ba [codex] Raise agent heartbeat concurrency default (#4954)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agent heartbeat settings control how much parallel work one employee
can run
> - The previous default of 5 concurrent runs was too restrictive for
active local agent teams
> - The shared default, heartbeat clamp, docs, and route/import/UI
expectations need to agree
> - This pull request raises the default heartbeat concurrency to 20
while keeping explicit headroom up to 50 for power users
> - The benefit is higher throughput for agent teams without each new
agent needing manual runtime config edits

## What Changed

- Raised `AGENT_DEFAULT_MAX_CONCURRENT_RUNS` from 5 to 20.
- Raised the heartbeat service max clamp from 10 to 50, keeping the new
default below the ceiling.
- Updated V1 implementation docs and tests that assert default
imported/exported runtime config.
- Updated the new-agent UI runtime config test to assert the shared
default constant instead of duplicating the numeric value.

## Verification

- `pnpm exec vitest run
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/company-portability.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`

## Risks

- Medium risk: new agents can consume more local execution capacity by
default. The heartbeat scheduler still respects configured max
concurrency and budget/pause controls, and operators can lower or raise
the per-agent cap within the `1..50` clamp.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 10:42:56 -05:00
Dotta 42a299fb9d [codex] Bound productivity review recovery loops (#4948)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat/productivity review subsystem detects when assigned
work is likely stuck or churning.
> - Productivity reviews are useful, but repeated reconciliation can
create noisy refresh comments or repeated review issues around the same
source issue.
> - That makes manager follow-up harder because the signal can get
buried under duplicate review activity.
> - This pull request bounds productivity review refreshes and creation
loops while preserving the existing escalation path.
> - The benefit is a quieter recovery loop that still surfaces stuck or
high-churn work for manager attention.

## What Changed

- Added refresh throttling for open productivity review issues,
including a one-hour default interval and a maximum of three refresh
comments per open review.
- Added a rolling 24-hour creation cap so completed/closed reviews
cannot immediately recreate review issues indefinitely for the same
source issue.
- Excluded cancelled productivity reviews from the creation cap so
manager cancellations do not silently suppress future legitimate
reviews.
- Preserved productivity review timestamps in deterministic test paths
and added targeted coverage for immediate refresh suppression, refresh
caps, creation caps, and cancelled-review exclusion.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/productivity-review-service.test.ts`
- `pnpm exec vitest run
server/src/__tests__/productivity-review-service.test.ts`
- Greptile Review: 5/5 on commit
`bcf25832d0ffae25890b2ee7eed112d1c2d114fe` with review threads resolved.
- GitHub PR checks passed on the latest head: `policy`, `verify`, `e2e`,
`Greptile Review`, and `security/snyk (cryppadotta)`.
- Verified the branch is rebased onto `public-gh/master` with no
conflicts.
- Verified the diff does not include `pnpm-lock.yaml`, database schema
changes, or migrations.

## Risks

- Low-to-medium risk: this changes automation cadence for productivity
reviews. A truly stuck issue may receive fewer repeated refresh
comments, but the original review issue remains open and assigned for
manager action.
- No migration risk: this is server logic and tests only.

> Checked [`ROADMAP.md`](ROADMAP.md) for overlapping planned core work;
this is a targeted recovery-loop fix and does not add a new roadmap
feature.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family, tool-using software
engineering mode. Exact context window is not exposed in this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable; server-only change)
- [x] I have updated relevant documentation to reflect my changes (not
applicable; no user-facing docs or commands changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 08:32:04 -05:00
Dotta 4272c1604d Add ACPX local adapter runtime (#4893)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through a control plane
that can start, supervise, and recover agent runs.
> - Local adapters are the bridge between Paperclip issues and concrete
agent runtimes such as Claude, Codex, and other ACP-compatible tools.
> - The roadmap calls out broader “bring your own agent” and claw-style
agent support, and ACPX gives Paperclip one path to normalize multiple
ACP agents behind a single adapter.
> - The branch needed to become one reviewable PR against current
`paperclipai/paperclip:master`, without carrying stale base conflicts or
generated lockfile churn.
> - This pull request adds an experimental built-in `acpx_local`
adapter, integrates it through the server/CLI/UI adapter surfaces, and
adds regression coverage for runtime execution, skill sync, stream
parsing, diagnostics, and log redaction.
> - The benefit is that Paperclip can run Claude/Codex/custom ACP agents
through ACPX while keeping operator configuration, skills, logging, and
transcript rendering inside the existing adapter model.

## What Changed

- Added `@paperclipai/adapter-acpx-local` with server execution, config
schema, ACPX session handling, CLI formatting, UI config helpers, and
stdout parsing.
- Registered `acpx_local` across CLI, server, shared constants, UI
adapter metadata, adapter capabilities, and agent creation/editing
surfaces.
- Added ACPX runtime execution support with persistent sessions,
local-agent JWT environment handling, skill snapshots, runtime skill
materialization, and isolation/security regressions.
- Added ACPX adapter diagnostics and marked the adapter experimental in
the UI.
- Added command/env secret redaction for resolved command metadata in
adapter-utils, server event storage, and the Agent Detail invocation UI.
- Added Storybook coverage for ACPX config, transcript rendering, and
skill states, plus PR screenshots under `docs/pr-screenshots/pap-2944/`.
- Rebased the branch onto current `public-gh/master`; `pnpm-lock.yaml`
is intentionally not included and there are no migration/schema changes.

## Verification

- `pnpm exec vitest run
packages/adapters/acpx-local/src/server/execute.test.ts
packages/adapters/acpx-local/src/server/test.test.ts
packages/adapters/acpx-local/src/cli/format-event.test.ts
packages/adapters/acpx-local/src/ui/parse-stdout.test.ts
packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/redaction.test.ts
server/src/__tests__/acpx-local-execute.test.ts
server/src/__tests__/acpx-local-skill-sync.test.ts
server/src/__tests__/acpx-local-adapter-environment.test.ts
server/src/__tests__/adapter-routes.test.ts
server/src/__tests__/agent-skills-routes.test.ts
ui/src/adapters/metadata.test.ts` — 12 files, 87 tests passed.
- `pnpm --filter @paperclipai/adapter-acpx-local typecheck` — passed.
- `pnpm --filter @paperclipai/server typecheck` — passed.
- `pnpm --filter @paperclipai/ui typecheck` — passed.
- Confirmed PR diff does not include `pnpm-lock.yaml`, database schema
files, or migrations.

Screenshots:

![ACPX Claude skills
light](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-claude-light.png?raw=true)
![ACPX Claude skills
dark](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-claude-dark.png?raw=true)
![ACPX custom skills
light](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-custom-light.png?raw=true)

## Risks

- Medium risk: this introduces a new built-in adapter package and
touches runtime execution, adapter registration, agent config, skills,
and transcript rendering.
- ACPX and ACP agent behavior can vary by installed tool versions; the
adapter is marked experimental to set operator expectations.
- `pnpm-lock.yaml` is excluded per repository PR policy, so dependency
lock refresh must be handled by the repo’s automation or maintainers.
- No database migration risk: no schema or migration files changed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with repository tool use,
shell execution, git operations, and local verification. Exact hosted
context window was not exposed in this environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 19:57:05 -05:00
Dotta ad5432fece [codex] Harden issue recovery reliability (#4875)
## Thinking Path

> - Paperclip is the control plane for autonomous agent companies, so
non-terminal issue state must always have a clear live, waiting, or
recovery owner.
> - This change stays inside the server reliability and liveness
subsystem for assigned issue recovery, blocker attention, and live-run
polling.
> - Closed PR #4860 mixed this reliability work with separate
mutation-boundary policy changes, which made review and merge risk too
broad.
> - [PAP-2981](/PAP/issues/PAP-2981) asked for a replacement PR
containing only the remaining reliability slice and explicitly excluding
user-assignment and execution-policy restrictions.
> - Follow-up review also split `advanced` run-liveness continuation
behavior out of this PR so it can be reviewed separately.
> - The implementation hardens repeated recovery escalation, expands
blocker-attention coverage for explicit waiting and recovery paths, and
caps company live-run polling defaults.
> - The benefit is a smaller reliability PR that improves liveness
behavior without changing agent/user mutation authorization boundaries
or `advanced` continuation semantics.

## What Changed

- Avoid repeated liveness escalation updates when the source issue is
already blocked by the same open escalation.
- Treat open liveness escalation recovery issues, their source issues,
and their leaf blockers as covered waiting paths in blocker attention.
- Cap default company live-run polling at 50 rows for both `minCount`
and `limit`, including explicit zero values, to avoid unbounded
responses.
- Preserve the existing behavior where succeeded `advanced` runs are
considered productive/healthy for stranded-work recovery and are not
actionable bounded run-liveness continuations.
- Added focused server coverage for recovery dedupe, blocker attention,
liveness escalation, run continuations, and live-run polling.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
server/src/__tests__/issue-blocker-attention.test.ts
server/src/__tests__/run-continuations.test.ts
server/src/__tests__/agent-live-run-routes.test.ts`
- Result: 5 files passed, 63 tests passed.
- `pnpm --filter @paperclipai/server typecheck`
- Result: passed.
- No UI changes; screenshots are not applicable.

## Risks

- Recovery and blocker-attention classification changes can affect which
blocked chains are shown as covered versus needing attention.
- Live-run polling now treats omitted, invalid, or non-positive `limit`
/ `minCount` values as the capped default of 50.
- `advanced` run-liveness continuation behavior is intentionally
excluded from this PR and split for separate review.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 16:44:28 -05:00
Dotta a3de1d764d Add cheap model profiles for local adapters (#4881)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies, where
adapters are the boundary between the board, agents, and execution
runtimes.
> - Local adapters currently expose a primary runtime configuration, but
operators often need a cheaper model lane for routine or low-risk work.
> - That cheap lane has to stay adapter-owned: runtime profile settings
should not mutate the primary adapter config or bypass existing
auth/secret mediation.
> - Issue creation also needs an ergonomic way to request primary,
cheap, or custom model behavior for a selected assignee.
> - This pull request adds a first-class `cheap` model profile contract
across adapter capabilities, heartbeat config resolution, agent
configuration, and issue creation.
> - The benefit is cheaper task execution can be configured and
requested explicitly while preserving adapter boundaries, secret
handling, and audit visibility.

## What Changed

- Added adapter model-profile capability metadata and a `cheap` profile
contract for supported local adapters.
- Applied `runtimeConfig.modelProfiles.cheap.adapterConfig` during
heartbeat config resolution, including requested/applied/fallback run
metadata.
- Added agent configuration UI for cheap model profile settings without
writing those settings into primary `adapterConfig`.
- Added New Issue assignee model lane controls for Primary / Cheap /
Custom and request payload handling.
- Added run ledger profile badges and Storybook stories for the new
cheap-lane UI states.
- Added tests for validators, heartbeat model profile application,
permission/secret mediation, UI payload helpers, and run ledger
rendering.
- Added committed UI verification screenshots under
`docs/pr-screenshots/pap-2837/`.
- Addressed Greptile review feedback around cheap-profile defaults,
shared profile types, and fallback test data.

## Verification

Local:

- `pnpm exec vitest run packages/shared/src/validators/issue.test.ts
server/src/__tests__/adapter-registry.test.ts
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/heartbeat-model-profile.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/lib/agent-config-patch.test.ts
ui/src/lib/issue-assignee-overrides.test.ts
ui/src/lib/new-agent-runtime-config.test.ts` — passed, 8 files / 103
tests.
- `pnpm exec vitest run ui/src/lib/new-agent-runtime-config.test.ts
ui/src/components/IssueRunLedger.test.tsx` — passed after
Greptile/rebase follow-up, 2 files / 17 tests.
- `pnpm --filter @paperclipai/ui typecheck` — passed after
Greptile/rebase follow-up.
- `pnpm -r typecheck` — passed.
- `pnpm build` — passed.
- `pnpm test:run` — did not complete successfully in this local
worktree: it stopped in pre-existing `@paperclipai/adapter-utils`
sandbox/SSH fixture suites outside this PR diff. Failures were 5s local
timeouts plus `git init -b` unsupported by this machine's Git 2.21.0.
The branch-specific targeted suites above passed.
- Branch was fetched/rebased onto `public-gh/master`; `git rev-list
--left-right --count public-gh/master...HEAD` reports `0 9`.

Remote PR checks on latest head
`e30bf399146451c86cee98ed528d51d33fa5af5a`:

- `policy` — passed.
- `verify` — passed.
- `e2e` — passed.
- `Greptile Review` — passed, confidence score 5/5; Greptile review
threads resolved.
- `security/snyk (cryppadotta)` — passed.

Screenshots:

- [New issue cheap lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-cheap-desktop.png)
- [New issue custom lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-custom-desktop.png)
- [New issue unsupported adapter
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-unsupported-desktop.png)
- [Run ledger model profile badges
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/runledger-profile-badges-desktop.png)
- Mobile variants are also in `docs/pr-screenshots/pap-2837/`.

## Risks

- Medium: heartbeat config mediation now merges runtime model profiles
into adapter configs, so adapter secret normalization and host-command
restrictions must keep covering nested config paths.
- Medium: the UI adds another issue creation choice; unsupported
adapters must keep hiding the cheap lane and preserve primary behavior.
- Low migration risk: no database migration is included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

OpenAI Codex coding agent using GPT-5-class reasoning with repo tool use
and command execution. Exact served model/context window was not exposed
by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 15:32:04 -05:00
Dotta c4269bab59 Add workflow interaction cancellation and issue cost summaries (#4862)
## Thinking Path

> - Paperclip coordinates work through issue-thread interactions, run
history, and cost telemetry.
> - Operators need workflow prompts to be cancellable and costs to be
visible at the issue level.
> - The earlier rollup mixed this workflow/cost work with database
backups, reliability recovery, thread scaling, and settings polish.
> - This pull request isolates the interaction and cost surfaces into a
reviewable slice.
> - The backend now supports cancelling pending question interactions
and summarizing issue-tree costs.
> - The UI component layer can render cancelled questions and interleave
activity with run ledger rows.

## What Changed

- Added `cancelled` as an issue-thread interaction status and result
shape for question interactions.
- Added the board-only `POST
/issues/:id/interactions/:interactionId/cancel` route and service
implementation.
- Added issue-tree cost summary support in the cost service and
`/issues/:id/cost-summary` API route.
- Extended shared cost exports and UI API/query keys for issue cost
summaries.
- Updated `IssueThreadInteractionCard` and `IssueRunLedger` components
for cancelled questions, issue cost surfaces, and activity/run
interleaving.
- Added focused server and component regression coverage.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- Result: 4 test files passed, 45 tests passed.
- UI screenshots not included because this PR updates reusable
components and API surfaces without wiring a new page-level layout.

## Risks

- Adds a new interaction terminal status; clients that switch
exhaustively on interaction status may need to handle `cancelled`.
- Issue-tree cost summaries use recursive issue traversal and should be
watched on unusually large issue trees.
- Page-level issue detail wiring is intentionally left to the board
QoL/issue-detail branch to keep this PR narrow.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 13:57:25 -05:00
Devin Foley c0ce35d1fb Improve E2B plugin configuration UX and fix execution timeouts (#4802)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - E2B is a sandbox provider plugin that runs agent code in isolated
cloud environments
> - Operators configure E2B through the plugin settings page
> - But the E2B API key configuration was unclear — the settings field
description didn't explain that pasted keys are auto-saved as company
secrets, and the fallback to the host `E2B_API_KEY` variable wasn't
documented
> - Additionally, long-running E2B sandbox commands were timing out
because the plugin environment RPC driver used a fixed timeout, and
environment commands competed for the single foreground command slot
> - This PR clarifies the E2B configuration UX, fixes RPC timeouts for
plugin environment execution, and runs E2B environment commands in
background mode to avoid blocking the foreground slot
> - The benefit is clearer E2B setup for operators and more reliable
sandbox command execution

## What Changed

- Updated E2B plugin manifest and settings UI to clarify API key
configuration — field description now explains that pasted keys are
saved as company secrets and documents the `E2B_API_KEY` host fallback
- Added test coverage for the plugin settings page rendering
- Fixed `plugin-environment-driver.ts` to pass the configured timeout
through to RPC calls instead of using a hardcoded default
- Updated `environment-runtime.ts` to propagate timeout from the
environment lease to the plugin driver
- Changed E2B sandbox command execution to use background handles so
long-running agent commands don't block the foreground slot needed by
the callback bridge

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to plugin settings, verify E2B API key field shows
the updated description text
- Manual: run an E2B-backed agent task with a long-running command,
verify it completes without RPC timeout

## Risks

- Low risk. Configuration UX change is cosmetic. The timeout fix passes
an existing value through instead of dropping it. Background command
execution is a behavioral change but only affects E2B sandbox commands —
the foreground slot is still available for bridge health checks.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 17:12:30 -07:00
Devin Foley a4ac6ff133 Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end

## What Changed

- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export

## Verification

- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge

## Risks

- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
Devin Foley 4cf612a92d Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments that are leased, and the server
manages runtime state, workspace configuration, and plugin lifecycle
> - Several edge cases caused failures during concurrent operations: a
race condition in runtime state insertion could produce duplicate-key
errors, reused workspaces didn't sync their configuration when the
parent issue was updated, sandbox provider plugins could be queried
before registration completed, and orphaned environment leases from
failed runs were never released
> - This PR fixes these four runtime/environment issues
> - The benefit is more reliable concurrent agent execution and proper
resource cleanup

## What Changed

- `services/heartbeat.ts`: Fixed a race condition where concurrent
runtime state inserts could fail with a duplicate-key error by using an
upsert pattern
- `services/issues.ts`: Sync reused workspace configuration when an
issue is updated, so the workspace reflects the latest issue state
- `services/environment-runtime.ts`: Fixed a startup race where sandbox
provider plugins could be queried before registration completed, by
awaiting plugin readiness before resolving environment drivers
- `services/heartbeat.ts`: Release environment leases for orphaned runs
that lost their process without cleanup

## Verification

- `pnpm test` — all existing and new tests pass, including new tests for
runtime state upsert and process recovery lease cleanup
- `pnpm typecheck` — clean
- Manual: trigger concurrent agent runs to verify no duplicate-key
failures; verify orphaned leases are released after process loss

## Risks

- Low risk. The runtime state upsert changes insert-to-upsert behavior,
which could mask a legitimate duplicate if two different runs produce
the same key — but this is prevented by the run ID being part of the
key. The plugin startup await is bounded by the existing registration
timeout.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:10 -07:00
Dotta 6b7f6ce4b8 [codex] Split PR #4692 UI/QoL updates (#4701)
## Thinking Path

> - Paperclip orchestrates AI agents through a company-scoped control
plane.
> - The affected surface is the board UI for issue threads, issue lists,
routines, dialogs, navigation, and issue review indicators.
> - Closed PR #4692 bundled backend, schema, docs, workflow, and UI/QoL
work into one oversized change set.
> - Greptile could not keep reviewing that broad PR because it exceeded
the 100-file review limit and mixed unrelated concerns.
> - This pull request extracts the UI/QoL slice into a fresh branch
under the review limit while leaving workflow and lockfile churn out.
> - The benefit is a focused review path for the board UI performance
and workflow improvements without reopening the oversized PR.

## What Changed

- Added long issue-thread virtualization, scroll-container binding,
anchor preservation, latest-comment jump targeting, and related
regression/perf fixtures.
- Improved issue list scalability with scroll-based loading, server
offset parameters, and pagination-focused UI tests.
- Reduced new issue dialog typing churn and split dialog action
subscriptions so broad layout/nav surfaces avoid unnecessary renders.
- Added routine variables help and routine description mention options
for users, agents, and projects.
- Added productivity review badge/link UI and fixed the badge to use
Paperclip's company-prefixed router link.
- Kept the split PR below Greptile's review limit and excluded
`.github/workflows/pr.yml` and `pnpm-lock.yaml`.

## Verification

- `pnpm install --no-frozen-lockfile` in the clean worktree to install
`@tanstack/react-virtual` locally without committing lockfile churn.
- `pnpm --filter @paperclipai/ui exec vitest run --config
vitest.config.ts src/components/IssueChatThread.test.tsx
src/components/IssuesList.test.tsx
src/components/NewIssueDialog.test.tsx src/pages/Routines.test.tsx
src/pages/Issues.test.tsx` passed: 5 files, 83 tests.
- `pnpm --filter @paperclipai/ui typecheck` passed.
- `git diff --check origin/master..HEAD` passed.
- Split-scope checks: 53 changed files; no `.github/workflows/pr.yml`;
no `pnpm-lock.yaml`.
- Screenshots were not captured in this heartbeat; the changes are
primarily virtualization, routing, pagination, and editor behavior
covered by focused regression tests.

## Risks

- Moderate UI risk because issue-thread virtualization changes scroll
behavior on long conversations; regression tests cover anchor jumps,
latest-comment targeting, row metadata, and short-thread fallback.
- Moderate integration risk because the issue-list offset parameter and
productivity review field depend on matching API behavior.
- Dependency risk: the UI package adds `@tanstack/react-virtual` while
repository policy keeps `pnpm-lock.yaml` out of PRs, so CI must resolve
dependency changes through the repo's normal lockfile policy.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled local repository and
GitHub workflow. Exact runtime context window was not exposed by the
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 17:18:58 -05:00
Dotta 1991ec9d6f [codex] Split backend control-plane QoL slice (#4700)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, so
backend task ownership, recovery, review visibility, and company-scoped
limits need to stay enforceable without UI-only coupling.
> - Closed PR #4692 bundled those backend changes with UI workflow,
docs, skills, workflow, and lockfile churn.
> - PAP-2694 asks for a clean backend/control-plane slice from that
closed branch.
> - This branch starts from current `master` and mines only the `cli`,
`packages/db`, `packages/shared`, and `server` contracts/tests needed
for the backend behavior.
> - It explicitly excludes UI workflow/performance work,
`.github/workflows/pr.yml`, `pnpm-lock.yaml`, docs, skills,
package-script, adapter UI build-config, and perf fixture script
changes; the only UI files are fixture/test updates required by the
tightened shared `Company` contract.
> - The benefit is a smaller reviewable PR that preserves the
control-plane fixes while staying under Greptile s 100-file review
limit.

## What Changed

- Added company-scoped attachment-size limits through DB
schema/migrations, shared company portability contracts, CLI
import/export coverage, and server attachment upload enforcement.
- Added productivity review service/API behavior for no-comment streak,
long-active, and high-churn review issues, including request-depth
clamping and issue summary exposure.
- Hardened issue ownership and recovery/control-plane paths: peer-agent
mutation denial, issue tree pause/resume behavior, stranded recovery
origins, and related activity/test coverage.
- Preserved related backend contract updates for routine timestamp
variables and managed agent instruction bundles because they live in
shared/server contracts from the source branch.
- Addressed Greptile feedback by making `Company.attachmentMaxBytes`
non-optional, simplifying review request-depth clamping, fixing the
migration final newline, and enforcing the process-level attachment cap
as the final ceiling for uploads.
- Added minimal company fixtures needed for repo-wide typecheck/build
and kept the PR to 66 changed files with forbidden/non-slice paths
excluded.

## Verification

- `pnpm install --frozen-lockfile`
- `git diff --check origin/master..HEAD`
- `git diff --name-only origin/master..HEAD | wc -l` -> 66 files
- `git diff --name-only origin/master..HEAD -- .github/workflows/pr.yml
pnpm-lock.yaml package.json doc skills .agents scripts
packages/adapters` -> no output
- `pnpm exec vitest run --config vitest.config.ts
packages/shared/src/validators/issue.test.ts
packages/shared/src/routine-variables.test.ts
packages/shared/src/adapter-types.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts
cli/src/__tests__/company.test.ts
server/src/__tests__/productivity-review-service.test.ts
server/src/__tests__/issue-tree-control-service.test.ts
server/src/__tests__/issue-tree-control-routes.test.ts
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
server/src/__tests__/issue-attachment-routes.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/issues-service.test.ts` -> 12 files, 147 tests
passed
- `pnpm exec vitest run --config vitest.config.ts
cli/src/__tests__/company-delete.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts
server/src/__tests__/productivity-review-service.test.ts` -> 3 files, 18
tests passed
- `pnpm exec vitest run --config vitest.config.ts
server/src/__tests__/issue-attachment-routes.test.ts` -> 1 file, 6 tests
passed
- `pnpm --filter @paperclipai/db typecheck && pnpm --filter
@paperclipai/shared typecheck && pnpm --filter @paperclipai/server
typecheck && pnpm --filter paperclipai typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck && pnpm --filter
@paperclipai/ui build`

## Risks

- Includes migrations `0073_shiny_salo.sql` and
`0074_striped_genesis.sql`; merge ordering matters if another PR adds
migrations first.
- This is intentionally backend-only apart from fixture/test updates
forced by shared type correctness; UI affordances from PR #4692 are not
present here and should land in separate UI slices.
- The worktree install emitted plugin SDK bin-link warnings for unbuilt
plugin packages, but the targeted tests and package typechecks completed
successfully.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected; check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow. Exact runtime context window was not exposed by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 16:46:45 -05:00
Dotta f88f538e6d Keep manual routine runs visible in the runner inbox (#4615)
## Thinking Path

> - Paperclip coordinates recurring agent work through scheduled and
manual routines.
> - Manual routine runs are board-initiated work and should stay visible
to the human who kicked them off.
> - Routine execution issues are agent-assigned, so they can be filtered
away from a board user's inbox unless the user is recorded as touching
the work.
> - Coalesced or skipped active routine runs have the same visibility
problem because they reuse an existing live issue.
> - This pull request carries the manual runner actor into routine
dispatch and touches the linked issue for that user's inbox.
> - The benefit is that manually triggered routine work stays
discoverable by the operator who started it.

## What Changed

- Passed the board or agent actor from the routine run route into the
routine service.
- Recorded manual board runners as `createdByUserId` on fresh routine
execution issues.
- Touched coalesced or skipped active routine issues for the manual
runner by updating read state and clearing that user's inbox archive.
- Added route and service regressions for manual routine run actor
propagation and inbox visibility.

## Verification

- `pnpm exec vitest run server/src/__tests__/routines-routes.test.ts
server/src/__tests__/routines-service.test.ts`

## Risks

- Low risk: the change is scoped to manual routine runs and only updates
issue attribution/read-state metadata for the initiating actor.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:03:24 -05:00
Dotta 68c37660f0 Dispatch assigned todo work during recovery sweeps (#4614)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies.
> - Agent assignments must reliably turn into heartbeat work without
board operators manually nudging stuck tasks.
> - The stranded-assignment recovery sweep already handles failed or
lost runs.
> - But assigned `todo` issues with no prior run could sit idle because
there was nothing to retry or recover.
> - This pull request dispatches those never-started assigned todos as
normal assignment wakes.
> - The benefit is that recovery fixes missed initial dispatches without
creating unnecessary recovery issues.

## What Changed

- Added an initial assigned-todo dispatch path to the recovery service
when an assigned `todo` issue has no heartbeat run yet.
- Reused invocation budget hard-stop checks before dispatching or
requeueing recovery work.
- Counted `assignmentDispatched` in startup/scheduled recovery logs.
- Added heartbeat recovery regressions for first dispatch, duplicate
queued wake prevention, budget-blocked skips, and paused-agent skips.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`

## Risks

- Low to medium risk: this changes liveness recovery behavior for
assigned `todo` issues, but it stays on the existing assignment wake
path and skips paused or budget-blocked agents.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-27 20:02:44 -05:00
Dotta 7a9b3a6037 [codex] Harden recovery issue handling (#4600)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The control plane must recover stranded agent work without creating
new operational loops
> - Stranded recovery issues can themselves fail, and exposing raw retry
errors in comments can leak sensitive adapter details
> - New local companies also should not force a hire-approval gate
unless operators enable that policy
> - This pull request hardens recovery issue handling, redacts retry
failure details in issue copy, preserves `maxConcurrentRuns: 1`, and
flips new-hire approval to an opt-in default
> - The benefit is safer automatic recovery and smoother default company
setup without hidden migration conflicts

## What Changed

- Added migration `0071_default_hire_approval_off` and updated company
schema/import/export/docs so hire approvals default off and serialize
only when enabled.
- Added migration `0072_large_sandman` with a partial unique index
preventing duplicate active stranded recovery issues for the same source
issue.
- Blocked failed `stranded_issue_recovery` issues in place instead of
creating nested recovery issues.
- Redacted latest retry failure details from recovery issue comments
while still linking reviewers to run evidence.
- Allowed `maxConcurrentRuns: 1` to be honored by heartbeat concurrency
normalization.
- Added focused regression coverage for recovery recursion, redaction,
migration ordering, and concurrency behavior.

## Verification

- `pnpm --filter @paperclipai/db run check:migrations`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/recovery-classifiers.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-portability.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-permissions-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-process-recovery.test.ts --pool=forks
--poolOptions.forks.isolate=true` exits 0, but this host skipped the
embedded Postgres tests with the existing init guard.
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
--pool=forks --poolOptions.forks.isolate=true` exits 0, but this host
skipped the embedded Postgres tests with the existing init guard.

## Risks

- Migration risk is low but this PR intentionally owns both new
migrations to avoid separate PR migration-journal conflicts.
- Recovery comments now require operators to inspect linked run evidence
for details instead of reading raw errors inline.
- The hire approval default changes behavior for newly created/imported
companies only; existing persisted company settings are not changed
except by the SQL default for future rows.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 15:02:47 -05:00
Dotta 6ccf80bcf2 [codex] Reject stale company skill refreshes (#4601)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Company skills are part of the reusable agent capability layer
> - Skill inventory refresh work can outlive the company it was
requested for
> - Without an explicit company existence check, stale refreshes can
continue into bundled/local skill cleanup for deleted or missing
companies
> - This pull request makes company-skill listing fail fast when the
company no longer exists
> - The benefit is clearer API behavior and less stale background work
against missing company scope

## What Changed

- Added a company existence check before `companySkillService.list()`
refreshes bundled and local-path skill state.
- Added regression coverage asserting missing companies return `404
Company not found`.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-skills-service.test.ts --pool=forks
--poolOptions.forks.isolate=true` exits 0, but this host skipped the
embedded Postgres tests with the existing init guard.

## Risks

- Low risk. Existing callers for valid companies are unchanged.
- Missing-company callers now receive an explicit 404 instead of
continuing refresh work.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-27 13:19:38 -05:00
Dotta fda296ee4f [codex] Add configurable liveness auto-recovery controls (#4587)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Heartbeat liveness recovery decides when stalled issue trees need
manager-visible follow-up.
> - Automatic recovery issue creation is useful, but operators need
instance-level controls for how aggressive it is.
> - Without controls, recovery behavior is harder to tune for local
development, production operations, and noisy edge cases.
> - This pull request adds configurable liveness auto-recovery settings
across shared contracts, API routes, services, and the instance
experimental settings UI.
> - The benefit is that operators can keep liveness findings advisory or
enable bounded recovery automation with explicit intervals and lookback
windows.

## What Changed

- Added shared types and validators for liveness auto-recovery settings.
- Extended instance settings routes and services to persist and validate
the new controls.
- Wired heartbeat/recovery services to honor enablement, minimum
interval, and lookback settings.
- Added UI controls for liveness recovery under instance experimental
settings.
- Covered the new server behavior with instance settings and liveness
escalation tests.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
server/src/__tests__/instance-settings-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Moderate behavioral risk because recovery automation timing changes
when enabled; defaults keep existing advisory behavior unless the
setting is turned on.
- No database migration in this PR; settings are stored through the
existing instance settings path.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:46:44 -05:00
Dotta 82e257c7ba Cancel stale queued heartbeats when issue graph changes (PAP-2314) (#4534)
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 21:17:38 -05:00
Devin Foley 54ab0d24cd Fix disappearing issue comments (#4557)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies, so issue detail
pages are a primary surface for understanding agent work and human
feedback.
> - The relevant subsystem here is the issue comments/chat experience
across the React issue detail page and the server comment pagination
API.
> - Long issue threads were only surfacing the newest page of comments
at first render, which hid earlier human and agent messages behind extra
pagination.
> - The first UI fix exposed that the descending cursor path on the
server could also fail for older-page fetches, leaving the chat tab
stuck on an infinite "Loading earlier comments..." state.
> - This needed to be addressed in both layers so the chat tab can
surface earlier conversation history without manual recovery and without
server errors.
> - This pull request auto-loads earlier comment pages in the issue
detail chat view and fixes the descending cursor predicate used by issue
comment pagination.
> - The benefit is that long-running issues like `PAPA-103` now show the
missing conversation history near the top of the chat surface instead of
hiding it or failing to load it.

## What Changed

- Auto-load earlier issue comment pages in the issue detail chat tab
until the thread reaches a 150-comment cap or there are no older
comments left.
- Add UI-side guard logic and regression coverage for optimistic issue
comment pagination so the autoload behavior stops cleanly.
- Replace the raw SQL descending cursor predicate in
`issueService.listComments` with typed Drizzle comparisons for the
`(createdAt, id)` anchor tuple.
- Add a server regression test that paginates earlier comments in
descending order from an anchor comment.
- Smoke-test the exact previously failing seeded `PAPA-103` cursor path
on the isolated dev instance used for review.

## Verification

- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/issues-service.test.ts`
- `pnpm --filter @paperclipai/server typecheck`
- Manual smoke against seeded `PAPA-103` data on the isolated dev
server:
- `GET /api/issues/PAPA-103/comments?order=desc&limit=50` returns `200`
- `GET
/api/issues/PAPA-103/comments?after=765d3609-edc6-4d11-a8fe-d466affbe85d&order=desc&limit=50`
now returns `200` with 50 comments instead of `500`

## Risks

- Moderate UI/perf risk on very large threads because the chat tab now
prefetches multiple earlier pages on mount; the cap is intentionally
limited to 150 comments to bound that work.
- Low API risk because the server fix only changes the cursor predicate
construction for anchor-based comment pagination, but any mistake there
would affect older-comment paging order.

> I checked `ROADMAP.md` before opening this PR and this bug fix does
not duplicate planned core work.

## Model Used

- OpenAI Codex coding agent in the Paperclip local adapter environment.
The exact backend model ID and context window were not exposed
in-session. Tool-assisted workflow included shell execution, git/GitHub
CLI, local test execution, and targeted code edits.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 16:23:53 -07:00
Dotta df425fde96 Present ordered sub-issues as a workflow checklist (#4523)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators use issue detail pages and child issue lists to understand
multi-step execution plans.
> - Ordered sub-issues currently read like a flat table, so dependency
chains and current next steps are harder to scan.
> - The branch work adds a workflow-oriented presentation for child
issues without changing the single-assignee task model.
> - This pull request makes ordered sub-issues read more like a progress
checklist while preserving normal issue list controls.
> - The benefit is that operators can see completed steps, active work,
blocked follow-ups, and dependency order at a glance.

## What Changed

- Added workflow sorting utilities and tests for dependency-aware child
issue ordering.
- Added sub-issue progress summary, checklist numbering, current-step
affordances, blocker context, and done-state de-emphasis in the issue
list UI.
- Wired issue detail sub-issue panels to use the workflow sort/progress
checklist presentation.
- Updated issue service behavior/tests for child issue ordering inputs
used by the UI.
- Added a Storybook visual review fixture and screenshot helper for the
sub-issue workflow checklist surface.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/issues-service.test.ts
ui/src/components/IssueRow.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/pages/IssueDetail.test.tsx
ui/src/lib/issue-detail-subissues.test.ts
ui/src/lib/workflow-sort.test.ts`
- Result: 6 test files passed, 55 tests passed, 34 embedded Postgres
issue-service tests skipped because `@embedded-postgres/darwin-x64` is
unavailable on this host.
- Visual review: generated Storybook screenshots from the existing local
Storybook server on port 6006 with `node
scripts/screenshot-subissues.mjs /tmp/pap-2189-subissues-screens
http://localhost:6006`.
- Screenshot artifacts:
- Desktop dark: ![Desktop
dark](doc/assets/pap-2189/desktop-1440x900-dark.png)
- Desktop light: ![Desktop
light](doc/assets/pap-2189/desktop-1440x900-light.png)
- Mobile dark: ![Mobile
dark](doc/assets/pap-2189/mobile-390x844-dark.png)
- Mobile light: ![Mobile
light](doc/assets/pap-2189/mobile-390x844-light.png)
- Local Storybook note: starting a second Storybook process selected
port 6008 because 6006 was occupied, then Vite failed with an esbuild
host/binary version mismatch (`0.25.12` host vs `0.27.3` binary). The
already-running Storybook server on 6006 served the fixture successfully
for screenshots.

## Risks

- Medium UI risk: the issue list now has additional sub-issue-specific
visual states, so dense lists should be checked for spacing and
scanability.
- Low ordering risk: workflow sorting is covered by focused unit tests,
but unusual dependency topologies may still need reviewer attention.
- No migration risk: this PR does not add database migrations or touch
`pnpm-lock.yaml`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub
workflow. Context window is runtime-provided and not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-26 07:36:49 -05:00
Devin Foley 5bd0f578fd Generalize sandbox provider core for plugin-only providers (#4449)
## Thinking Path

> - Paperclip is a control plane, so optional execution providers should
sit at the plugin edge instead of hardcoding provider-specific behavior
into core shared/server/ui layers.
> - Sandbox environments are already first-class, and the fake provider
proves the built-in path; the remaining gap was that real providers
still leaked provider-specific config and runtime assumptions into core.
> - That coupling showed up in config normalization, secret persistence,
capabilities reporting, lease reconstruction, and the board UI form
fields.
> - As long as core knew about those provider-shaped details, shipping a
provider as a pure third-party plugin meant every new provider would
still require host changes.
> - This pull request generalizes the sandbox provider seam around
schema-driven plugin metadata and generic secret-ref handling.
> - The runtime and UI now consume provider metadata generically, so
core only special-cases the built-in fake provider while third-party
providers can live entirely in plugins.

## What Changed

- Added generic sandbox-provider capability metadata so plugin-backed
providers can expose `configSchema` through shared environment support
and the environments capabilities API.
- Reworked sandbox config normalization/persistence/runtime resolution
to handle schema-declared secret-ref fields generically, storing them as
Paperclip secrets and resolving them for probe/execute/release flows.
- Generalized plugin sandbox runtime handling so provider validation,
reusable-lease matching, lease reconstruction, and plugin worker calls
all operate on provider-agnostic config instead of provider-shaped
branches.
- Replaced hardcoded sandbox provider form fields in Company Settings
with schema-driven rendering and blocked agent environment selection
from the built-in fake provider.
- Added regression coverage for the generic seam across shared support
helpers plus environment config, probe, routes, runtime, and
sandbox-provider runtime tests.

## Verification

- `pnpm vitest --run packages/shared/src/environment-support.test.ts
server/src/__tests__/environment-config.test.ts
server/src/__tests__/environment-probe.test.ts
server/src/__tests__/environment-routes.test.ts
server/src/__tests__/environment-runtime.test.ts
server/src/__tests__/sandbox-provider-runtime.test.ts`
- `pnpm -r typecheck`

## Risks

- Plugin sandbox providers now depend more heavily on accurate
`configSchema` declarations; incorrect schemas can misclassify
secret-bearing fields or omit required config.
- Reusable lease matching is now metadata-driven for plugin-backed
providers, so providers that fail to persist stable metadata may
reprovision instead of resuming an existing lease.
- The UI form is now fully schema-driven for plugin-backed sandbox
providers; provider manifests without good defaults or descriptions may
produce a rougher operator experience.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, and local code/test
inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 18:03:41 -07:00
Dotta 73fbdf36db Gate stale-run watchdog decisions by board access (#4446)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The run ledger surfaces stale-run watchdog evaluation issues and
recovery actions
> - Viewer-level board users should be able to inspect status without
getting controls that the server will reject
> - The UI also needs enough board-access context to know when to hide
those decision actions
> - This pull request exposes board memberships in the current board
access snapshot and gates watchdog action controls for known viewer
contexts
> - The benefit is clearer least-privilege UI behavior around recovery
controls

## What Changed

- Included memberships in `/api/cli-auth/me` so the board UI can
distinguish active viewer memberships from operator/admin access.
- Added the stale-run evaluation issue assignee to output silence
summaries.
- Hid stale-run watchdog decision buttons for known non-owner viewer
contexts.
- Surfaced watchdog decision failures through toast and inline error
text.
- Threaded `companyId` through the issue activity run ledger so access
checks are company-scoped.
- Added IssueRunLedger coverage for non-owner viewers.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Medium-low risk. This is a UI gating change backed by existing server
authorization.
- Local implicit and instance-admin board contexts continue to show
watchdog decision controls.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:25:23 -05:00
Dotta 6916e30f8e Cancel stale retries when issue ownership changes (#4445)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue execution is guarded by run locks and bounded retry scheduling
> - A failed run can schedule a retry, but the issue may be reassigned
before that retry becomes due
> - The old assignee's scheduled retry should not continue to hold or
reclaim execution for the issue
> - This pull request cancels stale scheduled retries when ownership
changes and cancels live work when an issue is explicitly cancelled
> - The benefit is cleaner issue handoff semantics and fewer stranded or
incorrect execution locks

## What Changed

- Cancel scheduled retry runs when their issue has been reassigned
before the retry is promoted.
- Clear stale issue execution locks and cancel the associated wakeup
request when a stale retry is cancelled.
- Avoid deferring a new assignee behind a previous assignee's scheduled
retry.
- Cancel an active run when an issue status is explicitly changed to
`cancelled`, while leaving `done` transitions alone.
- Added route and heartbeat regressions for reassignment and
cancellation behavior.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
  - `issue-comment-reopen-routes.test.ts`: 28 passed.
- `heartbeat-retry-scheduling.test.ts`: skipped by the existing embedded
Postgres host guard (`Postgres init script exited with code null`).
- `pnpm --filter @paperclipai/server typecheck`

## Risks

- Medium risk because this changes heartbeat retry lifecycle behavior.
- The cancellation path is scoped to scheduled retries whose issue
assignee no longer matches the retrying agent, and logs a lifecycle
event for auditability.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:24:13 -05:00
Dotta 5a0c1979cf [codex] Add runtime lifecycle recovery and live issue visibility (#4419) 2026-04-24 15:50:32 -05:00
Devin Foley 70679a3321 Add sandbox environment support (#4415)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.

## What Changed

- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.

## Verification

- Ran locally before the final fixture-only scrub:
  - `pnpm -r typecheck`
  - `pnpm test:run`
  - `pnpm build`
- Ran locally after the final scrub amend:
  - `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
  - create a sandbox environment backed by the fake provider plugin
  - run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly

## Risks

- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.

## Model Used

- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
Dotta 8f1cd0474f [codex] Improve transient recovery and Codex model refresh (#4383)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapter execution and retry classification decide whether agent work
pauses, retries, or recovers automatically
> - Transient provider failures need to be classified precisely so
Paperclip does not convert retryable upstream conditions into false hard
failures
> - At the same time, operators need an up-to-date model list for
Codex-backed agents and prompts should nudge agents toward targeted
verification instead of repo-wide sweeps
> - This pull request tightens transient recovery classification for
Claude and Codex, updates the agent prompt guidance, and adds Codex
model refresh support end-to-end
> - The benefit is better automatic retry behavior plus fresher
operator-facing model configuration

## What Changed

- added Codex usage-limit retry-window parsing and Claude extra-usage
transient classification
- normalized the heartbeat transient-recovery contract across adapter
executions and heartbeat scheduling
- documented that deferred comment wakes only reopen completed issues
for human/comment-reopen interactions, while system follow-ups leave
closed work closed
- updated adapter-utils prompt guidance to prefer targeted verification
- added Codex model refresh support in the server route, registry,
shared types, and agent config form
- added adapter/server tests covering the new parsing, retry scheduling,
and model-refresh behavior

## Verification

- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-claude-local
packages/adapters/claude-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-codex-local
packages/adapters/codex-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/adapter-model-refresh-routes.test.ts
server/src/__tests__/adapter-models.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/codex-local-execute.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-retry-scheduling.test.ts`

## Risks

- Moderate behavior risk: retry classification affects whether runs
auto-recover or block, so mistakes here could either suppress needed
retries or over-retry real failures
- Low workflow risk: deferred comment wake reopening is intentionally
scoped to human/comment-reopen interactions so system follow-ups do not
revive completed issues unexpectedly

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:40:40 -05:00
Dotta 7ad225a198 [codex] Improve issue thread review flow (#4381)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue detail is where operators coordinate review, approvals, and
follow-up work with active runs
> - That thread UI needs to surface blockers, descendants, review
handoffs, and reply ergonomics clearly enough for humans to guide agent
work
> - Several small gaps in the issue-thread flow were making review and
navigation clunkier than necessary
> - This pull request improves the reply composer, descendant/blocker
presentation, interaction folding, and review-request handoff plumbing
together as one cohesive issue-thread workflow slice
> - The benefit is a cleaner operator review loop without changing the
broader task model

## What Changed

- restored and refined the floating reply composer behavior in the issue
thread
- folded expired confirmation interactions and improved post-submit
thread scrolling behavior
- surfaced descendant issue context and inline blocker/paused-assignee
notices on the issue detail view
- tightened large-board first paint behavior in `IssuesList`
- added loose review-request handoffs through the issue
execution-policy/update path and covered them with tests

## Verification

- `pnpm vitest run ui/src/pages/IssueDetail.test.tsx`
- `pnpm vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/issue-execution-policy.test.ts`
- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/lib/issue-tree.test.ts
ui/src/api/issues.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-comment-reopen-routes.test.ts -t "coerces
executor handoff patches into workflow-controlled review wakes|wakes the
return assignee with execution_changes_requested"`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-execution-policy.test.ts
server/src/__tests__/issues-service.test.ts`

## Visual Evidence

- UI layout changes are covered by the focused issue-thread component
and issue-detail tests listed above. Browser screenshots were not
attachable from this automated greploop environment, so reviewers should
use the running preview for final visual confirmation.

## Risks

- Moderate UI-flow risk: these changes touch the issue detail experience
in multiple spots, so regressions would most likely show up as
thread-layout quirks or incorrect review-handoff behavior

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or documented the visual verification path
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 08:02:45 -05:00
Dotta 35a9dc37b0 [codex] Speed up company skill detail loading (#4380)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Company skills are part of the control plane for distributing
reusable capabilities
> - Board flows that inspect company skill detail should stay responsive
because they are operator-facing control-plane reads
> - The existing detail path was doing broader work than needed for the
specific detail screen
> - This pull request narrows that company-skill detail loading path and
adds a regression test around it
> - The benefit is faster company skill detail reads without changing
the external API contract

## What Changed

- tightened the company-skill detail loading path in
`server/src/services/company-skills.ts`
- added `server/src/__tests__/company-skills-detail.test.ts` to verify
the detail route only pulls the required data

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-skills-detail.test.ts`

## Risks

- Low risk: this only changes the company-skill detail query path, but
any missed assumption in the detail consumer would surface when loading
that screen

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 07:37:13 -05:00
Devin Foley e4995bbb1c Add SSH environment support (#4358)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation

## What Changed

- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.

## Verification

- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
  - enabled the experimental flag
  - created an SSH environment
  - created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back

## Risks

- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.

## Model Used

- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
Dotta f98c348e2b [codex] Add issue subtree pause, cancel, and restore controls (#4332)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - This branch extends the issue control-plane so board operators can
pause, cancel, and later restore whole issue subtrees while keeping
descendant execution and wake behavior coherent.
> - That required new hold state in the database, shared contracts,
server routes/services, and issue detail UI controls so subtree actions
are durable and auditable instead of ad hoc.
> - While this branch was in flight, `master` advanced with new
environment lifecycle work, including a new `0065_environments`
migration.
> - Before opening the PR, this branch had to be rebased onto
`paperclipai/paperclip:master` without losing the existing
subtree-control work or leaving conflicting migration numbering behind.
> - This pull request rebases the subtree pause/cancel/restore feature
cleanly onto current `master`, renumbers the hold migration to
`0066_issue_tree_holds`, and preserves the full branch diff in a single
PR.
> - The benefit is that reviewers get one clean, mergeable PR for the
subtree-control feature instead of stale branch history with migration
conflicts.

## What Changed

- Added durable issue subtree hold data structures, shared
API/types/validators, server routes/services, and UI flows for subtree
pause, cancel, and restore operations.
- Added server and UI coverage for subtree previewing, hold
creation/release, dependency-aware scheduling under holds, and issue
detail subtree controls.
- Rebased the branch onto current `paperclipai/paperclip:master` and
renumbered the branch migration from `0065_issue_tree_holds` to
`0066_issue_tree_holds` so it no longer conflicts with upstream
`0065_environments`.
- Added a small follow-up commit that makes restore requests return `200
OK` explicitly while keeping pause/cancel hold creation at `201
Created`, and updated the route test to match that contract.

## Verification

- `pnpm --filter @paperclipai/db typecheck`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `cd server && pnpm exec vitest run
src/__tests__/issue-tree-control-routes.test.ts
src/__tests__/issue-tree-control-service.test.ts
src/__tests__/issue-tree-control-service-unit.test.ts
src/__tests__/heartbeat-dependency-scheduling.test.ts`
- `cd ui && pnpm exec vitest run src/components/IssueChatThread.test.tsx
src/pages/IssueDetail.test.tsx`

## Risks

- This is a broad cross-layer change touching DB/schema, shared
contracts, server orchestration, and UI; regressions are most likely
around subtree status restoration or wake suppression/resume edge cases.
- The migration was renumbered during PR prep to avoid the new upstream
`0065_environments` conflict. Reviewers should confirm the final
`0066_issue_tree_holds` ordering is the only hold-related migration that
lands.
- The issue-tree restore endpoint now responds with `200` instead of
relying on implicit behavior, which is semantically better for a restore
operation but still changes an API detail that clients or tests could
have assumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent in the Paperclip Codex runtime (GPT-5-class
tool-using coding model; exact deployment ID/context window is not
exposed inside this session).

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-23 14:51:46 -05:00
Devin Foley 13551b2bac Add local environment lifecycle (#4297)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Every heartbeat run needs a concrete place where the agent's adapter
process executes.
> - Today that execution location is implicitly the local machine, which
makes it hard to track, audit, and manage as a first-class runtime
concern.
> - The first step is to represent the current local execution path
explicitly without changing how users experience agent runs.
> - This pull request adds core Environment and Environment Lease
records, then routes existing local heartbeat execution through a
default `Local` environment.
> - The benefit is that local runs remain behavior-preserving while the
system now has durable environment identity, lease lifecycle tracking,
and activity records for execution placement.

## What Changed

- Added `environments` and `environment_leases` database tables, schema
exports, and migration `0065_environments.sql`.
- Added shared environment constants, TypeScript types, and validators
for environment drivers, statuses, lease policies, lease statuses, and
cleanup states.
- Added `environmentService` for listing, reading, creating, updating,
and ensuring company-scoped environments.
- Added environment lease lifecycle operations for acquire, metadata
update, single-lease release, and run-wide release.
- Updated heartbeat execution to lazily ensure a company-scoped default
`Local` environment before adapter execution.
- Updated heartbeat execution to acquire an ephemeral local environment
lease, write `paperclipEnvironment` into the run context snapshot, and
release active leases during run finalization.
- Added activity log events for environment lease acquisition and
release.
- Added tests for environment service behavior and the local heartbeat
environment lifecycle.
- Added a CI-follow-up heartbeat guard so deferred issue comment wakes
are promoted before automatic missing-comment retries, with focused
batching test coverage.

## Verification

Local verification run for this branch:

- `pnpm -r typecheck`
- `pnpm build`
- `pnpm exec vitest run server/src/__tests__/environment-service.test.ts
server/src/__tests__/heartbeat-local-environment.test.ts --pool=forks`

Additional reviewer/CI verification:

- Confirm `pnpm-lock.yaml` is not modified.
- Confirm `pnpm test:run` passes in CI.
- Confirm `PAPERCLIP_E2E_SKIP_LLM=true pnpm run test:e2e` passes in CI.
- Confirm a local heartbeat run creates one active `Local` environment
when needed, records one lease for the run, releases the lease when the
run finishes, and includes `paperclipEnvironment` in the run context
snapshot.

Screenshots: not applicable; this PR has no UI changes.

## Risks

- Migration risk: introduces two new tables and a new migration journal
entry. Review should verify company scoping, indexes, foreign keys, and
enum defaults are correct.
- Lifecycle risk: heartbeat finalization now releases environment leases
in addition to existing runtime cleanup. A finalization bug could leave
stale active leases or mark a failed run's lease incorrectly.
- Behavior-preservation risk: local adapter execution should remain
unchanged apart from environment bookkeeping. Review should pay
attention to the heartbeat path around context snapshot updates and
final cleanup ordering.
- Activity volume risk: each heartbeat run now logs lease acquisition
and release events, increasing activity log volume by two records per
run.

## Model Used

OpenAI GPT-5.4 via Codex CLI. Capabilities used: repository inspection,
TypeScript implementation review, local test/build execution, and
PR-description drafting.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no UI changes)
- [x] I have updated relevant documentation to reflect my changes (N/A:
no user-facing docs or commands changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-22 20:07:41 -07:00
Dotta b69b563aa8 [codex] Fix stale issue execution run locks (#4258)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies, so issue
checkout and execution ownership are core safety contracts.
> - The affected subsystem is the issue service and route layer that
gates agent writes by `checkoutRunId` and `executionRunId`.
> - PAP-1982 exposed a stale-lock failure mode where a terminal
heartbeat run could leave `executionRunId` pinned after checkout
ownership had moved or been cleared.
> - That stale execution lock could reject legitimate
PATCH/comment/release requests from the rightful assignee after a
harness restart.
> - This pull request centralizes terminal-run cleanup, applies it
before ownership-gated writes, and adds a board-only recovery endpoint
for operator intervention.
> - The benefit is that crashed or terminal runs no longer strand issues
behind stale execution locks, while live execution locks still block
conflicting writes.

## What Changed

- Added `issueService.clearExecutionRunIfTerminal()` to atomically lock
the issue/run rows and clear terminal or missing execution-run locks.
- Reused stale execution-lock cleanup from checkout,
`assertCheckoutOwner()`, and `release()`.
- Allowed the same assigned agent/current run to adopt an unowned
`in_progress` checkout after stale execution-lock cleanup.
- Updated release to clear `executionRunId`, `executionAgentNameKey`,
and `executionLockedAt`.
- Added board-only `POST /api/issues/:id/admin/force-release` with
company access checks, optional `clearAssignee=true`, and
`issue.admin_force_release` audit logging.
- Added embedded Postgres service tests and route integration tests for
stale-lock recovery, release behavior, and admin force-release
authorization/audit behavior.
- Documented the new force-release API in `doc/SPEC-implementation.md`.

## Verification

- `pnpm vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/issue-stale-execution-lock-routes.test.ts` passed.
- `pnpm vitest run
server/src/__tests__/issue-stale-execution-lock-routes.test.ts
server/src/__tests__/approval-routes-idempotency.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/issue-telemetry-routes.test.ts` passed.
- `pnpm -r typecheck` passed.
- `pnpm build` passed.
- `git diff --check` passed.
- `pnpm lint` could not run because this repo has no `lint` command.
- Full `pnpm test:run` completed with 4 failures in existing route
suites: `approval-routes-idempotency.test.ts` (2),
`issue-comment-reopen-routes.test.ts` (1), and
`issue-telemetry-routes.test.ts` (1). Those same files pass when run
isolated and when run together with the new stale-lock route test, so
this appears to be a whole-suite ordering/mock-isolation issue outside
this patch path.

## Risks

- Medium: this changes ownership-gated write behavior. The new adoption
path is limited to the current run, the current assignee, `in_progress`
issues, and rows with no checkout owner after terminal-lock cleanup.
- Low: the admin force-release endpoint is board-only and
company-scoped, but misuse can intentionally clear a live lock. It
writes an audit event with prior lock IDs.
- No schema or migration changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent (`gpt-5`), agentic coding with
terminal/tool use and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-22 10:43:38 -05:00
Dotta a957394420 [codex] Add structured issue-thread interactions (#4244)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators supervise that work through issues, comments, approvals,
and the board UI.
> - Some agent proposals need structured board/user decisions, not
hidden markdown conventions or heavyweight governed approvals.
> - Issue-thread interactions already provide a natural thread-native
surface for proposed tasks and questions.
> - This pull request extends that surface with request confirmations,
richer interaction cards, and agent/plugin/MCP helpers.
> - The benefit is that plan approvals and yes/no decisions become
explicit, auditable, and resumable without losing the single-issue
workflow.

## What Changed

- Added persisted issue-thread interactions for suggested tasks,
structured questions, and request confirmations.
- Added board UI cards for interaction review, selection, question
answers, and accept/reject confirmation flows.
- Added MCP and plugin SDK helpers for creating interaction cards from
agents/plugins.
- Updated agent wake instructions, onboarding assets, Paperclip skill
docs, and public docs to prefer structured confirmations for
issue-scoped decisions.
- Rebased the branch onto `public-gh/master` and renumbered branch
migrations to `0063` and `0064`; the idempotency migration uses `ADD
COLUMN IF NOT EXISTS` for old branch users.

## Verification

- `git diff --check public-gh/master..HEAD`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
packages/mcp-server/src/tools.test.ts
packages/shared/src/issue-thread-interactions.test.ts
ui/src/lib/issue-thread-interactions.test.ts
ui/src/lib/issue-chat-messages.test.ts
ui/src/components/IssueThreadInteractionCard.test.tsx
ui/src/components/IssueChatThread.test.tsx
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
server/src/services/issue-thread-interactions.test.ts` -> 9 files / 79
tests passed
- `pnpm -r typecheck` -> passed, including `packages/db` migration
numbering check

## Risks

- Medium: this adds a new issue-thread interaction model across
db/shared/server/ui/plugin surfaces.
- Migration risk is reduced by placing this branch after current master
migrations (`0063`, `0064`) and making the idempotency column add
idempotent for users who applied the old branch numbering.
- UI interaction behavior is covered by component tests, but this PR
does not include browser screenshots.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-class coding agent runtime. Exact model ID and
context window are not exposed in this Paperclip run; tool use and local
shell/code execution were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 20:15:11 -05:00
Dotta bcbbb41a4b [codex] Harden heartbeat runtime cleanup (#4233)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat runtime is the control-plane path that turns issue
assignments into agent runs and recovers after process exits.
> - Several edge cases could leave high-volume reads unbounded, stale
runtime services visible, blocked dependency wakes too eager, or
terminal adapter processes still around after output finished.
> - These problems make operator views noisy and make long-running agent
work less predictable.
> - This pull request tightens the runtime/read paths and adds focused
regression coverage.
> - The benefit is safer heartbeat execution and cleaner runtime state
without changing the public task model.

## What Changed

- Bounded high-volume issue/log reads in runtime code paths.
- Hardened heartbeat handling for blocked dependency wakes and terminal
run cleanup.
- Added adapter process cleanup coverage for terminal output cases.
- Added workspace runtime control tests for stale command matching and
stopped services.

## Verification

- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
ui/src/components/WorkspaceRuntimeControls.test.tsx`

## Risks

- Medium risk because heartbeat cleanup and runtime filtering affect
active agent execution paths.
- No migrations.

> Checked `ROADMAP.md`; this is runtime hardening and bug-fix work, not
a new roadmap-level feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled repository
editing and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 16:48:47 -05:00
Dotta 09d0678840 [codex] Harden heartbeat scheduling and runtime controls (#4223)
## Thinking Path

> - Paperclip orchestrates AI agents through issue checkout, heartbeat
runs, routines, and auditable control-plane state
> - The runtime path has to recover from lost local processes, transient
adapter failures, blocked dependencies, and routine coalescing without
stranding work
> - The existing branch carried several reliability fixes across
heartbeat scheduling, issue runtime controls, routine dispatch, and
operator-facing run state
> - These changes belong together because they share backend contracts,
migrations, and runtime status semantics
> - This pull request groups the control-plane/runtime slice so it can
merge independently from board UI polish and adapter sandbox work
> - The benefit is safer heartbeat recovery, clearer runtime controls,
and more predictable recurring execution behavior

## What Changed

- Adds bounded heartbeat retry scheduling, scheduled retry state, and
Codex transient failure recovery handling.
- Tightens heartbeat process recovery, blocker wake behavior, issue
comment wake handling, routine dispatch coalescing, and
activity/dashboard bounds.
- Adds runtime-control MCP tools and Paperclip skill docs for issue
workspace runtime management.
- Adds migrations `0061_lively_thor_girl.sql` and
`0062_routine_run_dispatch_fingerprint.sql`.
- Surfaces retry state in run ledger/agent UI and keeps related shared
types synchronized.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/routines-service.test.ts`
- `pnpm exec vitest run src/tools.test.ts` from `packages/mcp-server`

## Risks

- Medium risk: this touches heartbeat recovery and routine dispatch,
which are central execution paths.
- Migration order matters if split branches land out of order: merge
this PR before branches that assume the new runtime/routine fields.
- Runtime retry behavior should be watched in CI and in local operator
smoke tests because it changes how transient failures are resumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 12:24:11 -05:00
Dotta ab9051b595 Add first-class issue references (#4214)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators and agents coordinate through company-scoped issues,
comments, documents, and task relationships.
> - Issue text can mention other tickets, but those references were
previously plain markdown/text without durable relationship data.
> - That made it harder to understand related work, surface backlinks,
and keep cross-ticket context visible in the board.
> - This pull request adds first-class issue reference extraction,
storage, API responses, and UI surfaces.
> - The benefit is that issue references become queryable, navigable,
and visible without relying on ad hoc text scanning.

## What Changed

- Added shared issue-reference parsing utilities and exported
reference-related types/constants.
- Added an `issue_reference_mentions` table, idempotent migration DDL,
schema exports, and database documentation.
- Added server-side issue reference services, route integration,
activity summaries, and a backfill command for existing issue content.
- Added UI reference pills, related-work panels, markdown/editor mention
handling, and issue detail/property rendering updates.
- Added focused shared, server, and UI tests for parsing, persistence,
display, and related-work behavior.
- Rebased `PAP-735-first-class-task-references` cleanly onto
`public-gh/master`; no `pnpm-lock.yaml` changes are included.

## Verification

- `pnpm -r typecheck`
- `pnpm test:run packages/shared/src/issue-references.test.ts
server/src/__tests__/issue-references-service.test.ts
ui/src/components/IssueRelatedWorkPanel.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/MarkdownBody.test.tsx`

## Risks

- Medium risk because this adds a new issue-reference persistence path
that touches shared parsing, database schema, server routes, and UI
rendering.
- Migration risk is mitigated by `CREATE TABLE IF NOT EXISTS`, guarded
foreign-key creation, and `CREATE INDEX IF NOT EXISTS` statements so
users who have applied an older local version of the numbered migration
can re-run safely.
- UI risk is limited by focused component coverage, but reviewers should
still manually inspect issue detail pages containing ticket references
before merge.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-using shell workflow with
repository inspection, git rebase/push, typecheck, and focused Vitest
verification.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: dotta <dotta@example.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 10:02:52 -05:00
Dotta 1954eb3048 [codex] Detect issue graph liveness deadlocks (#4209)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat harness is responsible for waking agents, reconciling
issue state, and keeping execution moving.
> - Some dependency graphs can become live-locks when a blocked issue
depends on an unassigned, cancelled, or otherwise uninvokable issue.
> - Review and approval stages can also stall when the recorded
participant can no longer be resolved.
> - This pull request adds issue graph liveness classification plus
heartbeat reconciliation that creates durable escalation work for those
cases.
> - The benefit is that harness-level deadlocks become visible,
assigned, logged, and recoverable instead of silently leaving task
sequences blocked.

## What Changed

- Added an issue graph liveness classifier for blocked dependency and
invalid review participant states.
- Added heartbeat reconciliation that creates one stable escalation
issue per liveness incident, links it as a blocker, comments on the
affected issue, wakes the recommended owner, and logs activity.
- Wired startup and periodic server reconciliation for issue graph
liveness incidents.
- Added focused tests for classifier behavior, heartbeat escalation
creation/deduplication, and queued dependency wake promotion.
- Fixed queued issue wakes so a coalesced wake re-runs queue selection,
allowing dependency-unblocked work to start immediately.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
server/src/__tests__/issue-liveness.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts`
- Passed locally: `server/src/__tests__/issue-liveness.test.ts` (5
tests)
- Skipped locally: embedded Postgres suites because optional package
`@embedded-postgres/darwin-x64` is not installed on this host
- `pnpm --filter @paperclipai/server typecheck`
- `git diff --check`
- Greptile review loop: ran 3 times as requested; the final
Greptile-reviewed head `0a864eab` had 0 comments and all Greptile
threads were resolved. Later commits are CI/test-stability fixes after
the requested max Greptile pass count.
- GitHub PR checks on head `87493ed4`: `policy`, `verify`, `e2e`, and
`security/snyk (cryppadotta)` all passed.

## Risks

- Moderate operational risk: the reconciler creates escalation issues
automatically, so incorrect classification could create noise. Stable
incident keys and deduplication limit repeated escalation.
- Low schema risk: this uses existing issue, relation, comment, wake,
and activity log tables with no migration.
- No UI screenshots included because this change is server-side harness
behavior only.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent. Exact runtime model ID and
context window were not exposed in this session. Used tool execution for
git, tests, typecheck, Greptile review handling, and GitHub CLI
operations.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 09:11:12 -05:00
Dotta 1266954a4e [codex] Make heartbeat scheduling blocker-aware (#4157)
## Thinking Path

> - Paperclip orchestrates AI agents through issue-driven heartbeats,
checkouts, and wake scheduling.
> - This change sits in the server heartbeat and issue services that
decide which queued runs are allowed to start.
> - Before this branch, queued heartbeats could be selected even when
their issue still had unresolved blocker relationships.
> - That let blocked descendant work compete with actually-ready work
and risked auto-checking out issues that were not dependency-ready.
> - This pull request teaches the scheduler and checkout path to consult
issue dependency readiness before claiming queued runs.
> - It also exposes dependency readiness in the agent inbox so agents
can see which assigned issues are still blocked.
> - The result is that heartbeat execution follows the DAG of blocked
dependencies instead of waking work out of order.

## What Changed

- Added `IssueDependencyReadiness` helpers to `issueService`, including
unresolved blocker lookup for single issues and bulk issue lists.
- Prevented issue checkout and `in_progress` transitions when unresolved
blockers still exist.
- Made heartbeat queued-run claiming and prioritization dependency-aware
so ready work starts before blocked descendants.
- Included dependency readiness fields in `/api/agents/me/inbox-lite`
for agent heartbeat selection.
- Added regression coverage for dependency-aware heartbeat promotion and
issue-service participation filtering.

## Verification

- `pnpm run preflight:workspace-links`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
server/src/__tests__/issues-service.test.ts`
- On this host, the Vitest command passed, but the embedded-Postgres
portions of those files were skipped because
`@embedded-postgres/darwin-x64` is not installed.

## Risks

- Scheduler ordering now prefers dependency-ready runs, so any hidden
assumptions about strict FIFO ordering could surface in edge cases.
- The new guardrails reject checkout or `in_progress` transitions for
blocked issues; callers depending on the old permissive behavior would
now get `422` errors.
- Local verification did not execute the embedded-Postgres integration
paths on this macOS host because the platform binary package was
missing.

> I checked `ROADMAP.md`; this is a targeted execution/scheduling fix
and does not duplicate planned roadmap feature work.

## Model Used

- OpenAI Codex via the Paperclip `codex_local` adapter in this
workspace. Exact backend model ID is not surfaced in the runtime here;
tool-enabled coding agent with terminal execution and repository editing
capabilities.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-20 16:03:57 -05:00
Dotta 549ef11c14 [codex] Respect manual workspace runtime controls (#4125)
## Thinking Path

> - Paperclip orchestrates AI agents inside execution and project
workspaces
> - Workspace runtime services can be controlled manually by operators
and reused by agent runs
> - Manual start/stop state was not preserved consistently across
workspace policies and routine launches
> - Routine launches also needed branch/workspace variables to default
from the selected workspace context
> - This pull request makes runtime policy state explicit, preserves
manual control, and auto-fills routine branch variables from workspace
data
> - The benefit is less surprising workspace service behavior and fewer
manual inputs when running workspace-scoped routines

## What Changed

- Added runtime-state handling for manual workspace control across
execution and project workspace validators, routes, and services.
- Updated heartbeat/runtime startup behavior so manually stopped
services are respected.
- Auto-filled routine workspace branch variables from available
workspace context.
- Added focused server and UI tests for workspace runtime and routine
variable behavior.
- Removed muted gray background styling from workspace pages and cards
for a cleaner workspace UI.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run server/src/__tests__/routines-service.test.ts
server/src/__tests__/workspace-runtime.test.ts
ui/src/components/RoutineRunVariablesDialog.test.tsx`
- Result: 55 tests passed, 21 skipped. The embedded Postgres routines
tests skipped on this host with the existing PGlite/Postgres init
warning; workspace-runtime and UI tests passed.

## Risks

- Medium risk: this touches runtime service start/stop policy and
heartbeat launch behavior.
- The focused tests cover manual runtime state, routine variables, and
workspace runtime reuse paths.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted component/service verification
is sufficient here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:39:37 -05:00
Dotta 4357a3f352 [codex] Harden dashboard run activity charts (#4126)
## Thinking Path

> - Paperclip gives operators a live view of agent work across
dashboards, transcripts, and run activity charts
> - Those views consume live run updates and aggregate run activity from
backend dashboard data
> - Missing or partial run data could make charts brittle, and live
transcript updates were heavier than needed
> - Operators need dashboard data to stay stable even when recent run
payloads are incomplete
> - This pull request hardens dashboard run aggregation, guards chart
rendering, and lightens live run update handling
> - The benefit is a more reliable dashboard during active agent
execution

## What Changed

- Added dashboard run activity types and backend aggregation coverage.
- Guarded activity chart rendering when run data is missing or partial.
- Reduced live transcript update churn in active agent and run chat
surfaces.
- Fixed issue chat avatar alignment in the thread renderer.
- Added focused dashboard, activity chart, and live transcript tests.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run server/src/__tests__/dashboard-service.test.ts
ui/src/components/ActivityCharts.test.tsx
ui/src/components/transcript/useLiveRunTranscripts.test.tsx`
- Result: 8 tests passed, 1 skipped. The embedded Postgres dashboard
service test skipped on this host with the existing PGlite/Postgres init
warning; UI chart and transcript tests passed.

## Risks

- Medium-low risk: aggregation semantics changed, but the UI remains
guarded around incomplete data.
- The dashboard service test is host-skipped here, so CI should confirm
the embedded database path.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted component tests are sufficient
here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:34:21 -05:00
Dotta 9c6f551595 [codex] Add plugin orchestration host APIs (#4114)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The plugin system is the extension path for optional capabilities
that should not require core product changes for every integration.
> - Plugins need scoped host APIs for issue orchestration, documents,
wakeups, summaries, activity attribution, and isolated database state.
> - Without those host APIs, richer plugins either cannot coordinate
Paperclip work safely or need privileged core-side special cases.
> - This pull request adds the plugin orchestration host surface, scoped
route dispatch, a database namespace layer, and a smoke plugin that
exercises the contract.
> - The benefit is a broader plugin API that remains company-scoped,
auditable, and covered by tests.

## What Changed

- Added plugin orchestration host APIs for issue creation, document
access, wakeups, summaries, plugin-origin activity, and scoped API route
dispatch.
- Added plugin database namespace tables, schema exports, migration
checks, and idempotent replay coverage under migration
`0059_plugin_database_namespaces`.
- Added shared plugin route/API types and validators used by server and
SDK boundaries.
- Expanded plugin SDK types, protocol helpers, worker RPC host behavior,
and testing utilities for orchestration flows.
- Added the `plugin-orchestration-smoke-example` package to exercise
scoped routes, restricted database namespaces, issue orchestration,
documents, wakeups, summaries, and UI status surfaces.
- Kept the new orchestration smoke fixture out of the root pnpm
workspace importer so this PR preserves the repository policy of not
committing `pnpm-lock.yaml`.
- Updated plugin docs and database docs for the new orchestration and
database namespace surfaces.
- Rebased the branch onto `public-gh/master`, resolved conflicts, and
removed `pnpm-lock.yaml` from the final PR diff.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm --filter @paperclipai/db typecheck`
- `pnpm exec vitest run packages/db/src/client.test.ts`
- `pnpm exec vitest run server/src/__tests__/plugin-database.test.ts
server/src/__tests__/plugin-orchestration-apis.test.ts
server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/plugin-scoped-api-routes.test.ts
server/src/__tests__/plugin-sdk-orchestration-contract.test.ts`
- From `packages/plugins/examples/plugin-orchestration-smoke-example`:
`pnpm exec vitest run --config ./vitest.config.ts`
- `pnpm --dir
packages/plugins/examples/plugin-orchestration-smoke-example run
typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- PR CI on latest head `293fc67c`: `policy`, `verify`, `e2e`, and
`security/snyk` all passed.

## Risks

- Medium risk: this expands plugin host authority, so route auth,
company scoping, and plugin-origin activity attribution need careful
review.
- Medium risk: database namespace migration behavior must remain
idempotent for environments that may have seen earlier branch versions.
- Medium risk: the orchestration smoke fixture is intentionally excluded
from the root workspace importer to avoid a `pnpm-lock.yaml` PR diff;
direct fixture verification remains listed above.
- Low operational risk from the PR setup itself: the branch is rebased
onto current `master`, the migration is ordered after upstream
`0057`/`0058`, and `pnpm-lock.yaml` is not in the final diff.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

Roadmap checked: this work aligns with the completed Plugin system
milestone and extends the plugin surface rather than duplicating an
unrelated planned core feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in a tool-enabled CLI
environment. Exact hosted model build and context-window size are not
exposed by the runtime; reasoning/tool use were enabled for repository
inspection, editing, testing, git operations, and PR creation.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no core UI screen change; example plugin UI contract
is covered by tests)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 08:52:51 -05:00