Commit Graph

4 Commits

Author SHA1 Message Date
Dotta a1835cfa5e [codex] Harden plugin runtime invocation scope (#6547)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through a company-scoped
control plane.
> - Plugins extend that control plane, but plugin workers still call
back into host APIs.
> - Those worker-to-host calls need the same company boundary guarantees
as normal API routes.
> - Plugin action handlers also need authenticated actor context from
the host instead of trusting caller-supplied params.
> - This pull request hardens plugin bridge/action scope and keeps
plugin operation issues out of normal issue surfaces.
> - The benefit is safer plugin execution with clearer authorization
boundaries and better test coverage.

## What Changed

- Added host-owned invocation context plumbing for nested plugin worker
calls.
- Added actor context to plugin `performAction` calls and test harness
helpers.
- Enforced company invocation scope on worker-to-host calls and filtered
company lists to the active invocation scope.
- Extended plugin action route tests for board and agent actor context,
spoofed company params, and cross-company rejection.
- Extended plugin worker manager coverage for invocation-scope
propagation.
- Filtered typed and legacy plugin operation issue origins from default
issue/inbox lists.

## Verification

- `pnpm --filter @paperclipai/plugin-sdk build`
- `NODE_ENV=test pnpm exec vitest run
packages/plugins/sdk/tests/host-client-factory.test.ts
packages/plugins/sdk/tests/testing-actions.test.ts
server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/plugin-worker-manager.test.ts
server/src/__tests__/issues-service.test.ts`

Note: embedded Postgres issue-service tests reported host-level Postgres
init skip for 47 tests; the non-embedded targeted tests passed.

## Risks

- Medium: plugin host authorization paths are sensitive, and external
plugins may rely on previously loose company params.
- Mitigation: the change only tightens calls when the host attached a
company invocation scope and includes explicit tests for board, agent,
and nested worker calls.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI GPT-5 Codex via `codex_local`, tool-enabled coding session;
exact context window not exposed by this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-22 09:16:24 -05:00
Dotta 38c185fb8b [codex] Add agent permissions and controls plan (#6386)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies by keeping
task ownership, approvals, and operator control inside one control
plane.
> - Agent permissions and plugin-hosted company settings sit on the
boundary between autonomy and governance.
> - V1 needs scoped task assignment rules, plugin extension points, and
clearer company access surfaces without weakening company boundaries.
> - The branch builds the core authorization service, plugin SDK/host
APIs, and UI simplifications needed to support those controls.
> - Paperclip EE plugin surfaces were intentionally moved out of this
core PR per review direction, so this PR now carries only the public
core/plugin infrastructure work.
> - The latest updates preserve the PAP-9937 branch changes that belong
in this PR, remove the `design/` artifacts, and exclude the experimental
`plugin-briefs` package.
> - Greptile feedback was applied through the authorization/audit paths
and the final cleanup commit was re-reviewed at 5/5 with no unresolved
Greptile threads.
> - The benefit is safer assignment control with extension hooks for
richer permission products while preserving simple defaults for normal
operators.

## What Changed

- Added scoped task-assignment authorization decisions and routed
issue/agent assignment mutations through the authorization service.
- Added plugin SDK and host APIs for company settings slots,
authorization policy/grant management, assignment previews, and bridge
invocation scope propagation.
- Simplified core company access UI and moved advanced controls behind
plugin-provided settings surfaces.
- Added retry-now affordances for blocked issue next-step notices.
- Added protected-assignment enforcement for persisted
agent/project/issue policies, including explicit-grant fallback
behavior.
- Added incremental principal-access compatibility backfill for active
agent memberships and role-default human permission grants.
- Added the Markdown code block wrap action fix from the latest branch
changes.
- Removed `design/` artifacts from the PR and removed
`packages/plugins/plugin-briefs` from the final diff.
- Addressed Greptile feedback for plugin actor sanitization, legacy
membership handling, audit pagination, unknown grant-scope metadata, and
startup test mocks.

## Verification

- `pnpm exec vitest run server/src/__tests__/access-service.test.ts
server/src/__tests__/company-portability.test.ts` -> 2 files passed, 54
tests passed.
- `pnpm exec vitest run
server/src/__tests__/server-startup-feedback-export.test.ts
server/src/__tests__/access-service.test.ts
server/src/__tests__/company-portability.test.ts` -> 3 files passed, 62
tests passed.
- `pnpm exec vitest run
server/src/__tests__/authorization-service.test.ts
server/src/__tests__/plugin-access-authorization-host-services.test.ts
server/src/__tests__/server-startup-feedback-export.test.ts` -> 3 files
passed, 28 tests passed.
- `pnpm --filter @paperclipai/server typecheck` -> passed.
- `git diff --check` -> passed.
- `node ./scripts/check-docker-deps-stage.mjs` -> passed.
- `CI=true pnpm install --frozen-lockfile --ignore-scripts` -> passed
with no lockfile update.
- `pnpm exec vitest run
ui/src/components/MarkdownBody.interaction.test.tsx` -> 1 test passed.
- `git ls-files design packages/plugins/plugin-briefs | wc -l` -> 0.
- GitHub CI on `40cd83b53` -> all checks passed, merge state `CLEAN`.
- Greptile on `40cd83b53` -> 5/5, 102 files reviewed, 0
comments/annotations added, 0 unresolved review threads.
- Confirmed the PR diff contains no `design/`,
`packages/plugins/plugin-briefs`, `pnpm-lock.yaml`, or
`.github/workflows` changes.

## Risks

- Medium: task assignment authorization paths are behaviorally stricter
for protected/private policy data, so existing plugin-authored policies
may block assignment until explicit grants or approval flows are
configured.
- Medium: plugin-host authorization APIs expand the surface area
available to trusted plugins and need careful review for company
scoping.
- Low: startup now performs a principal-access compatibility backfill,
but the migration and runtime backfill use conflict-tolerant inserts.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled workflow with shell,
git, and GitHub CLI access.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-22 08:12:52 -05:00
Devin Foley b24c6909e8 Harden remote sandbox runtime probes, timeouts, and installs (#5685)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each agent runs inside a sandbox environment so its CLI is isolated
from the host
> - Sandbox-backed adapter runs go through a small set of shared helpers
— `ensureAdapterExecutionTargetCommandResolvable`, the sandbox callback
bridge runner, and per-adapter `SANDBOX_INSTALL_COMMAND` strings
> - When standing up new sandbox provider plugins, the existing helpers
timed out, missed install fallbacks, or leaned on assumptions that only
held for E2B
> - Local adapters (`claude-local`, `codex-local`, `gemini-local`,
`opencode-local`) needed slightly hardened probes so they could install
themselves and validate inside *any* remote sandbox transport, not just
E2B
> - This pull request bundles those runtime fixes so future sandbox
provider plugins inherit a working baseline
> - The benefit is that adding a new sandbox provider plugin no longer
requires touching adapter-utils or each local-adapter probe — the
supporting infra is already correct

## What Changed

- `packages/adapter-utils/src/execution-target.ts`: introduce
`DEFAULT_REMOTE_SANDBOX_ADAPTER_TIMEOUT_SEC = 1800` and
`resolveAdapterExecutionTargetTimeoutSec(...)`. Local and SSH adapters
keep the historical "0 means no adapter timeout" behavior;
sandbox-backed runs without an explicit `timeoutSec` get an explicit
30-minute default so remote installs and warm-up don't time out at the
per-RPC default. Plumbed `timeoutSec` through
`ensureAdapterExecutionTargetCommandResolvable` so install probes inside
a sandbox honor adapter-level overrides instead of the bridge's 5-minute
default.
- `packages/adapters/opencode-local/src/index.ts`: switch
`SANDBOX_INSTALL_COMMAND` from `npm install -g opencode-ai` to `curl
-fsSL https://opencode.ai/install | bash`. The npm package reifies four
large prebuilt-binary subpackages in parallel even though only one
matches the host arch; on bandwidth-constrained sandboxes that blew
through the 240s install budget. The official installer fetches one
arch-specific binary and adds `$HOME/.opencode/bin` to PATH via
`~/.bashrc`, which the sandbox-callback-bridge login-shell script
already sources.
- `packages/adapters/{claude,codex,gemini,opencode}-local/`: harden
remote-target probes — pass `--skip-git-repo-check` for Codex when
probing outside a repo, normalize permission flags for Claude, and add
`*.remote.test.ts` coverage that exercises the remote-sandbox path
explicitly for each adapter.
- `packages/adapter-utils/src/sandbox-install-command.{ts,test.ts}`
(new): add `buildSandboxNpmInstallCommand` helper.
`server/src/adapters/registry.ts` + new
`server/src/__tests__/adapter-registry.test.ts`: wire adapter install
commands so they fall back to a writable `$HOME/.local` prefix when
global install isn't available.
- `server/src/__tests__/plugin-worker-manager.test.ts` + new
`server/src/__tests__/fixtures/plugin-worker-delayed.cjs`: pin per-call
timeout overrides so plugin worker exec calls honor the caller's timeout
instead of the worker's default.

## Verification

- `pnpm typecheck`
- `pnpm exec vitest run --no-coverage
packages/adapter-utils/src/execution-target-sandbox.test.ts
packages/adapter-utils/src/sandbox-install-command.test.ts`
- `pnpm exec vitest run --no-coverage
server/src/__tests__/plugin-worker-manager.test.ts
server/src/__tests__/adapter-registry.test.ts
server/src/__tests__/claude-local-adapter-environment.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/gemini-local-adapter-environment.test.ts`
- `pnpm exec vitest run --no-coverage
packages/adapters/codex-local/src/server/test.remote.test.ts
packages/adapters/opencode-local/src/server/test.remote.test.ts
packages/adapters/codex-local/src/server/codex-args.test.ts
packages/adapters/codex-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts`

All passing locally.

## Risks

- Touches shared `adapter-utils` and several `*-local` adapters. The
30-minute default applies only when both (a) the target is
`remote+sandbox` and (b) no `timeoutSec` is configured — local + SSH
paths are unchanged. New test coverage was added alongside each behavior
change to pin the contracts.
- Switching OpenCode's install command to the official installer is a
behavior change for any operator running OpenCode inside a remote
sandbox. Local installs are unaffected (the `SANDBOX_INSTALL_COMMAND`
only runs when an adapter is being installed inside a sandbox).
- Low risk overall — no migrations, no API surface change.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.7 (1M context)
- Capabilities used: extended reasoning, tool use (Read/Edit/Bash/Grep),
no code execution beyond local repo commands

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A, no UI change
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-11 00:31:54 -07:00
Devin Foley fe3904f434 Stabilize runtime probes and Codex env tests (#5445)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters expose a Test action that probes the configured runtime —
install, resolvability, hello — to give operators a fast yes/no on
whether an environment is healthy
> - The Codex test path was running its hello probe directly without
going through the managed-runtime preparation that production runs use,
so a healthy production setup could still report a probe failure
> - The plugin worker manager wasn't surfacing terminated workers
cleanly, leaving the runtime probe waiting on a dead worker until the
request timed out
> - This pull request routes the Codex test probe through
`prepareAdapterExecutionTargetRuntime` (so it sees the same managed
Codex home production sees), exposes `commandCwd` on
`createCommandManagedRuntimeClient` so callers can target a per-probe
directory without leaking the workspace `remoteCwd`, and propagates
plugin-worker termination as a usable error instead of a hang
> - The benefit is the Codex Test action mirrors production behavior
end-to-end, and probes against a terminated plugin worker fail fast
instead of timing out

## What Changed

- `packages/adapter-utils/src/command-managed-runtime.ts`: rename the
`remoteCwd` knob to `commandCwd` so callers can target a per-probe
directory without inheriting the workspace cwd; matching test coverage
in `command-managed-runtime.test.ts`
- `packages/adapter-utils/src/sandbox-callback-bridge.{ts,test.ts}`:
small fixes to keep callback bridge stop semantics deterministic
- `packages/adapters/codex-local/src/server/test.ts`: thread the Codex
hello probe through `prepareAdapterExecutionTargetRuntime` +
`prepareManagedCodexHome` so the probe sees the same managed home
production sees; new `test.remote.test.ts` covers the remote probe path
- `packages/adapters/cursor-local/src/server/execute.ts`: small
probe-side cleanup that aligns with the new commandCwd contract
- `server/src/services/plugin-worker-manager.ts`: surface plugin-worker
termination as a structured error so callers fail fast; new
`plugin-worker-terminated.cjs` fixture and
`plugin-worker-manager.test.ts` cases pin the behavior

## Verification

- `pnpm vitest run --no-coverage --project @paperclipai/adapter-utils
--project @paperclipai/adapter-codex-local --project
@paperclipai/adapter-cursor-local --project @paperclipai/server` —
1749/1750 passing (1 unrelated skip)
- `pnpm typecheck` clean

## Risks

Low–medium. The `remoteCwd → commandCwd` rename is a parameter renaming
on an internal helper used only by adapter test/execute paths in this
repo. The plugin-worker-terminated path was previously a hang; failing
fast may surface latent timeouts as explicit termination errors in
callers that already expected them.

## Model Used

Claude Opus 4.7 (1M context)

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable — new tests cover
commandCwd, plugin-worker termination, and Codex remote test path
- [x] If this change affects the UI, I have included before/after
screenshots — N/A (no UI)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---

> **Stacked PR.** Sits on top of #5444 which adds the per-run runtime
API surface this PR builds on. Cumulative diff against `master` includes
that PR's content; the files touched by *this* PR's commit are listed
under "What Changed" above. Will rebase onto `master` and force-push
once #5444 merges.
2026-05-07 14:52:31 -07:00