Commit Graph

21 Commits

Author SHA1 Message Date
Dotta 685ee84e4a [codex] Document terminal bench dispatch config (#4961)
## Thinking Path

> - Paperclip agents rely on skills for repeatable operating procedures
> - The Terminal-Bench loop skill needs to preserve enough dispatch
configuration to reproduce real heartbeat behavior
> - A bare benchmark command can create unassigned work with no
heartbeat-enabled agent, which is a harness setup failure rather than
product evidence
> - The Paperclip heartbeat skill also needs to keep escalation biased
toward agent-owned follow-through
> - This pull request documents dispatch runner config requirements and
strengthens the agent follow-through rule
> - The benefit is fewer misleading benchmark loops and clearer agent
operating guidance

## What Changed

- Documented `PAPERCLIP_HARBOR_RUNNER_CONFIG` / runner dispatch config
as required Terminal-Bench loop input.
- Updated the Terminal-Bench loop smoke check to require the dispatch
config mention.
- Added stronger Paperclip skill guidance to avoid asking humans for
work an agent can perform.

## Verification

- `pnpm smoke:terminal-bench-loop-skill`

## Risks

- Low risk: documentation and smoke expectation changes only. The
stricter smoke assertion is intentional so future edits do not drop the
dispatch config requirement.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 12:00:47 -05:00
Dotta 1fe1067361 Polish board settings and skills workflow (#4863)
## Thinking Path

> - Paperclip's board UI and bundled skills are the operator layer for
configuring agents, routines, issue workflows, and local troubleshooting
loops.
> - The prior rollup mixed this operator polish with database backups,
backend reliability, thread scale, and cost/workflow primitives.
> - This pull request isolates the remaining board QoL, settings,
issue-detail integration, adapter config cleanup, and skills smoke
tooling.
> - It includes some integration-level overlap with the thread and
workflow slices so this branch can run from `origin/master` while still
preserving the full original work.
> - Preferred merge order is the narrower primitives first, then this
integration PR last.
> - The benefit is that reviewers can inspect the user-facing
board/settings/skills layer separately from backend infrastructure
changes.

## What Changed

- Added board/settings polish for agents, routines, company settings,
project workspace detail, and issue detail controls.
- Added agent/routine UI regression tests and New Issue dialog coverage.
- Integrated issue-detail activity/cost/interaction surfaces and leaf
work pause/resume controls.
- Cleaned bundled adapter UI config defaults and onboarding copy.
- Added terminal-bench loop and work-stoppage diagnosis skills plus a
smoke test script.
- Updated attachment type handling and Paperclip skill/API guidance.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/pages/Agents.test.tsx
ui/src/pages/Routines.test.tsx ui/src/components/NewIssueDialog.test.tsx
ui/src/pages/IssueDetail.test.tsx
server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts`
- Result: 7 test files passed, 54 tests passed.
- `pnpm run smoke:terminal-bench-loop-skill`
- Result: JSON output included `"ok": true` and `"cleanup": true`.
- UI screenshots not included because verification is focused
component/page coverage for the changed board surfaces.

## Risks

- This is the integration-heavy PR in the split and intentionally
overlaps some component/API primitives with the issue-thread and
workflow PRs so it can run from `origin/master`.
- Preferred merge order: #4859, #4860, #4861, #4862, then this PR last.
If earlier branches merge first, this PR may need a straightforward
conflict refresh in shared UI files.
- The terminal-bench smoke script creates temporary mock issues and
relies on cleanup; the verified run returned `cleanup: true`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 15:28:11 -05:00
Dotta 048e2b1bfe Remove legacy OpenClaw adapter and keep gateway-only flow 2026-03-07 18:50:25 -06:00
Dotta 0abb6a1205 openclaw gateway: persist device keys and smoke pairing flow 2026-03-07 17:05:36 -06:00
Dotta e27ec5de8c fix(smoke): disambiguate case C ack comment target 2026-03-07 16:16:53 -06:00
Dotta 83488b4ed0 fix(openclaw-gateway): enforce join token validation and add smoke preflight gates 2026-03-07 16:01:19 -06:00
Dotta 9ac2e71187 fix(smoke): pin OpenClaw docker harness to stable stock ref 2026-03-07 15:09:52 -06:00
Dotta 4bd6961020 fix(openclaw-gateway): add diagnostics capture and two-lane validation to e2e
Capture run events, logs, issue state, and container logs on failures
or timeouts for debugging. Write compatibility JSON keys for claimed
API key. Add two-lane validation requirement to test plan.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 09:22:40 -06:00
Dotta a498c268c5 feat: add openclaw_gateway adapter
New adapter type for invoking OpenClaw agents via the gateway protocol.
Registers in server, CLI, and UI adapter registries. Adds onboarding
wizard support with gateway URL field and e2e smoke test script.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 08:59:29 -06:00
Dotta 0f895a8cf9 Enforce 10-minute TTL for generated company invites 2026-03-06 10:10:23 -06:00
Dotta cecb94213d smoke: default openclaw docker ui state to temp folder 2026-03-05 18:53:07 -06:00
Dotta 222e0624a8 smoke: fully reset openclaw docker ui state by default 2026-03-05 17:38:06 -06:00
Dotta 0cc75c6e10 Cut over OpenClaw adapter to strict SSE streaming 2026-03-05 15:54:55 -06:00
Dotta bee24e880f Add Paperclip host networking guidance to OpenClaw smoke script 2026-03-05 12:22:14 -06:00
Dotta 1b98c2b279 Isolate OpenClaw smoke state to avoid stale auth drift 2026-03-05 11:42:50 -06:00
Dotta 529d53acc0 Pin OpenClaw Docker UI smoke defaults to OpenAI models 2026-03-05 11:29:29 -06:00
Dotta 6f98c5f25c Clarify zero-flag OpenClaw Docker UI smoke defaults 2026-03-05 11:14:30 -06:00
Dotta 9dbd72cffd Improve OpenClaw Docker UI smoke pairing ergonomics 2026-03-05 11:04:14 -06:00
Dotta 34d9122b45 Add one-command OpenClaw Docker UI smoke script 2026-03-05 09:15:46 -06:00
Dotta d4a2fc6464 Improve OpenClaw smoke auth error handling 2026-03-04 16:37:55 -06:00
Dotta be50daba42 Add OpenClaw onboarding text endpoint and join smoke harness 2026-03-04 16:29:14 -06:00