forked from farhoodlabs/paperclip
[codex] Document terminal bench dispatch config (#4961)
## Thinking Path > - Paperclip agents rely on skills for repeatable operating procedures > - The Terminal-Bench loop skill needs to preserve enough dispatch configuration to reproduce real heartbeat behavior > - A bare benchmark command can create unassigned work with no heartbeat-enabled agent, which is a harness setup failure rather than product evidence > - The Paperclip heartbeat skill also needs to keep escalation biased toward agent-owned follow-through > - This pull request documents dispatch runner config requirements and strengthens the agent follow-through rule > - The benefit is fewer misleading benchmark loops and clearer agent operating guidance ## What Changed - Documented `PAPERCLIP_HARBOR_RUNNER_CONFIG` / runner dispatch config as required Terminal-Bench loop input. - Updated the Terminal-Bench loop smoke check to require the dispatch config mention. - Added stronger Paperclip skill guidance to avoid asking humans for work an agent can perform. ## Verification - `pnpm smoke:terminal-bench-loop-skill` ## Risks - Low risk: documentation and smoke expectation changes only. The stricter smoke assertion is intentional so future edits do not drop the dispatch config requirement. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool use and local command execution. Exact context window was not exposed in the runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
This commit is contained in:
@@ -233,6 +233,10 @@ For commands, response fields, and MCP tools, read:
|
||||
- **Hiring**: use the `paperclip-create-agent` skill for new agent creation workflows (links to reusable `AGENTS.md` templates like `Coder` and `QA`).
|
||||
- **Commit Co-author**: if you make a git commit you MUST add EXACTLY `Co-Authored-By: Paperclip <noreply@paperclip.ing>` to the end of each commit message. Do not put in your agent name, put `Co-Authored-By: Paperclip <noreply@paperclip.ing>`.
|
||||
|
||||
This is rule #1:
|
||||
|
||||
IMPORTANT: **NEVER ASK A HUMAN TO DO WHAT AN AGENT COULD DO**. If you need to escalate, escalate. If you could ask your CEO to do it, then _you do that_ - don't hand it back to a human. Again: Never ask a human to do what an agent _could_ do. Rule number 1.
|
||||
|
||||
## Comment Style (Required)
|
||||
|
||||
When posting issue comments or writing issue descriptions, use concise markdown with:
|
||||
@@ -349,3 +353,5 @@ Results are ranked by relevance: title matches first, then identifier, descripti
|
||||
## Full Reference
|
||||
|
||||
For detailed API tables, JSON response schemas, worked examples (IC and Manager heartbeats), governance/approvals, cross-team delegation rules, error codes, issue lifecycle diagram, and the common mistakes table, read: `skills/paperclip/references/api-reference.md`
|
||||
|
||||
Again, rule #1 is: never ask a human to do what an agent could do. Try harder. Try again. Ask another agent to help. Keep working until the goal is fully accomplished.
|
||||
|
||||
Reference in New Issue
Block a user