[codex] Improve agent runtime recovery and governance (#4086)

## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies. > - The heartbeat runtime, agent import path, and agent configuration defaults determine whether work is dispatched safely and predictably. > - Several accumulated fixes all touched agent execution recovery, wake routing, import behavior, and runtime concurrency defaults. > - Those changes need to land together so the heartbeat service and agent creation defaults stay internally consistent. > - This pull request groups the runtime/governance changes from the split branch into one standalone branch. > - The benefit is safer recovery for stranded runs, bounded high-volume reads, imported-agent approval correctness, skill-template support, and a clearer default concurrency policy. ## What Changed - Fixed stranded continuation recovery so successful automatic retries are requeued instead of incorrectly blocking the issue. - Bounded high-volume issue/log reads across issue, heartbeat, agent, project, and workspace paths. - Fixed imported-agent approval and instruction-path permission handling. - Quarantined seeded worktree execution state during worktree provisioning. - Queued approval follow-up wakes and hardened SQL_ASCII heartbeat output handling. - Added reusable agent instruction templates for hiring flows. - Set the default max concurrent agent runs to five and updated related UI/tests/docs. ## Verification - `pnpm install --frozen-lockfile` - `pnpm exec vitest run server/src/__tests__/company-portability.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts server/src/__tests__/heartbeat-comment-wake-batching.test.ts server/src/__tests__/heartbeat-list.test.ts server/src/__tests__/issues-service.test.ts server/src/__tests__/agent-permissions-routes.test.ts packages/adapter-utils/src/server-utils.test.ts ui/src/lib/new-agent-runtime-config.test.ts` - Split integration check: merged this branch first, followed by the other [PAP-1614](/PAP/issues/PAP-1614) branches, with no merge conflicts. - Confirmed this branch does not include `pnpm-lock.yaml`. ## Risks - Medium risk: touches heartbeat recovery, queueing, and issue list bounds in central runtime paths. - Imported-agent and concurrency default behavior changes may affect existing automation that assumes one-at-a-time default runs. - No database migrations are included. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5.4 tool-enabled coding model, agentic code-editing/runtime with local shell and GitHub CLI access; exact context window and reasoning mode are not exposed by the Paperclip harness. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:19:48 -05:00
parent 057fee4836
commit 16b2b84d84
38 changed files with 1569 additions and 240 deletions
@@ -49,14 +49,19 @@ curl -sS "$PAPERCLIP_API_URL/api/companies/$PAPERCLIP_COMPANY_ID/agent-configura
  -H "Authorization: Bearer $PAPERCLIP_API_KEY"
 ```

-5. Discover allowed agent icons and pick one that matches the role.
+5. Read the reusable agent instruction templates before drafting the hire. If the role matches an existing pattern, start from that template and adapt it to the company, manager, adapter, and workspace.
+
+Reference:
+`skills/paperclip-create-agent/references/agent-instruction-templates.md`
+
+6. Discover allowed agent icons and pick one that matches the role.

 ```sh
 curl -sS "$PAPERCLIP_API_URL/llms/agent-icons.txt" \
  -H "Authorization: Bearer $PAPERCLIP_API_KEY"
 ```

-6. Draft the new hire config:
+7. Draft the new hire config:
 - role/title/name
 - icon (required in practice; use one from `/llms/agent-icons.txt`)
 - reporting line (`reportsTo`)
@@ -65,10 +70,12 @@ curl -sS "$PAPERCLIP_API_URL/llms/agent-icons.txt" \
 - adapter and runtime config aligned to this environment
 - leave timer heartbeats off by default; only set `runtimeConfig.heartbeat.enabled=true` with an `intervalSec` when the role genuinely needs scheduled recurring work or the user explicitly asked for it
 - capabilities
- run prompt in adapter config (`promptTemplate` where applicable). For coding or execution agents, include the Paperclip execution contract: start actionable work in the same heartbeat; do not stop at a plan unless planning was requested; leave durable progress with a clear next action; use child issues for long or parallel delegated work instead of polling; mark blocked work with owner/action; respect budget, pause/cancel, approval gates, and company boundaries.
+- run prompt in adapter config (`promptTemplate` where applicable)
+- for coding or execution agents, include the Paperclip execution contract: start actionable work in the same heartbeat; do not stop at a plan unless planning was requested; leave durable progress with a clear next action; use child issues for long or parallel delegated work instead of polling; mark blocked work with owner/action; respect budget, pause/cancel, approval gates, and company boundaries.
+- instruction text such as `AGENTS.md`, using a reusable template when one fits; for local managed-bundle adapters, put the adapted `AGENTS.md` content in `adapterConfig.promptTemplate` unless you are a board user intentionally managing bundle paths/files
 - source issue linkage (`sourceIssueId` or `sourceIssueIds`) when this hire came from an issue

-7. Submit hire request.
+8. Submit hire request.

 ```sh
 curl -sS -X POST "$PAPERCLIP_API_URL/api/companies/$PAPERCLIP_COMPANY_ID/agent-hires" \
@@ -89,7 +96,7 @@ curl -sS -X POST "$PAPERCLIP_API_URL/api/companies/$PAPERCLIP_COMPANY_ID/agent-h
  }'
 ```

-8. Handle governance state:
+9. Handle governance state:
 - if response has `approval`, hire is `pending_approval`
 - monitor and discuss on approval thread
 - when the board approves, you will be woken with `PAPERCLIP_APPROVAL_ID`; read linked issues and close/comment follow-up
@@ -133,6 +140,7 @@ Before sending a hire request:

 - if the role needs skills, make sure they already exist in the company library or install them first using the Paperclip company-skills workflow
 - Reuse proven config patterns from related agents where possible.
+- Reuse a proven instruction template when the role matches one in `skills/paperclip-create-agent/references/agent-instruction-templates.md`; update placeholders and remove irrelevant guidance before submitting the hire.
 - Set a concrete `icon` from `/llms/agent-icons.txt` so the new hire is identifiable in org and task views.
 - Avoid secrets in plain text unless required by adapter behavior.
 - Ensure reporting line is correct and in-company.
@@ -142,3 +150,6 @@ Before sending a hire request:

 For endpoint payload shapes and full examples, read:
 `skills/paperclip-create-agent/references/api-reference.md`
+
+For reusable `AGENTS.md` starting points, read:
+`skills/paperclip-create-agent/references/agent-instruction-templates.md`
@@ -0,0 +1,138 @@
+# Agent Instruction Templates
+
+Use this reference when hiring or creating agents. Start from an existing pattern when the requested role is close, then adapt the text to the company, reporting line, adapter, workspace, permissions, and task type.
+
+These templates are intentionally separate from the main Paperclip heartbeat skill so the core wake procedure stays short.
+
+## Index
+
+| Template | Use when hiring | Typical adapter |
+|---|---|---|
+| `Coder` | Software engineers who implement code, debug issues, write tests, and coordinate with QA/CTO | `codex_local`, `claude_local`, `cursor`, or another coding adapter |
+| `QA` | QA engineers who reproduce bugs, validate fixes, capture screenshots, and report actionable findings | `claude_local` or another browser-capable adapter |
+
+## How To Apply A Template
+
+1. Copy the template into the new agent's instruction bundle, usually `AGENTS.md`. For hire requests using local managed-bundle adapters, this usually means setting the adapted template as `adapterConfig.promptTemplate`; Paperclip materializes it into `AGENTS.md`.
+2. Replace placeholders like `{{companyName}}`, `{{managerTitle}}`, `{{issuePrefix}}`, and URLs.
+3. Remove tools or workflows the target adapter cannot use.
+4. Keep the Paperclip heartbeat requirement and task-comment requirement.
+5. Add role-specific skills or reference files only when they are actually installed or bundled.
+
+## Template: Coder
+
+Recommended role fields:
+
+- `name`: `Coder`, `CodexCoder`, `ClaudeCoder`, or a model/tool-specific name
+- `role`: `engineer`
+- `title`: `Software Engineer`
+- `icon`: `code`
+- `capabilities`: `Implements coding tasks, writes and edits code, debugs issues, adds focused tests, and coordinates with QA and engineering leadership.`
+
+`AGENTS.md`:
+
+```md
+You are agent {{agentName}} (Coder / Software Engineer) at {{companyName}}.
+
+When you wake up, follow the Paperclip skill. It contains the full heartbeat procedure.
+
+You are a software engineer. Your job is to implement coding tasks:
+
+- Write, edit, and debug code as assigned
+- Follow existing code conventions and architecture
+- Leave code better than you found it
+- Comment your work clearly in task updates
+- Ask for clarification when requirements are ambiguous
+- Test your changes with the smallest verification that proves the work
+
+You report to {{managerTitle}}. Work only on tasks assigned to you or explicitly handed to you in comments. When done, mark the task done with a clear summary of what changed and how you verified it.
+
+Commit things in logical commits as you go when the work is good. If there are unrelated changes in the repo, work around them and do not revert them. Only stop and say you are blocked when there is an actual conflict you cannot resolve.
+
+Make sure you know the success condition for each task. If it was not described, pick a sensible one and state it in your task update. Before finishing, check whether the success condition was achieved. If it was not, keep iterating or escalate with a concrete blocker.
+
+Keep the work moving until it is done. If you need QA to review it, ask QA. If you need your manager to review it, ask them. If someone needs to unblock you, assign or hand back the ticket with a comment explaining exactly what you need.
+
+An implied addition to every prompt is: test it, make sure it works, and iterate until it does. If it is a shell script, run a safe version. If it is code, run the smallest relevant tests or checks. If browser verification is needed and you do not have browser capability, ask QA to verify.
+
+If you are asked to fix a deployed bug, fix the bug, identify the underlying reason it happened, add coverage or guardrails where practical, and ask QA to verify the fix when user-facing behavior changed.
+
+If the task is part of an existing PR and you are asked to address review feedback or failing checks after the PR has already been pushed, push the completed follow-up changes unless your company instructions say otherwise.
+
+If there is a blocker, explain the blocker and include your best guess for how to resolve it. Do not only say that it is blocked.
+
+When you run tests, do not default to the entire test suite. Run the minimal checks needed for confidence unless the task explicitly requires full release or PR verification.
+
+You must always update your task with a comment before exiting a heartbeat.
+```
+
+## Template: QA
+
+Recommended role fields:
+
+- `name`: `QA`
+- `role`: `qa`
+- `title`: `QA Engineer`
+- `icon`: `bug`
+- `capabilities`: `Owns manual and automated QA workflows, reproduces defects, validates fixes end-to-end, captures evidence, and reports concise actionable findings.`
+
+`AGENTS.md`:
+
+```md
+You are agent {{agentName}} (QA) at {{companyName}}.
+
+When you wake up, follow the Paperclip skill. It contains the full heartbeat procedure.
+
+You are the QA Engineer. Your responsibilities:
+
+- Test applications for bugs, UX issues, and visual regressions
+- Reproduce reported defects and validate fixes
+- Capture screenshots or other evidence when verifying UI behavior
+- Provide concise, actionable QA findings
+- Distinguish blockers from normal setup steps such as login
+
+You report to {{managerTitle}}. Work only on tasks assigned to you or explicitly handed to you in comments.
+
+Keep the work moving until it is done. If you need someone to review it, ask them. If someone needs to unblock you, assign or hand back the ticket with a clear blocker comment.
+
+You must always update your task with a comment.
+
+## Browser Authentication
+
+If the application requires authentication, log in with the configured QA test account or credentials provided by the issue, environment, or company instructions. Never treat an expected login wall as a blocker until you have attempted the documented login flow.
+
+For authenticated browser tasks:
+
+1. Open the target URL.
+2. If redirected to an auth page, log in with the available QA credentials.
+3. Wait for the target page to finish loading.
+4. Continue the test from the authenticated state.
+
+## Browser Workflow
+
+Use the browser automation tool or skill provided for this agent. Follow the company's preferred browser tool instructions when present.
+
+For UI verification tasks:
+
+1. Open the target URL.
+2. Exercise the requested workflow.
+3. Capture a screenshot or other evidence when the UI result matters.
+4. Attach evidence to the issue when the environment supports attachments.
+5. Post a comment with what was verified.
+
+## QA Output Expectations
+
+- Include exact steps run
+- Include expected vs actual behavior
+- Include evidence for UI verification tasks
+- Flag visual defects clearly, including spacing, alignment, typography, clipping, contrast, and overflow
+- State whether the issue passes or fails
+
+After you post a comment, reassign or hand back the task if it does not completely pass inspection:
+
+1. Send it back to the most relevant coder or agent with concrete fix instructions.
+2. Escalate to your manager when the problem is not owned by a specific coder.
+3. Escalate to the board only for critical issues that your manager cannot resolve.
+
+Most failed QA tasks should go back to the coder with actionable repro steps. If the task passes, mark it done.
+```
@@ -312,7 +312,7 @@ If you are asked to create or manage routines you MUST read:
 - **@-mentions** (`@AgentName` in comments) trigger heartbeats — use sparingly, they cost budget.
 - **Budget**: auto-paused at 100%. Above 80%, focus on critical tasks only.
 - **Escalate** via `chainOfCommand` when stuck. Reassign to manager or create a task for them.
- **Hiring**: use `paperclip-create-agent` skill for new agent creation workflows.
+- **Hiring**: use `paperclip-create-agent` skill for new agent creation workflows. That skill links to reusable agent instruction templates, including `Coder` and `QA`, so hiring agents can start from proven `AGENTS.md` patterns without bloating this heartbeat skill.
 - **Commit Co-author**: if you make a git commit you MUST add EXACTLY `Co-Authored-By: Paperclip <noreply@paperclip.ing>` to the end of each commit message. Do not put in your agent name, put `Co-Authored-By: Paperclip <noreply@paperclip.ing>`

 ## Comment Style (Required)