Files

T

Dotta 1fe1067361 Polish board settings and skills workflow (#4863 )

## Thinking Path

> - Paperclip's board UI and bundled skills are the operator layer for
configuring agents, routines, issue workflows, and local troubleshooting
loops.
> - The prior rollup mixed this operator polish with database backups,
backend reliability, thread scale, and cost/workflow primitives.
> - This pull request isolates the remaining board QoL, settings,
issue-detail integration, adapter config cleanup, and skills smoke
tooling.
> - It includes some integration-level overlap with the thread and
workflow slices so this branch can run from `origin/master` while still
preserving the full original work.
> - Preferred merge order is the narrower primitives first, then this
integration PR last.
> - The benefit is that reviewers can inspect the user-facing
board/settings/skills layer separately from backend infrastructure
changes.

## What Changed

- Added board/settings polish for agents, routines, company settings,
project workspace detail, and issue detail controls.
- Added agent/routine UI regression tests and New Issue dialog coverage.
- Integrated issue-detail activity/cost/interaction surfaces and leaf
work pause/resume controls.
- Cleaned bundled adapter UI config defaults and onboarding copy.
- Added terminal-bench loop and work-stoppage diagnosis skills plus a
smoke test script.
- Updated attachment type handling and Paperclip skill/API guidance.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/pages/Agents.test.tsx
ui/src/pages/Routines.test.tsx ui/src/components/NewIssueDialog.test.tsx
ui/src/pages/IssueDetail.test.tsx
server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts`
- Result: 7 test files passed, 54 tests passed.
- `pnpm run smoke:terminal-bench-loop-skill`
- Result: JSON output included `"ok": true` and `"cleanup": true`.
- UI screenshots not included because verification is focused
component/page coverage for the changed board surfaces.

## Risks

- This is the integration-heavy PR in the split and intentionally
overlaps some component/API primitives with the issue-thread and
workflow PRs so it can run from `origin/master`.
- Preferred merge order: #4859, #4860, #4861, #4862, then this PR last.
If earlier branches merge first, this PR may need a straightforward
conflict refresh in shared UI files.
- The terminal-bench smoke script creates temporary mock issues and
relies on cleanup; the verified run returned `cleanup: true`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>

2026-04-30 15:28:11 -05:00

11 KiB

Raw Blame History

name, description

name	description
diagnose-why-work-stopped	How to handle "why did this work stop / why is this looping?" assignments. Forensics first on the named tree, surface the exact stop-point, frame the fix as a general product rule that respects three invariants (productive work continues, only real blockers stop work, no infinite loops), and deliver a plan — no code changes — gated by board/CTO approval before child issues are created. Use whenever the issue title or body asks for forensics on a stalled, looping, or "went too deep" tree.

name

description

diagnose-why-work-stopped

How to handle "why did this work stop / why is this looping?" assignments. Forensics first on the named tree, surface the exact stop-point, frame the fix as a general product rule that respects three invariants (productive work continues, only real blockers stop work, no infinite loops), and deliver a plan — no code changes — gated by board/CTO approval before child issues are created. Use whenever the issue title or body asks for forensics on a stalled, looping, or "went too deep" tree.

Diagnose Why Work Stopped

A repeatable procedure for the recurring class of issues where the user (or a manager) points at a stalled / looping / over-recovered issue tree and asks "why did this stop / why is this looping / how do we make sure this doesn't happen again?"

This skill is diagnostic + product-design, not engineering. The output is a written root cause and an approved plan. No code changes leave this skill.

Canonical execution model: read doc/execution-semantics.md before diagnosing or proposing a new liveness/recovery rule. Use that document as the source of truth for status, action-path, post-run disposition, bounded continuation, productivity review, pause-hold, watchdog, and explicit recovery semantics. If the investigation finds a true product-rule gap, the plan should say whether doc/execution-semantics.md needs a matching update.

When to use

Trigger on an assignment whose title or body matches any of:

"why did this work stop", "why did this stall", "why did this just stop"
"infinite loop", "looping", "spinning", "going too deep", "recovery went too deep"
"liveness — what happened here", "this tree stopped working", "stuck"
"approach it from a product perspective", "general product principle / rule"
An attached link to a specific stalled / looping / over-recovered issue tree

Also use when the user asks for forensics, root cause, or a write-up before any product change.

When NOT to use

The assignment asks you to ship a code change directly. Use normal engineering flow.
The assignment is a normal bug report against a specific feature. Use normal investigation.
You are the original implementer being asked to fix your own bug. Use normal debugging.

Three invariants you must preserve

Every diagnosis and every proposed rule must hold these three invariants together. The user has restated them on at least four issues; treat them as load-bearing:

Productive work continues. Agents that have a clear next action must keep working without needing the user to wake them. (PAP-2674, PAP-2708)
Only real blockers stop work. Stops happen when something genuinely cannot proceed (missing approval, missing dependency, human owner). Pseudo-stops (in_review with no action path, cancelled leaves, malformed metadata) must be detected and routed, not left silent. (PAP-2335, PAP-2674)
No infinite loops. Stranded-work recovery and continuation loops must be bounded and distinguishable from genuinely productive continuation. (PAP-2602, PAP-2486)

If a proposed rule violates any of the three, drop it or rework it. State explicitly in the plan how each invariant is held.

Procedure

0. Read the current execution contract

Before walking the tree, read doc/execution-semantics.md and keep its terms intact:

live path / waiting path / recovery path
post-run disposition: terminal, explicitly live, explicitly waiting, invalid
bounded run_liveness_continuation
productivity review vs liveness recovery
active subtree pause holds
silent active-run watchdog

Do not invent a new rule until you can state how it differs from the current execution semantics document.

1. Forensics on the named tree — before anything else

Do this in the same heartbeat. Do not propose a rule until you have a concrete stop point.

Open the linked issue (and its blocker chain, parents, recovery siblings, recent runs).
Walk the tree node-by-node and find the exact issue + state combination that stops the world. Common shapes seen in the company so far:
- in_review with no typed execution participant, no active run, no pending interaction, no recovery issue (PAP-2335, PAP-2674).
- in_progress after a successful run with no future action path queued (PAP-2674).
- Blocker chain whose leaf is cancelled / malformed / cross-company-inaccessible (PAP-2602).
- issue.continuation_recovery waking the same issue >N times after successful runs (PAP-2602).
- Stranded-work recovery treating its own recovery issues as more recoverable source work (PAP-2486).
Quote the evidence: run ids, comment timestamps, status transitions. "Inferred" is acceptable only when an API boundary blocks direct evidence — say so explicitly and mark the claim provisional (PAP-2631).

Respect the API boundary. If the linked issue is in another company and your agent token returns 403, do not bypass scoping. Either request a board-approved diagnostic path or proceed from inferred PAP-side evidence and label it.

Before proposing a new product rule, read what already shipped this week in the same area. The user has explicitly called this out: (PAP-2602) "review our recent work on liveness that we shipped in the last couple of days." A new rule that contradicts code merged 48 hours ago is rework, not improvement.

Quick survey:

Recent merged PRs in the affected area.
Recent done issues whose title mentions liveness, recovery, productivity, continuation, or the affected subsystem.
Any active plan documents on parent issues. The fix may belong as a revision to an existing plan, not as a new top-level proposal.

State in the forensics: "I reviewed X, Y, Z. The new gap is …"

3. Classify each non-progressing issue in the tree

For every issue in the affected tree that is not done / cancelled / actively running, decide:

Truly needs human or board intervention — name the owner and the action.
Agent-actionable but not currently routed — name the rule that would have routed it, and the agent that should have been waked.
Already covered — point at the active run, queued wake, recovery issue, or pending interaction.

This is the table the user has asked for repeatedly (PAP-2335). Without it the plan is abstract.

4. Frame as a general product rule

The user does not want a one-off patch on the named tree. They want the rule. Two checks:

The rule is stated as a contract, not as an if/else patch. Example contract: "every agent-owned non-terminal issue must finish each heartbeat with a terminal state, an explicit waiting path, or an explicit live path" (PAP-2674).
The rule is reconciled against doc/execution-semantics.md. Prefer citing and applying the existing contract; propose a document change only when the current doc is incomplete or contradicted by accepted/implemented behavior.
The rule explicitly preserves the three invariants above. Show the work.

If the rule would have blocked a recent productive run from succeeding, drop or narrow it.

5. Plan, do not code

Write the plan into the issue's plan document. Cover:

Forensics summary (root cause + evidence).
The general product rule, stated as a contract.
Whether the existing doc/execution-semantics.md contract already covers the case, or what exact documentation update is needed.
Phased subtasks: typically Phase 0 resolves the named live tree (carefully, not destructively), Phase 1 codifies the contract in docs, then implementation phases for detection, recovery, UI surfacing, security review, QA, and CTO review.
Explicit assignees per phase; favor team specialty (CodexCoder for server, ClaudeCoder for FE, UXDesigner for visible state, SecurityEngineer for ownership/permissions, QA for validation).
Blocking dependencies wired with blockedByIssueIds, parallel branches identified.

Do not create the child issues yet. Do not push code.

6. Request approval, then decompose

Open a request_confirmation interaction targeting the latest plan revision. Idempotency key confirmation:{issueId}:plan:{revisionId}.
Wait for board/CTO acceptance. If the user posts a new comment that supersedes the plan, the prior confirmation is invalidated — open a fresh confirmation tied to the new revision (PAP-2602 cycled three revisions; that is fine).
Only after acceptance: create the phased child issues with the right assignees and dependencies, then block this parent on the final QA / CTO review issue so the parent only wakes when the chain finishes.

7. Phase 0 hygiene on the named tree

Phase 0 cleans up the live tree without papering over evidence:

Move stalled in_review leaves with no participant to todo with a precise next action and named owner (PAP-2335).
Detach cancelled/dead blockers from chains they were holding hostage; do not silently mark issues done to clear backlog.
Leave a comment on the original named issue summarizing what changed and why; never hide the recovery chain history.

8. Final close-out

When the phase chain is complete, post a board-level summary comment on the parent issue: what changed, what the new contract is, what the rollout step is (e.g. "restart the control-plane to pick up the new response shape"), and the live state of the originally-named tree. Then close the parent.

Pitfalls

Coding before approval. The user has said "make a plan first" on every recent diagnostic issue. Producing code in the forensic phase wastes the round-trip.
Restating one invariant at the cost of another. Bound continuation too tightly and productive work stalls; loosen recovery and infinite loops return. Always check all three.
Skipping the recent-work survey. Proposing a contract that contradicts what shipped 24 hours ago is the easiest way to get the plan rejected.
Letting "in_review" mean done. A leaf assigned to another agent with no participant or active run is not progress; treat it as a stop.
Bypassing company scoping. Cross-company forensics needs a board-approved diagnostic path, not a database read.
Recursive recovery. Stranded-work recovery that recovers its own recovery issues is the canonical infinite loop (PAP-2486). Detect it and refuse to deepen.
Hiding the chain. Don't silently delete or hide the symptomatic recovery issues — the operator needs the audit trail.

Verification checklist (before posting the plan)

The exact stop point in the named tree is identified with run ids / comment ids.
Recent shipped work in the same area was surveyed and is referenced.
Every non-progressing issue is classified human-needed / agent-actionable / already-covered.
The proposed rule is stated as a contract, not a patch.
All three invariants are explicitly preserved.
No code change has landed in this heartbeat.
A request_confirmation against the latest plan revision is open.
Phase 0 of the plan addresses the live named tree without destroying evidence.
Implementation phases name specialty-appropriate assignees and blockedByIssueIds dependencies.

11 KiB Raw Blame History