[codex] Split backend control-plane QoL slice (#4700)

## Thinking Path > - Paperclip is the control plane for autonomous AI companies, so backend task ownership, recovery, review visibility, and company-scoped limits need to stay enforceable without UI-only coupling. > - Closed PR #4692 bundled those backend changes with UI workflow, docs, skills, workflow, and lockfile churn. > - PAP-2694 asks for a clean backend/control-plane slice from that closed branch. > - This branch starts from current `master` and mines only the `cli`, `packages/db`, `packages/shared`, and `server` contracts/tests needed for the backend behavior. > - It explicitly excludes UI workflow/performance work, `.github/workflows/pr.yml`, `pnpm-lock.yaml`, docs, skills, package-script, adapter UI build-config, and perf fixture script changes; the only UI files are fixture/test updates required by the tightened shared `Company` contract. > - The benefit is a smaller reviewable PR that preserves the control-plane fixes while staying under Greptile s 100-file review limit. ## What Changed - Added company-scoped attachment-size limits through DB schema/migrations, shared company portability contracts, CLI import/export coverage, and server attachment upload enforcement. - Added productivity review service/API behavior for no-comment streak, long-active, and high-churn review issues, including request-depth clamping and issue summary exposure. - Hardened issue ownership and recovery/control-plane paths: peer-agent mutation denial, issue tree pause/resume behavior, stranded recovery origins, and related activity/test coverage. - Preserved related backend contract updates for routine timestamp variables and managed agent instruction bundles because they live in shared/server contracts from the source branch. - Addressed Greptile feedback by making `Company.attachmentMaxBytes` non-optional, simplifying review request-depth clamping, fixing the migration final newline, and enforcing the process-level attachment cap as the final ceiling for uploads. - Added minimal company fixtures needed for repo-wide typecheck/build and kept the PR to 66 changed files with forbidden/non-slice paths excluded. ## Verification - `pnpm install --frozen-lockfile` - `git diff --check origin/master..HEAD` - `git diff --name-only origin/master..HEAD | wc -l` -> 66 files - `git diff --name-only origin/master..HEAD -- .github/workflows/pr.yml pnpm-lock.yaml package.json doc skills .agents scripts packages/adapters` -> no output - `pnpm exec vitest run --config vitest.config.ts packages/shared/src/validators/issue.test.ts packages/shared/src/routine-variables.test.ts packages/shared/src/adapter-types.test.ts cli/src/__tests__/company-import-export-e2e.test.ts cli/src/__tests__/company.test.ts server/src/__tests__/productivity-review-service.test.ts server/src/__tests__/issue-tree-control-service.test.ts server/src/__tests__/issue-tree-control-routes.test.ts server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts server/src/__tests__/issue-attachment-routes.test.ts server/src/__tests__/heartbeat-process-recovery.test.ts server/src/__tests__/issues-service.test.ts` -> 12 files, 147 tests passed - `pnpm exec vitest run --config vitest.config.ts cli/src/__tests__/company-delete.test.ts cli/src/__tests__/company-import-export-e2e.test.ts server/src/__tests__/productivity-review-service.test.ts` -> 3 files, 18 tests passed - `pnpm exec vitest run --config vitest.config.ts server/src/__tests__/issue-attachment-routes.test.ts` -> 1 file, 6 tests passed - `pnpm --filter @paperclipai/db typecheck && pnpm --filter @paperclipai/shared typecheck && pnpm --filter @paperclipai/server typecheck && pnpm --filter paperclipai typecheck` - `pnpm --filter @paperclipai/server typecheck` - `pnpm --filter @paperclipai/ui typecheck && pnpm --filter @paperclipai/ui build` ## Risks - Includes migrations `0073_shiny_salo.sql` and `0074_striped_genesis.sql`; merge ordering matters if another PR adds migrations first. - This is intentionally backend-only apart from fixture/test updates forced by shared type correctness; UI affordances from PR #4692 are not present here and should land in separate UI slices. - The worktree install emitted plugin SDK bin-link warnings for unbuilt plugin packages, but the targeted tests and package typechecks completed successfully. > For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected; check the roadmap first. See `CONTRIBUTING.md`. ## Model Used - OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub workflow. Exact runtime context window was not exposed by the harness. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run tests locally and they pass - [x] I have added or updated tests where applicable - [x] If this change affects the UI, I have included before/after screenshots - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 16:46:45 -05:00
parent d9f540c331
commit 1991ec9d6f
66 changed files with 34186 additions and 148 deletions
@@ -472,6 +472,7 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
    retryReason?: "assignment_recovery" | "issue_continuation_needed" | null;
    assignToUser?: boolean;
    activePauseHold?: boolean;
+    livenessState?: "completed" | "advanced" | "plan_only" | "empty_response" | "blocked" | "failed" | "needs_followup" | null;
    runErrorCode?: string | null;
    runError?: string | null;
  }) {
@@ -545,6 +546,7 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
      error: input.runStatus === "succeeded"
        ? null
        : ("runError" in input ? input.runError : "run failed before issue advanced"),
+      livenessState: input.livenessState ?? null,
    });

    await db.insert(issues).values([
@@ -1417,6 +1419,59 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
    }
  });

+  it.each([
+    ["failed", "adapter_failed"],
+    ["failed", "process_lost"],
+    ["timed_out", "adapter_timed_out"],
+  ] as const)(
+    "re-enqueues stranded in-progress work after a %s/%s run before escalating",
+    async (runStatus, runErrorCode) => {
+      const { companyId, agentId, issueId, runId } = await seedStrandedIssueFixture({
+        status: "in_progress",
+        runStatus,
+        runErrorCode,
+      });
+      const heartbeat = heartbeatService(db);
+
+      const result = await heartbeat.reconcileStrandedAssignedIssues();
+      expect(result.dispatchRequeued).toBe(0);
+      expect(result.continuationRequeued).toBe(1);
+      expect(result.escalated).toBe(0);
+      expect(result.issueIds).toEqual([issueId]);
+
+      const runs = await db
+        .select()
+        .from(heartbeatRuns)
+        .where(eq(heartbeatRuns.agentId, agentId));
+      expect(runs).toHaveLength(2);
+
+      const retryRun = runs.find((row) => row.id !== runId);
+      expect(retryRun?.contextSnapshot as Record<string, unknown> | undefined).toMatchObject({
+        issueId,
+        taskId: issueId,
+        retryReason: "issue_continuation_needed",
+        retryOfRunId: runId,
+        source: "issue.continuation_recovery",
+      });
+
+      const recoveries = await db
+        .select()
+        .from(issues)
+        .where(
+          and(
+            eq(issues.companyId, companyId),
+            eq(issues.originKind, "stranded_issue_recovery"),
+            eq(issues.originId, issueId),
+          ),
+        );
+      expect(recoveries).toHaveLength(0);
+
+      if (retryRun?.id) {
+        await waitForRunToSettle(heartbeat, retryRun.id);
+      }
+    },
+  );
+
  it("still re-enqueues stranded assigned todo recovery when an old queued wake exists", async () => {
    const { companyId, agentId, issueId, runId } = await seedStrandedIssueFixture({
      status: "todo",
@@ -2055,18 +2110,21 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
    expect(wakeups).toHaveLength(1);
  });

-  it("re-enqueues continuation when the latest automatic continuation succeeded without closing the issue", async () => {
+  it("records productive continuation instead of recovery when the latest automatic continuation succeeded", async () => {
    const { agentId, issueId, runId } = await seedStrandedIssueFixture({
      status: "in_progress",
      runStatus: "succeeded",
      retryReason: "issue_continuation_needed",
+      livenessState: "advanced",
    });
    const heartbeat = heartbeatService(db);

    const result = await heartbeat.reconcileStrandedAssignedIssues();
-    expect(result.continuationRequeued).toBe(1);
+    expect(result.continuationRequeued).toBe(0);
+    expect(result.productiveContinuationObserved).toBe(1);
+    expect(result.successfulContinuationObserved).toBe(0);
    expect(result.escalated).toBe(0);
-    expect(result.issueIds).toEqual([issueId]);
+    expect(result.issueIds).toEqual([]);

    const issue = await db.select().from(issues).where(eq(issues.id, issueId)).then((rows) => rows[0] ?? null);
    expect(issue?.status).toBe("in_progress");
@@ -2078,14 +2136,10 @@ describeEmbeddedPostgres("heartbeat orphaned process recovery", () => {
      .select()
      .from(heartbeatRuns)
      .where(eq(heartbeatRuns.agentId, agentId));
-    expect(runs).toHaveLength(2);
+    expect(runs.map((row) => row.id)).toEqual([runId]);

-    const retryRun = runs.find((row) => row.id !== runId);
-    expect(retryRun?.id).toBeTruthy();
-    expect((retryRun?.contextSnapshot as Record<string, unknown>)?.retryReason).toBe("issue_continuation_needed");
-    if (retryRun) {
-      await waitForRunToSettle(heartbeat, retryRun.id);
-    }
+    const wakeups = await db.select().from(agentWakeupRequests).where(eq(agentWakeupRequests.agentId, agentId));
+    expect(wakeups).toHaveLength(1);
  });

  it("does not reconcile user-assigned work through the agent stranded-work recovery path", async () => {