The init-only and partial-run error paths now embed the K8s container
terminated state (reason, message, signal, OOM hint) directly in the
errorMessage. This eliminates the kubectl round-trip when diagnosing
adapter_failed runs — the surfaced error self-explains.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Init-only runs that exit with a non-zero code now surface a more actionable
message naming the exit code and the likely cause (unsupported model or
rejected session) instead of the generic "did not produce a result" text.
Helps operators diagnose model-id / billing-tier failures (e.g. opus 4.6).
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Capture the claude container's terminated state (exit code, reason, message,
signal) and surface it in the truncation error so operators see *why* the run
was cut short — e.g. "exit code 137, SIGKILL (commonly OOMKilled),
reason=OOMKilled, message=Memory cgroup out of memory" instead of just a
"truncated" label with no diagnostic context.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
When Claude produces assistant content (output_tokens > 0) but the stream ends
without a result event, classify the run as truncated mid-stream rather than
falling through to the generic "did not produce a result — check API
credentials" message. The misleading hint pointed operators at auth/model
config when the real cause was pod termination, OOMKill, or CLI crash.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Adds a diagnostic log line after skill resolution so operators can see exactly
which skills were bundled into each run, making it straightforward to diagnose
skill availability issues. Also surfaces the skill list in the onMeta
commandNotes for run metadata visibility.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two concurrent execute() calls for the same agent can both pass the
list-then-create guard before either job appears in the other's query.
The new module-level agentCreationMutex serializes the guard+create phase
within the process so only one call enters listNamespacedJob at a time.
The mutex is acquired after sanitizing the agent ID and released in a
finally block that wraps the entire guard+create section, so all early
return paths (guard blocks, create failures) cleanly release it. Variables
used in both the guard+create and log-streaming phases are hoisted to
before the try block. Cross-agent calls use separate mutex slots and are
unaffected.
Added two vitest cases verifying same-agent serialization and that
different-agent calls are not serialized.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
parseClaudeStreamJson now tracks assistant events with stop_reason:null and
output_tokens:0 (the MiniMax degraded-response pattern). When no result event
follows, execute() returns errorCode:"llm_api_error" with a descriptive message
instead of the generic adapter_failed.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
When a K8s Job is deleted externally (kubectl delete job or TTL before
terminal condition observed) and stdout has no result event, the adapter
now returns errorCode "k8s_job_deleted_externally" with the message
"K8s Job was deleted externally before Claude could complete" instead of
the misleading "Claude exited with code -1".
Tracks a jobDeletedExternally flag in execute() on the jobGone path and
checks it in the !parsed branch before falling through to buildPartialRunError.
Only applies when exitCode is null (pod gone alongside the job).
Adds regression test: FAR-31 scenario where job 404s mid-run with partial
stdout and missing pod produces the new error code.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Verifies that a terminating K8s job (deletionTimestamp set, no
Complete/Failed condition) is skipped by the concurrency guard so
subsequent heartbeat runs are not incorrectly blocked.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Jobs being deleted via kubectl enter a Terminating state where
deletionTimestamp is set but no Complete/Failed condition is added.
The concurrency guard previously treated these as running, blocking
all subsequent heartbeat runs for the agent until the job fully
disappeared from the K8s API.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
The prior approach (commit b607657) converted Claude's stream-json into
flat plain text before calling onLog. This stripped the structure the
Paperclip UI needs — its adapter ui-parser (src/ui-parser.ts, exported
via the package's ./ui-parser entry) expects raw stream-json lines and
emits structured transcript entries (assistant / thinking / tool_call /
tool_result / init / result) that the UI renders as rich blocks, just
like claude_local.
claude_local passes stdout through unchanged to onLog for the same
reason — the server persists raw lines and the UI parser turns them
into rendered transcript entries. Mirror that here.
formatClaudeStreamLine stays as an internal helper for future CLI use,
but is no longer applied in the K8s streaming path.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Exposing formatClaudeStreamLine at the package root caused Paperclip reinstalls
to fail with "'./cli/index.js' does not provide an export named
'formatClaudeStreamLine'". The host process caches child ESM module records
across reinstalls; linking the new dist/index.js re-export against the cached
old dist/cli/index.js fails.
The symbol is only used internally by server/execute.ts (which imports from
./cli/format-event.js directly), so drop the public re-export.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
All output sent to Paperclip via onLog now passes through formatClaudeStreamLine,
converting raw stream-json blobs into human-readable text consistent with how
the CLI and claude_local adapter format events.
Changes:
- format-event.ts: add formatClaudeStreamLine(raw) -> string | null
Plain-text equivalent of printClaudeStreamEvent — no ANSI colours, returns
null for lines to suppress (assistant with no content, unknown events).
Handles: system/init, assistant (text/thinking/tool_use), user (tool_result),
result (summary + tokens), rate_limit_event. Non-JSON lines pass through.
- execute.ts: wire formatClaudeStreamLine into streamPodLogsOnce write handler.
raw chunks still stored in 'chunks[]' for parseClaudeStreamJson; only the
onLog path receives formatted text.
- 12 new tests for formatClaudeStreamLine covering all event types.
- 352/352 tests pass.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
rate_limit_event was previously falling through to the debug-only branch
and silently dropped in non-debug mode. Now it surfaces a concise,
human-readable line for CLI consumers:
rate_limit: type=five_hour status=allowed resets=2026-04-22T06:00:00.000Z
Two tests cover the exact FAR-32 repro payload and graceful handling of
missing rate_limit_info fields.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Extends the previous fix (which only covered assistant/user) to skip every
JSON object with a non-empty "type" field — system, assistant, user,
rate_limit_event, result, and any future event types. This prevents all
structured protocol artefacts from being surfaced verbatim as error messages.
Root cause of the new repro: when Claude emits a rate_limit_event before
producing output and then exits without a result event, the rate_limit_event
JSON blob was becoming the "first content line" and appearing in the error:
Claude exited with code -1: {"type":"rate_limit_event","rate_limit_info":{...}}
With this fix, all typed events are filtered and the initOnlyOutput branch
fires, producing the clean diagnostic:
Claude started but did not produce a result (model: claude-opus-4-7)
— check API credentials, model support, and adapter config
Updated the "result event as content" test to match the new (correct) behaviour:
in production buildPartialRunError is only called when parseClaudeStreamJson
returns null (no result event), so the prior test was exercising a degenerate
state that cannot occur through execute().
Co-Authored-By: Paperclip <noreply@paperclip.ing>
When a model produces assistant events with output_tokens=0 but no result
event (e.g. MiniMax-M2.7 thinking-only output), the partial-run error
previously surfaced the raw assistant JSON blob verbatim, producing an
unreadable message like "Claude exited with code -1: {\"type\":\"assistant\",...}".
Fix: extend the content-line filter in buildPartialRunError to also skip
assistant and user event types (intermediate streaming events), in addition
to system events. result events are still retained since they may carry
useful terminal error details. When all stdout lines are filtered, the
existing initOnlyOutput branch triggers and surfaces a clean diagnostic:
"Claude started but did not produce a result (model: MiniMax-M2.7) — check
API credentials, model support, and adapter config".
Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Poll GET /api/heartbeat-runs/:runId on every keepalive tick (15s); when
status != 'running', delete the K8s Job, set logStopSignal, and return
errorCode='cancelled' — Job gone within ~15s of external cancellation.
- SIGTERM handler best-effort deletes all active Jobs/Secrets and re-emits
the signal to let the process exit naturally.
- Export shouldAbortForCancellation() helper; add tests for helper, cancel
poll path, and SIGTERM cleanup.
- Guard: PAPERCLIP_API_URL missing logs a warning and skips cancel polling;
HTTP 5xx from poll treated as transient; reattach path skips cancel poll.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Add `hasOutOfProcessLiveness: true` to createServerAdapter() so the
reaper skips local PID checks and uses the staleness window instead.
- Remove the initial onSpawn call and all periodic keepalive onSpawn
refreshes that were compensating for the missing flag.
- Remove POST_TERMINAL_KEEPALIVE_MS constant and keepaliveTick counter
that backed those workarounds.
- Cast required: adapter-utils ServerAdapterModule type predates this field.
- Bump to 0.1.38.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Previously the test suite relied on real fs.stat completing within the fake
timer advance window (~11200ms). Under CI with 11 parallel test files the I/O
could drain later than the advances allowed, causing a 1-in-4 timeout on the
"logs pod pending" test.
Fix: mock @paperclipai/adapter-utils/server-utils using vi.hoisted() + Object.assign
so readPaperclipRuntimeSkillEntries resolves immediately as a microtask. All other
exports are forwarded to the real module via importOriginal. Each beforeEach that
calls vi.resetAllMocks() or vi.clearAllMocks() now also calls
mockReadSkillEntries.mockResolvedValue([]) to restore the implementation.
Timer advances in affected tests are simplified to reflect the purely fake-timer
sequence (no I/O drain prefix). All 323 tests pass deterministically.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The log-stream-exit grace timer never fired because logExitTime was set
in the .then() of streamPodLogs, which only resolves once stopSignal is
set — but stopSignal is only set when completionWithGrace fires, which
requires logExitTime to be non-null. Classic deadlock.
Fix: add onFirstStreamExit callback to streamPodLogs, called after
attempt=0's streamPodLogsOnce returns (the first container exit signal).
execute() passes a closure that sets logExitTime immediately, breaking
the circular dependency and allowing the 30s grace timer to fire
correctly when K8s Job conditions lag container exit.
Tests: all 323 pass including the two FAR-23 grace-period regression tests.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
readPaperclipRuntimeSkillEntries does real fs.stat I/O under fake timers,
delaying execute()'s fake-timer registration by ~3200-4200ms of fake time
when tests run in isolation (cold OS page cache). The previous approach
tried vi.spyOn on an ESM module namespace export, which throws
"Cannot redefine property" — a fundamental ESM constraint.
Fix: remove the broken spy. Instead, each timer-heavy test now uses enough
advanceTimersByTimeAsync calls to (a) give the event loop sufficient turns
for the I/O to drain, and (b) cover the full fake-timer sequence even with
the maximum observed I/O delay. Patterns chosen:
reconnects (needs t+6000): 6 advances, ~12200ms total
deadline exceeded (needs t+3000): 5 advances, ~8400ms total
pod-creation wait (needs t+5000): 5 advances, ~9400ms total
execute.ts line coverage: 82.57% (was ~24% before this task's test additions).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- #9: match Paperclip container by name in k8s-client instead of
trusting spec.containers[0], which could be a service-mesh sidecar
- #11: key assistant-text dedup by (message.id, index) so legitimate
duplicate content across turns isn't collapsed in the summary
- #16: trim trailing hyphens from sanitized K8s names so truncation
doesn't produce names ending in "-"
Findings #5 (keepalive re-verify) and #6 (one-shot log dedup) were
already addressed in the current code — verified during this review.
#8 (orphan reattach behavior) requires a product decision on whether
"new session wins" is intentional, so deferring.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
When a large prompt creates a K8s Secret, it can orphan if the process
crashes before the finally block runs. Now the Secret gets an
ownerReference pointing to the Job after creation, so K8s GC cleans it
up automatically. Also cleans up the Secret on job creation failure.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
The trunc function in the RTK filter script now walks back from the
truncation point past continuation bytes and checks whether the full
codepoint fits, avoiding replacement characters from mid-codepoint slicing.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Prevents process_lost false positives for 2-3 minute K8s jobs by
resetting the reaper clock when the keepalive loop detects the job
has completed (or been deleted), rather than waiting for the next
periodic refresh.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
K8s Job pods were starting without the Paperclip skill loaded, so agents
could not find their heartbeat procedure and reported "no issue content in
my workspace" on every wake. Root cause: claude_local materialises skills
into a PVC-backed prompt-bundle directory and passes --add-dir to Claude,
but claude_k8s did neither.
Changes:
- Add src/server/prompt-cache.ts with prepareClaudePromptBundle (ported
from adapter-claude-local). Writes skill symlinks and the agent's
instructions file into a content-addressed bundle directory under the
shared PVC (/paperclip/instances/.../claude-prompt-cache/<hash>/).
- execute.ts: read desired skills and instructions file before building
the Job manifest, then call prepareClaudePromptBundle and pass the
resulting bundle to buildJobManifest.
- job-manifest.ts: accept optional promptBundle in JobBuildInput; when
present, pass --add-dir <bundle.addDir> and use bundle.instructionsFilePath
for --append-system-prompt-file. Also fix: skip --append-system-prompt-file
on session resumes to avoid wasting tokens on re-injection.
- skills.ts: correct the detail string to reflect actual materialisation.
- job-manifest.test.ts: add 5 new tests covering --add-dir injection,
bundle path preference, session-resume skipping, and fallback behaviour.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Uses vi.mock on k8s-client and vi.useFakeTimers to prove that when
logApi.log() never resolves (the FAR-10 hang shape) and stopSignal
fires, streamPodLogsOnce still returns within the bail window
(LOG_STREAM_BAIL_TIMEOUT_MS). Exports streamPodLogsOnce so the test
can call it directly. Also covers the no-stopSignal happy path.
269/269 passing (+2 new).
Co-Authored-By: Paperclip <noreply@paperclip.ing>