Compare commits

...

42 Commits

Author SHA1 Message Date
Chris Farhood 5179544fd6 docs: mark repo as abandoned in favor of paperclip-plugin-k8s
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 08:46:55 -04:00
Chris Farhood 160d6b49e9 0.2.5 2026-04-30 09:06:19 -04:00
Chris Farhood 9007762390 chore(deps): bump @paperclipai/adapter-utils from canary.7 pin to ^2026.428.0
The peerDep floor and devDep were pinned to a pre-release canary from
April 15, 13 days behind the current stable. Move both to the latest
stable 2026.428.0. All 328 tests pass against the new types; the
imported surface (asString, parseObject, runChildProcess,
AdapterExecutionContext, etc.) is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 09:06:12 -04:00
Chris Farhood 506007984c 0.2.4 2026-04-30 08:46:57 -04:00
Chris Farhood 7a6d1a44f2 fix(ui-parser): restore esbuild CJS bundle step lost in PR #11 merge
Commit 0e43811 added an esbuild step to bundle src/ui-parser.ts as CJS
because the UI's sandboxed worker can't evaluate ESM `export` syntax.
PR #11 (filesystem-log-tail) was based on a commit predating that fix,
so the merge clobbered both the build:ui-parser script and the esbuild
devDependency. Every release since has shipped a tsc-emitted ESM
ui-parser.js that the worker silently fails to load — parseStdoutLine
never registers and the run transcript falls back to dumping raw
stream-json lines as plain text instead of rendering structured
assistant/thinking/tool_call/tool_result entries.

Restore the script and dep verbatim from 0e43811.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 08:46:46 -04:00
Chris Farhood 3960d746f4 ci: serialize publish jobs sharing the same SHA to fix race
When a tagged release lands on master, both the master-push and tag-push
events trigger the publish job. The skip-on-exists check (`npm view`)
runs concurrently on both, both see the version as not-yet-published,
and both proceed to `npm publish`. The first wins; the second gets
E403 ("cannot publish over previously published versions") and reds
out the run.

Fixes the race by adding a publish-${{ github.sha }} concurrency group
so the second run queues until the first finishes — by then npm view
sees the published version and the skip path takes over cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 08:28:01 -04:00
Chris Farhood cc942ca818 0.2.3 2026-04-30 08:03:08 -04:00
Chris Farhood 83a2d25062 fix(execute): assign captured stdout to outer binding so parse sees it
The filesystem-tail rewrite (8bd5042) declared `const stdout` inside the
try block, shadowing the outer `let stdout = ""`. parseClaudeStreamJson
then ran on the empty outer binding, so every run failed with "Failed to
parse Claude JSON output" and resultJson={stdout:""} despite live
log-streaming working fine. Drop the `const` so the assignment lands on
the outer let.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 08:03:05 -04:00
Chris Farhood c8429cfde1 fix: write logs to /paperclip/instances/default/data/run-logs/ to match server PVC layout
v0.2.1 introduced filesystem-tail log delivery with buildPodLogPath()
returning /paperclip/instances/default/run-logs/... but the paperclip
server creates and tails from /paperclip/instances/default/data/run-logs/
on the shared PVC. The missing /data/ segment meant:

  1. The init container's mkdir -p /paperclip/instances/... ran in a
     directory busybox UID 1000 can't write to — it's the init
     container's ephemeral rootfs, since the PVC is only mounted in
     the main container. Init exited 1, the && short-circuited, and
     the prompt copy never happened. Job failed with "Init container
     'write-prompt' failed with exit code 1".
  2. Even if the mkdir had worked, the main container's tee would
     have written to a path the server doesn't tail.

Fix: drop the misplaced mkdir from both init container variants and
correct buildPodLogPath() to include /data/. The directory already
exists on the PVC because the paperclip server creates it; both
containers run as UID 1000 with fsGroup 1000, so the main container's
tee writes to the pre-existing path with no setup needed.

Bump to 0.2.2.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 22:15:15 -04:00
Chris Farhood 1502039d70 Merge pull request #11 from farhoodlabs/feat/filesystem-log-tail
feat: replace k8s log API streaming with filesystem tailing
2026-04-27 22:26:02 -04:00
Chris Farhood c326d2571e fix(ci): run on tags, publish on both master push and tags 2026-04-27 22:25:48 -04:00
Chris Farhood e6df8fad98 chore: bump to 0.2.0 2026-04-27 22:20:00 -04:00
Chris Farhood 8bd5042b5d feat: replace k8s log API streaming with filesystem tailing
Replaces K8s log API streaming (which was dropping every ~3 seconds at
production scale) with filesystem tailing via tee to a pod log file on
the shared PVC.

Core changes:
- Add tee to claudeInvocation to write pod log file
- Add mkdir -p to init container to create log directory
- Add assertSafePathComponent and buildPodLogPath helper
- Add tailPodLogFile function with adaptive 250ms/1s polling
- Replace k8s log streaming with tailPodLogFile in Promise.allSettled
- Delete log-dedup.ts (RTK output truncation no longer needed)
- Update config-schema.ts and index.ts to remove RTK references
- Clean up log file in cleanupJob when retainJobs=false

Note: 14 tests in execute.test.ts test the obsolete k8s log streaming
approach and need to be rewritten or deleted (streamPodLogsOnce tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-27 22:13:39 -04:00
Chris Farhood 568f571d8c fix(models): inline static model list in index.ts to break circular dep with server/models
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 09:16:35 -04:00
Chris Farhood 8a9376b40e chore: bump to 0.1.56
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 08:05:29 -04:00
Chris Farhood 0c8aa4d1ea fix(models): move import to top of index.ts before export declarations
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 08:04:46 -04:00
Chris Farhood 1d894f104f fix(models): expose static models list so UI renders entries before listModels resolves
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 07:42:44 -04:00
Chris Farhood fc3866924a 0.1.54 2026-04-27 00:38:06 +00:00
Chris Farhood 368254d75d fix: per-chunk activity tracking + pod-phase gate on grace timer (FAR-107)
The 0.1.53 fix tracked stream liveness by updating lastActiveAt only
after streamPodLogsOnce returned.  That worked for the
disconnect-then-reconnect-then-disconnect case, but missed the
disconnect-then-long-running-reconnect case: a streaming attempt that
runs for minutes without disconnecting never refreshes lastActiveAt,
so the grace timer fires 30s after the prior disconnect even though
the new attempt is currently producing output.  Nancy reproduced
exactly this on 0.1.53 — claude_truncated with pod phase=Running.

Two changes:

1. streamPodLogsOnce now accepts the activity ref and updates
   lastActiveAt inside its writable's write handler — every chunk
   delivered from the container refreshes the timer in real time,
   not just on stream return.

2. Before the grace timer settles, gate on pod phase: if the pod is
   still Running or Pending, the container is alive (Claude's
   long tool-use silences exceed 30s for slow upstream APIs).
   Refresh lastActiveAt, leave the poller armed, and let
   waitForJobCompletion remain the authoritative termination
   signal.  Only proceed with the grace settlement when the pod
   has actually reached a terminal phase or is gone.

The original FAR-23 fast-path (container exits, Job condition lags)
still works: when the container terminates, pod phase moves to
Succeeded/Failed and the gate falls through to the existing
Job-presence check.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 00:38:06 +00:00
Chris Farhood 34756f8215 0.1.53 2026-04-27 00:28:45 +00:00
Chris Farhood 07ef106c66 fix: gate grace timer on stream-output silence, not first disconnect (FAR-107)
The 30s grace timer that bounds K8s Job condition propagation lag was
armed by streamPodLogs's onFirstStreamExit callback the moment
streamPodLogsOnce returned for the first time.  A transient K8s log-API
disconnect mid-run also returns from streamPodLogsOnce — so the grace
timer fired 30s later regardless of whether streamPodLogs had already
reconnected and the container was still producing output.

Nancy / Privileged Escalation reproduced this on long Opus-4-6 runs:
the prod paperclip pod was stable, the cancel-poll guard was already
narrowed in 0.1.51, but every long run truncated with claude_truncated
+ "container terminated state not yet observable (pod phase=Running)"
because the run was being abandoned mid-output.

Replace the boolean onFirstStreamExit signal with a streamActivity ref
carrying lastActiveAt + streamHasExited.  streamPodLogs refreshes
lastActiveAt every time a streamPodLogsOnce attempt returns non-empty
output, so reconnects that resume real output keep the grace clock
reset.  The grace timer fires only once the stream has exited at least
once AND no chunk has arrived for the full grace window — which
preserves the original FAR-23 behaviour (container truly exited but
Job condition lags) while ending the false-truncation of healthy
streams.  Adds a regression test that asserts a stream drop + reconnect
+ deferred Job completion does not surface as truncated.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 00:28:44 +00:00
Chris Farhood fd7dce7239 0.1.52 2026-04-27 00:00:57 +00:00
Chris Farhood b1878c684e fix: retry-aware pod state lookup + honest truncation cause messages (FAR-107)
The single-shot getPodTerminatedState query lost a real race against
kubelet's containerStatus update: when Claude exited cleanly but quickly,
listNamespacedPod often returned the pod with phase=Succeeded/Failed but
without a populated state.terminated, so describeTruncationCause fell into
the catch-all "pod state unavailable — likely deleted before exit could
be read" branch.  That message is doubly wrong: the pod was not deleted
and the exit cause was readable a few hundred ms later.  Operators
chasing claude_truncated runs (Nancy/Privileged Escalation) had no
visibility into the actual exit code, OOMKilled flag, or reason.

Two changes:

1. Introduce lookupPodState + getPodLookupWithRetry — the lookup result
   carries the pod phase and a podMissing flag, and retries up to 4×500ms
   when the pod is in a terminal phase but containerStatuses lag.  When
   the pod is in a non-terminal phase or genuinely gone we bail
   immediately without burning the retry budget.

2. describeTruncationCause now distinguishes three states:
   - "pod is gone" (eviction, preemption, external delete)
   - "container terminated state not yet observable (pod phase=…)"
   - the existing populated-state path with exit code / reason / signal

The truncation error path re-queries with the retry-aware lookup right
before producing the message, so subsequent claude_truncated errors
surface the actual exit cause (137=OOMKilled, 143=SIGTERM, kubelet
reason text) instead of a misleading deletion claim.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 00:00:56 +00:00
Chris Farhood 83e105393c 0.1.51 2026-04-26 21:24:15 +00:00
Chris Farhood 49288fa5c7 fix: scope cancel-polling to explicit cancellation states only (FAR-107)
shouldAbortForCancellation previously treated any non-`running` runStatus
as a cancellation signal — which made the keepalive's cancel-poll delete
the K8s Job whenever the heartbeat-runs API briefly returned a transient
or stale status (e.g. queued, pending, succeeded, failed, completed,
unknown) for an in-flight run.  The follow-up `waitForJobCompletion`
poll then observed the 404 and surfaced a spurious
`k8s_job_deleted_externally` error to the user, even though no human
or external system deleted the Job.

Privileged Escalation's "null-pointer-nancy" agent reproduced this on
runs that were never cancelled and were not adjacent to a paperclip
restart, ruling out the SIGTERM path that 0.1.50 already addressed.

Tighten the guard to fire only on `cancelled` / `cancelling`.  Other
terminal statuses are unreachable while the adapter is still executing
(the adapter's own return is what flips them) and even if observed
mid-run, they do not justify deleting a Job that may still be doing
real work — the natural completion path will tear it down.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 21:24:11 +00:00
Chris Farhood dae9e18659 0.1.50 2026-04-26 21:19:03 +00:00
Chris Farhood 6923597b31 fix: do not delete active Jobs on SIGTERM — leave for orphan reattach (FAR-107)
Root cause of Nancy's k8s_job_deleted_externally false positive: the
paperclip server itself receives SIGTERM during rolling deploys,
evictions, scale-down, etc.  The previous SIGTERM handler iterated
activeJobs and deleted every Job before exiting, which surfaced in the
in-flight heartbeat as "K8s Job was deleted externally" — even though
nothing external touched it.

With reattachOrphanedJobs=true (default), this is exactly the wrong
behaviour: leaving the Jobs alive lets the next paperclip process
discover them via the orphan-classification path and reattach their
log streams.  With reattachOrphanedJobs=false the operator opted into
manual cleanup, so we still must not auto-delete.

The Job's ownerReference (FAR-15) keeps the prompt Secret tied to the
Job, so both survive together and TTL handles cleanup on natural
completion.  Test rewritten to assert the new contract: SIGTERM must
not touch K8s Jobs.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 21:19:02 +00:00
Chris Farhood d184a1732b 0.1.49 2026-04-26 21:06:19 +00:00
Chris Farhood be84428226 fix: enrich k8s_job_deleted_externally error with forensics + verify Job presence on grace fire (FAR-107)
The error previously fired with no diagnostic context, making it impossible
to distinguish (a) self-delete by our SIGTERM/cancel path, (b) TTL after a
missed Complete condition, or (c) actual external deletion without cluster
shell access.  Two changes:

1. Grace-period verification: when the log stream exits and the 30s grace
   timer fires, do a one-shot readNamespacedJob before declaring the Job
   gone.  If it's still there, settle as gracePeriodFired (not jobGone) so
   we don't mis-classify K8s condition propagation lag as deletion.

2. Forensic capture: track which of the three detection paths
   (completion-poll-404, grace-period-verify-404, recheck-poll-404)
   first observed the 404, the last successful Job conditions read, the
   poll count, elapsed time since pod-running, and stdout size.  Append
   all of it to the errorMessage so the next occurrence is self-diagnosing.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 21:05:04 +00:00
Chris Farhood d9928030d6 0.1.48 2026-04-26 14:48:22 +00:00
Chris Farhood 76fc6fcdfc fix: surface pod terminated reason/message in adapter_failed errors (FAR-100)
The init-only and partial-run error paths now embed the K8s container
terminated state (reason, message, signal, OOM hint) directly in the
errorMessage. This eliminates the kubectl round-trip when diagnosing
adapter_failed runs — the surfaced error self-explains.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 14:48:12 +00:00
Chris Farhood 3169f49f23 0.1.47 2026-04-26 13:04:54 +00:00
Chris Farhood e0b35d230f fix: distinguish init-only non-zero exits in buildPartialRunError (FAR-100)
Init-only runs that exit with a non-zero code now surface a more actionable
message naming the exit code and the likely cause (unsupported model or
rejected session) instead of the generic "did not produce a result" text.
Helps operators diagnose model-id / billing-tier failures (e.g. opus 4.6).

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 13:04:43 +00:00
Chris Farhood 4e2c36319d 0.1.46 2026-04-26 01:57:43 +00:00
Chris Farhood 8474f78fe1 fix: include pod terminated reason/message in claude_truncated error (FAR-95)
Capture the claude container's terminated state (exit code, reason, message,
signal) and surface it in the truncation error so operators see *why* the run
was cut short — e.g. "exit code 137, SIGKILL (commonly OOMKilled),
reason=OOMKilled, message=Memory cgroup out of memory" instead of just a
"truncated" label with no diagnostic context.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 01:57:43 +00:00
Chris Farhood 88896eddcf 0.1.45 2026-04-26 01:54:48 +00:00
Chris Farhood a2874c0426 fix: detect mid-stream truncation and emit claude_truncated error code (FAR-95)
When Claude produces assistant content (output_tokens > 0) but the stream ends
without a result event, classify the run as truncated mid-stream rather than
falling through to the generic "did not produce a result — check API
credentials" message. The misleading hint pointed operators at auth/model
config when the real cause was pod termination, OOMKill, or CLI crash.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 01:54:35 +00:00
Chris Farhood 818aa0f1d6 feat: log bundled skill names and add skills to onMeta commandNotes (FAR-36)
Adds a diagnostic log line after skill resolution so operators can see exactly
which skills were bundled into each run, making it straightforward to diagnose
skill availability issues. Also surfaces the skill list in the onMeta
commandNotes for run metadata visibility.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 20:41:01 +00:00
Chris Farhood 55fd3021fb fix: add per-agent mutex to eliminate TOCTOU race in K8s concurrency guard (FAR-29)
Two concurrent execute() calls for the same agent can both pass the
list-then-create guard before either job appears in the other's query.
The new module-level agentCreationMutex serializes the guard+create phase
within the process so only one call enters listNamespacedJob at a time.

The mutex is acquired after sanitizing the agent ID and released in a
finally block that wraps the entire guard+create section, so all early
return paths (guard blocks, create failures) cleanly release it. Variables
used in both the guard+create and log-streaming phases are hoisted to
before the try block. Cross-agent calls use separate mutex slots and are
unaffected.

Added two vitest cases verifying same-agent serialization and that
different-agent calls are not serialized.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 20:10:01 +00:00
Chris Farhood 83b58f9207 fix: detect stop_reason:null + output_tokens:0 and emit llm_api_error (FAR-30)
parseClaudeStreamJson now tracks assistant events with stop_reason:null and
output_tokens:0 (the MiniMax degraded-response pattern). When no result event
follows, execute() returns errorCode:"llm_api_error" with a descriptive message
instead of the generic adapter_failed.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 20:00:42 +00:00
Chris Farhood 602afa9b84 fix: return k8s_job_deleted_externally error code when job deleted mid-run (FAR-31)
When a K8s Job is deleted externally (kubectl delete job or TTL before
terminal condition observed) and stdout has no result event, the adapter
now returns errorCode "k8s_job_deleted_externally" with the message
"K8s Job was deleted externally before Claude could complete" instead of
the misleading "Claude exited with code -1".

Tracks a jobDeletedExternally flag in execute() on the jobGone path and
checks it in the !parsed branch before falling through to buildPartialRunError.
Only applies when exitCode is null (pod gone alongside the job).

Adds regression test: FAR-31 scenario where job 404s mid-run with partial
stdout and missing pod produces the new error code.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 19:58:46 +00:00
Chris Farhood 986f2fc7fa test: add coverage for deletionTimestamp concurrency guard bypass (FAR-34)
Verifies that a terminating K8s job (deletionTimestamp set, no
Complete/Failed condition) is skipped by the concurrency guard so
subsequent heartbeat runs are not incorrectly blocked.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 19:57:10 +00:00
15 changed files with 1393 additions and 1694 deletions
+5 -1
View File
@@ -3,6 +3,7 @@ name: CI
on:
push:
branches: [master]
tags: ['v*']
pull_request:
branches: [master]
@@ -28,7 +29,10 @@ jobs:
publish:
needs: test
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/master' && github.event_name == 'push'
if: (github.ref == 'refs/heads/master' && github.event_name == 'push') || startsWith(github.ref, 'refs/tags/')
concurrency:
group: publish-${{ github.sha }}
cancel-in-progress: false
permissions:
id-token: write
steps:
+2
View File
@@ -1,3 +1,5 @@
> **Abandoned** — This adapter is no longer maintained. The Kubernetes execution capability has moved to the sandbox plugin at [`farhoodlabs/paperclip-plugin-k8s`](https://github.com/farhoodlabs/paperclip-plugin-k8s) (`@farhoodlabs/paperclip-plugin-k8s` on npm).
# Claude (Kubernetes) Paperclip Adapter Plugin
Paperclip adapter plugin that runs Claude Code agents as isolated Kubernetes Jobs instead of inside the main Paperclip process.
+474 -7
View File
@@ -1,26 +1,27 @@
{
"name": "paperclip-adapter-claude-k8s",
"version": "0.1.42",
"version": "0.2.5",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "paperclip-adapter-claude-k8s",
"version": "0.1.42",
"version": "0.2.5",
"license": "MIT",
"dependencies": {
"@kubernetes/client-node": "^1.0.0",
"picocolors": "^1.1.1"
},
"devDependencies": {
"@paperclipai/adapter-utils": "2026.415.0-canary.7",
"@paperclipai/adapter-utils": "^2026.428.0",
"@types/node": "^24.6.0",
"@vitest/coverage-v8": "^4.1.4",
"esbuild": "^0.24.0",
"typescript": "^5.7.3",
"vitest": "^4.1.4"
},
"peerDependencies": {
"@paperclipai/adapter-utils": ">=2026.415.0-canary.7"
"@paperclipai/adapter-utils": ">=2026.428.0"
}
},
"node_modules/@babel/helper-string-parser": {
@@ -117,6 +118,431 @@
"tslib": "^2.4.0"
}
},
"node_modules/@esbuild/aix-ppc64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.24.2.tgz",
"integrity": "sha512-thpVCb/rhxE/BnMLQ7GReQLLN8q9qbHmI55F4489/ByVg2aQaQ6kbcLb6FHkocZzQhxc4gx0sCk0tJkKBFzDhA==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"aix"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/android-arm": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.24.2.tgz",
"integrity": "sha512-tmwl4hJkCfNHwFB3nBa8z1Uy3ypZpxqxfTQOcHX+xRByyYgunVbZ9MzUUfb0RxaHIMnbHagwAxuTL+tnNM+1/Q==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/android-arm64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.24.2.tgz",
"integrity": "sha512-cNLgeqCqV8WxfcTIOeL4OAtSmL8JjcN6m09XIgro1Wi7cF4t/THaWEa7eL5CMoMBdjoHOTh/vwTO/o2TRXIyzg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/android-x64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.24.2.tgz",
"integrity": "sha512-B6Q0YQDqMx9D7rvIcsXfmJfvUYLoP722bgfBlO5cGvNVb5V/+Y7nhBE3mHV9OpxBf4eAS2S68KZztiPaWq4XYw==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/darwin-arm64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.24.2.tgz",
"integrity": "sha512-kj3AnYWc+CekmZnS5IPu9D+HWtUI49hbnyqk0FLEJDbzCIQt7hg7ucF1SQAilhtYpIujfaHr6O0UHlzzSPdOeA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/darwin-x64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.24.2.tgz",
"integrity": "sha512-WeSrmwwHaPkNR5H3yYfowhZcbriGqooyu3zI/3GGpF8AyUdsrrP0X6KumITGA9WOyiJavnGZUwPGvxvwfWPHIA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/freebsd-arm64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.24.2.tgz",
"integrity": "sha512-UN8HXjtJ0k/Mj6a9+5u6+2eZ2ERD7Edt1Q9IZiB5UZAIdPnVKDoG7mdTVGhHJIeEml60JteamR3qhsr1r8gXvg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/freebsd-x64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.24.2.tgz",
"integrity": "sha512-TvW7wE/89PYW+IevEJXZ5sF6gJRDY/14hyIGFXdIucxCsbRmLUcjseQu1SyTko+2idmCw94TgyaEZi9HUSOe3Q==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/linux-arm": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.24.2.tgz",
"integrity": "sha512-n0WRM/gWIdU29J57hJyUdIsk0WarGd6To0s+Y+LwvlC55wt+GT/OgkwoXCXvIue1i1sSNWblHEig00GBWiJgfA==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/linux-arm64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.24.2.tgz",
"integrity": "sha512-7HnAD6074BW43YvvUmE/35Id9/NB7BeX5EoNkK9obndmZBUk8xmJJeU7DwmUeN7tkysslb2eSl6CTrYz6oEMQg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/linux-ia32": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.24.2.tgz",
"integrity": "sha512-sfv0tGPQhcZOgTKO3oBE9xpHuUqguHvSo4jl+wjnKwFpapx+vUDcawbwPNuBIAYdRAvIDBfZVvXprIj3HA+Ugw==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/linux-loong64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.24.2.tgz",
"integrity": "sha512-CN9AZr8kEndGooS35ntToZLTQLHEjtVB5n7dl8ZcTZMonJ7CCfStrYhrzF97eAecqVbVJ7APOEe18RPI4KLhwQ==",
"cpu": [
"loong64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/linux-mips64el": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.24.2.tgz",
"integrity": "sha512-iMkk7qr/wl3exJATwkISxI7kTcmHKE+BlymIAbHO8xanq/TjHaaVThFF6ipWzPHryoFsesNQJPE/3wFJw4+huw==",
"cpu": [
"mips64el"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/linux-ppc64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.24.2.tgz",
"integrity": "sha512-shsVrgCZ57Vr2L8mm39kO5PPIb+843FStGt7sGGoqiiWYconSxwTiuswC1VJZLCjNiMLAMh34jg4VSEQb+iEbw==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/linux-riscv64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.24.2.tgz",
"integrity": "sha512-4eSFWnU9Hhd68fW16GD0TINewo1L6dRrB+oLNNbYyMUAeOD2yCK5KXGK1GH4qD/kT+bTEXjsyTCiJGHPZ3eM9Q==",
"cpu": [
"riscv64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/linux-s390x": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.24.2.tgz",
"integrity": "sha512-S0Bh0A53b0YHL2XEXC20bHLuGMOhFDO6GN4b3YjRLK//Ep3ql3erpNcPlEFed93hsQAjAQDNsvcK+hV90FubSw==",
"cpu": [
"s390x"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/linux-x64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.24.2.tgz",
"integrity": "sha512-8Qi4nQcCTbLnK9WoMjdC9NiTG6/E38RNICU6sUNqK0QFxCYgoARqVqxdFmWkdonVsvGqWhmm7MO0jyTqLqwj0Q==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/netbsd-arm64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-arm64/-/netbsd-arm64-0.24.2.tgz",
"integrity": "sha512-wuLK/VztRRpMt9zyHSazyCVdCXlpHkKm34WUyinD2lzK07FAHTq0KQvZZlXikNWkDGoT6x3TD51jKQ7gMVpopw==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"netbsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/netbsd-x64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.24.2.tgz",
"integrity": "sha512-VefFaQUc4FMmJuAxmIHgUmfNiLXY438XrL4GDNV1Y1H/RW3qow68xTwjZKfj/+Plp9NANmzbH5R40Meudu8mmw==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"netbsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/openbsd-arm64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.24.2.tgz",
"integrity": "sha512-YQbi46SBct6iKnszhSvdluqDmxCJA+Pu280Av9WICNwQmMxV7nLRHZfjQzwbPs3jeWnuAhE9Jy0NrnJ12Oz+0A==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"openbsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/openbsd-x64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.24.2.tgz",
"integrity": "sha512-+iDS6zpNM6EnJyWv0bMGLWSWeXGN/HTaF/LXHXHwejGsVi+ooqDfMCCTerNFxEkM3wYVcExkeGXNqshc9iMaOA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"openbsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/sunos-x64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.24.2.tgz",
"integrity": "sha512-hTdsW27jcktEvpwNHJU4ZwWFGkz2zRJUz8pvddmXPtXDzVKTTINmlmga3ZzwcuMpUvLw7JkLy9QLKyGpD2Yxig==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"sunos"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/win32-arm64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.24.2.tgz",
"integrity": "sha512-LihEQ2BBKVFLOC9ZItT9iFprsE9tqjDjnbulhHoFxYQtQfai7qfluVODIYxt1PgdoyQkz23+01rzwNwYfutxUQ==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/win32-ia32": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.24.2.tgz",
"integrity": "sha512-q+iGUwfs8tncmFC9pcnD5IvRHAzmbwQ3GPS5/ceCyHdjXubwQWI12MKWSNSMYLJMq23/IUCvJMS76PDqXe1fxA==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@esbuild/win32-x64": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.24.2.tgz",
"integrity": "sha512-7VTgWzgMGvup6aSqDPLiW5zHaxYJGTO4OokMjIlrCtf+VpEL+cXKtCvg723iguPYI5oaUNdS+/V7OU2gvXVWEg==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
],
"engines": {
"node": ">=18"
}
},
"node_modules/@jridgewell/resolve-uri": {
"version": "3.1.2",
"resolved": "https://registry.npmjs.org/@jridgewell/resolve-uri/-/resolve-uri-3.1.2.tgz",
@@ -223,9 +649,9 @@
}
},
"node_modules/@paperclipai/adapter-utils": {
"version": "2026.415.0-canary.7",
"resolved": "https://registry.npmjs.org/@paperclipai/adapter-utils/-/adapter-utils-2026.415.0-canary.7.tgz",
"integrity": "sha512-VNzIZmu1lrK6QM8Ad9WkOihZItfkj21NHKQf+artDcbwFT2hHbDAD9hdW2W9NMVxYdFvvnws3w76FI/BUbCMbQ==",
"version": "2026.428.0",
"resolved": "https://registry.npmjs.org/@paperclipai/adapter-utils/-/adapter-utils-2026.428.0.tgz",
"integrity": "sha512-kGHpE7rhePPCbnG3OwXbNuHZZuI+XyuFgNSiDnrEeiSbkI2c5XHM2WnWDCZ/NGHULfJW3lWhSxGMFoYqiy38vQ==",
"dev": true,
"license": "MIT"
},
@@ -1033,6 +1459,47 @@
"node": ">= 0.4"
}
},
"node_modules/esbuild": {
"version": "0.24.2",
"resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.24.2.tgz",
"integrity": "sha512-+9egpBW8I3CD5XPe0n6BfT5fxLzxrlDzqydF3aviG+9ni1lDC/OvMHcxqEFV0+LANZG5R1bFMWfUrjVsdwxJvA==",
"dev": true,
"hasInstallScript": true,
"license": "MIT",
"bin": {
"esbuild": "bin/esbuild"
},
"engines": {
"node": ">=18"
},
"optionalDependencies": {
"@esbuild/aix-ppc64": "0.24.2",
"@esbuild/android-arm": "0.24.2",
"@esbuild/android-arm64": "0.24.2",
"@esbuild/android-x64": "0.24.2",
"@esbuild/darwin-arm64": "0.24.2",
"@esbuild/darwin-x64": "0.24.2",
"@esbuild/freebsd-arm64": "0.24.2",
"@esbuild/freebsd-x64": "0.24.2",
"@esbuild/linux-arm": "0.24.2",
"@esbuild/linux-arm64": "0.24.2",
"@esbuild/linux-ia32": "0.24.2",
"@esbuild/linux-loong64": "0.24.2",
"@esbuild/linux-mips64el": "0.24.2",
"@esbuild/linux-ppc64": "0.24.2",
"@esbuild/linux-riscv64": "0.24.2",
"@esbuild/linux-s390x": "0.24.2",
"@esbuild/linux-x64": "0.24.2",
"@esbuild/netbsd-arm64": "0.24.2",
"@esbuild/netbsd-x64": "0.24.2",
"@esbuild/openbsd-arm64": "0.24.2",
"@esbuild/openbsd-x64": "0.24.2",
"@esbuild/sunos-x64": "0.24.2",
"@esbuild/win32-arm64": "0.24.2",
"@esbuild/win32-ia32": "0.24.2",
"@esbuild/win32-x64": "0.24.2"
}
},
"node_modules/estree-walker": {
"version": "3.0.3",
"resolved": "https://registry.npmjs.org/estree-walker/-/estree-walker-3.0.3.tgz",
+6 -4
View File
@@ -1,6 +1,6 @@
{
"name": "paperclip-adapter-claude-k8s",
"version": "0.1.43",
"version": "0.2.5",
"description": "Paperclip adapter plugin that runs Claude Code agents as Kubernetes Jobs",
"license": "MIT",
"repository": {
@@ -25,7 +25,8 @@
"dist"
],
"scripts": {
"build": "tsc",
"build": "tsc && npm run build:ui-parser",
"build:ui-parser": "esbuild src/ui-parser.ts --bundle --format=cjs --target=es2020 --outfile=dist/ui-parser.js --log-level=warning",
"clean": "rm -rf dist",
"typecheck": "tsc --noEmit",
"test": "vitest run",
@@ -37,12 +38,13 @@
"picocolors": "^1.1.1"
},
"peerDependencies": {
"@paperclipai/adapter-utils": ">=2026.415.0-canary.7"
"@paperclipai/adapter-utils": ">=2026.428.0"
},
"devDependencies": {
"@paperclipai/adapter-utils": "2026.415.0-canary.7",
"@paperclipai/adapter-utils": "^2026.428.0",
"@types/node": "^24.6.0",
"@vitest/coverage-v8": "^4.1.4",
"esbuild": "^0.24.0",
"typescript": "^5.7.3",
"vitest": "^4.1.4"
}
+29 -5
View File
@@ -1,7 +1,35 @@
import type { AdapterModel } from "@paperclipai/adapter-utils";
export const type = "claude_k8s";
export const label = "Claude (Kubernetes)";
export const models: undefined = undefined;
function isBedrockEnv(): boolean {
return (
process.env.CLAUDE_CODE_USE_BEDROCK === "1" ||
process.env.CLAUDE_CODE_USE_BEDROCK === "true" ||
(typeof process.env.ANTHROPIC_BEDROCK_BASE_URL === "string" &&
process.env.ANTHROPIC_BEDROCK_BASE_URL.trim().length > 0)
);
}
const DIRECT_MODELS: AdapterModel[] = [
{ id: "claude-opus-4-7", label: "Claude Opus 4.7" },
{ id: "claude-opus-4-6", label: "Claude Opus 4.6" },
{ id: "claude-sonnet-4-6", label: "Claude Sonnet 4.6" },
{ id: "claude-haiku-4-6", label: "Claude Haiku 4.6" },
{ id: "claude-sonnet-4-5-20250929", label: "Claude Sonnet 4.5" },
{ id: "claude-haiku-4-5-20251001", label: "Claude Haiku 4.5" },
];
const BEDROCK_MODELS: AdapterModel[] = [
{ id: "us.anthropic.claude-opus-4-7", label: "Bedrock Opus 4.7" },
{ id: "us.anthropic.claude-opus-4-6-v1", label: "Bedrock Opus 4.6" },
{ id: "us.anthropic.claude-sonnet-4-6", label: "Bedrock Sonnet 4.6" },
{ id: "us.anthropic.claude-sonnet-4-5-20250929-v1:0", label: "Bedrock Sonnet 4.5" },
{ id: "us.anthropic.claude-haiku-4-5-20251001-v1:0", label: "Bedrock Haiku 4.5" },
];
export const models = isBedrockEnv() ? BEDROCK_MODELS : DIRECT_MODELS;
export const agentConfigurationDoc = `# claude_k8s agent configuration
@@ -32,10 +60,6 @@ Kubernetes fields:
- retainJobs (boolean, optional): skip cleanup on completion for debugging
- reattachOrphanedJobs (boolean, optional): when true (default), attach to a running orphaned Job that matches the current agent/task/session instead of blocking; when false, any non-terminal orphan blocks the new run
Output filtering fields:
- enableRtk (boolean, optional): truncate oversized tool outputs before they reach the model via a PostToolUse hook; default false
- rtkMaxOutputBytes (number, optional): byte threshold for tool output truncation when enableRtk is true; default 50000
Operational fields:
- timeoutSec (number, optional): run timeout in seconds; 0 means no timeout
- graceSec (number, optional): additional grace before adapter gives up after Job deadline
+1 -16
View File
@@ -133,22 +133,7 @@ export function getConfigSchema(): AdapterConfigSchema {
label: "Labels",
hint: "Extra labels added to Job metadata. One key=value per line.",
},
// Output filtering (RTK-compatible)
{
type: "toggle",
key: "enableRtk",
label: "Enable Output Filtering",
hint: "Truncate oversized tool outputs before they reach the model, reducing token consumption. Implemented natively in Node.js — no external binary required. Installs a PostToolUse hook in ~/.claude/settings.json for each run.",
default: false,
},
{
type: "number",
key: "rtkMaxOutputBytes",
label: "Max Tool Output Bytes",
hint: "Maximum bytes of tool output to pass to the model when output filtering is enabled. Outputs exceeding this threshold are truncated with a summary. Default: 50000.",
default: 50000,
},
];
return { fields };
}
}
+174 -782
View File
File diff suppressed because it is too large Load Diff
+472 -377
View File
File diff suppressed because it is too large Load Diff
+52 -95
View File
@@ -1,6 +1,6 @@
import { describe, it, expect, beforeEach } from "vitest";
import type { AdapterExecutionContext } from "@paperclipai/adapter-utils";
import { buildJobManifest, buildRtkSetupCommands, sanitizeLabelValue } from "./job-manifest.js";
import { buildJobManifest, buildPodLogPath, sanitizeLabelValue } from "./job-manifest.js";
import type { SelfPodInfo } from "./k8s-client.js";
function makeCtx(overrides: Partial<AdapterExecutionContext> = {}): AdapterExecutionContext {
@@ -221,11 +221,9 @@ describe("buildJobManifest", () => {
it("omits paperclip.io/run-id when sanitized value is null (all-invalid runId)", () => {
// inject an all-special-chars runId via context override — buildJobManifest
// uses ctx.runId directly
// uses ctx.runId directly. Use characters that are path-valid but label-invalid.
const badCtx = makeCtx({ runId: "@@@" });
const { job, skippedLabels } = buildJobManifest({ ctx: badCtx, selfPod });
expect(job.metadata?.labels?.["paperclip.io/run-id"]).toBeUndefined();
expect(skippedLabels).toContain("paperclip.io/run-id");
expect(() => buildJobManifest({ ctx: badCtx, selfPod })).toThrow("Invalid runId");
});
it("selector matches sanitized agent-id label", () => {
@@ -301,7 +299,9 @@ describe("buildJobManifest", () => {
it("write-prompt writes PROMPT_CONTENT to /tmp/prompt/prompt.txt", () => {
const { job } = buildJobManifest({ ctx, selfPod });
const init = job.spec?.template?.spec?.initContainers?.[0];
expect(init?.command).toEqual(["sh", "-c", "printf '%s' \"$PROMPT_CONTENT\" > /tmp/prompt/prompt.txt"]);
expect(init?.command?.[0]).toBe("sh");
expect(init?.command?.[1]).toBe("-c");
expect(init?.command?.[2]).toBe("printf '%s' \"$PROMPT_CONTENT\" > /tmp/prompt/prompt.txt");
});
it("write-prompt mounts prompt volume", () => {
@@ -794,112 +794,69 @@ describe("buildJobManifest", () => {
});
});
describe("rtk output filtering", () => {
describe("pod log file tailing", () => {
it("does not modify main command when enableRtk is false (default)", () => {
const { job } = buildJobManifest({ ctx, selfPod });
const cmd = job.spec?.template?.spec?.containers[0]?.command;
// Command should be the plain `cat ... | claude ...` form with no rtk setup
expect(cmd?.[2]).toMatch(/^cat \/tmp\/prompt\/prompt\.txt \| claude /);
// Command should be the plain `cat ... | claude ... | tee ...` form with no rtk setup
expect(cmd?.[2]).toMatch(/^cat \/tmp\/prompt\/prompt\.txt \| claude .* \| tee /);
expect(cmd?.[2]).not.toContain("rtk-filter");
});
it("prepends RTK setup commands when enableRtk is true", () => {
ctx.config = { enableRtk: true };
const { job } = buildJobManifest({ ctx, selfPod });
const cmd = job.spec?.template?.spec?.containers[0]?.command;
expect(cmd?.[2]).toContain(".rtk-filter.js");
expect(cmd?.[2]).toContain("cat /tmp/prompt/prompt.txt | claude");
});
it("RTK setup runs before claude invocation", () => {
ctx.config = { enableRtk: true };
it("command includes tee to pod log path", () => {
const { job } = buildJobManifest({ ctx, selfPod });
const cmd = job.spec?.template?.spec?.containers[0]?.command?.[2] ?? "";
const rtkIdx = cmd.indexOf(".rtk-filter.js");
const claudeIdx = cmd.indexOf("cat /tmp/prompt/prompt.txt | claude");
expect(rtkIdx).toBeGreaterThanOrEqual(0);
expect(claudeIdx).toBeGreaterThan(rtkIdx);
expect(cmd).toContain("| tee");
expect(cmd).toContain("/paperclip/instances/default/data/run-logs/");
});
it("RTK setup uses node (no external binaries)", () => {
ctx.config = { enableRtk: true };
it("podLogPath is returned from buildJobManifest", () => {
const result = buildJobManifest({ ctx, selfPod });
expect(result.podLogPath).toBe(
"/paperclip/instances/default/data/run-logs/co1/agent-abc/run-abc12345.pod.ndjson",
);
});
it("buildPodLogPath returns correctly formatted path", () => {
expect(buildPodLogPath("co1", "agent-abc", "run-abc12345")).toBe(
"/paperclip/instances/default/data/run-logs/co1/agent-abc/run-abc12345.pod.ndjson",
);
});
it("init container does not create log directory (server pre-creates it on shared PVC)", () => {
const { job } = buildJobManifest({ ctx, selfPod });
const cmd = job.spec?.template?.spec?.containers[0]?.command?.[2] ?? "";
// Should only use `node` — no curl, wget, apt, pip, etc.
expect(cmd).not.toMatch(/\b(curl|wget|apt|yum|pip|gem|cargo|go\s+get)\b/);
expect(cmd).toContain("node ");
const initCmd = job.spec?.template?.spec?.initContainers?.[0]?.command;
expect(initCmd?.[2]).not.toContain("mkdir -p /paperclip");
});
it("uses default 50000 byte threshold when rtkMaxOutputBytes not set", () => {
ctx.config = { enableRtk: true };
const setup = buildRtkSetupCommands(50000);
// The filter script base64 should decode to contain the MAX constant
const b64Match = setup.match(/Buffer\.from\('([A-Za-z0-9+/=]+)','base64'\)/);
expect(b64Match).not.toBeNull();
const decoded = Buffer.from(b64Match![1], "base64").toString("utf-8");
expect(decoded).toContain("50000");
it("sanitizes companyId with / to valid path component for log path", () => {
const badCtx = {
...ctx,
agent: { ...ctx.agent, companyId: "co/1" },
};
const { podLogPath } = buildJobManifest({ ctx: badCtx as typeof ctx, selfPod });
// / is stripped by sanitizeForK8sPath
expect(podLogPath).toContain("co1/");
});
it("respects custom rtkMaxOutputBytes", () => {
ctx.config = { enableRtk: true, rtkMaxOutputBytes: 100000 };
const { job } = buildJobManifest({ ctx, selfPod });
const cmd = job.spec?.template?.spec?.containers[0]?.command?.[2] ?? "";
// The custom threshold should appear in the base64-encoded filter script
const b64Matches = [...cmd.matchAll(/Buffer\.from\('([A-Za-z0-9+/=]+)','base64'\)/g)];
const decoded = b64Matches.map((m) => Buffer.from(m[1], "base64").toString("utf-8")).join("\n");
expect(decoded).toContain("100000");
it("sanitizes agentId with @ to valid path component for log path", () => {
const badCtx = {
...ctx,
agent: { ...ctx.agent, id: "agent@123" },
};
const { podLogPath } = buildJobManifest({ ctx: badCtx as typeof ctx, selfPod });
// @ is stripped by sanitizeForK8sPath
expect(podLogPath).toContain("/agent123/");
});
it("RTK setup installs a PostToolUse hook in claude settings", () => {
const setup = buildRtkSetupCommands(50000);
// The settings script (second base64 block) should reference PostToolUse
const b64Matches = [...setup.matchAll(/Buffer\.from\('([A-Za-z0-9+/=]+)','base64'\)/g)];
expect(b64Matches.length).toBeGreaterThanOrEqual(2);
const settingsScript = Buffer.from(b64Matches[1]![1], "base64").toString("utf-8");
expect(settingsScript).toContain("PostToolUse");
expect(settingsScript).toContain("settings.json");
});
it("filter script handles string content truncation", () => {
// Decode the filter script and verify it truncates string content
const setup = buildRtkSetupCommands(1000);
const b64Matches = [...setup.matchAll(/Buffer\.from\('([A-Za-z0-9+/=]+)','base64'\)/g)];
const filterScript = Buffer.from(b64Matches[0]![1], "base64").toString("utf-8");
expect(filterScript).toContain("MAX=1000");
expect(filterScript).toContain("truncated by paperclip-rtk");
expect(filterScript).toContain("tool_response");
expect(filterScript).toContain("tool_result");
});
it("filter script truncates without corrupting multi-byte UTF-8", () => {
// "中" is U+4E2D, 3 bytes in UTF-8: E4 B8 AD
// With MAX=5, two "中" (6 bytes) should truncate to one (3 bytes), not
// produce a replacement character from slicing mid-codepoint.
const setup = buildRtkSetupCommands(5);
const b64Matches = [...setup.matchAll(/Buffer\.from\('([A-Za-z0-9+/=]+)','base64'\)/g)];
const filterScript = Buffer.from(b64Matches[0]![1], "base64").toString("utf-8");
// Extract the trunc function from the filter script and evaluate it
const fnMatch = filterScript.match(/(function trunc\(s\)\{.*\})(?=const tr=)/);
expect(fnMatch).toBeTruthy();
// eslint-disable-next-line no-eval
const trunc = eval(`(()=>{const MAX=5;${fnMatch![1]};return trunc;})()`);
const result = trunc("中中");
expect(result).not.toContain("");
expect(result).toContain("中");
expect(result).toContain("truncated by paperclip-rtk");
// Should report bytes from the actual truncation point, not MAX
expect(result).toContain("3 bytes truncated");
});
it("filter script handles array content (block format)", () => {
const setup = buildRtkSetupCommands(50000);
const b64Matches = [...setup.matchAll(/Buffer\.from\('([A-Za-z0-9+/=]+)','base64'\)/g)];
const filterScript = Buffer.from(b64Matches[0]![1], "base64").toString("utf-8");
// Should handle array content blocks (text field on each block)
expect(filterScript).toContain("Array.isArray");
expect(filterScript).toContain("b.text");
it("sanitizes runId with underscore to valid path component for log path", () => {
const badCtx = {
...ctx,
runId: "run_123",
};
const { podLogPath } = buildJobManifest({ ctx: badCtx as typeof ctx, selfPod });
// _ is stripped by sanitizeForK8sPath
expect(podLogPath).toContain("/run123.pod.ndjson");
});
});
});
+23 -88
View File
@@ -12,85 +12,18 @@ import {
import { createHash } from "node:crypto";
import type { ClaudePromptBundle } from "./prompt-cache.js";
/**
* Build the shell command prefix that installs a native Node.js PostToolUse
* hook into Claude Code's settings. The hook truncates oversized tool outputs
* before they reach the model — replacing the RTK binary init-container
* approach with a self-contained Node.js implementation.
*
* Both scripts are base64-encoded so they can be embedded in a sh -c command
* string without any quoting or escaping issues.
*
* @param maxOutputBytes Byte threshold above which tool output is truncated.
* @returns A shell command string (suitable for "&&"-chaining
* before the claude invocation).
*/
export function buildRtkSetupCommands(maxOutputBytes: number): string {
// --- Filter script ----------------------------------------------------------
// This script runs as the PostToolUse hook inside every K8s Job pod.
// Claude Code writes the hook event as JSON to the script's stdin; the script
// truncates the tool_response/tool_result content when it exceeds the
// threshold and writes the (possibly modified) JSON to stdout.
//
// Field-name coverage:
// • tool_response — documented hook event format for PostToolUse
// • tool_result — alternative name seen in some Claude Code versions
// Content may be a plain string or an array of typed blocks (text/image/…).
const filterScript = [
`const c=[];`,
`process.stdin.on('data',d=>c.push(d));`,
`process.stdin.on('end',()=>{`,
`const raw=Buffer.concat(c).toString('utf-8');`,
`let o;try{o=JSON.parse(raw);}catch{process.stdout.write(raw);return;}`,
`const MAX=${maxOutputBytes};`,
`function trunc(s){`,
`if(typeof s!=='string')return s;`,
`const b=Buffer.from(s,'utf-8');`,
`if(b.length<=MAX)return s;`,
`let e=MAX;if(e>0){let p=e-1;while(p>0&&(b[p]&0xC0)===0x80)p--;const l=b[p];let n=1;if((l&0xE0)===0xC0)n=2;else if((l&0xF0)===0xE0)n=3;else if((l&0xF8)===0xF0)n=4;if(p+n>e)e=p;}`,
`return b.slice(0,e).toString('utf-8')+'\\n[...'+(b.length-e)+' bytes truncated by paperclip-rtk]';`,
`}`,
`const tr=o&&(o.tool_response||o.tool_result);`,
`if(tr){`,
`if(typeof tr.content==='string'){tr.content=trunc(tr.content);}`,
`else if(Array.isArray(tr.content)){`,
`tr.content=tr.content.map(function(b){`,
`if(b&&typeof b==='object'&&typeof b.text==='string'){`,
`return Object.assign({},b,{text:trunc(b.text)});`,
`}return b;`,
`});`,
`}`,
`}`,
`process.stdout.write(JSON.stringify(o));`,
`});`,
].join("");
function assertSafePathComponent(field: string, value: string): void {
if (!/^[a-zA-Z0-9-]+$/.test(value)) {
throw new Error(`Invalid ${field} for log path: ${value}`);
}
}
// --- Settings script --------------------------------------------------------
// Reads the existing ~/.claude/settings.json (if any), merges in the RTK
// PostToolUse hook, and writes the file back. All other settings sections
// are preserved; only PostToolUse is replaced so we own the full hook list
// for this run.
const settingsScript = [
`const fs=require('fs'),pt=require('path');`,
`const p=pt.join(process.env.HOME,'.claude','settings.json');`,
`let s={};try{s=JSON.parse(fs.readFileSync(p,'utf-8'));}catch(e){}`,
`s.hooks=s.hooks||{};`,
`s.hooks.PostToolUse=[{matcher:'.*',hooks:[{type:'command',command:'node /tmp/.rtk-filter.js'}]}];`,
`fs.mkdirSync(pt.dirname(p),{recursive:true});`,
`fs.writeFileSync(p,JSON.stringify(s));`,
].join("");
function sanitizeForK8sPath(value: string): string {
return value.replace(/[^a-zA-Z0-9-]/g, "");
}
// Encode as base64 so the strings can be embedded directly in a shell command
// without any quoting concerns (base64 alphabet: A-Za-z0-9+/=).
const filterB64 = Buffer.from(filterScript, "utf-8").toString("base64");
const settingsB64 = Buffer.from(settingsScript, "utf-8").toString("base64");
return [
// Write the filter script
`node -e "require('fs').writeFileSync('/tmp/.rtk-filter.js',Buffer.from('${filterB64}','base64').toString('utf-8'))"`,
// Install the Claude Code PostToolUse hook (merge into existing settings)
`node -e "eval(Buffer.from('${settingsB64}','base64').toString('utf-8'))"`,
].join(" && ");
export function buildPodLogPath(companyId: string, agentId: string, runId: string): string {
return `/paperclip/instances/default/data/run-logs/${companyId}/${agentId}/${runId}.pod.ndjson`;
}
/** Prompts above this size (bytes) are staged via a Secret instead of an
@@ -202,6 +135,8 @@ export interface JobBuildResult {
promptSecret: PromptSecret | null;
/** User-supplied extra labels that were dropped because they used a reserved prefix. */
skippedLabels: string[];
/** Path to the pod log file on the shared PVC. */
podLogPath: string;
}
function sanitizeForK8sName(value: string, maxLen = 16): string {
@@ -353,8 +288,6 @@ export function buildJobManifest(input: JobBuildInput): JobBuildResult {
const nodeSelector = parseKeyValueConfig(config.nodeSelector);
const tolerations = Array.isArray(config.tolerations) ? config.tolerations : [];
const extraLabels = parseKeyValueConfig(config.labels);
const enableRtk = asBoolean(config.enableRtk, false);
const rtkMaxOutputBytes = asNumber(config.rtkMaxOutputBytes, 50000);
// Resolve working directory — use workspace cwd, fall back to /paperclip
const workspaceContext = parseObject(context.paperclipWorkspace);
@@ -535,13 +468,15 @@ export function buildJobManifest(input: JobBuildInput): JobBuildResult {
// Build the claude command string for the main container
const claudeArgsEscaped = claudeArgs.map((a) => `'${a.replace(/'/g, "'\\''")}'`).join(" ");
const claudeInvocation = `cat /tmp/prompt/prompt.txt | claude ${claudeArgsEscaped}`;
// When RTK output filtering is enabled, prepend the Node.js hook setup.
// This writes a filter script and a Claude Code settings file that installs
// it as a PostToolUse hook — no external binary or init container required.
const mainCommand = enableRtk
? `${buildRtkSetupCommands(rtkMaxOutputBytes)} && ${claudeInvocation}`
: claudeInvocation;
const logPathCompanyId = sanitizeForK8sPath(agent.companyId);
const logPathAgentId = sanitizeForK8sPath(agent.id);
const logPathRunId = sanitizeForK8sPath(runId);
assertSafePathComponent("companyId", logPathCompanyId);
assertSafePathComponent("agentId", logPathAgentId);
assertSafePathComponent("runId", logPathRunId);
const podLogPath = buildPodLogPath(logPathCompanyId, logPathAgentId, logPathRunId);
const claudeInvocation = `cat /tmp/prompt/prompt.txt | claude ${claudeArgsEscaped} | tee ${podLogPath}`;
const mainCommand = claudeInvocation;
// Decide prompt delivery strategy: env var (small) or Secret volume (large).
const promptBytes = Buffer.byteLength(prompt, "utf-8");
@@ -584,7 +519,7 @@ export function buildJobManifest(input: JobBuildInput): JobBuildResult {
name: "write-prompt",
image: "busybox:1.36",
imagePullPolicy: "IfNotPresent",
command: ["sh", "-c", "printf '%s' \"$PROMPT_CONTENT\" > /tmp/prompt/prompt.txt"],
command: ["sh", "-c", `printf '%s' "$PROMPT_CONTENT" > /tmp/prompt/prompt.txt`],
env: [{ name: "PROMPT_CONTENT", value: prompt }],
volumeMounts: [{ name: "prompt", mountPath: "/tmp/prompt" }],
securityContext,
@@ -641,5 +576,5 @@ export function buildJobManifest(input: JobBuildInput): JobBuildResult {
},
};
return { job, jobName, namespace, prompt, claudeArgs, promptMetrics, promptSecret, skippedLabels };
return { job, jobName, namespace, prompt, claudeArgs, promptMetrics, promptSecret, skippedLabels, podLogPath };
}
-173
View File
@@ -1,173 +0,0 @@
import { describe, it, expect } from "vitest";
import { LogLineDedupFilter, eventDedupKey } from "./log-dedup.js";
function assistantEvent(id: string, text: string): string {
return JSON.stringify({
type: "assistant",
session_id: "sess_1",
message: {
id,
content: [{ type: "text", text }],
},
});
}
function userToolResultEvent(toolUseId: string, content: string): string {
return JSON.stringify({
type: "user",
session_id: "sess_1",
message: {
content: [{ type: "tool_result", tool_use_id: toolUseId, content }],
},
});
}
function systemInitEvent(sessionId: string): string {
return JSON.stringify({
type: "system",
subtype: "init",
session_id: sessionId,
model: "claude-opus-4-7",
});
}
function resultEvent(sessionId: string): string {
return JSON.stringify({
type: "result",
subtype: "success",
session_id: sessionId,
result: "done",
total_cost_usd: 0.01,
usage: { input_tokens: 1, output_tokens: 1, cache_read_input_tokens: 0 },
});
}
describe("eventDedupKey", () => {
it("keys assistant events by message.id", () => {
const key = eventDedupKey(JSON.parse(assistantEvent("msg_abc", "hi")));
expect(key).toBe("assistant:msg_abc");
});
it("keys user tool_result events by tool_use_id", () => {
const key = eventDedupKey(JSON.parse(userToolResultEvent("toolu_1", "ok")));
expect(key).toBe("user:tool_result:toolu_1");
});
it("keys system init events by session_id", () => {
const key = eventDedupKey(JSON.parse(systemInitEvent("sess_xyz")));
expect(key).toBe("system:init:sess_xyz");
});
it("keys result events by session_id", () => {
const key = eventDedupKey(JSON.parse(resultEvent("sess_xyz")));
expect(key).toBe("result:sess_xyz");
});
it("returns null for assistant events missing message.id", () => {
const event = { type: "assistant", message: { content: [] } };
expect(eventDedupKey(event)).toBeNull();
});
it("returns null for unknown event types", () => {
expect(eventDedupKey({ type: "unknown" })).toBeNull();
expect(eventDedupKey({})).toBeNull();
});
});
describe("LogLineDedupFilter", () => {
it("passes unique lines through unchanged", () => {
const filter = new LogLineDedupFilter();
const a = assistantEvent("msg_1", "hello");
const b = assistantEvent("msg_2", "world");
expect(filter.filter(`${a}\n${b}\n`)).toBe(`${a}\n${b}\n`);
});
it("drops assistant events replayed with the same message.id", () => {
const filter = new LogLineDedupFilter();
const a = assistantEvent("msg_1", "Three nits to fix.");
filter.filter(`${a}\n`);
expect(filter.filter(`${a}\n`)).toBe("");
});
it("drops user tool_result events replayed with the same tool_use_id", () => {
const filter = new LogLineDedupFilter();
const a = userToolResultEvent("toolu_abc", "file contents");
filter.filter(`${a}\n`);
expect(filter.filter(`${a}\n`)).toBe("");
});
it("drops system init and result events on replay", () => {
const filter = new LogLineDedupFilter();
const init = systemInitEvent("sess_1");
const result = resultEvent("sess_1");
filter.filter(`${init}\n${result}\n`);
expect(filter.filter(`${init}\n${result}\n`)).toBe("");
});
it("buffers incomplete trailing lines across chunks", () => {
const filter = new LogLineDedupFilter();
const line = assistantEvent("msg_1", "hello");
const mid = Math.floor(line.length / 2);
const out1 = filter.filter(line.slice(0, mid));
const out2 = filter.filter(line.slice(mid) + "\n");
expect(out1).toBe("");
expect(out2).toBe(`${line}\n`);
});
it("flush() emits a final incomplete line that was not replayed", () => {
const filter = new LogLineDedupFilter();
const line = assistantEvent("msg_tail", "no newline");
filter.filter(line);
expect(filter.flush()).toBe(line);
});
it("flush() drops an incomplete line that was already seen with a newline", () => {
const filter = new LogLineDedupFilter();
const line = assistantEvent("msg_same", "x");
filter.filter(`${line}\n`);
filter.filter(line);
expect(filter.flush()).toBe("");
});
it("passes non-JSON lines through every time (does not dedup paperclip status)", () => {
const filter = new LogLineDedupFilter();
const status = "[paperclip] keepalive — job foo running\n";
expect(filter.filter(status)).toBe(status);
expect(filter.filter(status)).toBe(status);
});
it("dedups structurally identical JSON with identical content (raw fallback)", () => {
const filter = new LogLineDedupFilter();
// No recognized type → raw fallback key.
const line = JSON.stringify({ foo: "bar", baz: 1 });
filter.filter(`${line}\n`);
expect(filter.filter(`${line}\n`)).toBe("");
});
it("handles multiple complete lines in a single chunk with partial trailing", () => {
const filter = new LogLineDedupFilter();
const a = assistantEvent("msg_a", "a");
const b = assistantEvent("msg_b", "b");
const c = assistantEvent("msg_c", "c");
// a and b are complete, c is partial (no trailing newline).
const out = filter.filter(`${a}\n${b}\n${c}`);
expect(out).toBe(`${a}\n${b}\n`);
// Completing c later should emit exactly c.
expect(filter.filter("\n")).toBe(`${c}\n`);
});
it("drops the classic FAR-123 replay scenario across reconnects", () => {
const filter = new LogLineDedupFilter();
const assistantNits = assistantEvent("msg_nits", "Three nits to fix. Let me look at an existing test file...");
const assistantWrite = assistantEvent("msg_write", "Now I need to write unit tests");
// First stream attempt emits both events.
const out1 = filter.filter(`${assistantNits}\n${assistantWrite}\n`);
expect(out1).toBe(`${assistantNits}\n${assistantWrite}\n`);
// Reconnect replays both within the sinceSeconds overlap — filter should drop them.
const out2 = filter.filter(`${assistantNits}\n${assistantWrite}\n`);
expect(out2).toBe("");
// And a genuinely new event after the replay should still pass through.
const assistantFresh = assistantEvent("msg_fresh", "next turn");
expect(filter.filter(`${assistantFresh}\n`)).toBe(`${assistantFresh}\n`);
});
});
-146
View File
@@ -1,146 +0,0 @@
/**
* Line-level dedup filter for the K8s log stream.
*
* The K8s log follow stream can reconnect with an overlapping `sinceSeconds`
* window (integer-second granularity + a safety buffer), which replays a few
* seconds of recent output on every reconnect. Without dedup those replayed
* lines appear as duplicate events in the streaming UI — the same assistant
* text block shows up between every subsequent tool call (FAR-123).
*
* The filter operates at the chunk → line level: chunks are split on `\n`,
* incomplete trailing content is buffered until the next chunk, and each
* complete line is emitted at most once. JSON-shaped Claude stream-json
* events are keyed by their stable structural IDs; non-JSON lines pass
* through unchanged so genuinely-repeated status lines are not swallowed.
*/
type Parsed = Record<string, unknown>;
function asString(value: unknown): string {
return typeof value === "string" ? value : "";
}
function asRecord(value: unknown): Parsed | null {
if (typeof value !== "object" || value === null || Array.isArray(value)) return null;
return value as Parsed;
}
/**
* Build a stable dedup key for a Claude stream-json event. Returns `null`
* when the event is not a recognized Claude event — those lines fall back to
* raw-content hashing so non-JSON output (paperclip status lines, shell
* output) is never deduped by identity.
*/
export function eventDedupKey(event: Parsed): string | null {
const type = asString(event.type);
if (type === "system") {
const subtype = asString(event.subtype);
const sessionId = asString(event.session_id);
if (subtype === "init" && sessionId) return `system:init:${sessionId}`;
return null;
}
if (type === "assistant") {
const message = asRecord(event.message);
const id = message ? asString(message.id) : "";
if (id) return `assistant:${id}`;
return null;
}
if (type === "user") {
const message = asRecord(event.message);
const content = message && Array.isArray(message.content) ? message.content : [];
const toolUseIds: string[] = [];
for (const entry of content) {
const block = asRecord(entry);
if (!block) continue;
const toolUseId = asString(block.tool_use_id);
if (toolUseId) toolUseIds.push(toolUseId);
}
if (toolUseIds.length > 0) return `user:tool_result:${toolUseIds.join(",")}`;
return null;
}
if (type === "result") {
const sessionId = asString(event.session_id);
return sessionId ? `result:${sessionId}` : "result:unknown";
}
return null;
}
/**
* Stateful line-level dedup filter. Emits `filter(chunk)` output through
* the caller — preserves original chunk formatting (including trailing
* newlines) for lines that pass the dedup check.
*/
export class LogLineDedupFilter {
private buffer = "";
private readonly seenKeys = new Set<string>();
/**
* Process a chunk and return the subset that should be forwarded.
* Incomplete trailing content (no terminating newline) is buffered and
* emitted on the next chunk that completes the line (or on flush()).
*/
filter(chunk: string): string {
if (!chunk) return "";
const combined = this.buffer + chunk;
const endsWithNewline = combined.endsWith("\n");
const parts = combined.split("\n");
if (endsWithNewline) {
// Discard the final empty element — last line was complete.
parts.pop();
this.buffer = "";
} else {
// Last element is an incomplete line — hold it for the next chunk.
this.buffer = parts.pop() ?? "";
}
const out: string[] = [];
for (const line of parts) {
if (this.shouldEmit(line)) out.push(line);
}
if (out.length === 0) return "";
return out.join("\n") + "\n";
}
/**
* Flush any incomplete trailing content. Called when the stream ends
* without a terminating newline so the final partial line isn't lost.
*/
flush(): string {
const pending = this.buffer;
this.buffer = "";
if (!pending) return "";
return this.shouldEmit(pending) ? pending : "";
}
private shouldEmit(line: string): boolean {
const trimmed = line.trim();
if (!trimmed) return true;
// Only attempt dedup on JSON-shaped lines; pass shell/text output through.
if (!trimmed.startsWith("{") || !trimmed.endsWith("}")) return true;
let parsed: unknown;
try {
parsed = JSON.parse(trimmed);
} catch {
return true;
}
const event = asRecord(parsed);
if (!event) return true;
// Recognized Claude stream-json event → structural key.
const structuralKey = eventDedupKey(event);
const key = structuralKey ?? `raw:${trimmed}`;
if (this.seenKeys.has(key)) return false;
this.seenKeys.add(key);
return true;
}
}
+1
View File
@@ -50,3 +50,4 @@ describe("listK8sModels", () => {
expect(models.some((m) => m.id === "claude-opus-4-7")).toBe(true);
});
});
+125
View File
@@ -154,6 +154,131 @@ more raw output`;
// Should not be "Hello world\n\nHello world"
expect(result.summary.split("Hello world").length).toBe(2);
});
it("sets llmApiEmptyResponse=true when stop_reason:null and usage.output_tokens:0", () => {
const initLine = JSON.stringify({ type: "system", subtype: "init", model: "MiniMax-M2.7", session_id: "sess_1" });
const assistantEvent = JSON.stringify({
type: "assistant",
session_id: "sess_1",
message: {
id: "msg_abc",
stop_reason: null,
usage: { input_tokens: 100, output_tokens: 0, cache_creation_input_tokens: 0, cache_read_input_tokens: 0 },
content: [],
},
});
const result = parseClaudeStreamJson([initLine, assistantEvent].join("\n"));
expect(result.llmApiEmptyResponse).toBe(true);
expect(result.resultJson).toBeNull();
});
it("sets llmApiEmptyResponse=true when stop_reason:null and message-level output_tokens:0", () => {
const assistantEvent = JSON.stringify({
type: "assistant",
message: { stop_reason: null, output_tokens: 0, content: [] },
});
const result = parseClaudeStreamJson(assistantEvent);
expect(result.llmApiEmptyResponse).toBe(true);
});
it("does not set llmApiEmptyResponse when stop_reason is non-null", () => {
const assistantEvent = JSON.stringify({
type: "assistant",
message: {
stop_reason: "end_turn",
usage: { output_tokens: 0 },
content: [],
},
});
const result = parseClaudeStreamJson(assistantEvent);
expect(result.llmApiEmptyResponse).toBe(false);
});
it("does not set llmApiEmptyResponse when output_tokens > 0", () => {
const assistantEvent = JSON.stringify({
type: "assistant",
message: {
stop_reason: null,
usage: { output_tokens: 5 },
content: [{ type: "text", text: "hello" }],
},
});
const result = parseClaudeStreamJson(assistantEvent);
expect(result.llmApiEmptyResponse).toBe(false);
});
it("clears llmApiEmptyResponse when a result event follows the empty assistant event", () => {
const assistantEvent = JSON.stringify({
type: "assistant",
message: { stop_reason: null, usage: { output_tokens: 0 }, content: [] },
});
const resultEvent = JSON.stringify({
type: "result",
result: "Done",
subtype: "stop",
total_cost_usd: 0.001,
usage: { input_tokens: 10, output_tokens: 5, cache_read_input_tokens: 0 },
});
const result = parseClaudeStreamJson([assistantEvent, resultEvent].join("\n"));
expect(result.llmApiEmptyResponse).toBe(false);
expect(result.resultJson).not.toBeNull();
});
it("sets truncatedMidStream=true when assistant event with output_tokens>0 has no result (FAR-95)", () => {
const initLine = JSON.stringify({ type: "system", subtype: "init", model: "claude-opus-4-7", session_id: "sess_1" });
const assistantEvent = JSON.stringify({
type: "assistant",
session_id: "sess_1",
message: {
id: "msg_abc",
stop_reason: null,
usage: { input_tokens: 1, output_tokens: 35, cache_creation_input_tokens: 523, cache_read_input_tokens: 46295 },
content: [{ type: "tool_use", id: "tool_1", name: "Bash", input: { command: "echo hi" } }],
},
});
const result = parseClaudeStreamJson([initLine, assistantEvent].join("\n"));
expect(result.truncatedMidStream).toBe(true);
expect(result.llmApiEmptyResponse).toBe(false);
expect(result.resultJson).toBeNull();
});
it("clears truncatedMidStream when a result event follows assistant content", () => {
const assistantEvent = JSON.stringify({
type: "assistant",
message: { stop_reason: null, usage: { output_tokens: 35 }, content: [] },
});
const resultEvent = JSON.stringify({
type: "result",
result: "Done",
subtype: "stop",
total_cost_usd: 0.001,
usage: { input_tokens: 10, output_tokens: 5, cache_read_input_tokens: 0 },
});
const result = parseClaudeStreamJson([assistantEvent, resultEvent].join("\n"));
expect(result.truncatedMidStream).toBe(false);
expect(result.resultJson).not.toBeNull();
});
it("does not set truncatedMidStream when assistant has output_tokens=0", () => {
const assistantEvent = JSON.stringify({
type: "assistant",
message: { stop_reason: null, usage: { output_tokens: 0 }, content: [] },
});
const result = parseClaudeStreamJson(assistantEvent);
expect(result.truncatedMidStream).toBe(false);
});
it("sets llmApiEmptyResponse=false for normal result", () => {
const resultEvent = JSON.stringify({
type: "result",
result: "Done",
subtype: "stop",
total_cost_usd: 0.005,
usage: { input_tokens: 100, output_tokens: 200, cache_read_input_tokens: 50 },
});
const result = parseClaudeStreamJson(resultEvent);
expect(result.llmApiEmptyResponse).toBe(false);
});
});
describe("extractClaudeLoginUrl", () => {
+29
View File
@@ -15,6 +15,14 @@ export function parseClaudeStreamJson(stdout: string) {
// at the line level; this guard only needs to protect against the same
// message block being parsed twice.
const seenBlocks = new Set<string>();
// Set when we see stop_reason:null + output_tokens:0 on an assistant event
// with no subsequent result event — indicates the upstream LLM API returned
// an empty/malformed response (e.g. MiniMax degraded performance).
let llmApiEmptyResponse = false;
// Set when an assistant event with output_tokens > 0 was seen but no result
// event arrived — indicates the run was truncated mid-stream (pod terminated,
// OOMKill, or claude CLI crash after producing content).
let assistantContentSeen = false;
for (const rawLine of stdout.split(/\r?\n/)) {
const line = rawLine.trim();
@@ -34,6 +42,21 @@ export function parseClaudeStreamJson(stdout: string) {
const message = parseObject(event.message);
const messageId = asString(message.id, "");
const content = Array.isArray(message.content) ? message.content : [];
// Detect empty LLM API response: stop_reason:null with zero output tokens.
// output_tokens may appear directly on message or nested under message.usage.
const stopReason = message.stop_reason;
const usageObj = parseObject(message.usage as Record<string, unknown>);
const outputTokens = typeof message.output_tokens === "number"
? message.output_tokens
: asNumber(usageObj.output_tokens, -1);
if (stopReason === null && outputTokens === 0) {
llmApiEmptyResponse = true;
}
if (outputTokens > 0) {
assistantContentSeen = true;
}
for (let i = 0; i < content.length; i++) {
const entry = content[i];
if (typeof entry !== "object" || entry === null || Array.isArray(entry)) continue;
@@ -55,6 +78,8 @@ export function parseClaudeStreamJson(stdout: string) {
if (type === "result") {
finalResult = event;
llmApiEmptyResponse = false; // result event means Claude completed normally
assistantContentSeen = false; // result event means stream was not truncated
sessionId = asString(event.session_id, sessionId ?? "") || sessionId;
}
}
@@ -67,6 +92,8 @@ export function parseClaudeStreamJson(stdout: string) {
usage: null as UsageSummary | null,
summary: assistantTexts.join("\n\n").trim(),
resultJson: null as Record<string, unknown> | null,
llmApiEmptyResponse,
truncatedMidStream: assistantContentSeen,
};
}
@@ -87,6 +114,8 @@ export function parseClaudeStreamJson(stdout: string) {
usage,
summary,
resultJson: finalResult,
llmApiEmptyResponse: false,
truncatedMidStream: false,
};
}