Paperclip Adapter Logging Methodology Change #10
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
You are implementing a change to a Paperclip adapter plugin. The repo is at
/Users/Repositories/paperclip-adapter-claude-k8s on branch master. Work on a
new branch off master — do NOT commit directly to master.
Before you start, read these files fully:
- src/server/execute.ts (large; this is the main file you'll edit)
- src/server/job-manifest.ts
- src/server/log-dedup.ts (you will delete this)
- src/server/parse.ts
- src/server/config-schema.ts
- src/index.ts
Run
npm installand thennpm test. Confirm it's green (should be 379 testspassing). Do NOT run
npm run build— CI handles that.=============================================================================
WHY WE ARE DOING THIS
Today the adapter reads pod logs via the Kubernetes log API (follow mode). At
production scale this stream drops every few seconds, the adapter hits the
50-reconnect cap after ~2.5 minutes, and long-running agents fail. The fix is
to have the pod's claude command
teeits stdout to a file on the sharedPVC, and have the adapter tail that file directly from the Paperclip server
process. The PVC is mounted at /paperclip in both pods so the file is
visible on both sides.
We are NOT going to:
- wrap the claude binary
- use Claude hooks
- add a sidecar
- change revitalize (the consumer app)
- keep the k8s log API as a fallback
We ARE going to:
- replace k8s log streaming with filesystem tailing entirely
- delete all reconnect logic and the log-dedup filter
- delete the RTK tool-output truncation feature entirely
- keep
kubectl logs -fworking (tee preserves stdout)=============================================================================
SCOPE OF CHANGES
--- Job manifest (src/server/job-manifest.ts) ---
DELETE the
buildRtkSetupCommandsfunction entirely.DELETE the
enableRtkandrtkMaxOutputBytesconfig reads insidebuildJobManifest.DELETE the mainCommand conditional that prepends RTK setup:
const mainCommand = enableRtk
?
${buildRtkSetupCommands(rtkMaxOutputBytes)} && ${claudeInvocation}: claudeInvocation;
Replace with just
claudeInvocation.MODIFY
claudeInvocationto add tee:Before (approximately):
const claudeInvocation =
cat /tmp/prompt/prompt.txt | claude ${claudeArgsEscaped};After:
const podLogPath =
/paperclip/instances/default/run-logs/${companyId}/${agentId}/${runId}.pod.ndjson;const claudeInvocation =
cat /tmp/prompt/prompt.txt | claude ${claudeArgsEscaped} | tee ${podLogPath};companyId,agentId, andrunIdcome fromctx(search surroundingcode — they're already in scope in buildJobManifest via destructuring).
MODIFY the init container command to create the parent directory before
the main container starts. The init container today writes the prompt
file. Amend its command to also
mkdir -pthe log directory:const initCommand =
mkdir -p /paperclip/instances/default/run-logs/${companyId}/${agentId} && printf '%s' "$PROMPT" > /tmp/prompt/prompt.txt;(Your existing init command may differ slightly — keep its behavior,
just prepend the mkdir.)
EXPORT the log path builder as a helper so execute.ts can compute the
same path without duplicating the template:
export function buildPodLogPath(companyId: string, agentId: string, runId: string): string {
return
/paperclip/instances/default/run-logs/${companyId}/${agentId}/${runId}.pod.ndjson;}
Return this path from
buildJobManifestalongside the other fields inJobBuildResult(addpodLogPath: stringto the interface).ID SANITIZATION (critical): before using companyId/agentId/runId in the
path, validate they match
^[a-zA-Z0-9-]+$. If any of them doesn't,throw an Error with message:
Invalid ${field} for log path: ${value}The existing code probably does not validate; add a helper at the top of
job-manifest.ts:
function assertSafePathComponent(field: string, value: string): void {
if (!/^[a-zA-Z0-9-]+$/.test(value)) {
throw new Error(
Invalid ${field} for log path: ${value});}
}
Call it for all three before buildPodLogPath.
--- Adapter (src/server/execute.ts) ---
DELETE the
LogLineDedupFilterimport.DELETE constants:
MAX_LOG_RECONNECT_ATTEMPTS,LOG_STREAM_RECONNECT_DELAY_MS,LOG_STREAM_BAIL_TIMEOUT_MS.DELETE functions:
streamPodLogs,streamPodLogsOnce,readPodLogs.Also delete any helpers exclusively used by them.
DELETE the bail timer machinery (bailTimer, bailResolve, bailPromise,
stopPoller).
DELETE the one-shot fallback path inside the main
executefunction:the block computing
hasResultEvent,needsOneShot, and thereadPodLogsfallback call.DELETE the
sinceSecondsreconnect-window logic.ADD a new function
tailPodLogFilein execute.ts (or a new filesrc/server/file-log-tailer.ts if you prefer — but keep it simple; inline
is fine). Signature:
Behavior:
fs.promises.statevery 250ms. If the file doesn't appear in 30s,throw an Error:
Pod log file never appeared at ${filePath}.fs.promises.open(filePath, 'r').hasn't grown for 5 consecutive polls (reset to 250ms on any
growth). For each poll:
a. stat the file, compare size to offset
b. if size > offset, read bytes from [offset, size) into a Buffer
c. update offset = size
d. concatenate any pending partial line from previous poll with
the new buffer, split on '\n'
e. the last element of the split is either the new pending
partial line (if the buffer didn't end with '\n') or empty
f. for every complete line, call
onLog("stdout", line + "\n")and append to an in-memory accumulator (string)
opts.stopSignal.stopped === true. Before returning, doONE final read-to-EOF to drain any tail bytes (same logic as
above). Close the file handle. Return the accumulator as a string.
Use
fs.promises.open/FileHandle.read/FileHandle.close. Do notuse
fs.watchorchokidar.REPLACE the existing log-streaming section of
executewith a call totailPodLogFile. The file path comes frombuildJobManifest'sreturn value (add
podLogPaththere as noted above). The pattern isapproximately:
The existing
Promise.allSettledpattern is in the code today — mirrorits shape. Keep
waitForJobCompletionunchanged.ADD log file cleanup to
cleanupJob. After successful Job deletion,best-effort delete the log file:
try { await fs.promises.unlink(podLogPath); } catch { /* non-fatal */ }
Skip the unlink if
retainJobs === true.cleanupJobwill needpodLogPathpassed in; thread it through fromthe caller.
--- Config schema (src/server/config-schema.ts) ---
DELETE the
enableRtkandrtkMaxOutputBytesfield definitions.--- Documentation (src/index.ts) ---
DELETE any lines in
agentConfigurationDocreferring to enableRtk,rtkMaxOutputBytes, or RTK generally. Search for "rtk" (case-insensitive)
and remove matching lines/sections.
--- Delete entire files ---
--- Tests ---
=============================================================================
TESTING
After all changes:
1.
npm run typecheck— must pass2.
npm test— must pass. Record the new passing count.Do NOT run the adapter end-to-end. Do NOT require a k8s cluster.
=============================================================================
BRANCH, COMMIT, PUSH, PR
Create a new branch off master:
git checkout master && git pull && git checkout -b feat/filesystem-log-tail
Make all the changes above. Commit as ONE commit (this is a coordinated
change — init, adapter, tests belong together). Commit message:
Push:
git push -u origin feat/filesystem-log-tail
Open a PR against master with
gh pr create:Title:
feat: replace k8s log API streaming with filesystem tailingBody (use a heredoc):
=============================================================================
WRAPPING UP
Report back with:
1. Branch name and commit hash
2. PR URL
3. Final test count (e.g. "368 tests passing" — number will drop vs 379
because you deleted tests)
4. Line count of execute.ts before and after (should drop significantly)
5. Any deviation from these instructions, with reason
If ANY of the following happens, STOP and report instead of improvising:
- A file path doesn't match what's described (e.g. the mainCommand
pattern has changed)
- A function you're supposed to delete has other callers you didn't
expect
- A test you're supposed to keep depends on something you deleted
- Typecheck fails and the fix is non-obvious
Do NOT push to master. Do NOT tag a version. Do NOT bump package.json
version — leave it as-is.