paperclip-adapter-claude-k8s

farhoodlabs/paperclip-adapter-claude-k8s

Author	SHA1	Message	Date
Test User	df856e6ca5	fix: clean up orphaned K8s Jobs and refresh updatedAt to prevent UI desync Two root causes behind the "plugin losing sync" issue: 1. After a server restart, the in-memory activeRunExecutions set is lost. The K8s Job keeps running but the reaper marks the server-side run as failed after 5 min (stale updatedAt). Next heartbeat fires a new run, the adapter's concurrency guard blocks it because the old Job is still alive, and this loops indefinitely. Fix: the concurrency guard now compares each running Job's paperclip.io/run-id label against the current runId. Jobs from a previous (dead) run are cleaned up automatically so the new run can proceed. 2. onLog (keepalive) does NOT update the run's updatedAt in the DB — it only writes to the log store and publishes SSE events. In multi-instance deployments, a reaper on instance B can mark a run being executed on instance A as stale after 5 min of no DB updates. Fix: the keepalive timer now calls onSpawn every ~4 min (16 ticks) to refresh updatedAt, staying within the 5-min reaper threshold. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-16 21:48:16 +00:00
Test User	d53559e58b	fix: correct Bedrock Opus 4.7 model ID to us.anthropic.claude-opus-4-7 Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-16 17:51:47 +00:00
Test User	335b7b50b5	chore: bump version to 0.1.18 Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-16 17:07:08 +00:00
Test User	0b67ccc081	feat: add Opus 4.7 models and enable manual model selection - Add claude-opus-4-7 and Bedrock Opus 4.7 to model lists - Set models export to undefined (like opencode_k8s) to allow free-text model entry - Move direct models list into server/models.ts - Bump version to 0.1.17 Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-16 16:57:09 +00:00
Chris Farhood	9a85842add	chore: bump version for CI publish Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 11:52:02 -04:00
Test User	4bf5cf64a4	fix: call onSpawn after pod enters Running state to prevent UI desync The k8s adapter never called ctx.onSpawn(), so the Paperclip server had no processStartedAt timestamp for the run. The stale-run reaper (reapOrphanedRuns) would then mark live k8s runs as failed/orphaned, causing the UI to show no active runs and triggering duplicate run attempts that hit the concurrency guard. Uses pid=-1 as a sentinel since there is no local process — the server's isProcessAlive check safely returns false for pid <= 0. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-16 15:46:34 +00:00
Chris Farhood	b8ba457790	fix: don't delete job when returning state-mismatch error to keep UI in sync When waitForJobCompletion threw and the job was still not terminal, we were returning an error but still deleting the job in the finally block. This left the UI holding an error while the job (still alive) would be cleaned up by Kubernetes, causing the next heartbeat to find nothing and think it was safe to retry — spawning a concurrent pod. Now we set skipCleanup=true when returning the mismatch error, so the job is retained and the heartbeat can still find and wait on it. Also removes a duplicate empty-stdout fallback block. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 11:29:42 -04:00
Chris Farhood	fa5fcb94d9	fix: remove duplicate CI/CD section from CLAUDE.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 07:56:25 -04:00
Chris Farhood	169636de1d	docs: clarify CI/CD handles build, not local builds Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 07:27:14 -04:00
Chris Farhood	efbbfbc299	fix: re-check job state when completion waiter throws to prevent UI staleness When waitForJobCompletion threw a transient error (API disconnect, etc.), the code fell through with jobTimedOut=true and returned a result even though the job was still running. This caused the UI to think the run was complete while the job kept running, resulting in concurrency errors. Now when completion throws, we re-check the job's actual state. If still not terminal, we return a k8s_job_state_mismatch error so the UI knows the run is not done. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 07:26:10 -04:00
Chris Farhood	710cf37f5e	chore: rebuild dist files	2026-04-15 19:03:06 -04:00
Chris Farhood	6f85a068f4	chore: bump version to trigger CI publish Verify upstream canary has adapter-utils exports Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 17:59:18 -04:00
Chris Farhood	2412ee427f	Merge pull request #2 from farhoodliquor/feat/adapter-plugin-capabilities feat: declare adapter plugin capabilities on ServerAdapterModule	2026-04-15 17:47:48 -04:00
Test User	ddb1ea4311	chore: update lockfile for adapter-utils canary Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-15 21:45:46 +00:00
Test User	3db5229407	feat: declare adapter plugin capabilities on ServerAdapterModule Adds supportsInstructionsBundle, instructionsPathKey, and requiresMaterializedRuntimeSkills flags so the UI renders the bundle editor for claude_k8s agents. Bumps adapter-utils peer dep to the canary that includes the capability type fields. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-15 21:45:08 +00:00
Pawla Abdul	389bbb6f99	Add ServerAdapterModule capabilities from fork's adapter-utils 0.3.1 Bring the K8s adapter up to parity with the fork's ServerAdapterModule contract by adding sessionManagement, listSkills/syncSkills, listModels with Bedrock detection, and promptBundleKey support in the session codec. - Declare sessionManagement with nativeContextManagement: "confirmed" so Paperclip skips threshold-based session compaction (Claude manages its own context) - Add ephemeral skill management (listSkills/syncSkills) mirroring claude_local — reports skill state without runtime persistence since skills are injected via prompt bundle into ephemeral Job pods - Add listModels() with Bedrock environment detection, returning region-qualified model IDs when CLAUDE_CODE_USE_BEDROCK or ANTHROPIC_BEDROCK_BASE_URL are set - Extend session codec to round-trip promptBundleKey field - Remove the `as ServerAdapterModule` cast — the return type now satisfies the full interface Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-14 11:13:18 +00:00
Pawla Abdul	10a5004c02	Revert "Add RTK integration for token-optimized command output" This reverts commit `d074cb2a8c`.	2026-04-14 01:35:16 +00:00
Pawla Abdul	d074cb2a8c	Add RTK integration for token-optimized command output When enableRtk is set in adapter config, the adapter: - Adds an init container (curlimages/curl) to download the RTK binary - Mounts RTK binary in the main container via shared emptyDir volume - Runs `rtk install claude-code` before invoking Claude to set up hooks - Disables RTK telemetry (RTK_NO_TELEMETRY=1) for automated environments - Supports optional rtkVersion config for pinning specific versions RTK filters CLI command output before it reaches the LLM context, reducing token consumption by ~80%. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-14 01:27:28 +00:00
Pawla Abdul	77ba40d9bf	Reconnect K8s log stream on silent API disconnects The adapter opened a single follow-stream to the K8s API for pod logs. If that TCP connection silently dropped (API server hiccup, network timeout, load-balancer idle cut), streamPodLogs returned early and no more real Claude output reached the UI — only keepalive pings. The pod kept running and producing logs (visible via kubectl), but the adapter never reconnected. Splits streamPodLogs into streamPodLogsOnce (single follow attempt) and a reconnecting wrapper that retries with sinceSeconds until a shared stop signal fires when waitForJobCompletion resolves. On reconnect, requests logs from the original stream start time (+5s overlap) so no output is lost; the UI deduplicates chunks. Bumps version to 0.1.12. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-13 10:34:41 +00:00
Pawla Abdul	e760bf9386	Add keepalive pings during job execution to prevent UI timeout desync The adapter had no mechanism to signal liveness while a K8s Job was running. When Claude entered long thinking phases with no log output, the Paperclip UI could lose sync and consider the run stuck even though the pod was still actively working. Adds a 15-second interval keepalive that sends status messages via onLog during execution. The keepalive tracks time since last real log output and reports it, keeping the connection alive. The timer is cleaned up in the finally block to prevent leaks on any exit path. Bumps version to 0.1.11. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-12 18:44:09 +00:00
Pawla Abdul	ac2fe20294	Restore maxTurnsPerRun field, add config schema tests Keep adapter-specific maxTurnsPerRun (default 1000) in the config schema since the platform UI does not provide it for external adapters. Platform-provided fields (model, effort, instructionsFilePath, timeoutSec, graceSec) remain excluded to avoid duplication. Add config-schema.test.ts with assertions that platform-provided fields are absent and adapter-specific fields have correct defaults. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-12 18:32:49 +00:00
Pawla Abdul	c8d883d409	Remove duplicate/internal fields from UI config schema Fields like model, reasoning effort, instructions file path, max turns, timeout, and grace period are either surfaced elsewhere in the platform UI or are internal operational settings that shouldn't be user-facing in the adapter config panel. These values remain functional when set via the API/backend — only the UI exposure is removed. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-12 17:24:31 +00:00
Chris Farhood	df5cb84dca	Rename project to 'Claude (Kubernetes) Paperclip Adapter'	2026-04-12 11:22:43 -04:00
Pawla Abdul	c8c5e01371	Enhance README with RWX PVC requirements, RBAC examples, and full config docs Adds detailed prerequisites section covering ReadWriteMany PVC setup, complete RBAC Role/RoleBinding/ServiceAccount manifests, and API key secret configuration. Includes full configuration reference tables and a How It Works section explaining the adapter lifecycle. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-12 15:20:54 +00:00
Pawla Abdul	e75a62b329	Fix CI publish failures and add missing config schema fields - CI publish job failed because it tried to re-publish existing versions (npm returns 404 for scoped packages on duplicate version). Added a version-exists check before npm publish to skip gracefully. - Also fixed the auth env var from NPM_TOKEN to NODE_AUTH_TOKEN which is what actions/setup-node's registry-url option expects. - Added missing core and operational fields to getConfigSchema() so the Paperclip UI surfaces model, effort, maxTurnsPerRun, skipPermissions, instructionsFilePath, timeoutSec, and graceSec alongside existing K8s infrastructure fields. - Bumped version to 0.1.10. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-12 15:02:54 +00:00
Chris Farhood	545950daf2	Add CLI formatter, fix env forwarding, rename job prefix to agent-claude- - Add src/cli/ with format-event.ts (printClaudeStreamEvent) exported from CLIAdapterModule - Fix env var forwarding: read from pod spec container env dynamically instead of static allowlist; agent config env overrides pod values - Rename K8s Job prefix from agent- to agent-claude- - Add fsGroupChangePolicy: "OnRootMismatch" to skip PVC chown on subsequent runs - Add comprehensive test coverage (159 tests across 5 test files) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-12 10:47:27 -04:00
Chris Farhood	514fe15009	Regenerate package-lock.json to sync with package.json Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-12 10:44:52 -04:00
Chris Farhood	f5fa41fb3a	Fix getConfigSchema to use flat fields array with correct hint keys The Paperclip AdapterConfigSchema type expects a flat fields array, not nested sections. Also maps description -> hint per the schema type. Defines types locally since @paperclipai/adapter-utils@0.3.1 on npm does not yet export AdapterConfigSchema/ConfigFieldSchema (those exist in the monorepo but aren't released to npm yet). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-12 10:43:31 -04:00
Chris Farhood	448889fc94	Add GitHub Actions CI workflow Runs typecheck and tests on push/PR to master, then publishes to npm on successful master pushes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-12 10:39:23 -04:00
Chris Farhood	75ba66e504	Add getConfigSchema to surface K8s fields in Paperclip UI Adds AdapterConfigSchema with three sections (Kubernetes, Resource Limits, Scheduling) exposing: namespace, image, imagePullPolicy, kubeconfig, resources.{requests,limits}.{cpu,memory}, nodeSelector, tolerations, labels, ttlSecondsAfterFinished, retainJobs. Paperclip's server fetches GET /api/adapters/:type/config-schema and caches the result, automatically assigning ConfigFields to external adapters. The adapter now wires getConfigSchema into createServerAdapter(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-12 10:31:55 -04:00
Chris Farhood	98af28a272	Skip PVC chown on subsequent runs with fsGroupChangePolicy Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-12 00:04:35 -04:00
Chris Farhood	4b0baaf05c	Add .gitignore and bump version for npm publishing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 23:43:19 -04:00
Chris Farhood	4c310d020d	Clarify that Paperclip must be deployed on an RWX PVC	2026-04-11 23:18:16 -04:00
Chris Farhood	9dbb5f337e	Initial commit: Paperclip adapter for Claude Code on Kubernetes Adapter plugin that runs Claude Code agents as Kubernetes Jobs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 23:16:31 -04:00

34 Commits