forked from farhoodlabs/paperclip
feat(plugin): add kubernetes sandbox provider
This commit is contained in:
@@ -0,0 +1,165 @@
|
||||
# @paperclipai/plugin-kubernetes (alpha)
|
||||
|
||||
First-party Paperclip sandbox-provider plugin for Kubernetes.
|
||||
|
||||
**Alpha:** the default backend (`sandbox-cr`) is built on `kubernetes-sigs/agent-sandbox` v1alpha1 — expect breaking changes as that CRD evolves toward Beta. A stable fallback backend (`job`, using `batch/v1` Job) is available for clusters without agent-sandbox installed, but it does NOT support multi-command exec (paperclip-server's adapter-install pattern requires sandbox-cr).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### For `sandbox-cr` backend (default, recommended)
|
||||
|
||||
1. A Kubernetes cluster running k8s 1.27+
|
||||
2. [`kubernetes-sigs/agent-sandbox`](https://github.com/kubernetes-sigs/agent-sandbox) controller installed in the cluster (alpha — installs the `sandboxes.agents.x-k8s.io/v1alpha1` CRD and controller)
|
||||
3. Paperclip-server running with access to the cluster (in-cluster via `inCluster: true` or external via `kubeconfig`)
|
||||
|
||||
### For `job` backend (stable fallback)
|
||||
|
||||
1. A Kubernetes cluster running k8s 1.27+
|
||||
2. Paperclip-server with cluster access — no additional controllers or CRDs required
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
paperclipai plugin install @paperclipai/plugin-kubernetes
|
||||
```
|
||||
|
||||
Or, for local development:
|
||||
|
||||
```bash
|
||||
paperclipai plugin install --local /path/to/paperclip/packages/plugins/sandbox-providers/kubernetes
|
||||
```
|
||||
|
||||
## Backends
|
||||
|
||||
The plugin supports two backend modes, selected via the `backend` config field:
|
||||
|
||||
| Backend | Default | Stability | Multi-command exec | Requires |
|
||||
|---|---|---|---|---|
|
||||
| `sandbox-cr` | Yes | Alpha | Yes | `kubernetes-sigs/agent-sandbox` controller |
|
||||
| `job` | No | Stable | No | Nothing beyond k8s 1.27+ |
|
||||
|
||||
**`sandbox-cr` (default):** Creates a `Sandbox` CR (`agents.x-k8s.io/v1alpha1`) whose controller provisions a long-lived pod running `sleep infinity`. paperclip-server execs individual commands into the running pod — this is the multi-command adapter-install pattern. When you `releaseLease`, the Sandbox CR is deleted and the controller tears down the pod.
|
||||
|
||||
**`job` (stable fallback):** Creates a `batch/v1` Job. The container entrypoint runs once and exits — no multi-command exec possible. Use this when you cannot install agent-sandbox, or when you need strictly stable Kubernetes APIs. Note: paperclip-server's adapter-install pattern will not work in job mode.
|
||||
|
||||
### Migrating from `job` to `sandbox-cr`
|
||||
|
||||
1. Install the agent-sandbox controller: `kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/latest/download/install.yaml`
|
||||
2. Update your environment config to set `backend: "sandbox-cr"` (or remove `backend` since `sandbox-cr` is the default)
|
||||
3. New leases will use the Sandbox CR backend. Existing leases created with `job` mode continue to use job semantics until they are released.
|
||||
|
||||
## Configuration
|
||||
|
||||
Create a `sandbox` environment with `driver: kubernetes`. One of these auth fields is required:
|
||||
|
||||
- `inCluster: true` — use the in-pod ServiceAccount credentials (when paperclip-server runs inside the same cluster).
|
||||
- `kubeconfig: <YAML>` — inline kubeconfig (stored as a company secret).
|
||||
- `kubeconfigSecretRef: <secret-uuid>` — reference to an existing Paperclip secret.
|
||||
|
||||
Common optional fields:
|
||||
|
||||
| Field | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `backend` | `"sandbox-cr"` | `sandbox-cr` (alpha, requires agent-sandbox controller) or `job` (stable, one-shot entrypoint). |
|
||||
| `adapterType` | `"claude_local"` | One of the supported adapter types (claude_local, codex_local, gemini_local, cursor_local, opencode_local, acpx_local, pi_local). Determines runtime image + env keys + egress allow-list. |
|
||||
| `namespacePrefix` | `"paperclip-"` | Prefix for the per-company tenant namespace. |
|
||||
| `companySlug` | derived from companyId | Override the auto-derived company slug. |
|
||||
| `imageRegistry` | (none) | Override the default registry for agent runtime images. |
|
||||
| `imageAllowList` | `[]` | Glob patterns of allowed `target.imageOverride` values. Empty = no override permitted. |
|
||||
| `imagePullSecrets` | `[]` | Names of pre-created Docker image pull secrets in the tenant namespace. |
|
||||
| `egressAllowFqdns` | `[]` | Additional FQDNs (beyond adapter defaults like `api.anthropic.com`). |
|
||||
| `egressAllowCidrs` | `[]` | Additional CIDRs to allow egress to. |
|
||||
| `egressMode` | `"standard"` | `standard` (NetworkPolicy + CIDRs, plus public HTTPS fallback when adapter FQDNs are configured) or `cilium` (CiliumNetworkPolicy + exact FQDN allow-list). |
|
||||
| `runtimeClassName` | (none) | e.g. `kata-fc` for Firecracker-backed microVMs. Cluster must have the RuntimeClass installed. |
|
||||
| `serviceAccountAnnotations` | `{}` | Annotations applied to per-tenant ServiceAccount (e.g. IRSA `eks.amazonaws.com/role-arn`). |
|
||||
| `jobTtlSecondsAfterFinished` | `900` | Seconds after a Job completes before garbage-collection. |
|
||||
| `podActivityDeadlineSec` | `3600` | Hard ceiling on a single run's wall-clock time. |
|
||||
|
||||
Full JSON Schema in `src/manifest.ts`.
|
||||
|
||||
## What gets created in your cluster
|
||||
|
||||
For each company that runs agents (created lazily on first dispatch):
|
||||
|
||||
```
|
||||
Namespace paperclip-{companySlug} (PSS: restricted enforce + audit)
|
||||
ServiceAccount paperclip-tenant-sa
|
||||
Role paperclip-tenant-role (only get pods/log)
|
||||
RoleBinding paperclip-tenant-rb
|
||||
ResourceQuota paperclip-quota (pods, requests/limits cpu+memory)
|
||||
LimitRange paperclip-limits (container max/min/default/defaultRequest)
|
||||
NetworkPolicy paperclip-deny-all (deny ingress + egress baseline)
|
||||
NetworkPolicy paperclip-egress-allow (DNS + paperclip-server callback + user CIDRs + public HTTPS fallback for adapter FQDNs)
|
||||
OR CiliumNetworkPolicy paperclip-egress-fqdn if egressMode=cilium
|
||||
```
|
||||
|
||||
Standard Kubernetes NetworkPolicy cannot match FQDNs. In `egressMode: "standard"`, adapter-default FQDNs such as `api.anthropic.com` trigger a public IPv4 HTTPS fallback that excludes private and link-local ranges, so default agent runs can reach model APIs without opening intra-cluster/private-network egress. Use `egressMode: "cilium"` when you need exact FQDN enforcement.
|
||||
|
||||
For each agent run (sandbox-cr backend):
|
||||
|
||||
```
|
||||
Sandbox CR pc-{ulid} (agents.x-k8s.io/v1alpha1; explicit delete on release)
|
||||
Pod pc-{ulid}-{podSuffix} (managed by Sandbox controller; torn down on CR delete)
|
||||
Secret pc-{ulid}-env (owned by Sandbox CR; cascade-deleted)
|
||||
```
|
||||
|
||||
For each agent run (job backend):
|
||||
|
||||
```
|
||||
Job pc-{ulid} (backoffLimit: 0, ttlSecondsAfterFinished from config)
|
||||
Pod pc-{ulid}-{podSuffix} (owned by Job; cascade-deleted)
|
||||
Secret pc-{ulid}-env (owned by Job; cascade-deleted)
|
||||
```
|
||||
|
||||
## Security baseline
|
||||
|
||||
Every agent pod is:
|
||||
|
||||
- non-root (`runAsUser: 1000`, `runAsGroup: 1000`, `runAsNonRoot: true`)
|
||||
- drops ALL Linux capabilities, `allowPrivilegeEscalation: false`
|
||||
- `readOnlyRootFilesystem: true` with explicit `emptyDir` mounts for `/workspace`, `/home/paperclip`, `/home/paperclip/.cache`, `/tmp`
|
||||
- `seccompProfile: RuntimeDefault`
|
||||
- Tini as PID 1 (reaps zombies, forwards signals)
|
||||
- `fsGroupChangePolicy: OnRootMismatch` (fast PVC startup; openclaw-operator lesson)
|
||||
- `automountServiceAccountToken: false`
|
||||
|
||||
Plus per-namespace `pod-security.kubernetes.io/enforce: restricted` and a deny-all NetworkPolicy baseline with explicit egress allow-list (DNS, paperclip-server, CIDRs, and either Cilium FQDN rules or standard-mode public HTTPS fallback).
|
||||
|
||||
The per-run Secret carrying the bootstrap token and adapter API keys has `ownerReferences` pointing at the owning Sandbox CR or Job, so releasing the lease cascades cleanly to the Pod and Secret.
|
||||
|
||||
## Optional Kata-FC microVM isolation
|
||||
|
||||
For stronger isolation, install [Kata Containers](https://github.com/kata-containers/kata-containers) with the Firecracker hypervisor, then set `runtimeClassName: kata-fc` in the plugin config. Each agent pod will run inside a Firecracker microVM. Requires nested-virt-capable nodes (bare-metal or specific cloud instance types).
|
||||
|
||||
## Roadmap
|
||||
|
||||
- **Phase A (done):** `sandbox-cr` backend — multi-command exec via agent-sandbox Sandbox CRD.
|
||||
- **Phase B:** Warm pool support — pre-provisioned Sandbox CRs for sub-second cold starts. The `SandboxOrchestrator` interface reserves optional `pause?`/`resume?` extension slots.
|
||||
- **Phase C:** Kata-FC + snapshots — `runtimeClassName: kata-fc` with VM snapshot for fast restore.
|
||||
- **Phase D:** Contribute back to agent-sandbox upstream if their Beta model diverges from our needs. The `SandboxOrchestrator` interface (`src/sandbox-orchestrator.ts`) is the clean swap point — a new implementation can be added without touching `plugin.ts` business logic.
|
||||
|
||||
## Lessons learned (from openclaw-operator)
|
||||
|
||||
This plugin adopts patterns from `openclaw-rocks/openclaw-operator`:
|
||||
|
||||
- Tini PID 1 (issue #471 — zombie helper processes)
|
||||
- Read-only rootFS with explicit writable mounts (issue #456 — ~/.config not writable)
|
||||
- Strategic merge on reconcile (issue #446 — preserve third-party annotations)
|
||||
- Multi-storage-class testing (issue #448 — `local-path-provisioner` differences)
|
||||
- Image version compat matrix (issue #462 — runtime deps cannot resolve after upgrade)
|
||||
|
||||
## Development
|
||||
|
||||
```bash
|
||||
cd packages/plugins/sandbox-providers/kubernetes
|
||||
pnpm install --ignore-workspace
|
||||
pnpm test # unit tests only (fast)
|
||||
pnpm typecheck
|
||||
pnpm build
|
||||
```
|
||||
|
||||
To run the kind-cluster integration test (requires `kubectl --context kind-paperclip` and a pre-loaded alpine image; see `test/integration/end-to-end-run.test.ts`):
|
||||
|
||||
```bash
|
||||
RUN_K8S_INTEGRATION_TESTS=1 pnpm test test/integration/end-to-end-run.test.ts
|
||||
```
|
||||
@@ -0,0 +1,135 @@
|
||||
# Manual smoke test — `@paperclipai/plugin-kubernetes`
|
||||
|
||||
Manual sanity check that the plugin works end-to-end against a real
|
||||
paperclip-server instance and a real Kubernetes cluster (kind for local
|
||||
dev). Future work may automate this in CI.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- A running kind cluster:
|
||||
```bash
|
||||
kind create cluster --name paperclip
|
||||
```
|
||||
- `kubectl --context kind-paperclip get nodes` returns a node in `Ready` state.
|
||||
|
||||
## Steps
|
||||
|
||||
### 1. Build the plugin
|
||||
|
||||
```bash
|
||||
cd packages/plugins/sandbox-providers/kubernetes
|
||||
pnpm install --ignore-workspace
|
||||
pnpm build
|
||||
```
|
||||
|
||||
Expected: `dist/` populated with compiled `.js` and `.d.ts` files. No errors.
|
||||
|
||||
### 2. Start paperclip-server in dev mode
|
||||
|
||||
In a separate terminal:
|
||||
|
||||
```bash
|
||||
cd /path/to/paperclip
|
||||
export PAPERCLIP_HOME=/tmp/paperclip-smoke
|
||||
export PAPERCLIP_INSTANCE_ID=smoke
|
||||
export PAPERCLIP_DEPLOYMENT_MODE=local_trusted
|
||||
pnpm --filter @paperclipai/server dev
|
||||
```
|
||||
|
||||
Wait for `Server listening on 127.0.0.1:3100`.
|
||||
|
||||
### 3. Install the plugin via the CLI
|
||||
|
||||
```bash
|
||||
pnpm paperclipai plugin install \
|
||||
--local /path/to/paperclip/packages/plugins/sandbox-providers/kubernetes \
|
||||
--api-base http://127.0.0.1:3100
|
||||
```
|
||||
|
||||
Expected: `✓ Installed paperclip.kubernetes-sandbox-provider v0.1.0 (ready)`.
|
||||
|
||||
### 4. Create a company and a kubernetes sandbox environment
|
||||
|
||||
```bash
|
||||
CO_ID=$(curl -s -X POST -H "Content-Type: application/json" \
|
||||
-d '{"name":"SmokeCo"}' \
|
||||
http://127.0.0.1:3100/api/companies | jq -r '.id')
|
||||
|
||||
KUBECONFIG_CONTENT=$(cat ~/.kube/config | jq -Rs .)
|
||||
|
||||
curl -s -X POST -H "Content-Type: application/json" \
|
||||
-d "{
|
||||
\"name\": \"k8s-sandbox\",
|
||||
\"driver\": \"sandbox\",
|
||||
\"config\": {
|
||||
\"provider\": \"kubernetes\",
|
||||
\"kubeconfig\": $KUBECONFIG_CONTENT,
|
||||
\"companySlug\": \"smoke\",
|
||||
\"adapterType\": \"claude_local\",
|
||||
\"imageAllowList\": [\"ghcr.io/paperclipai/agent-runtime-claude:v1\"]
|
||||
}
|
||||
}" \
|
||||
http://127.0.0.1:3100/api/companies/$CO_ID/environments | jq
|
||||
```
|
||||
|
||||
Expected: HTTP 201 with the new environment row.
|
||||
|
||||
### 5. Probe the environment
|
||||
|
||||
```bash
|
||||
ENV_ID=$(curl -s http://127.0.0.1:3100/api/companies/$CO_ID/environments | jq -r '.[0].id')
|
||||
curl -s -X POST -d '{}' -H "Content-Type: application/json" \
|
||||
http://127.0.0.1:3100/api/environments/$ENV_ID/probe | jq
|
||||
```
|
||||
|
||||
Expected: `{"ok": true, ...}` with a summary mentioning the tenant namespace
|
||||
(`paperclip-smoke`). On first probe the namespace may not yet exist —
|
||||
the plugin treats a 404 on `listNamespacedPod` as a successful reachability
|
||||
check.
|
||||
|
||||
### 6. Trigger an agent run
|
||||
|
||||
Use the UI or the API to dispatch a run against the `k8s-sandbox` environment.
|
||||
The plugin's `onEnvironmentAcquireLease` will:
|
||||
|
||||
1. `ensureTenant` — provision the `paperclip-smoke` namespace, SA, Role,
|
||||
RoleBinding, ResourceQuota, LimitRange, NetworkPolicies
|
||||
2. `buildJobManifest` — render the security-hardened Job manifest
|
||||
3. `createJob` — submit to `batch/v1`
|
||||
4. `createPerRunSecret` — owned by the Job for cascade-delete
|
||||
|
||||
### 7. Verify the tenant resources
|
||||
|
||||
```bash
|
||||
kubectl --context kind-paperclip get namespace paperclip-smoke
|
||||
kubectl --context kind-paperclip get all,networkpolicy,resourcequota,limitrange,sa,role,rolebinding -n paperclip-smoke
|
||||
```
|
||||
|
||||
Expected:
|
||||
|
||||
- Namespace `paperclip-smoke` exists with PSS labels
|
||||
(`pod-security.kubernetes.io/enforce=restricted`)
|
||||
- ServiceAccount `paperclip-tenant-sa`
|
||||
- Role `paperclip-tenant-role`, RoleBinding `paperclip-tenant-rb`
|
||||
- ResourceQuota `paperclip-quota`, LimitRange `paperclip-limits`
|
||||
- NetworkPolicies `paperclip-deny-all` + `paperclip-egress-allow`
|
||||
- Job `pc-{ulid}` and its child Pod
|
||||
- Secret `pc-{ulid}-env` with `ownerReferences` pointing at the Job
|
||||
|
||||
### 8. Tear down
|
||||
|
||||
```bash
|
||||
kubectl --context kind-paperclip delete namespace paperclip-smoke
|
||||
kill %1 # paperclip-server
|
||||
```
|
||||
|
||||
### 9. Document the result
|
||||
|
||||
In the PR description (or appended to this file as a dated section),
|
||||
record:
|
||||
|
||||
- Date + git SHA
|
||||
- `kubectl version` server version
|
||||
- Output of `kubectl get all -n paperclip-smoke` after step 6
|
||||
- Probe response from step 5
|
||||
- Time-to-acquire-lease (target: <30s on kind for a cold tenant)
|
||||
@@ -0,0 +1,22 @@
|
||||
# This plugin uses only stable Kubernetes APIs. No CRD installation is required.
|
||||
#
|
||||
# Minimum cluster version: Kubernetes 1.27+
|
||||
# - batch/v1 Job (GA since k8s 1.21)
|
||||
# - core/v1 Pod, Secret, Namespace, ServiceAccount, ResourceQuota, LimitRange (GA since k8s 1.0)
|
||||
# - rbac.authorization.k8s.io/v1 Role, RoleBinding (GA since k8s 1.8)
|
||||
# - networking.k8s.io/v1 NetworkPolicy (GA since k8s 1.7)
|
||||
# - Pod Security Standards namespace labels (GA in k8s 1.25)
|
||||
# - fsGroupChangePolicy: OnRootMismatch (GA in k8s 1.23)
|
||||
# - seccompProfile.type: RuntimeDefault (GA in k8s 1.19)
|
||||
#
|
||||
# Optional CNI prerequisites for FQDN-based egress (egressMode: cilium):
|
||||
# - Cilium >= 1.11 with hubble + DNS proxy enabled
|
||||
# - cilium.io/v2 CiliumNetworkPolicy (provided by Cilium installation)
|
||||
#
|
||||
# Optional runtime class for microVM isolation (runtimeClassName: kata-fc):
|
||||
# - kata-containers with Firecracker hypervisor
|
||||
# - nested-virt-capable nodes
|
||||
#
|
||||
# Future backends (not currently required):
|
||||
# - kubernetes-sigs/agent-sandbox (when it reaches v1beta1) as an alternative
|
||||
# backend for warm pools / templates / pause-resume.
|
||||
@@ -0,0 +1,60 @@
|
||||
{
|
||||
"name": "@paperclipai/plugin-kubernetes",
|
||||
"version": "0.1.0",
|
||||
"description": "Kubernetes sandbox provider plugin for Paperclip environments",
|
||||
"license": "MIT",
|
||||
"homepage": "https://github.com/paperclipai/paperclip",
|
||||
"bugs": {
|
||||
"url": "https://github.com/paperclipai/paperclip/issues"
|
||||
},
|
||||
"repository": {
|
||||
"type": "git",
|
||||
"url": "https://github.com/paperclipai/paperclip",
|
||||
"directory": "packages/plugins/sandbox-providers/kubernetes"
|
||||
},
|
||||
"type": "module",
|
||||
"exports": {
|
||||
".": "./src/index.ts"
|
||||
},
|
||||
"publishConfig": {
|
||||
"access": "public",
|
||||
"exports": {
|
||||
".": {
|
||||
"types": "./dist/index.d.ts",
|
||||
"import": "./dist/index.js"
|
||||
}
|
||||
},
|
||||
"main": "./dist/index.js",
|
||||
"types": "./dist/index.d.ts"
|
||||
},
|
||||
"files": ["dist", "manifests", "README.md"],
|
||||
"paperclipPlugin": {
|
||||
"manifest": "./dist/manifest.js",
|
||||
"worker": "./dist/worker.js"
|
||||
},
|
||||
"keywords": [
|
||||
"paperclip",
|
||||
"plugin",
|
||||
"sandbox",
|
||||
"kubernetes"
|
||||
],
|
||||
"scripts": {
|
||||
"postinstall": "node ../../../../scripts/link-plugin-dev-sdk.mjs",
|
||||
"prebuild": "pnpm -C ../../../.. --filter @paperclipai/plugin-sdk ensure-build-deps",
|
||||
"build": "rm -rf dist && tsc",
|
||||
"clean": "rm -rf dist",
|
||||
"typecheck": "pnpm -C ../../../.. --filter @paperclipai/plugin-sdk ensure-build-deps && tsc --noEmit",
|
||||
"test": "vitest run --config vitest.config.ts",
|
||||
"prepack": "rm -f package.dev.json && cp package.json package.dev.json && node ../../../../scripts/generate-plugin-package-json.mjs",
|
||||
"postpack": "if [ -f package.dev.json ]; then mv package.dev.json package.json; fi"
|
||||
},
|
||||
"dependencies": {
|
||||
"@kubernetes/client-node": "^1.0.0",
|
||||
"zod": "^3.24.2"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/node": "^24.6.0",
|
||||
"typescript": "^5.7.3",
|
||||
"vitest": "^3.2.4"
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,61 @@
|
||||
export interface AdapterDefaults {
|
||||
runtimeImage: string;
|
||||
envKeys: string[];
|
||||
allowFqdns: string[];
|
||||
probeCommand: string[];
|
||||
}
|
||||
|
||||
const REGISTRY: Record<string, AdapterDefaults> = {
|
||||
claude_local: {
|
||||
runtimeImage: "ghcr.io/paperclipai/agent-runtime-claude:v1",
|
||||
envKeys: ["ANTHROPIC_API_KEY"],
|
||||
allowFqdns: ["api.anthropic.com"],
|
||||
probeCommand: ["claude", "--version"],
|
||||
},
|
||||
codex_local: {
|
||||
runtimeImage: "ghcr.io/paperclipai/agent-runtime-codex:v1",
|
||||
envKeys: ["OPENAI_API_KEY"],
|
||||
allowFqdns: ["api.openai.com"],
|
||||
probeCommand: ["codex", "--version"],
|
||||
},
|
||||
gemini_local: {
|
||||
runtimeImage: "ghcr.io/paperclipai/agent-runtime-gemini:v1",
|
||||
envKeys: ["GOOGLE_API_KEY", "GEMINI_API_KEY"],
|
||||
allowFqdns: ["generativelanguage.googleapis.com"],
|
||||
probeCommand: ["gemini", "--version"],
|
||||
},
|
||||
cursor_local: {
|
||||
runtimeImage: "ghcr.io/paperclipai/agent-runtime-cursor:v1",
|
||||
envKeys: ["ANTHROPIC_API_KEY", "OPENAI_API_KEY"],
|
||||
allowFqdns: ["api.anthropic.com", "api.openai.com"],
|
||||
probeCommand: ["cursor-agent", "--version"],
|
||||
},
|
||||
opencode_local: {
|
||||
runtimeImage: "ghcr.io/paperclipai/agent-runtime-opencode:v1",
|
||||
envKeys: ["ANTHROPIC_API_KEY", "OPENAI_API_KEY", "OPENROUTER_API_KEY"],
|
||||
allowFqdns: ["api.anthropic.com", "api.openai.com", "openrouter.ai"],
|
||||
probeCommand: ["opencode", "--version"],
|
||||
},
|
||||
acpx_local: {
|
||||
runtimeImage: "ghcr.io/paperclipai/agent-runtime-acpx:v1",
|
||||
envKeys: ["ANTHROPIC_API_KEY", "OPENAI_API_KEY"],
|
||||
allowFqdns: ["api.anthropic.com", "api.openai.com"],
|
||||
probeCommand: ["acpx", "--version"],
|
||||
},
|
||||
pi_local: {
|
||||
runtimeImage: "ghcr.io/paperclipai/agent-runtime-pi:v1",
|
||||
envKeys: ["ANTHROPIC_API_KEY"],
|
||||
allowFqdns: ["api.anthropic.com"],
|
||||
probeCommand: ["pi", "--version"],
|
||||
},
|
||||
};
|
||||
|
||||
export const KNOWN_ADAPTER_TYPES: ReadonlySet<string> = new Set(Object.keys(REGISTRY));
|
||||
|
||||
export function getAdapterDefaults(adapterType: string): AdapterDefaults {
|
||||
const defaults = REGISTRY[adapterType];
|
||||
if (!defaults) {
|
||||
throw new Error(`Unknown adapter type: ${adapterType}`);
|
||||
}
|
||||
return defaults;
|
||||
}
|
||||
@@ -0,0 +1,68 @@
|
||||
export interface BuildCiliumNetworkPolicyInput {
|
||||
namespace: string;
|
||||
paperclipServerNamespace: string;
|
||||
egressAllowFqdns: string[];
|
||||
egressAllowCidrs: string[];
|
||||
}
|
||||
|
||||
// Design note: no ingress rules are defined here. Paperclip-server does NOT
|
||||
// push to agent pods — agents make outbound (egress) callbacks to
|
||||
// paperclip-server on port 3100. If server→agent push is ever needed, add a
|
||||
// targeted ingress rule scoped to the paperclip-server endpoint selector.
|
||||
export function buildCiliumNetworkPolicyManifest(input: BuildCiliumNetworkPolicyInput): Record<string, unknown> {
|
||||
const egress: Record<string, unknown>[] = [];
|
||||
|
||||
egress.push({
|
||||
toEndpoints: [
|
||||
{ matchLabels: { "k8s:io.kubernetes.pod.namespace": "kube-system", "k8s-app": "kube-dns" } },
|
||||
],
|
||||
toPorts: [
|
||||
{
|
||||
ports: [
|
||||
{ port: "53", protocol: "UDP" },
|
||||
{ port: "53", protocol: "TCP" },
|
||||
],
|
||||
rules: { dns: [{ matchPattern: "*" }] },
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
if (input.egressAllowFqdns.length > 0) {
|
||||
egress.push({
|
||||
toFQDNs: input.egressAllowFqdns.map((fqdn) => ({ matchName: fqdn })),
|
||||
toPorts: [{ ports: [{ port: "443", protocol: "TCP" }] }],
|
||||
});
|
||||
}
|
||||
|
||||
egress.push({
|
||||
toEndpoints: [
|
||||
{
|
||||
matchLabels: {
|
||||
"k8s:io.kubernetes.pod.namespace": input.paperclipServerNamespace,
|
||||
app: "paperclip-server",
|
||||
},
|
||||
},
|
||||
],
|
||||
toPorts: [{ ports: [{ port: "3100", protocol: "TCP" }] }],
|
||||
});
|
||||
|
||||
if (input.egressAllowCidrs.length > 0) {
|
||||
egress.push({
|
||||
toCIDRSet: input.egressAllowCidrs.map((cidr) => ({ cidr })),
|
||||
});
|
||||
}
|
||||
|
||||
return {
|
||||
apiVersion: "cilium.io/v2",
|
||||
kind: "CiliumNetworkPolicy",
|
||||
metadata: {
|
||||
name: "paperclip-egress-fqdn",
|
||||
namespace: input.namespace,
|
||||
labels: { "paperclip.io/managed-by": "paperclip-k8s-plugin" },
|
||||
},
|
||||
spec: {
|
||||
endpointSelector: { matchLabels: { "paperclip.io/role": "agent" } },
|
||||
egress,
|
||||
},
|
||||
};
|
||||
}
|
||||
@@ -0,0 +1,59 @@
|
||||
/**
|
||||
* Glob matching for image references.
|
||||
* - `*` matches any sequence of characters EXCEPT `/` (so a wildcard doesn't span path segments)
|
||||
* - `?` matches exactly one character (excluding `/`)
|
||||
*/
|
||||
export function globMatch(pattern: string, value: string): boolean {
|
||||
const re = new RegExp(
|
||||
"^" +
|
||||
pattern
|
||||
.replace(/[.+^${}()|[\]\\]/g, "\\$&")
|
||||
.replace(/\*/g, "[^/]*")
|
||||
.replace(/\?/g, "[^/]") +
|
||||
"$",
|
||||
);
|
||||
return re.test(value);
|
||||
}
|
||||
|
||||
export interface ResolveImageInput {
|
||||
imageOverride?: string | null;
|
||||
}
|
||||
|
||||
export interface ResolveImageDefaults {
|
||||
runtimeImage: string;
|
||||
}
|
||||
|
||||
export interface ResolveImageConfig {
|
||||
imageAllowList: string[];
|
||||
imageRegistry?: string;
|
||||
}
|
||||
|
||||
export function resolveImage(
|
||||
target: ResolveImageInput,
|
||||
defaults: ResolveImageDefaults,
|
||||
config: ResolveImageConfig,
|
||||
): string {
|
||||
if (target.imageOverride) {
|
||||
if (!config.imageAllowList.some((p) => globMatch(p, target.imageOverride!))) {
|
||||
throw new Error(`Image override "${target.imageOverride}" is not in allowlist`);
|
||||
}
|
||||
return target.imageOverride;
|
||||
}
|
||||
if (config.imageRegistry) {
|
||||
return rewriteRegistry(defaults.runtimeImage, config.imageRegistry);
|
||||
}
|
||||
return defaults.runtimeImage;
|
||||
}
|
||||
|
||||
function rewriteRegistry(image: string, registry: string): string {
|
||||
// image is like "ghcr.io/paperclipai/agent-runtime-claude:v1"
|
||||
// we want to replace the first two path segments (host + org) with `registry`
|
||||
const cleanRegistry = registry.replace(/\/+$/, "");
|
||||
const colonIdx = image.lastIndexOf(":");
|
||||
const tag = colonIdx >= 0 ? image.slice(colonIdx) : "";
|
||||
const path = colonIdx >= 0 ? image.slice(0, colonIdx) : image;
|
||||
const segments = path.split("/");
|
||||
// Strip the host+org (first two segments), keep the image name
|
||||
const imageName = segments.slice(2).join("/") || segments[segments.length - 1];
|
||||
return `${cleanRegistry}/${imageName}${tag}`;
|
||||
}
|
||||
@@ -0,0 +1,2 @@
|
||||
export { default as manifest } from "./manifest.js";
|
||||
export { default as plugin } from "./plugin.js";
|
||||
@@ -0,0 +1,129 @@
|
||||
import type { KubeClients } from "./kube-client.js";
|
||||
import type { SandboxOrchestrator, SandboxStatus } from "./sandbox-orchestrator.js";
|
||||
|
||||
export class JobTimeoutError extends Error {
|
||||
constructor(namespace: string, name: string, timeoutMs: number) {
|
||||
super(`Job ${namespace}/${name} did not complete within ${timeoutMs}ms`);
|
||||
this.name = "JobTimeoutError";
|
||||
}
|
||||
}
|
||||
|
||||
export async function createJob(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
manifest: Record<string, unknown>,
|
||||
): Promise<{ uid: string }> {
|
||||
const result = await clients.batch.createNamespacedJob({ namespace, body: manifest as never });
|
||||
const uid = (result as { metadata?: { uid?: string } }).metadata?.uid;
|
||||
if (!uid) throw new Error("Job created without a UID");
|
||||
return { uid };
|
||||
}
|
||||
|
||||
export type JobStatus = SandboxStatus;
|
||||
|
||||
export async function getJobStatus(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
name: string,
|
||||
): Promise<JobStatus> {
|
||||
const result = await clients.batch.readNamespacedJobStatus({ namespace, name });
|
||||
const body = (result as Record<string, unknown>) ?? {};
|
||||
const status = (body.status as Record<string, unknown>) ?? {};
|
||||
const active = (status.active as number) ?? 0;
|
||||
const succeeded = (status.succeeded as number) ?? 0;
|
||||
const failed = (status.failed as number) ?? 0;
|
||||
const conditions = (status.conditions as { type: string; status: string; reason?: string; message?: string }[]) ?? [];
|
||||
const completed = conditions.find((c) => c.type === "Complete" && c.status === "True");
|
||||
const failedCond = conditions.find((c) => c.type === "Failed" && c.status === "True");
|
||||
if (failedCond || failed > 0) {
|
||||
return { phase: "Failed", complete: false, active, succeeded, failed, reason: failedCond?.reason, message: failedCond?.message };
|
||||
}
|
||||
if (completed || succeeded > 0) {
|
||||
return { phase: "Succeeded", complete: true, active, succeeded, failed };
|
||||
}
|
||||
if (active > 0) {
|
||||
return { phase: "Running", complete: false, active, succeeded, failed };
|
||||
}
|
||||
return { phase: "Pending", complete: false, active, succeeded, failed };
|
||||
}
|
||||
|
||||
export async function findPodForJob(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
jobName: string,
|
||||
): Promise<string | null> {
|
||||
const result = await clients.core.listNamespacedPod({
|
||||
namespace,
|
||||
labelSelector: `job-name=${jobName}`,
|
||||
});
|
||||
const items = ((result as { items?: { metadata?: { name?: string }; status?: { phase?: string } }[] }).items) ?? [];
|
||||
const running = items.find((p) => p.status?.phase === "Running");
|
||||
return (running ?? items[0])?.metadata?.name ?? null;
|
||||
}
|
||||
|
||||
export async function streamPodLogs(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
podName: string,
|
||||
onChunk: (stream: "stdout" | "stderr", text: string) => Promise<void>,
|
||||
): Promise<void> {
|
||||
// V1 limitation: the Pod log API returns the container's combined log stream.
|
||||
// Kubernetes does not preserve stdout/stderr channel separation after the
|
||||
// container runtime writes logs, so the Job backend reports combined logs on
|
||||
// stdout. The sandbox-cr backend uses exec and keeps streams separate.
|
||||
const result = await clients.core.readNamespacedPodLog({ namespace, name: podName });
|
||||
const text = readPodLogText(result);
|
||||
if (text.length > 0) await onChunk("stdout", text);
|
||||
}
|
||||
|
||||
function readPodLogText(result: unknown): string {
|
||||
if (typeof result === "string") return result;
|
||||
const body = (result as { body?: unknown })?.body;
|
||||
return typeof body === "string" ? body : "";
|
||||
}
|
||||
|
||||
export async function deleteJob(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
name: string,
|
||||
): Promise<void> {
|
||||
await clients.batch.deleteNamespacedJob({
|
||||
namespace,
|
||||
name,
|
||||
propagationPolicy: "Foreground",
|
||||
});
|
||||
}
|
||||
|
||||
export async function waitForJobCompletion(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
name: string,
|
||||
opts: { timeoutMs: number; pollMs?: number } = { timeoutMs: 120_000, pollMs: 2000 },
|
||||
): Promise<JobStatus> {
|
||||
const deadline = Date.now() + opts.timeoutMs;
|
||||
const pollMs = opts.pollMs ?? 2000;
|
||||
while (Date.now() < deadline) {
|
||||
const status = await getJobStatus(clients, namespace, name);
|
||||
if (status.phase === "Succeeded" || status.phase === "Failed") return status;
|
||||
await sleep(pollMs);
|
||||
}
|
||||
throw new JobTimeoutError(namespace, name, opts.timeoutMs);
|
||||
}
|
||||
|
||||
function sleep(ms: number): Promise<void> {
|
||||
return new Promise((resolve) => setTimeout(resolve, ms));
|
||||
}
|
||||
|
||||
/**
|
||||
* Job-backed conformance to SandboxOrchestrator. Plugin.ts imports THIS value
|
||||
* (the swap point) — to use a different backend, swap this import for another
|
||||
* module exposing a SandboxOrchestrator-shaped default export.
|
||||
*/
|
||||
export const jobOrchestrator: SandboxOrchestrator = {
|
||||
claim: createJob,
|
||||
getStatus: getJobStatus,
|
||||
findPod: findPodForJob,
|
||||
streamLogs: streamPodLogs,
|
||||
release: deleteJob,
|
||||
waitForCompletion: waitForJobCompletion,
|
||||
};
|
||||
@@ -0,0 +1,44 @@
|
||||
import {
|
||||
KubeConfig,
|
||||
CoreV1Api,
|
||||
BatchV1Api,
|
||||
CustomObjectsApi,
|
||||
NetworkingV1Api,
|
||||
RbacAuthorizationV1Api,
|
||||
} from "@kubernetes/client-node";
|
||||
|
||||
export interface CreateKubeConfigInput {
|
||||
inCluster?: boolean;
|
||||
kubeconfig?: string;
|
||||
}
|
||||
|
||||
export function createKubeConfig(input: CreateKubeConfigInput): KubeConfig {
|
||||
const kc = new KubeConfig();
|
||||
if (input.inCluster) {
|
||||
kc.loadFromCluster();
|
||||
return kc;
|
||||
}
|
||||
if (input.kubeconfig && input.kubeconfig.trim().length > 0) {
|
||||
kc.loadFromString(input.kubeconfig);
|
||||
return kc;
|
||||
}
|
||||
throw new Error("createKubeConfig requires either inCluster=true or a kubeconfig string");
|
||||
}
|
||||
|
||||
export interface KubeClients {
|
||||
core: CoreV1Api;
|
||||
batch: BatchV1Api;
|
||||
custom: CustomObjectsApi;
|
||||
networking: NetworkingV1Api;
|
||||
rbac: RbacAuthorizationV1Api;
|
||||
}
|
||||
|
||||
export function makeKubeClients(kc: KubeConfig): KubeClients {
|
||||
return {
|
||||
core: kc.makeApiClient(CoreV1Api),
|
||||
batch: kc.makeApiClient(BatchV1Api),
|
||||
custom: kc.makeApiClient(CustomObjectsApi),
|
||||
networking: kc.makeApiClient(NetworkingV1Api),
|
||||
rbac: kc.makeApiClient(RbacAuthorizationV1Api),
|
||||
};
|
||||
}
|
||||
@@ -0,0 +1,122 @@
|
||||
import type { PaperclipPluginManifestV1 } from "@paperclipai/plugin-sdk";
|
||||
|
||||
const PLUGIN_ID = "paperclip.kubernetes-sandbox-provider";
|
||||
const PLUGIN_VERSION = "0.1.0-alpha.1";
|
||||
|
||||
const manifest: PaperclipPluginManifestV1 = {
|
||||
id: PLUGIN_ID,
|
||||
apiVersion: 1,
|
||||
version: PLUGIN_VERSION,
|
||||
displayName: "Kubernetes Sandbox (alpha)",
|
||||
description:
|
||||
"Built on kubernetes-sigs/agent-sandbox (v1alpha1). ALPHA — expect breaking changes as the upstream CRD evolves. Falls back to stable batch/v1 Job mode for clusters without agent-sandbox installed. First-party Paperclip sandbox-provider plugin for Kubernetes.",
|
||||
author: "Paperclip",
|
||||
categories: ["automation"],
|
||||
capabilities: ["environment.drivers.register"],
|
||||
entrypoints: {
|
||||
worker: "./dist/worker.js",
|
||||
},
|
||||
environmentDrivers: [
|
||||
{
|
||||
driverKey: "kubernetes",
|
||||
kind: "sandbox_provider",
|
||||
displayName: "Kubernetes",
|
||||
description:
|
||||
"Dispatches agent runs in per-tenant Kubernetes namespaces. Default backend (sandbox-cr, alpha) uses kubernetes-sigs/agent-sandbox for multi-command exec; fallback backend (job) uses stable batch/v1 Job for clusters without agent-sandbox installed.",
|
||||
configSchema: {
|
||||
type: "object",
|
||||
properties: {
|
||||
inCluster: {
|
||||
type: "boolean",
|
||||
description:
|
||||
"When true, the plugin uses the in-pod ServiceAccount credentials. Requires paperclip-server to be running inside the target cluster.",
|
||||
},
|
||||
kubeconfig: {
|
||||
type: "string",
|
||||
format: "secret-ref",
|
||||
description:
|
||||
"Inline kubeconfig YAML. Paste a kubeconfig or an existing Paperclip secret reference; pasted values are stored as company secrets.",
|
||||
},
|
||||
namespacePrefix: {
|
||||
type: "string",
|
||||
description: "Prefix for the per-company tenant namespace (default: paperclip-).",
|
||||
},
|
||||
companySlug: {
|
||||
type: "string",
|
||||
description: "Override the auto-derived company slug used in the tenant namespace name.",
|
||||
},
|
||||
imageRegistry: {
|
||||
type: "string",
|
||||
description: "Override the default registry for agent runtime images (default: ghcr.io/paperclipai).",
|
||||
},
|
||||
imageAllowList: {
|
||||
type: "array",
|
||||
items: { type: "string" },
|
||||
description:
|
||||
"Glob patterns of allowed `target.imageOverride` values. Empty list = no override permitted.",
|
||||
},
|
||||
imagePullSecrets: {
|
||||
type: "array",
|
||||
items: { type: "string" },
|
||||
description: "Names of pre-created Docker image pull secrets in the tenant namespace.",
|
||||
},
|
||||
egressAllowFqdns: {
|
||||
type: "array",
|
||||
items: { type: "string" },
|
||||
description:
|
||||
"Additional FQDNs to allow egress to from agent pods. Adapter-default FQDNs (e.g. api.anthropic.com) are added automatically.",
|
||||
},
|
||||
egressAllowCidrs: {
|
||||
type: "array",
|
||||
items: { type: "string" },
|
||||
description: "Additional CIDRs to allow egress to from agent pods.",
|
||||
},
|
||||
egressMode: {
|
||||
type: "string",
|
||||
enum: ["standard", "cilium"],
|
||||
description:
|
||||
"Network policy mode. `standard` uses NetworkPolicy and allows public HTTPS when adapter FQDNs are configured; `cilium` enables exact FQDN egress filtering via CiliumNetworkPolicy.",
|
||||
},
|
||||
runtimeClassName: {
|
||||
type: "string",
|
||||
description:
|
||||
"Optional RuntimeClass for pod isolation (e.g. `kata-fc` for Firecracker-backed microVMs). Cluster must have the RuntimeClass installed.",
|
||||
},
|
||||
serviceAccountAnnotations: {
|
||||
type: "object",
|
||||
additionalProperties: { type: "string" },
|
||||
description:
|
||||
"Annotations applied to the per-tenant ServiceAccount (e.g. `eks.amazonaws.com/role-arn` for IRSA).",
|
||||
},
|
||||
jobTtlSecondsAfterFinished: {
|
||||
type: "integer",
|
||||
minimum: 0,
|
||||
description: "Seconds after a Job completes before it is garbage-collected (default: 900).",
|
||||
},
|
||||
podActivityDeadlineSec: {
|
||||
type: "integer",
|
||||
minimum: 1,
|
||||
description: "Hard ceiling on a single run's wall-clock time (default: 3600).",
|
||||
},
|
||||
adapterType: {
|
||||
type: "string",
|
||||
description:
|
||||
"The adapter type that Jobs in this environment will run (e.g. `claude_local`, `codex_local`). Defaults to `claude_local`. Each environment is bound to one adapter; create multiple environments for different adapters.",
|
||||
},
|
||||
backend: {
|
||||
type: "string",
|
||||
enum: ["sandbox-cr", "job"],
|
||||
description:
|
||||
"sandbox-cr (default, alpha — requires kubernetes-sigs/agent-sandbox installed) | job (stable fallback — batch/v1 Job, one-shot entrypoint, no multi-command exec)",
|
||||
},
|
||||
},
|
||||
anyOf: [
|
||||
{ required: ["inCluster"] },
|
||||
{ required: ["kubeconfig"] },
|
||||
],
|
||||
},
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
export default manifest;
|
||||
@@ -0,0 +1,101 @@
|
||||
export interface BuildNetworkPolicyInput {
|
||||
namespace: string;
|
||||
paperclipServerNamespace: string;
|
||||
egressAllowFqdns: string[];
|
||||
egressAllowCidrs: string[];
|
||||
}
|
||||
|
||||
const PUBLIC_IPV4_EXCEPTIONS = [
|
||||
"10.0.0.0/8",
|
||||
"100.64.0.0/10",
|
||||
"127.0.0.0/8",
|
||||
"169.254.0.0/16",
|
||||
"172.16.0.0/12",
|
||||
"192.168.0.0/16",
|
||||
];
|
||||
|
||||
// Design note: the deny-all baseline blocks all ingress to agent pods.
|
||||
// Paperclip-server does NOT push to agent pods — the agent shim makes
|
||||
// outbound calls to paperclip-server via the egress allow-list (port 3100).
|
||||
// This pull/callback model means no ingress rule is needed. If a future
|
||||
// feature requires server→agent push (e.g. forced shutdown, live exec),
|
||||
// add a targeted ingress rule here scoped to the paperclip-server pod
|
||||
// selector.
|
||||
//
|
||||
// Standard Kubernetes NetworkPolicy cannot express FQDN allow-lists. When
|
||||
// adapter defaults require FQDN egress, keep runs functional by allowing public
|
||||
// IPv4 HTTPS while excluding private/link-local ranges. Operators who need
|
||||
// exact FQDN enforcement should use egressMode="cilium".
|
||||
export function buildNetworkPolicyManifests(input: BuildNetworkPolicyInput): Record<string, unknown>[] {
|
||||
const fqdnsRequirePublicHttpsFallback = input.egressAllowFqdns.length > 0;
|
||||
const denyAll = {
|
||||
apiVersion: "networking.k8s.io/v1",
|
||||
kind: "NetworkPolicy",
|
||||
metadata: {
|
||||
name: "paperclip-deny-all",
|
||||
namespace: input.namespace,
|
||||
labels: { "paperclip.io/managed-by": "paperclip-k8s-plugin" },
|
||||
},
|
||||
spec: {
|
||||
podSelector: {},
|
||||
policyTypes: ["Ingress", "Egress"],
|
||||
},
|
||||
};
|
||||
|
||||
const egressAllow: Record<string, unknown> = {
|
||||
apiVersion: "networking.k8s.io/v1",
|
||||
kind: "NetworkPolicy",
|
||||
metadata: {
|
||||
name: "paperclip-egress-allow",
|
||||
namespace: input.namespace,
|
||||
labels: { "paperclip.io/managed-by": "paperclip-k8s-plugin" },
|
||||
},
|
||||
spec: {
|
||||
podSelector: { matchLabels: { "paperclip.io/role": "agent" } },
|
||||
policyTypes: ["Egress"],
|
||||
egress: [
|
||||
{
|
||||
to: [
|
||||
{
|
||||
namespaceSelector: { matchLabels: { "kubernetes.io/metadata.name": "kube-system" } },
|
||||
podSelector: { matchLabels: { "k8s-app": "kube-dns" } },
|
||||
},
|
||||
],
|
||||
ports: [
|
||||
{ protocol: "UDP", port: 53 },
|
||||
{ protocol: "TCP", port: 53 },
|
||||
],
|
||||
},
|
||||
{
|
||||
to: [
|
||||
{
|
||||
namespaceSelector: { matchLabels: { "kubernetes.io/metadata.name": input.paperclipServerNamespace } },
|
||||
podSelector: { matchLabels: { app: "paperclip-server" } },
|
||||
},
|
||||
],
|
||||
ports: [{ protocol: "TCP", port: 3100 }],
|
||||
},
|
||||
...(fqdnsRequirePublicHttpsFallback
|
||||
? [
|
||||
{
|
||||
to: [
|
||||
{
|
||||
ipBlock: {
|
||||
cidr: "0.0.0.0/0",
|
||||
except: PUBLIC_IPV4_EXCEPTIONS,
|
||||
},
|
||||
},
|
||||
],
|
||||
ports: [{ protocol: "TCP", port: 443 }],
|
||||
},
|
||||
]
|
||||
: []),
|
||||
...input.egressAllowCidrs.map((cidr) => ({
|
||||
to: [{ ipBlock: { cidr } }],
|
||||
})),
|
||||
],
|
||||
},
|
||||
};
|
||||
|
||||
return [denyAll, egressAllow];
|
||||
}
|
||||
@@ -0,0 +1,554 @@
|
||||
import { randomBytes } from "node:crypto";
|
||||
import { definePlugin } from "@paperclipai/plugin-sdk";
|
||||
import type {
|
||||
PluginEnvironmentAcquireLeaseParams,
|
||||
PluginEnvironmentExecuteParams,
|
||||
PluginEnvironmentExecuteResult,
|
||||
PluginEnvironmentLease,
|
||||
PluginEnvironmentProbeParams,
|
||||
PluginEnvironmentProbeResult,
|
||||
PluginEnvironmentRealizeWorkspaceParams,
|
||||
PluginEnvironmentRealizeWorkspaceResult,
|
||||
PluginEnvironmentReleaseLeaseParams,
|
||||
PluginEnvironmentValidateConfigParams,
|
||||
PluginEnvironmentValidationResult,
|
||||
} from "@paperclipai/plugin-sdk";
|
||||
import {
|
||||
kubernetesProviderConfigSchema,
|
||||
type KubernetesProviderConfig,
|
||||
type KubernetesLeaseMetadata,
|
||||
} from "./types.js";
|
||||
import { createKubeConfig, makeKubeClients } from "./kube-client.js";
|
||||
import { getAdapterDefaults } from "./adapter-defaults.js";
|
||||
import { resolveImage } from "./image-allowlist.js";
|
||||
import { buildJobManifest } from "./pod-spec-builder.js";
|
||||
import { buildSandboxCrManifest } from "./sandbox-cr-builder.js";
|
||||
import { ensureTenant } from "./tenant-orchestrator.js";
|
||||
import { createPerRunSecret } from "./secret-manager.js";
|
||||
import { jobOrchestrator, JobTimeoutError } from "./job-orchestrator.js";
|
||||
import {
|
||||
sandboxCrOrchestrator,
|
||||
SandboxCrTimeoutError,
|
||||
} from "./sandbox-cr-orchestrator.js";
|
||||
import { execInPod } from "./pod-exec.js";
|
||||
import {
|
||||
deriveCompanySlug,
|
||||
deriveNamespaceName,
|
||||
newRunUlidDns,
|
||||
paperclipLabels,
|
||||
} from "./utils.js";
|
||||
|
||||
// The namespace paperclip-server itself runs in. Used when building
|
||||
// NetworkPolicy manifests so the tenant namespace allows inbound traffic
|
||||
// from the server pod.
|
||||
const PAPERCLIP_SERVER_NAMESPACE = "paperclip";
|
||||
|
||||
// Name of the ServiceAccount created inside each tenant namespace by ensureTenant.
|
||||
const TENANT_SERVICE_ACCOUNT = "paperclip-tenant-sa";
|
||||
|
||||
// Resource quota defaults applied to every tenant namespace (M4b; tunable via
|
||||
// config in a future milestone).
|
||||
const DEFAULT_RESOURCE_QUOTA = {
|
||||
pods: "20",
|
||||
requestsCpu: "10",
|
||||
requestsMemory: "20Gi",
|
||||
limitsCpu: "20",
|
||||
limitsMemory: "40Gi",
|
||||
};
|
||||
|
||||
function deriveTenantNamespace(config: KubernetesProviderConfig, companyId: string): string {
|
||||
// TODO: future versions could thread companyName through AcquireLeaseParams
|
||||
// to get a friendlier slug (e.g. "acme-corp") instead of the UUID-derived one.
|
||||
const slug = config.companySlug ?? deriveCompanySlug(companyId);
|
||||
return deriveNamespaceName(config.namespacePrefix, slug);
|
||||
}
|
||||
|
||||
/**
|
||||
* Reads adapter env keys (e.g. ANTHROPIC_API_KEY) from the current process
|
||||
* environment. The plugin worker runs inside paperclip-server's pod, which has
|
||||
* these vars injected at deploy time.
|
||||
*
|
||||
* M4b approach: env vars sourced from process.env at acquire time.
|
||||
* TODO: future milestones may thread per-run secrets differently (e.g. via
|
||||
* a secret store reference on the environment config).
|
||||
*/
|
||||
function extractAdapterEnvFromProcess(envKeys: string[]): Record<string, string> {
|
||||
const out: Record<string, string> = {};
|
||||
for (const k of envKeys) {
|
||||
const v = process.env[k];
|
||||
if (v) out[k] = v;
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
function generateBootstrapToken(): string {
|
||||
// TODO: paperclip-server's actual callback auth scheme is separate and is
|
||||
// out of M4b scope. This per-run random token is stored in the per-run
|
||||
// Secret and consumed by paperclip-agent-shim for initial registration.
|
||||
return randomBytes(32).toString("hex");
|
||||
}
|
||||
|
||||
const plugin = definePlugin({
|
||||
async setup(ctx) {
|
||||
ctx.logger.info("Kubernetes sandbox provider plugin ready");
|
||||
},
|
||||
|
||||
async onHealth() {
|
||||
return { status: "ok", message: "Kubernetes sandbox provider plugin healthy" };
|
||||
},
|
||||
|
||||
async onEnvironmentValidateConfig(
|
||||
params: PluginEnvironmentValidateConfigParams,
|
||||
): Promise<PluginEnvironmentValidationResult> {
|
||||
const parsed = kubernetesProviderConfigSchema.safeParse(params.config);
|
||||
if (!parsed.success) {
|
||||
return {
|
||||
ok: false,
|
||||
errors: parsed.error.issues.map((i) => i.message),
|
||||
};
|
||||
}
|
||||
const warnings: string[] = [];
|
||||
const cfg = parsed.data;
|
||||
const adapterDefaults = getAdapterDefaults(cfg.adapterType);
|
||||
const totalFqdns = [...adapterDefaults.allowFqdns, ...cfg.egressAllowFqdns];
|
||||
if (cfg.egressMode === "standard" && totalFqdns.length > 0) {
|
||||
warnings.push(
|
||||
`egressMode=standard cannot enforce FQDN-based egress rules for ${totalFqdns.join(", ")}. Agent pods will get public IPv4 HTTPS egress with private/link-local ranges excluded. Switch egressMode to "cilium" for exact FQDN enforcement.`,
|
||||
);
|
||||
}
|
||||
return { ok: true, normalizedConfig: cfg as Record<string, unknown>, warnings: warnings.length > 0 ? warnings : undefined };
|
||||
},
|
||||
|
||||
async onEnvironmentProbe(
|
||||
params: PluginEnvironmentProbeParams,
|
||||
): Promise<PluginEnvironmentProbeResult> {
|
||||
const parsed = kubernetesProviderConfigSchema.safeParse(params.config);
|
||||
if (!parsed.success) {
|
||||
return {
|
||||
ok: false,
|
||||
summary: "Invalid Kubernetes provider configuration.",
|
||||
metadata: {
|
||||
errors: parsed.error.issues.map((i) => i.message),
|
||||
},
|
||||
};
|
||||
}
|
||||
const config = parsed.data;
|
||||
const namespace = deriveTenantNamespace(config, params.companyId);
|
||||
|
||||
try {
|
||||
const kc = createKubeConfig({
|
||||
inCluster: config.inCluster,
|
||||
kubeconfig: config.kubeconfig,
|
||||
});
|
||||
const clients = makeKubeClients(kc);
|
||||
// Reachability check: list pods in the tenant namespace. If the namespace
|
||||
// doesn't exist yet this will throw a 404 which we treat as "reachable
|
||||
// but namespace not provisioned" — still a successful probe.
|
||||
try {
|
||||
await clients.core.listNamespacedPod({ namespace });
|
||||
} catch (err) {
|
||||
const code = (err as { code?: number; statusCode?: number }).code
|
||||
?? (err as { code?: number; statusCode?: number }).statusCode;
|
||||
if (code !== 404) throw err;
|
||||
// 404 means namespace doesn't exist yet — cluster is reachable.
|
||||
}
|
||||
return {
|
||||
ok: true,
|
||||
summary: `Kubernetes cluster reachable. Tenant namespace: ${namespace}.`,
|
||||
metadata: { namespace, provider: "kubernetes" },
|
||||
};
|
||||
} catch (err) {
|
||||
return {
|
||||
ok: false,
|
||||
summary: "Kubernetes cluster probe failed.",
|
||||
metadata: {
|
||||
namespace,
|
||||
provider: "kubernetes",
|
||||
error: err instanceof Error ? err.message : String(err),
|
||||
},
|
||||
};
|
||||
}
|
||||
},
|
||||
|
||||
async onEnvironmentAcquireLease(
|
||||
params: PluginEnvironmentAcquireLeaseParams,
|
||||
): Promise<PluginEnvironmentLease> {
|
||||
const config = kubernetesProviderConfigSchema.parse(params.config);
|
||||
const namespace = deriveTenantNamespace(config, params.companyId);
|
||||
|
||||
// Emit a runtime warning if FQDNs are configured but egressMode=standard
|
||||
// cannot enforce them. Mirrors the validateConfig warning so operators see
|
||||
// it in paperclip-server logs even if they missed the validation step.
|
||||
const adapterDefaultsForWarn = getAdapterDefaults(config.adapterType);
|
||||
const totalFqdnsForWarn = [...adapterDefaultsForWarn.allowFqdns, ...config.egressAllowFqdns];
|
||||
if (config.egressMode === "standard" && totalFqdnsForWarn.length > 0) {
|
||||
console.warn(
|
||||
`[plugin-kubernetes] egressMode=standard cannot enforce FQDN-based egress rules for ${totalFqdnsForWarn.join(", ")}. Agent pods will get public IPv4 HTTPS egress with private/link-local ranges excluded. Switch egressMode to "cilium" for exact FQDN enforcement.`,
|
||||
);
|
||||
}
|
||||
|
||||
const kc = createKubeConfig({
|
||||
inCluster: config.inCluster,
|
||||
kubeconfig: config.kubeconfig,
|
||||
});
|
||||
const clients = makeKubeClients(kc);
|
||||
|
||||
// Ensure the tenant namespace and all its RBAC / network policy resources
|
||||
// exist before we try to create the Job.
|
||||
const adapterDefaults = getAdapterDefaults(config.adapterType);
|
||||
|
||||
await ensureTenant(clients, {
|
||||
namespace,
|
||||
companyId: params.companyId,
|
||||
paperclipServerNamespace: PAPERCLIP_SERVER_NAMESPACE,
|
||||
serviceAccountAnnotations: config.serviceAccountAnnotations,
|
||||
egressMode: config.egressMode,
|
||||
egressAllowFqdns: [...adapterDefaults.allowFqdns, ...config.egressAllowFqdns],
|
||||
egressAllowCidrs: config.egressAllowCidrs,
|
||||
resourceQuota: DEFAULT_RESOURCE_QUOTA,
|
||||
});
|
||||
|
||||
const jobName = `pc-${newRunUlidDns()}`;
|
||||
const secretName = `${jobName}-env`;
|
||||
|
||||
// TODO: use params.runId as stand-in for agentId in labels; future
|
||||
// versions will have a dedicated agentId on AcquireLeaseParams.
|
||||
const labels = paperclipLabels({
|
||||
runId: params.runId,
|
||||
agentId: params.runId,
|
||||
companyId: params.companyId,
|
||||
adapterType: config.adapterType,
|
||||
});
|
||||
|
||||
const image = resolveImage(
|
||||
{ imageOverride: null },
|
||||
adapterDefaults,
|
||||
{ imageAllowList: config.imageAllowList, imageRegistry: config.imageRegistry },
|
||||
);
|
||||
|
||||
// Pick the orchestrator and build the appropriate manifest based on backend.
|
||||
const isSandboxCrBackend = config.backend === "sandbox-cr";
|
||||
const orchestrator = isSandboxCrBackend ? sandboxCrOrchestrator : jobOrchestrator;
|
||||
|
||||
const manifest = isSandboxCrBackend
|
||||
? buildSandboxCrManifest({
|
||||
namespace,
|
||||
sandboxName: jobName,
|
||||
adapterType: config.adapterType,
|
||||
image,
|
||||
envSecretName: secretName,
|
||||
serviceAccountName: TENANT_SERVICE_ACCOUNT,
|
||||
labels,
|
||||
resources: config.defaultResources ?? {},
|
||||
runtimeClassName: config.runtimeClassName,
|
||||
imagePullSecrets: config.imagePullSecrets,
|
||||
})
|
||||
: buildJobManifest({
|
||||
namespace,
|
||||
jobName,
|
||||
adapterType: config.adapterType,
|
||||
image,
|
||||
envSecretName: secretName,
|
||||
serviceAccountName: TENANT_SERVICE_ACCOUNT,
|
||||
labels,
|
||||
resources: config.defaultResources ?? {},
|
||||
runtimeClassName: config.runtimeClassName,
|
||||
activeDeadlineSec: config.podActivityDeadlineSec,
|
||||
ttlSecondsAfterFinished: config.jobTtlSecondsAfterFinished,
|
||||
imagePullSecrets: config.imagePullSecrets,
|
||||
});
|
||||
|
||||
const { uid: ownerUid } = await orchestrator.claim(clients, namespace, manifest);
|
||||
|
||||
// M4b: adapter env vars are sourced from the plugin worker's own process
|
||||
// environment (paperclip-server pod has them injected at deploy time).
|
||||
const adapterEnv = extractAdapterEnvFromProcess(adapterDefaults.envKeys);
|
||||
const bootstrapToken = generateBootstrapToken();
|
||||
|
||||
// Secret ownerRef: for job backend, the Job owns the Secret (cascade delete).
|
||||
// For sandbox-cr backend, the Sandbox CR owns the Secret.
|
||||
// NOTE: For sandbox-cr, if the Secret outlives the Sandbox due to a cluster
|
||||
// quirk, the release() call will still clean it up via namespace GC or
|
||||
// explicit delete in a future milestone.
|
||||
await createPerRunSecret(clients, {
|
||||
namespace,
|
||||
secretName,
|
||||
runId: params.runId,
|
||||
ownerKind: isSandboxCrBackend ? "Sandbox" : "Job",
|
||||
ownerApiVersion: isSandboxCrBackend ? "agents.x-k8s.io/v1alpha1" : "batch/v1",
|
||||
ownerName: jobName,
|
||||
ownerUid,
|
||||
bootstrapToken,
|
||||
adapterEnv,
|
||||
});
|
||||
|
||||
const podName = await orchestrator.findPod(clients, namespace, jobName);
|
||||
|
||||
const leaseMetadata: KubernetesLeaseMetadata = {
|
||||
namespace,
|
||||
jobName,
|
||||
podName,
|
||||
secretName,
|
||||
phase: "Pending",
|
||||
backend: config.backend,
|
||||
};
|
||||
|
||||
return {
|
||||
providerLeaseId: jobName,
|
||||
metadata: leaseMetadata as unknown as Record<string, unknown>,
|
||||
};
|
||||
},
|
||||
|
||||
async onEnvironmentRealizeWorkspace(
|
||||
params: PluginEnvironmentRealizeWorkspaceParams,
|
||||
): Promise<PluginEnvironmentRealizeWorkspaceResult> {
|
||||
// The agent pod already has /workspace mounted as an emptyDir at pod
|
||||
// scheduling time (see pod-spec-builder). Nothing to provision here —
|
||||
// we just hand back the cwd. Honor a caller-supplied remotePath if set.
|
||||
const cwd =
|
||||
params.workspace.remotePath && params.workspace.remotePath.trim().length > 0
|
||||
? params.workspace.remotePath.trim()
|
||||
: "/workspace";
|
||||
return {
|
||||
cwd,
|
||||
metadata: {
|
||||
provider: "kubernetes",
|
||||
remoteCwd: cwd,
|
||||
},
|
||||
};
|
||||
},
|
||||
|
||||
async onEnvironmentReleaseLease(
|
||||
params: PluginEnvironmentReleaseLeaseParams,
|
||||
): Promise<void> {
|
||||
if (!params.providerLeaseId) return;
|
||||
const config = kubernetesProviderConfigSchema.parse(params.config);
|
||||
const namespace =
|
||||
typeof params.leaseMetadata?.namespace === "string"
|
||||
? params.leaseMetadata.namespace
|
||||
: deriveTenantNamespace(config, params.companyId);
|
||||
|
||||
const kc = createKubeConfig({
|
||||
inCluster: config.inCluster,
|
||||
kubeconfig: config.kubeconfig,
|
||||
});
|
||||
const clients = makeKubeClients(kc);
|
||||
|
||||
const leaseBackend =
|
||||
typeof params.leaseMetadata?.backend === "string"
|
||||
? (params.leaseMetadata.backend as "sandbox-cr" | "job")
|
||||
: config.backend;
|
||||
const releaseOrchestrator =
|
||||
leaseBackend === "sandbox-cr" ? sandboxCrOrchestrator : jobOrchestrator;
|
||||
|
||||
try {
|
||||
await releaseOrchestrator.release(clients, namespace, params.providerLeaseId);
|
||||
} catch (err) {
|
||||
// If the resource is already gone (404), that's fine.
|
||||
const code = (err as { code?: number; statusCode?: number }).code
|
||||
?? (err as { code?: number; statusCode?: number }).statusCode;
|
||||
if (code !== 404) throw err;
|
||||
}
|
||||
},
|
||||
|
||||
async onEnvironmentExecute(
|
||||
params: PluginEnvironmentExecuteParams,
|
||||
): Promise<PluginEnvironmentExecuteResult> {
|
||||
const { lease, timeoutMs } = params;
|
||||
|
||||
if (!lease.providerLeaseId) {
|
||||
return {
|
||||
exitCode: 1,
|
||||
timedOut: false,
|
||||
stdout: "",
|
||||
stderr: "No provider lease ID available for execution.",
|
||||
};
|
||||
}
|
||||
|
||||
const config = kubernetesProviderConfigSchema.parse(params.config);
|
||||
const namespace =
|
||||
typeof lease.metadata?.namespace === "string"
|
||||
? lease.metadata.namespace
|
||||
: deriveTenantNamespace(config, params.companyId);
|
||||
|
||||
// Determine which backend this lease was created with.
|
||||
const leaseBackend =
|
||||
typeof lease.metadata?.backend === "string"
|
||||
? (lease.metadata.backend as "sandbox-cr" | "job")
|
||||
: config.backend;
|
||||
|
||||
const kc = createKubeConfig({
|
||||
inCluster: config.inCluster,
|
||||
kubeconfig: config.kubeconfig,
|
||||
});
|
||||
const clients = makeKubeClients(kc);
|
||||
|
||||
const effectiveTimeoutMs =
|
||||
typeof timeoutMs === "number" && timeoutMs > 0
|
||||
? timeoutMs
|
||||
: config.podActivityDeadlineSec * 1000;
|
||||
|
||||
if (leaseBackend === "sandbox-cr") {
|
||||
// ── Sandbox-CR backend ──────────────────────────────────────────────────
|
||||
// 1. Ensure the Sandbox pod is Ready (wait if needed).
|
||||
// 2. Exec the command into the running pod.
|
||||
// 3. Return exec result directly (no log scraping needed).
|
||||
|
||||
let podName =
|
||||
typeof lease.metadata?.podName === "string" && lease.metadata.podName
|
||||
? lease.metadata.podName
|
||||
: null;
|
||||
|
||||
// Wait for pod Ready if we don't have a pod name yet (or as a health check).
|
||||
try {
|
||||
await sandboxCrOrchestrator.waitForCompletion(
|
||||
clients,
|
||||
namespace,
|
||||
lease.providerLeaseId,
|
||||
{ timeoutMs: effectiveTimeoutMs, pollMs: 2000 },
|
||||
);
|
||||
} catch (err) {
|
||||
if (err instanceof SandboxCrTimeoutError) {
|
||||
return {
|
||||
exitCode: null,
|
||||
timedOut: true,
|
||||
stdout: "",
|
||||
stderr: `Sandbox pod did not become Ready within ${effectiveTimeoutMs}ms`,
|
||||
metadata: {
|
||||
provider: "kubernetes",
|
||||
backend: "sandbox-cr",
|
||||
namespace,
|
||||
sandboxName: lease.providerLeaseId,
|
||||
},
|
||||
};
|
||||
}
|
||||
throw err;
|
||||
}
|
||||
|
||||
// Resolve pod name (may now be populated in Sandbox status).
|
||||
if (!podName) {
|
||||
podName = await sandboxCrOrchestrator.findPod(
|
||||
clients,
|
||||
namespace,
|
||||
lease.providerLeaseId,
|
||||
);
|
||||
}
|
||||
|
||||
if (!podName) {
|
||||
return {
|
||||
exitCode: 1,
|
||||
timedOut: false,
|
||||
stdout: "",
|
||||
stderr: "Sandbox pod is Ready but podName could not be resolved.",
|
||||
metadata: {
|
||||
provider: "kubernetes",
|
||||
backend: "sandbox-cr",
|
||||
namespace,
|
||||
sandboxName: lease.providerLeaseId,
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
// Build the command to exec. If params.command is provided use it;
|
||||
// otherwise wrap in a login shell so profile scripts run.
|
||||
const rawCommand =
|
||||
typeof params.command === "string" && params.command.trim().length > 0
|
||||
? params.command
|
||||
: params.args?.join(" ") ?? "";
|
||||
|
||||
const execCommand = rawCommand.length > 0
|
||||
? ["/bin/sh", "-lc", rawCommand]
|
||||
: ["/bin/sh", "-l"];
|
||||
|
||||
const execResult = await execInPod(
|
||||
kc,
|
||||
namespace,
|
||||
podName,
|
||||
"agent",
|
||||
execCommand,
|
||||
typeof params.stdin === "string" ? params.stdin : undefined,
|
||||
);
|
||||
|
||||
return {
|
||||
exitCode: execResult.exitCode,
|
||||
timedOut: false,
|
||||
stdout: execResult.stdout,
|
||||
stderr: execResult.stderr,
|
||||
metadata: {
|
||||
provider: "kubernetes",
|
||||
backend: "sandbox-cr",
|
||||
namespace,
|
||||
sandboxName: lease.providerLeaseId,
|
||||
podName,
|
||||
},
|
||||
};
|
||||
} else {
|
||||
// ── Job backend (legacy / stable fallback) ──────────────────────────────
|
||||
// The container entrypoint is baked into the Job spec (Tini + paperclip-agent-shim).
|
||||
// We do NOT re-exec command/args — instead we wait for the Job to finish
|
||||
// and collect its logs.
|
||||
//
|
||||
// params.command / params.args / params.stdin are intentionally ignored.
|
||||
|
||||
let status;
|
||||
let timedOut = false;
|
||||
try {
|
||||
status = await jobOrchestrator.waitForCompletion(
|
||||
clients,
|
||||
namespace,
|
||||
lease.providerLeaseId,
|
||||
{ timeoutMs: effectiveTimeoutMs, pollMs: 2000 },
|
||||
);
|
||||
} catch (err) {
|
||||
if (err instanceof JobTimeoutError) {
|
||||
timedOut = true;
|
||||
status = null;
|
||||
} else {
|
||||
throw err;
|
||||
}
|
||||
}
|
||||
|
||||
// Collect logs from the pod.
|
||||
const podName =
|
||||
typeof lease.metadata?.podName === "string"
|
||||
? lease.metadata.podName
|
||||
: await jobOrchestrator.findPod(
|
||||
clients,
|
||||
namespace,
|
||||
lease.providerLeaseId,
|
||||
);
|
||||
|
||||
const stdoutChunks: string[] = [];
|
||||
const stderrChunks: string[] = [];
|
||||
|
||||
if (podName) {
|
||||
await jobOrchestrator.streamLogs(
|
||||
clients,
|
||||
namespace,
|
||||
podName,
|
||||
async (stream, text) => {
|
||||
if (stream === "stdout") stdoutChunks.push(text);
|
||||
else stderrChunks.push(text);
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
return {
|
||||
exitCode: timedOut ? null : status?.phase === "Succeeded" ? 0 : 1,
|
||||
timedOut,
|
||||
stdout: stdoutChunks.join(""),
|
||||
stderr: stderrChunks.join(""),
|
||||
metadata: {
|
||||
provider: "kubernetes",
|
||||
backend: "job",
|
||||
namespace,
|
||||
jobName: lease.providerLeaseId,
|
||||
podName: podName ?? null,
|
||||
phase: status?.phase ?? null,
|
||||
},
|
||||
};
|
||||
}
|
||||
},
|
||||
});
|
||||
|
||||
export default plugin;
|
||||
@@ -0,0 +1,79 @@
|
||||
/**
|
||||
* Exec a command inside a running pod container using the Kubernetes exec API.
|
||||
*
|
||||
* Uses @kubernetes/client-node's Exec class, which opens a WebSocket to the
|
||||
* kube-apiserver and streams stdout/stderr. The statusCallback receives a V1Status
|
||||
* with status="Success" or status="Failure" + details.causes[{reason:"ExitCode"}].
|
||||
*
|
||||
* NOTE: tty=false so stdout and stderr arrive on separate channels. If tty=true
|
||||
* were used, they would be merged onto stdout and the exit code would not be
|
||||
* reliable from the status callback on older cluster versions.
|
||||
*/
|
||||
|
||||
import { Exec } from "@kubernetes/client-node";
|
||||
import { PassThrough } from "node:stream";
|
||||
import type { KubeConfig } from "@kubernetes/client-node";
|
||||
|
||||
export async function execInPod(
|
||||
kc: KubeConfig,
|
||||
namespace: string,
|
||||
podName: string,
|
||||
containerName: string,
|
||||
command: string[],
|
||||
stdin?: string,
|
||||
): Promise<{ exitCode: number; stdout: string; stderr: string }> {
|
||||
const exec = new Exec(kc);
|
||||
const stdoutStream = new PassThrough();
|
||||
const stderrStream = new PassThrough();
|
||||
|
||||
// If stdin is provided build a readable stream from it; the Exec API accepts
|
||||
// a Readable | null for stdin.
|
||||
const stdinStream: import("node:stream").Readable | null = stdin
|
||||
? PassThrough.from(stdin)
|
||||
: null;
|
||||
|
||||
let stdoutData = "";
|
||||
let stderrData = "";
|
||||
|
||||
stdoutStream.on("data", (chunk: Buffer) => {
|
||||
stdoutData += chunk.toString("utf-8");
|
||||
});
|
||||
stderrStream.on("data", (chunk: Buffer) => {
|
||||
stderrData += chunk.toString("utf-8");
|
||||
});
|
||||
|
||||
return await new Promise<{ exitCode: number; stdout: string; stderr: string }>(
|
||||
(resolve, reject) => {
|
||||
exec
|
||||
.exec(
|
||||
namespace,
|
||||
podName,
|
||||
containerName,
|
||||
command,
|
||||
stdoutStream,
|
||||
stderrStream,
|
||||
stdinStream,
|
||||
false, // tty=false: keep stdout/stderr on separate channels
|
||||
(status) => {
|
||||
// status.status is "Success" | "Failure"
|
||||
if (status.status === "Success") {
|
||||
resolve({ exitCode: 0, stdout: stdoutData, stderr: stderrData });
|
||||
return;
|
||||
}
|
||||
// On failure, the exit code surfaces via
|
||||
// status.details?.causes[].{reason:"ExitCode", message:"<N>"}
|
||||
const causes = status.details?.causes ?? [];
|
||||
const exitCodeCause = causes.find(
|
||||
(c: { reason?: string; message?: string }) =>
|
||||
c.reason === "ExitCode",
|
||||
);
|
||||
const exitCode = exitCodeCause?.message
|
||||
? Number(exitCodeCause.message)
|
||||
: 1;
|
||||
resolve({ exitCode, stdout: stdoutData, stderr: stderrData });
|
||||
},
|
||||
)
|
||||
.catch(reject);
|
||||
},
|
||||
);
|
||||
}
|
||||
@@ -0,0 +1,94 @@
|
||||
export interface BuildJobManifestInput {
|
||||
namespace: string;
|
||||
jobName: string;
|
||||
adapterType: string;
|
||||
image: string;
|
||||
envSecretName: string;
|
||||
serviceAccountName: string;
|
||||
labels: Record<string, string>;
|
||||
resources: {
|
||||
requests?: { cpu?: string; memory?: string };
|
||||
limits?: { cpu?: string; memory?: string };
|
||||
};
|
||||
runtimeClassName?: string;
|
||||
activeDeadlineSec: number;
|
||||
ttlSecondsAfterFinished: number;
|
||||
imagePullSecrets?: string[];
|
||||
}
|
||||
|
||||
export function buildJobManifest(input: BuildJobManifestInput): Record<string, unknown> {
|
||||
const podLabels = {
|
||||
...input.labels,
|
||||
"paperclip.io/role": "agent",
|
||||
};
|
||||
return {
|
||||
apiVersion: "batch/v1",
|
||||
kind: "Job",
|
||||
metadata: {
|
||||
name: input.jobName,
|
||||
namespace: input.namespace,
|
||||
labels: { ...input.labels },
|
||||
},
|
||||
spec: {
|
||||
backoffLimit: 0,
|
||||
ttlSecondsAfterFinished: input.ttlSecondsAfterFinished,
|
||||
activeDeadlineSeconds: input.activeDeadlineSec,
|
||||
template: {
|
||||
metadata: { labels: podLabels },
|
||||
spec: {
|
||||
serviceAccountName: input.serviceAccountName,
|
||||
// Agent containers call back to paperclip-server via HTTPS egress;
|
||||
// they never call the Kubernetes API, so mounting an SA token is
|
||||
// unnecessary attack surface.
|
||||
automountServiceAccountToken: false,
|
||||
restartPolicy: "Never",
|
||||
...(input.runtimeClassName ? { runtimeClassName: input.runtimeClassName } : {}),
|
||||
...(input.imagePullSecrets && input.imagePullSecrets.length > 0
|
||||
? { imagePullSecrets: input.imagePullSecrets.map((name) => ({ name })) }
|
||||
: {}),
|
||||
securityContext: {
|
||||
runAsNonRoot: true,
|
||||
runAsUser: 1000,
|
||||
runAsGroup: 1000,
|
||||
fsGroup: 1000,
|
||||
fsGroupChangePolicy: "OnRootMismatch",
|
||||
seccompProfile: { type: "RuntimeDefault" },
|
||||
},
|
||||
containers: [
|
||||
{
|
||||
name: "agent",
|
||||
image: input.image,
|
||||
imagePullPolicy: "IfNotPresent",
|
||||
command: ["/usr/bin/tini", "--", "/usr/local/bin/paperclip-agent-shim"],
|
||||
envFrom: [{ secretRef: { name: input.envSecretName } }],
|
||||
securityContext: {
|
||||
runAsNonRoot: true,
|
||||
runAsUser: 1000,
|
||||
runAsGroup: 1000,
|
||||
readOnlyRootFilesystem: true,
|
||||
allowPrivilegeEscalation: false,
|
||||
capabilities: { drop: ["ALL"] },
|
||||
},
|
||||
resources: {
|
||||
requests: input.resources.requests ?? { cpu: "250m", memory: "512Mi" },
|
||||
limits: input.resources.limits ?? { cpu: "2", memory: "4Gi" },
|
||||
},
|
||||
volumeMounts: [
|
||||
{ name: "workspace", mountPath: "/workspace" },
|
||||
{ name: "home", mountPath: "/home/paperclip" },
|
||||
{ name: "cache", mountPath: "/home/paperclip/.cache" },
|
||||
{ name: "tmp", mountPath: "/tmp" },
|
||||
],
|
||||
},
|
||||
],
|
||||
volumes: [
|
||||
{ name: "workspace", emptyDir: { sizeLimit: "8Gi" } },
|
||||
{ name: "home", emptyDir: { sizeLimit: "1Gi" } },
|
||||
{ name: "cache", emptyDir: { sizeLimit: "1Gi" } },
|
||||
{ name: "tmp", emptyDir: { sizeLimit: "2Gi" } },
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
};
|
||||
}
|
||||
@@ -0,0 +1,136 @@
|
||||
/**
|
||||
* Builds a kubernetes-sigs/agent-sandbox Sandbox CR manifest.
|
||||
*
|
||||
* The Sandbox CR creates a long-lived pod (sleep infinity entrypoint) into
|
||||
* which paperclip-server can exec arbitrary commands. This solves the
|
||||
* architectural mismatch with the batch/v1 Job backend, which only supports
|
||||
* a single one-shot entrypoint — not the multi-command adapter-install pattern
|
||||
* used by paperclip-server.
|
||||
*
|
||||
* Security baseline is identical to buildJobManifest (pod-spec-builder.ts):
|
||||
* non-root, drop ALL caps, read-only rootFS, Tini PID 1, seccomp
|
||||
* RuntimeDefault, fsGroupChangePolicy OnRootMismatch, automountSAToken=false.
|
||||
*
|
||||
* NOTE: paperclip-server runs OUTSIDE the cluster, so we cannot set ownerReferences
|
||||
* on the Sandbox CR (the owner would need to be an in-cluster resource). The
|
||||
* release path is explicit delete via sandboxCrOrchestrator.release().
|
||||
*/
|
||||
|
||||
export interface BuildSandboxCrManifestInput {
|
||||
namespace: string;
|
||||
sandboxName: string;
|
||||
adapterType: string;
|
||||
image: string;
|
||||
envSecretName: string;
|
||||
serviceAccountName: string;
|
||||
labels: Record<string, string>;
|
||||
resources: {
|
||||
requests?: { cpu?: string; memory?: string };
|
||||
limits?: { cpu?: string; memory?: string };
|
||||
};
|
||||
runtimeClassName?: string;
|
||||
imagePullSecrets?: string[];
|
||||
}
|
||||
|
||||
export function buildSandboxCrManifest(
|
||||
input: BuildSandboxCrManifestInput,
|
||||
): Record<string, unknown> {
|
||||
const podLabels: Record<string, string> = {
|
||||
...input.labels,
|
||||
"paperclip.io/role": "agent",
|
||||
};
|
||||
return {
|
||||
apiVersion: "agents.x-k8s.io/v1alpha1",
|
||||
kind: "Sandbox",
|
||||
metadata: {
|
||||
name: input.sandboxName,
|
||||
namespace: input.namespace,
|
||||
labels: { ...input.labels },
|
||||
// No ownerReferences: paperclip-server is out-of-cluster. Release is
|
||||
// explicit delete.
|
||||
},
|
||||
spec: {
|
||||
podTemplate: {
|
||||
metadata: {
|
||||
labels: podLabels,
|
||||
},
|
||||
spec: {
|
||||
serviceAccountName: input.serviceAccountName,
|
||||
// Agent containers call back to paperclip-server via HTTPS egress;
|
||||
// they never call the Kubernetes API, so mounting an SA token is
|
||||
// unnecessary attack surface.
|
||||
automountServiceAccountToken: false,
|
||||
// Sandbox controller requires restartPolicy: Always so the pod
|
||||
// stays running between exec calls.
|
||||
restartPolicy: "Always",
|
||||
...(input.runtimeClassName
|
||||
? { runtimeClassName: input.runtimeClassName }
|
||||
: {}),
|
||||
...(input.imagePullSecrets && input.imagePullSecrets.length > 0
|
||||
? {
|
||||
imagePullSecrets: input.imagePullSecrets.map((name) => ({
|
||||
name,
|
||||
})),
|
||||
}
|
||||
: {}),
|
||||
securityContext: {
|
||||
runAsNonRoot: true,
|
||||
runAsUser: 1000,
|
||||
runAsGroup: 1000,
|
||||
fsGroup: 1000,
|
||||
fsGroupChangePolicy: "OnRootMismatch",
|
||||
seccompProfile: { type: "RuntimeDefault" },
|
||||
},
|
||||
containers: [
|
||||
{
|
||||
name: "agent",
|
||||
image: input.image,
|
||||
imagePullPolicy: "IfNotPresent",
|
||||
// sleep infinity keeps the pod running; paperclip-server execs
|
||||
// commands into it via Kubernetes exec API. Tini as PID 1 for
|
||||
// proper signal forwarding and zombie reaping.
|
||||
command: [
|
||||
"/usr/bin/tini",
|
||||
"--",
|
||||
"/bin/sh",
|
||||
"-c",
|
||||
"sleep infinity",
|
||||
],
|
||||
envFrom: [{ secretRef: { name: input.envSecretName } }],
|
||||
securityContext: {
|
||||
runAsNonRoot: true,
|
||||
runAsUser: 1000,
|
||||
runAsGroup: 1000,
|
||||
readOnlyRootFilesystem: true,
|
||||
allowPrivilegeEscalation: false,
|
||||
capabilities: { drop: ["ALL"] },
|
||||
},
|
||||
resources: {
|
||||
requests: input.resources.requests ?? {
|
||||
cpu: "250m",
|
||||
memory: "512Mi",
|
||||
},
|
||||
limits: input.resources.limits ?? {
|
||||
cpu: "2",
|
||||
memory: "4Gi",
|
||||
},
|
||||
},
|
||||
volumeMounts: [
|
||||
{ name: "workspace", mountPath: "/workspace" },
|
||||
{ name: "home", mountPath: "/home/paperclip" },
|
||||
{ name: "cache", mountPath: "/home/paperclip/.cache" },
|
||||
{ name: "tmp", mountPath: "/tmp" },
|
||||
],
|
||||
},
|
||||
],
|
||||
volumes: [
|
||||
{ name: "workspace", emptyDir: { sizeLimit: "8Gi" } },
|
||||
{ name: "home", emptyDir: { sizeLimit: "1Gi" } },
|
||||
{ name: "cache", emptyDir: { sizeLimit: "1Gi" } },
|
||||
{ name: "tmp", emptyDir: { sizeLimit: "2Gi" } },
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
};
|
||||
}
|
||||
@@ -0,0 +1,288 @@
|
||||
/**
|
||||
* SandboxOrchestrator implementation backed by the kubernetes-sigs/agent-sandbox
|
||||
* Sandbox CRD (agents.x-k8s.io/v1alpha1).
|
||||
*
|
||||
* The Sandbox CR creates a long-lived pod that paperclip-server can exec into
|
||||
* for multi-command adapter-install workflows — the key architectural win over
|
||||
* the batch/v1 Job backend.
|
||||
*
|
||||
* Key semantic differences from jobOrchestrator:
|
||||
* - claim() creates a Sandbox CR via CustomObjectsApi instead of a batch Job
|
||||
* - getStatus() maps Sandbox phase (Pending|Ready|Terminating|Failed) to SandboxStatus
|
||||
* - findPod() reads status.podName from the Sandbox CR (falls back to label query)
|
||||
* - waitForCompletion() means "wait until pod is Ready to exec" NOT "wait until
|
||||
* workload finishes". The Sandbox pod runs sleep infinity; execution completion
|
||||
* is tracked by the individual execInPod() calls.
|
||||
* - release() deletes the Sandbox CR with Foreground propagation (controller
|
||||
* tears down the underlying pod).
|
||||
*
|
||||
* NOTE: streamLogs() is provided for interface conformance but is limited —
|
||||
* the sleep-infinity pod has no meaningful stdout. Callers in execute mode
|
||||
* should use execInPod() and capture its stdout/stderr directly.
|
||||
*/
|
||||
|
||||
import type { KubeClients } from "./kube-client.js";
|
||||
import type { SandboxOrchestrator, SandboxStatus } from "./sandbox-orchestrator.js";
|
||||
|
||||
const SANDBOX_GROUP = "agents.x-k8s.io";
|
||||
const SANDBOX_VERSION = "v1alpha1";
|
||||
const SANDBOX_PLURAL = "sandboxes";
|
||||
|
||||
export class SandboxCrTimeoutError extends Error {
|
||||
constructor(namespace: string, name: string, timeoutMs: number) {
|
||||
super(
|
||||
`Sandbox ${namespace}/${name} did not reach Ready phase within ${timeoutMs}ms`,
|
||||
);
|
||||
this.name = "SandboxCrTimeoutError";
|
||||
}
|
||||
}
|
||||
|
||||
function sleep(ms: number): Promise<void> {
|
||||
return new Promise((resolve) => setTimeout(resolve, ms));
|
||||
}
|
||||
|
||||
/**
|
||||
* Map a Sandbox CR status.phase value to our SandboxStatus shape.
|
||||
* Sandbox phases: Pending | Ready | Terminating | Failed
|
||||
*/
|
||||
function mapSandboxPhase(
|
||||
cr: Record<string, unknown>,
|
||||
): SandboxStatus {
|
||||
const status = (cr.status as Record<string, unknown>) ?? {};
|
||||
const phase = (status.phase as string) ?? "Pending";
|
||||
|
||||
switch (phase) {
|
||||
case "Ready":
|
||||
return {
|
||||
phase: "Running", // SandboxStatus.phase uses Job semantics; "Running" = active pod
|
||||
complete: false,
|
||||
active: 1,
|
||||
succeeded: 0,
|
||||
failed: 0,
|
||||
};
|
||||
case "Terminating":
|
||||
return {
|
||||
phase: "Running",
|
||||
complete: false,
|
||||
active: 0,
|
||||
succeeded: 0,
|
||||
failed: 0,
|
||||
reason: "Terminating",
|
||||
};
|
||||
case "Failed": {
|
||||
const conditions = (status.conditions as { type?: string; reason?: string; message?: string }[]) ?? [];
|
||||
const failedCond = conditions.find((c) => c.type === "Failed");
|
||||
return {
|
||||
phase: "Failed",
|
||||
complete: false,
|
||||
active: 0,
|
||||
succeeded: 0,
|
||||
failed: 1,
|
||||
reason: failedCond?.reason,
|
||||
message: failedCond?.message,
|
||||
};
|
||||
}
|
||||
default:
|
||||
// "Pending" or unknown
|
||||
return {
|
||||
phase: "Pending",
|
||||
complete: false,
|
||||
active: 0,
|
||||
succeeded: 0,
|
||||
failed: 0,
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
export async function createSandboxCr(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
manifest: Record<string, unknown>,
|
||||
): Promise<{ uid: string }> {
|
||||
const result = await clients.custom.createNamespacedCustomObject({
|
||||
group: SANDBOX_GROUP,
|
||||
version: SANDBOX_VERSION,
|
||||
namespace,
|
||||
plural: SANDBOX_PLURAL,
|
||||
body: manifest,
|
||||
});
|
||||
const uid = (result as { metadata?: { uid?: string } }).metadata?.uid;
|
||||
if (!uid) throw new Error("Sandbox CR created without a UID");
|
||||
return { uid };
|
||||
}
|
||||
|
||||
export async function getSandboxCrStatus(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
name: string,
|
||||
): Promise<SandboxStatus> {
|
||||
const result = await clients.custom.getNamespacedCustomObject({
|
||||
group: SANDBOX_GROUP,
|
||||
version: SANDBOX_VERSION,
|
||||
namespace,
|
||||
plural: SANDBOX_PLURAL,
|
||||
name,
|
||||
});
|
||||
return mapSandboxPhase(result as Record<string, unknown>);
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the pod name backing a Sandbox CR.
|
||||
* Primary: read status.podName from the CR (set by the controller once ready).
|
||||
* Fallback: list pods in the namespace filtered by the paperclip.io/managed-by
|
||||
* label and the sandbox name label set on the pod template.
|
||||
*/
|
||||
export async function findPodForSandbox(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
name: string,
|
||||
): Promise<string | null> {
|
||||
// Primary: read status.podName from the Sandbox CR
|
||||
const cr = await clients.custom.getNamespacedCustomObject({
|
||||
group: SANDBOX_GROUP,
|
||||
version: SANDBOX_VERSION,
|
||||
namespace,
|
||||
plural: SANDBOX_PLURAL,
|
||||
name,
|
||||
}) as Record<string, unknown>;
|
||||
|
||||
const status = (cr.status as Record<string, unknown>) ?? {};
|
||||
const podName = status.podName as string | undefined;
|
||||
if (podName && podName.trim().length > 0) {
|
||||
return podName;
|
||||
}
|
||||
|
||||
// Fallback: list pods with sandbox-name label (sandbox controller typically
|
||||
// labels pods with the sandbox name)
|
||||
const result = await clients.core.listNamespacedPod({
|
||||
namespace,
|
||||
labelSelector: `paperclip.io/managed-by=paperclip-k8s-plugin`,
|
||||
});
|
||||
const items =
|
||||
(
|
||||
(
|
||||
result as {
|
||||
items?: {
|
||||
metadata?: { name?: string; labels?: Record<string, string> };
|
||||
status?: { phase?: string };
|
||||
}[];
|
||||
}
|
||||
).items
|
||||
) ?? [];
|
||||
|
||||
// Filter to pods that belong to this sandbox by name prefix or label
|
||||
const matching = items.filter((p) => {
|
||||
const podMeta = p.metadata ?? {};
|
||||
const labels = podMeta.labels ?? {};
|
||||
// The sandbox controller may label pods differently; try matching by name prefix
|
||||
return (
|
||||
podMeta.name?.startsWith(name) ||
|
||||
labels["agents.x-k8s.io/sandbox-name"] === name
|
||||
);
|
||||
});
|
||||
|
||||
const running = matching.find((p) => p.status?.phase === "Running");
|
||||
return (running ?? matching[0])?.metadata?.name ?? null;
|
||||
}
|
||||
|
||||
export async function streamSandboxLogs(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
podName: string,
|
||||
onChunk: (stream: "stdout" | "stderr", text: string) => Promise<void>,
|
||||
): Promise<void> {
|
||||
// V1 limitation: the Pod log API returns the container's combined log stream. The
|
||||
// sleep-infinity pod will have minimal output; this is provided for interface
|
||||
// conformance. For actual command output, use execInPod() directly.
|
||||
const result = await clients.core.readNamespacedPodLog({
|
||||
namespace,
|
||||
name: podName,
|
||||
});
|
||||
const text =
|
||||
typeof result === "string"
|
||||
? result
|
||||
: typeof (result as { body?: unknown })?.body === "string"
|
||||
? (result as { body: string }).body
|
||||
: "";
|
||||
if (text.length > 0) await onChunk("stdout", text);
|
||||
}
|
||||
|
||||
export async function deleteSandboxCr(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
name: string,
|
||||
): Promise<void> {
|
||||
await clients.custom.deleteNamespacedCustomObject({
|
||||
group: SANDBOX_GROUP,
|
||||
version: SANDBOX_VERSION,
|
||||
namespace,
|
||||
plural: SANDBOX_PLURAL,
|
||||
name,
|
||||
propagationPolicy: "Foreground",
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Wait until the Sandbox CR's pod reaches Ready phase (i.e., the pod is up and
|
||||
* exec-able). This is NOT waiting for a workload to finish — the Sandbox pod
|
||||
* runs sleep infinity indefinitely. Execution completion is tracked by the
|
||||
* individual execInPod() calls.
|
||||
*
|
||||
* Throws SandboxCrTimeoutError if Ready is not reached within timeoutMs.
|
||||
* Throws if the Sandbox transitions to Failed.
|
||||
*/
|
||||
export async function waitForSandboxReady(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
name: string,
|
||||
opts: { timeoutMs: number; pollMs?: number } = {
|
||||
timeoutMs: 120_000,
|
||||
pollMs: 2000,
|
||||
},
|
||||
): Promise<SandboxStatus> {
|
||||
const deadline = Date.now() + opts.timeoutMs;
|
||||
const pollMs = opts.pollMs ?? 2000;
|
||||
|
||||
while (Date.now() < deadline) {
|
||||
const cr = await clients.custom.getNamespacedCustomObject({
|
||||
group: SANDBOX_GROUP,
|
||||
version: SANDBOX_VERSION,
|
||||
namespace,
|
||||
plural: SANDBOX_PLURAL,
|
||||
name,
|
||||
}) as Record<string, unknown>;
|
||||
|
||||
const status = (cr.status as Record<string, unknown>) ?? {};
|
||||
const phase = (status.phase as string) ?? "Pending";
|
||||
|
||||
if (phase === "Ready") {
|
||||
return mapSandboxPhase(cr);
|
||||
}
|
||||
if (phase === "Failed") {
|
||||
const mapped = mapSandboxPhase(cr);
|
||||
throw new Error(
|
||||
`Sandbox ${namespace}/${name} failed: ${mapped.reason ?? "unknown reason"} — ${mapped.message ?? ""}`,
|
||||
);
|
||||
}
|
||||
// Pending or Terminating — keep polling
|
||||
await sleep(pollMs);
|
||||
}
|
||||
|
||||
throw new SandboxCrTimeoutError(namespace, name, opts.timeoutMs);
|
||||
}
|
||||
|
||||
/**
|
||||
* Sandbox CR-backed conformance to SandboxOrchestrator.
|
||||
*
|
||||
* waitForCompletion semantics change: for this backend, "completion" means
|
||||
* "pod is up and Ready to exec into" — NOT "workload finished". The actual
|
||||
* command execution and its completion is handled by execInPod().
|
||||
*/
|
||||
export const sandboxCrOrchestrator: SandboxOrchestrator = {
|
||||
claim: createSandboxCr,
|
||||
getStatus: getSandboxCrStatus,
|
||||
findPod: findPodForSandbox,
|
||||
streamLogs: streamSandboxLogs,
|
||||
release: deleteSandboxCr,
|
||||
waitForCompletion: waitForSandboxReady,
|
||||
};
|
||||
@@ -0,0 +1,68 @@
|
||||
import type { KubeClients } from "./kube-client.js";
|
||||
|
||||
export interface SandboxStatus {
|
||||
phase: "Pending" | "Running" | "Succeeded" | "Failed";
|
||||
complete: boolean;
|
||||
active: number;
|
||||
succeeded: number;
|
||||
failed: number;
|
||||
reason?: string;
|
||||
message?: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Abstract interface over a sandbox runtime backend. The current implementation
|
||||
* is Job-backed (job-orchestrator.ts). Future backends slot in by exporting an
|
||||
* object conforming to this shape — e.g. a Kata-FC warm-pool backend that
|
||||
* additionally implements the optional pause/resume slots, or a CRD-backed
|
||||
* backend on kubernetes-sigs/agent-sandbox once it reaches Beta.
|
||||
*/
|
||||
export interface SandboxOrchestrator {
|
||||
/** Provision the sandbox. Returns the runtime's stable UID. */
|
||||
claim(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
manifest: Record<string, unknown>,
|
||||
): Promise<{ uid: string }>;
|
||||
|
||||
/** Read current lifecycle phase. */
|
||||
getStatus(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
name: string,
|
||||
): Promise<SandboxStatus>;
|
||||
|
||||
/** Locate the pod backing this sandbox (or null if none exists yet). */
|
||||
findPod(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
name: string,
|
||||
): Promise<string | null>;
|
||||
|
||||
/** Read logs from the sandbox's pod. V1: post-completion read. */
|
||||
streamLogs(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
podName: string,
|
||||
onChunk: (stream: "stdout" | "stderr", text: string) => Promise<void>,
|
||||
): Promise<void>;
|
||||
|
||||
/** Tear down the sandbox. Implementations MUST cascade-delete child resources. */
|
||||
release(clients: KubeClients, namespace: string, name: string): Promise<void>;
|
||||
|
||||
/** Block until phase is Succeeded or Failed, or throw on timeout. */
|
||||
waitForCompletion(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
name: string,
|
||||
opts: { timeoutMs: number; pollMs?: number },
|
||||
): Promise<SandboxStatus>;
|
||||
|
||||
// Optional warm-pool / Kata-FC extension slots. Job-backed implementation
|
||||
// does not provide these; runtimes that do (e.g. Kata-FC microVM pause)
|
||||
// implement them and acquire the warm-pool capability.
|
||||
// TODO: requires custom in-cluster controller for k8s — kubelet does not
|
||||
// expose pause/resume at the pod level. Add when warm-pool design lands.
|
||||
pause?(clients: KubeClients, namespace: string, name: string): Promise<void>;
|
||||
resume?(clients: KubeClients, namespace: string, name: string): Promise<void>;
|
||||
}
|
||||
@@ -0,0 +1,52 @@
|
||||
import type { KubeClients } from "./kube-client.js";
|
||||
|
||||
export interface CreatePerRunSecretInput {
|
||||
namespace: string;
|
||||
secretName: string;
|
||||
runId: string;
|
||||
ownerKind: string;
|
||||
ownerApiVersion: string;
|
||||
ownerName: string;
|
||||
ownerUid: string;
|
||||
bootstrapToken: string;
|
||||
adapterEnv: Record<string, string>;
|
||||
}
|
||||
|
||||
export async function createPerRunSecret(clients: KubeClients, input: CreatePerRunSecretInput): Promise<void> {
|
||||
if (!input.ownerUid) {
|
||||
throw new Error("createPerRunSecret requires a non-empty ownerUid");
|
||||
}
|
||||
if ("BOOTSTRAP_TOKEN" in input.adapterEnv) {
|
||||
throw new Error("adapterEnv must not contain BOOTSTRAP_TOKEN (reserved key)");
|
||||
}
|
||||
await clients.core.createNamespacedSecret({
|
||||
namespace: input.namespace,
|
||||
body: {
|
||||
apiVersion: "v1",
|
||||
kind: "Secret",
|
||||
type: "Opaque",
|
||||
metadata: {
|
||||
name: input.secretName,
|
||||
namespace: input.namespace,
|
||||
labels: {
|
||||
"paperclip.io/run-id": input.runId,
|
||||
"paperclip.io/managed-by": "paperclip-k8s-plugin",
|
||||
},
|
||||
ownerReferences: [
|
||||
{
|
||||
apiVersion: input.ownerApiVersion,
|
||||
kind: input.ownerKind,
|
||||
name: input.ownerName,
|
||||
uid: input.ownerUid,
|
||||
controller: true,
|
||||
blockOwnerDeletion: true,
|
||||
},
|
||||
],
|
||||
},
|
||||
stringData: {
|
||||
BOOTSTRAP_TOKEN: input.bootstrapToken,
|
||||
...input.adapterEnv,
|
||||
},
|
||||
},
|
||||
});
|
||||
}
|
||||
@@ -0,0 +1,322 @@
|
||||
import type { KubeClients } from "./kube-client.js";
|
||||
import { buildNetworkPolicyManifests } from "./network-policy.js";
|
||||
import { buildCiliumNetworkPolicyManifest } from "./cilium-network-policy.js";
|
||||
|
||||
export interface EnsureTenantInput {
|
||||
namespace: string;
|
||||
companyId: string;
|
||||
paperclipServerNamespace: string;
|
||||
serviceAccountAnnotations: Record<string, string>;
|
||||
egressMode: "standard" | "cilium";
|
||||
egressAllowFqdns: string[];
|
||||
egressAllowCidrs: string[];
|
||||
resourceQuota: {
|
||||
pods: string;
|
||||
requestsCpu: string;
|
||||
requestsMemory: string;
|
||||
limitsCpu: string;
|
||||
limitsMemory: string;
|
||||
};
|
||||
}
|
||||
|
||||
const SERVICE_ACCOUNT_NAME = "paperclip-tenant-sa";
|
||||
const ROLE_NAME = "paperclip-tenant-role";
|
||||
const ROLE_BINDING_NAME = "paperclip-tenant-rb";
|
||||
const RESOURCE_QUOTA_NAME = "paperclip-quota";
|
||||
const LIMIT_RANGE_NAME = "paperclip-limits";
|
||||
|
||||
/**
|
||||
* Tenant provisioning reconciles the resources this plugin owns. Existing
|
||||
* resources are replaced with the desired manifest so quota, RBAC, service
|
||||
* account annotations, and egress policy changes take effect on the next run.
|
||||
*/
|
||||
export async function ensureTenant(clients: KubeClients, input: EnsureTenantInput): Promise<void> {
|
||||
await ensureNamespace(clients, input);
|
||||
await ensureServiceAccount(clients, input);
|
||||
await ensureRole(clients, input);
|
||||
await ensureRoleBinding(clients, input);
|
||||
await ensureResourceQuota(clients, input);
|
||||
await ensureLimitRange(clients, input);
|
||||
await ensureNetworkPolicies(clients, input);
|
||||
}
|
||||
|
||||
async function ensureNamespace(clients: KubeClients, input: EnsureTenantInput): Promise<void> {
|
||||
try {
|
||||
await clients.core.readNamespace({ name: input.namespace });
|
||||
return;
|
||||
} catch (err) {
|
||||
if (!isNotFound(err)) throw err;
|
||||
}
|
||||
await clients.core.createNamespace({
|
||||
body: {
|
||||
apiVersion: "v1",
|
||||
kind: "Namespace",
|
||||
metadata: {
|
||||
name: input.namespace,
|
||||
labels: {
|
||||
"paperclip.io/company-id": input.companyId,
|
||||
"paperclip.io/managed-by": "paperclip-k8s-plugin",
|
||||
"pod-security.kubernetes.io/enforce": "restricted",
|
||||
"pod-security.kubernetes.io/audit": "restricted",
|
||||
"pod-security.kubernetes.io/warn": "restricted",
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
async function ensureServiceAccount(clients: KubeClients, input: EnsureTenantInput): Promise<void> {
|
||||
const manifest = {
|
||||
apiVersion: "v1",
|
||||
kind: "ServiceAccount",
|
||||
metadata: {
|
||||
name: SERVICE_ACCOUNT_NAME,
|
||||
namespace: input.namespace,
|
||||
annotations: input.serviceAccountAnnotations,
|
||||
labels: { "paperclip.io/managed-by": "paperclip-k8s-plugin" },
|
||||
},
|
||||
};
|
||||
try {
|
||||
const existing = await clients.core.readNamespacedServiceAccount({ name: SERVICE_ACCOUNT_NAME, namespace: input.namespace });
|
||||
await clients.core.replaceNamespacedServiceAccount({
|
||||
name: SERVICE_ACCOUNT_NAME,
|
||||
namespace: input.namespace,
|
||||
body: withResourceVersion(manifest, existing) as never,
|
||||
});
|
||||
return;
|
||||
} catch (err) {
|
||||
if (!isNotFound(err)) throw err;
|
||||
}
|
||||
await clients.core.createNamespacedServiceAccount({ namespace: input.namespace, body: manifest });
|
||||
}
|
||||
|
||||
async function ensureRole(clients: KubeClients, input: EnsureTenantInput): Promise<void> {
|
||||
const manifest = {
|
||||
apiVersion: "rbac.authorization.k8s.io/v1",
|
||||
kind: "Role",
|
||||
metadata: { name: ROLE_NAME, namespace: input.namespace },
|
||||
rules: [
|
||||
{ apiGroups: [""], resources: ["pods/log"], verbs: ["get"] },
|
||||
],
|
||||
};
|
||||
try {
|
||||
const existing = await clients.rbac.readNamespacedRole({ name: ROLE_NAME, namespace: input.namespace });
|
||||
await clients.rbac.replaceNamespacedRole({
|
||||
name: ROLE_NAME,
|
||||
namespace: input.namespace,
|
||||
body: withResourceVersion(manifest, existing) as never,
|
||||
});
|
||||
return;
|
||||
} catch (err) {
|
||||
if (!isNotFound(err)) throw err;
|
||||
}
|
||||
await clients.rbac.createNamespacedRole({ namespace: input.namespace, body: manifest });
|
||||
}
|
||||
|
||||
async function ensureRoleBinding(clients: KubeClients, input: EnsureTenantInput): Promise<void> {
|
||||
const manifest = {
|
||||
apiVersion: "rbac.authorization.k8s.io/v1",
|
||||
kind: "RoleBinding",
|
||||
metadata: { name: ROLE_BINDING_NAME, namespace: input.namespace },
|
||||
roleRef: { apiGroup: "rbac.authorization.k8s.io", kind: "Role", name: ROLE_NAME },
|
||||
subjects: [{ kind: "ServiceAccount", name: SERVICE_ACCOUNT_NAME, namespace: input.namespace }],
|
||||
};
|
||||
try {
|
||||
const existing = await clients.rbac.readNamespacedRoleBinding({ name: ROLE_BINDING_NAME, namespace: input.namespace });
|
||||
await clients.rbac.replaceNamespacedRoleBinding({
|
||||
name: ROLE_BINDING_NAME,
|
||||
namespace: input.namespace,
|
||||
body: withResourceVersion(manifest, existing) as never,
|
||||
});
|
||||
return;
|
||||
} catch (err) {
|
||||
if (!isNotFound(err)) throw err;
|
||||
}
|
||||
await clients.rbac.createNamespacedRoleBinding({ namespace: input.namespace, body: manifest });
|
||||
}
|
||||
|
||||
async function ensureResourceQuota(clients: KubeClients, input: EnsureTenantInput): Promise<void> {
|
||||
const manifest = {
|
||||
apiVersion: "v1",
|
||||
kind: "ResourceQuota",
|
||||
metadata: { name: RESOURCE_QUOTA_NAME, namespace: input.namespace },
|
||||
spec: {
|
||||
hard: {
|
||||
pods: input.resourceQuota.pods,
|
||||
"requests.cpu": input.resourceQuota.requestsCpu,
|
||||
"requests.memory": input.resourceQuota.requestsMemory,
|
||||
"limits.cpu": input.resourceQuota.limitsCpu,
|
||||
"limits.memory": input.resourceQuota.limitsMemory,
|
||||
},
|
||||
},
|
||||
};
|
||||
try {
|
||||
const existing = await clients.core.readNamespacedResourceQuota({ name: RESOURCE_QUOTA_NAME, namespace: input.namespace });
|
||||
await clients.core.replaceNamespacedResourceQuota({
|
||||
name: RESOURCE_QUOTA_NAME,
|
||||
namespace: input.namespace,
|
||||
body: withResourceVersion(manifest, existing) as never,
|
||||
});
|
||||
return;
|
||||
} catch (err) {
|
||||
if (!isNotFound(err)) throw err;
|
||||
}
|
||||
await clients.core.createNamespacedResourceQuota({ namespace: input.namespace, body: manifest });
|
||||
}
|
||||
|
||||
async function ensureLimitRange(clients: KubeClients, input: EnsureTenantInput): Promise<void> {
|
||||
const manifest = {
|
||||
apiVersion: "v1",
|
||||
kind: "LimitRange",
|
||||
metadata: { name: LIMIT_RANGE_NAME, namespace: input.namespace },
|
||||
spec: {
|
||||
limits: [
|
||||
{
|
||||
type: "Container",
|
||||
max: { cpu: "4", memory: "8Gi" },
|
||||
min: { cpu: "100m", memory: "128Mi" },
|
||||
// The k8s client-node type names this `_default` but the actual
|
||||
// Kubernetes API field is `default`. We produce a JSON-shape
|
||||
// manifest so the cast is safe.
|
||||
default: { cpu: "1", memory: "2Gi" },
|
||||
defaultRequest: { cpu: "250m", memory: "512Mi" },
|
||||
},
|
||||
],
|
||||
},
|
||||
};
|
||||
try {
|
||||
const existing = await clients.core.readNamespacedLimitRange({ name: LIMIT_RANGE_NAME, namespace: input.namespace });
|
||||
await clients.core.replaceNamespacedLimitRange({
|
||||
name: LIMIT_RANGE_NAME,
|
||||
namespace: input.namespace,
|
||||
body: withResourceVersion(manifest, existing) as never,
|
||||
});
|
||||
return;
|
||||
} catch (err) {
|
||||
if (!isNotFound(err)) throw err;
|
||||
}
|
||||
await clients.core.createNamespacedLimitRange({
|
||||
namespace: input.namespace,
|
||||
body: manifest as never,
|
||||
});
|
||||
}
|
||||
|
||||
async function ensureNetworkPolicies(clients: KubeClients, input: EnsureTenantInput): Promise<void> {
|
||||
const [denyAll, egressStd] = buildNetworkPolicyManifests({
|
||||
namespace: input.namespace,
|
||||
paperclipServerNamespace: input.paperclipServerNamespace,
|
||||
egressAllowFqdns: input.egressAllowFqdns,
|
||||
egressAllowCidrs: input.egressAllowCidrs,
|
||||
});
|
||||
|
||||
await ensureNetworkPolicy(clients, input.namespace, denyAll);
|
||||
|
||||
if (input.egressMode === "cilium") {
|
||||
const cnp = buildCiliumNetworkPolicyManifest({
|
||||
namespace: input.namespace,
|
||||
paperclipServerNamespace: input.paperclipServerNamespace,
|
||||
egressAllowFqdns: input.egressAllowFqdns,
|
||||
egressAllowCidrs: input.egressAllowCidrs,
|
||||
});
|
||||
await ensureCiliumNetworkPolicy(clients, input.namespace, cnp);
|
||||
await deleteNetworkPolicyIfExists(clients, input.namespace, "paperclip-egress-allow");
|
||||
} else {
|
||||
await ensureNetworkPolicy(clients, input.namespace, egressStd);
|
||||
await deleteCiliumNetworkPolicyIfExists(clients, input.namespace, "paperclip-egress-fqdn");
|
||||
}
|
||||
}
|
||||
|
||||
async function ensureNetworkPolicy(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
manifest: Record<string, unknown>,
|
||||
): Promise<void> {
|
||||
const name = (manifest.metadata as { name: string }).name;
|
||||
try {
|
||||
const existing = await clients.networking.readNamespacedNetworkPolicy({ name, namespace });
|
||||
await clients.networking.replaceNamespacedNetworkPolicy({
|
||||
name,
|
||||
namespace,
|
||||
body: withResourceVersion(manifest, existing) as never,
|
||||
});
|
||||
return;
|
||||
} catch (err) {
|
||||
if (!isNotFound(err)) throw err;
|
||||
}
|
||||
await clients.networking.createNamespacedNetworkPolicy({ namespace, body: manifest as never });
|
||||
}
|
||||
|
||||
async function ensureCiliumNetworkPolicy(
|
||||
clients: KubeClients,
|
||||
namespace: string,
|
||||
manifest: Record<string, unknown>,
|
||||
): Promise<void> {
|
||||
const name = (manifest.metadata as { name: string }).name;
|
||||
try {
|
||||
const existing = await clients.custom.getNamespacedCustomObject({
|
||||
group: "cilium.io",
|
||||
version: "v2",
|
||||
namespace,
|
||||
plural: "ciliumnetworkpolicies",
|
||||
name,
|
||||
});
|
||||
await clients.custom.replaceNamespacedCustomObject({
|
||||
group: "cilium.io",
|
||||
version: "v2",
|
||||
namespace,
|
||||
plural: "ciliumnetworkpolicies",
|
||||
name,
|
||||
body: withResourceVersion(manifest, existing),
|
||||
});
|
||||
return;
|
||||
} catch (err) {
|
||||
if (!isNotFound(err)) throw err;
|
||||
}
|
||||
await clients.custom.createNamespacedCustomObject({
|
||||
group: "cilium.io",
|
||||
version: "v2",
|
||||
namespace,
|
||||
plural: "ciliumnetworkpolicies",
|
||||
body: manifest,
|
||||
});
|
||||
}
|
||||
|
||||
async function deleteNetworkPolicyIfExists(clients: KubeClients, namespace: string, name: string): Promise<void> {
|
||||
try {
|
||||
await clients.networking.deleteNamespacedNetworkPolicy({ name, namespace });
|
||||
} catch (err) {
|
||||
if (!isNotFound(err)) throw err;
|
||||
}
|
||||
}
|
||||
|
||||
async function deleteCiliumNetworkPolicyIfExists(clients: KubeClients, namespace: string, name: string): Promise<void> {
|
||||
try {
|
||||
await clients.custom.deleteNamespacedCustomObject({
|
||||
group: "cilium.io",
|
||||
version: "v2",
|
||||
namespace,
|
||||
plural: "ciliumnetworkpolicies",
|
||||
name,
|
||||
});
|
||||
} catch (err) {
|
||||
if (!isNotFound(err)) throw err;
|
||||
}
|
||||
}
|
||||
|
||||
function withResourceVersion<T extends Record<string, unknown>>(manifest: T, existing: unknown): T {
|
||||
const resourceVersion = (existing as { metadata?: { resourceVersion?: string } })?.metadata?.resourceVersion;
|
||||
if (!resourceVersion) return manifest;
|
||||
return {
|
||||
...manifest,
|
||||
metadata: {
|
||||
...(manifest.metadata as Record<string, unknown>),
|
||||
resourceVersion,
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
function isNotFound(err: unknown): boolean {
|
||||
if (typeof err !== "object" || err === null) return false;
|
||||
const e = err as { code?: number; statusCode?: number };
|
||||
return e.code === 404 || e.statusCode === 404;
|
||||
}
|
||||
@@ -0,0 +1,85 @@
|
||||
import { z } from "zod";
|
||||
import { KNOWN_ADAPTER_TYPES } from "./adapter-defaults.js";
|
||||
|
||||
const cidrRegex = /^(\d{1,3}\.){3}\d{1,3}\/\d{1,2}$/;
|
||||
|
||||
export const kubernetesProviderConfigSchema = z
|
||||
.object({
|
||||
inCluster: z.boolean().default(false),
|
||||
kubeconfig: z.string().optional(),
|
||||
|
||||
namespacePrefix: z.string().regex(/^[a-z0-9-]{1,32}$/).default("paperclip-"),
|
||||
companySlug: z.string().regex(/^[a-z0-9-]{1,32}$/).optional(),
|
||||
|
||||
imageRegistry: z.string().url().optional(),
|
||||
imageAllowList: z.array(z.string()).default([]),
|
||||
imagePullSecrets: z.array(z.string()).default([]),
|
||||
|
||||
egressAllowFqdns: z.array(z.string()).default([]),
|
||||
egressAllowCidrs: z.array(z.string().regex(cidrRegex, "Invalid CIDR")).default([]),
|
||||
egressMode: z.enum(["cilium", "standard"]).default("standard"),
|
||||
|
||||
defaultResources: z
|
||||
.object({
|
||||
requests: z.object({ cpu: z.string(), memory: z.string() }).partial().optional(),
|
||||
limits: z.object({ cpu: z.string(), memory: z.string() }).partial().optional(),
|
||||
})
|
||||
.optional(),
|
||||
|
||||
runtimeClassName: z.string().optional(),
|
||||
serviceAccountAnnotations: z.record(z.string()).default({}),
|
||||
|
||||
jobTtlSecondsAfterFinished: z.number().int().nonnegative().default(900),
|
||||
podActivityDeadlineSec: z.number().int().positive().default(3600),
|
||||
|
||||
/**
|
||||
* The adapter type that Jobs in this environment will run.
|
||||
* Each Kubernetes environment is bound to one adapter; create multiple
|
||||
* environments for different adapters.
|
||||
* Defaults to `"claude_local"`.
|
||||
*/
|
||||
adapterType: z
|
||||
.string()
|
||||
.default("claude_local")
|
||||
.refine((v) => KNOWN_ADAPTER_TYPES.has(v), {
|
||||
message: "adapterType must be one of the known adapter types",
|
||||
}),
|
||||
|
||||
/**
|
||||
* The sandbox backend to use.
|
||||
*
|
||||
* - `"sandbox-cr"` (default, alpha) — uses the kubernetes-sigs/agent-sandbox
|
||||
* Sandbox CRD (agents.x-k8s.io/v1alpha1). Creates a long-lived pod that
|
||||
* paperclip-server can exec into for multi-command adapter-install workflows.
|
||||
* Requires the agent-sandbox controller to be installed in the cluster.
|
||||
*
|
||||
* - `"job"` — uses batch/v1 Job (stable fallback). One-shot entrypoint; does
|
||||
* NOT support multi-command exec. Use this for clusters without agent-sandbox
|
||||
* installed, or when you need stable (non-alpha) k8s APIs.
|
||||
*/
|
||||
backend: z.enum(["sandbox-cr", "job"]).default("sandbox-cr"),
|
||||
})
|
||||
.refine(
|
||||
(cfg) => cfg.inCluster || cfg.kubeconfig,
|
||||
{
|
||||
message:
|
||||
"kubernetes provider requires one of `inCluster` or `kubeconfig`",
|
||||
},
|
||||
);
|
||||
|
||||
export type KubernetesProviderConfig = z.infer<typeof kubernetesProviderConfigSchema>;
|
||||
|
||||
export function parseKubernetesProviderConfig(input: unknown): KubernetesProviderConfig {
|
||||
return kubernetesProviderConfigSchema.parse(input);
|
||||
}
|
||||
|
||||
export interface KubernetesLeaseMetadata {
|
||||
namespace: string;
|
||||
/** Name of the workload resource (Job name for job backend, Sandbox CR name for sandbox-cr backend). */
|
||||
jobName: string;
|
||||
podName: string | null;
|
||||
secretName: string;
|
||||
phase: "Pending" | "Running" | "Succeeded" | "Failed";
|
||||
/** Which backend provisioned this lease. */
|
||||
backend: "sandbox-cr" | "job";
|
||||
}
|
||||
@@ -0,0 +1,46 @@
|
||||
const ULID_ALPHABET = "0123456789abcdefghjkmnpqrstvwxyz";
|
||||
|
||||
export function deriveCompanySlug(input: string): string {
|
||||
const slug = input
|
||||
.toLowerCase()
|
||||
.replace(/[^a-z0-9-]+/g, "-")
|
||||
.replace(/^-+|-+$/g, "")
|
||||
.slice(0, 32)
|
||||
.replace(/-+$/, "");
|
||||
return slug.length > 0 ? slug : "company";
|
||||
}
|
||||
|
||||
export function deriveNamespaceName(prefix: string, slug: string): string {
|
||||
return `${prefix}${slug}`;
|
||||
}
|
||||
|
||||
export function newRunUlidDns(now: () => number = Date.now): string {
|
||||
const timestamp = now();
|
||||
let out = "";
|
||||
let t = timestamp;
|
||||
for (let i = 0; i < 10; i++) {
|
||||
out = ULID_ALPHABET[t & 0x1f] + out;
|
||||
t = Math.floor(t / 32);
|
||||
}
|
||||
for (let i = 0; i < 16; i++) {
|
||||
out += ULID_ALPHABET[Math.floor(Math.random() * 32)];
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
export interface LabelsInput {
|
||||
runId: string;
|
||||
agentId: string;
|
||||
companyId: string;
|
||||
adapterType: string;
|
||||
}
|
||||
|
||||
export function paperclipLabels(input: LabelsInput): Record<string, string> {
|
||||
return {
|
||||
"paperclip.io/run-id": input.runId,
|
||||
"paperclip.io/agent-id": input.agentId,
|
||||
"paperclip.io/company-id": input.companyId,
|
||||
"paperclip.io/adapter": input.adapterType,
|
||||
"paperclip.io/managed-by": "paperclip-k8s-plugin",
|
||||
};
|
||||
}
|
||||
@@ -0,0 +1,5 @@
|
||||
import { runWorker } from "@paperclipai/plugin-sdk";
|
||||
import plugin from "./plugin.js";
|
||||
|
||||
export default plugin;
|
||||
runWorker(plugin, import.meta.url);
|
||||
@@ -0,0 +1,22 @@
|
||||
import { execSync } from "node:child_process";
|
||||
import { readFileSync } from "node:fs";
|
||||
import { homedir } from "node:os";
|
||||
import { join } from "node:path";
|
||||
|
||||
export const KIND_CONTEXT = "kind-paperclip";
|
||||
|
||||
export function readKindKubeconfig(): string {
|
||||
return readFileSync(join(homedir(), ".kube", "config"), "utf-8");
|
||||
}
|
||||
|
||||
export function kubectl(args: string): string {
|
||||
return execSync(`kubectl --context ${KIND_CONTEXT} ${args}`, { encoding: "utf-8" });
|
||||
}
|
||||
|
||||
export function deleteNamespaceIfExists(namespace: string): void {
|
||||
try {
|
||||
kubectl(`delete namespace ${namespace} --wait=true --timeout=60s --ignore-not-found`);
|
||||
} catch {
|
||||
// ignore
|
||||
}
|
||||
}
|
||||
+205
@@ -0,0 +1,205 @@
|
||||
/**
|
||||
* End-to-end integration test against a local kind cluster.
|
||||
*
|
||||
* PREREQUISITES (operator must perform before running this test):
|
||||
* 1. Create the kind cluster:
|
||||
* kind create cluster --name paperclip
|
||||
* 2. Pre-load the alpine image so the Job can start without network access:
|
||||
* docker pull alpine:3.20
|
||||
* docker tag alpine:3.20 localhost/paperclip-agent:latest
|
||||
* kind load docker-image localhost/paperclip-agent:latest --name paperclip
|
||||
* 3. For the sandbox-cr backend test, the agent-sandbox controller must be installed:
|
||||
* kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/latest/download/install.yaml
|
||||
* And a tini-bearing image pre-loaded (e.g. the same localhost/paperclip-agent:latest
|
||||
* if it includes /usr/bin/tini and /bin/sh).
|
||||
* 4. Set the env var and run:
|
||||
* RUN_K8S_INTEGRATION_TESTS=1 pnpm test
|
||||
*
|
||||
* The namespace is derived from companySlug ("spike-e2e") + namespacePrefix
|
||||
* ("paperclip-"), resolving to "paperclip-spike-e2e".
|
||||
*/
|
||||
|
||||
import { describe, it, expect, beforeAll, afterAll } from "vitest";
|
||||
import plugin from "../../src/plugin.js";
|
||||
import { createKubeConfig } from "../../src/kube-client.js";
|
||||
import { execInPod } from "../../src/pod-exec.js";
|
||||
import { sandboxCrOrchestrator } from "../../src/sandbox-cr-orchestrator.js";
|
||||
import { deleteNamespaceIfExists, kubectl, readKindKubeconfig } from "./_kind-harness.js";
|
||||
|
||||
const NAMESPACE = "paperclip-spike-e2e";
|
||||
|
||||
describe("plugin-kubernetes end-to-end", () => {
|
||||
beforeAll(() => {
|
||||
if (process.env.RUN_K8S_INTEGRATION_TESTS !== "1") return;
|
||||
deleteNamespaceIfExists(NAMESPACE);
|
||||
});
|
||||
|
||||
afterAll(() => {
|
||||
if (process.env.RUN_K8S_INTEGRATION_TESTS !== "1") return;
|
||||
deleteNamespaceIfExists(NAMESPACE);
|
||||
});
|
||||
|
||||
// ── Job backend (stable fallback) ─────────────────────────────────────────
|
||||
|
||||
it.runIf(process.env.RUN_K8S_INTEGRATION_TESTS === "1")(
|
||||
"[job backend] acquireLease creates tenant + Job + supporting resources; releaseLease cascade-deletes them",
|
||||
async () => {
|
||||
const kubeconfig = readKindKubeconfig();
|
||||
const config = {
|
||||
inCluster: false,
|
||||
kubeconfig,
|
||||
companySlug: "spike-e2e",
|
||||
adapterType: "claude_local",
|
||||
backend: "job",
|
||||
imageAllowList: [] as string[],
|
||||
podActivityDeadlineSec: 60,
|
||||
jobTtlSecondsAfterFinished: 60,
|
||||
};
|
||||
|
||||
const lease = await plugin.definition.onEnvironmentAcquireLease!({
|
||||
driverKey: "kubernetes",
|
||||
config,
|
||||
runId: "r-test-e2e-job",
|
||||
companyId: "11111111-1111-1111-1111-111111111111",
|
||||
environmentId: "env-test",
|
||||
});
|
||||
|
||||
expect(lease.providerLeaseId).toMatch(/^pc-/);
|
||||
|
||||
// Verify the Job exists in the tenant namespace
|
||||
const jobs = kubectl(`get jobs -n ${NAMESPACE} -o name`);
|
||||
expect(jobs).toContain(`job.batch/${lease.providerLeaseId}`);
|
||||
|
||||
// Verify the tenant namespace has the expected supporting resources
|
||||
const all = kubectl(
|
||||
`get sa,role,rolebinding,resourcequota,limitrange,networkpolicy -n ${NAMESPACE} -o name`,
|
||||
);
|
||||
expect(all).toContain("serviceaccount/paperclip-tenant-sa");
|
||||
expect(all).toContain("role.rbac.authorization.k8s.io/paperclip-tenant-role");
|
||||
expect(all).toContain("rolebinding.rbac.authorization.k8s.io/paperclip-tenant-rb");
|
||||
expect(all).toContain("resourcequota/paperclip-quota");
|
||||
expect(all).toContain("limitrange/paperclip-limits");
|
||||
expect(all).toContain("networkpolicy.networking.k8s.io/paperclip-deny-all");
|
||||
expect(all).toContain("networkpolicy.networking.k8s.io/paperclip-egress-allow");
|
||||
|
||||
// Verify the namespace has PSS-restricted labels
|
||||
const ns = kubectl(`get namespace ${NAMESPACE} -o jsonpath='{.metadata.labels}'`);
|
||||
expect(ns).toContain("pod-security.kubernetes.io/enforce");
|
||||
expect(ns).toContain("restricted");
|
||||
|
||||
// Verify the per-run Secret exists (owned by the Job for cascade deletion)
|
||||
const secrets = kubectl(`get secrets -n ${NAMESPACE} -o name`);
|
||||
expect(secrets).toContain(`secret/${lease.providerLeaseId}-env`);
|
||||
|
||||
// Release — deletes the Job with Foreground propagation, which cascade-deletes
|
||||
// the owned Secret via owner references set at acquireLease time.
|
||||
await plugin.definition.onEnvironmentReleaseLease!({
|
||||
driverKey: "kubernetes",
|
||||
config,
|
||||
providerLeaseId: lease.providerLeaseId,
|
||||
leaseMetadata: lease.metadata,
|
||||
companyId: "11111111-1111-1111-1111-111111111111",
|
||||
environmentId: "env-test",
|
||||
});
|
||||
|
||||
// Allow a brief grace window for Foreground propagation to finish.
|
||||
await new Promise((resolve) => setTimeout(resolve, 2000));
|
||||
|
||||
const jobsAfter = kubectl(`get jobs -n ${NAMESPACE} -o name 2>&1 || true`);
|
||||
expect(jobsAfter).not.toContain(`job.batch/${lease.providerLeaseId}`);
|
||||
},
|
||||
180_000,
|
||||
);
|
||||
|
||||
// ── Sandbox-CR backend (alpha, requires agent-sandbox controller) ──────────
|
||||
|
||||
it.runIf(process.env.RUN_K8S_INTEGRATION_TESTS === "1")(
|
||||
"[sandbox-cr backend] acquireLease creates Sandbox CR + supporting resources; pod becomes Ready; execInPod runs echo hello; releaseLease deletes CR",
|
||||
async () => {
|
||||
const kubeconfig = readKindKubeconfig();
|
||||
const config = {
|
||||
inCluster: false,
|
||||
kubeconfig,
|
||||
companySlug: "spike-e2e",
|
||||
adapterType: "claude_local",
|
||||
backend: "sandbox-cr",
|
||||
imageAllowList: [] as string[],
|
||||
podActivityDeadlineSec: 120,
|
||||
jobTtlSecondsAfterFinished: 60,
|
||||
};
|
||||
|
||||
const lease = await plugin.definition.onEnvironmentAcquireLease!({
|
||||
driverKey: "kubernetes",
|
||||
config,
|
||||
runId: "r-test-e2e-sandbox-cr",
|
||||
companyId: "22222222-2222-2222-2222-222222222222",
|
||||
environmentId: "env-test-cr",
|
||||
});
|
||||
|
||||
expect(lease.providerLeaseId).toMatch(/^pc-/);
|
||||
|
||||
// Verify the Sandbox CR exists in the tenant namespace
|
||||
const sandboxes = kubectl(
|
||||
`get sandboxes.agents.x-k8s.io -n ${NAMESPACE} -o name 2>&1`,
|
||||
);
|
||||
expect(sandboxes).toContain(`sandbox.agents.x-k8s.io/${lease.providerLeaseId}`);
|
||||
|
||||
// Verify the per-run Secret exists (owned by the Sandbox CR)
|
||||
const secrets = kubectl(`get secrets -n ${NAMESPACE} -o name`);
|
||||
expect(secrets).toContain(`secret/${lease.providerLeaseId}-env`);
|
||||
|
||||
// Wait for the Sandbox pod to become Ready
|
||||
const kc = createKubeConfig({ inCluster: false, kubeconfig });
|
||||
const { makeKubeClients } = await import("../../src/kube-client.js");
|
||||
const clients = makeKubeClients(kc);
|
||||
|
||||
await sandboxCrOrchestrator.waitForCompletion(
|
||||
clients,
|
||||
NAMESPACE,
|
||||
lease.providerLeaseId,
|
||||
{ timeoutMs: 90_000, pollMs: 3000 },
|
||||
);
|
||||
|
||||
// Resolve the pod name
|
||||
const podName = await sandboxCrOrchestrator.findPod(
|
||||
clients,
|
||||
NAMESPACE,
|
||||
lease.providerLeaseId,
|
||||
);
|
||||
expect(podName).toBeTruthy();
|
||||
|
||||
// Exec a simple echo command into the running pod
|
||||
const execResult = await execInPod(
|
||||
kc,
|
||||
NAMESPACE,
|
||||
podName!,
|
||||
"agent",
|
||||
["echo", "hello"],
|
||||
);
|
||||
|
||||
expect(execResult.exitCode).toBe(0);
|
||||
expect(execResult.stdout.trim()).toBe("hello");
|
||||
|
||||
// Release — deletes the Sandbox CR with Foreground propagation.
|
||||
await plugin.definition.onEnvironmentReleaseLease!({
|
||||
driverKey: "kubernetes",
|
||||
config,
|
||||
providerLeaseId: lease.providerLeaseId,
|
||||
leaseMetadata: lease.metadata,
|
||||
companyId: "22222222-2222-2222-2222-222222222222",
|
||||
environmentId: "env-test-cr",
|
||||
});
|
||||
|
||||
// Allow a brief grace window for Foreground propagation.
|
||||
await new Promise((resolve) => setTimeout(resolve, 3000));
|
||||
|
||||
const sandboxesAfter = kubectl(
|
||||
`get sandboxes.agents.x-k8s.io -n ${NAMESPACE} -o name 2>&1 || true`,
|
||||
);
|
||||
expect(sandboxesAfter).not.toContain(
|
||||
`sandbox.agents.x-k8s.io/${lease.providerLeaseId}`,
|
||||
);
|
||||
},
|
||||
300_000,
|
||||
);
|
||||
});
|
||||
@@ -0,0 +1,37 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import { getAdapterDefaults, KNOWN_ADAPTER_TYPES } from "../../src/adapter-defaults.js";
|
||||
|
||||
describe("adapter-defaults", () => {
|
||||
it("returns defaults for claude_local", () => {
|
||||
const d = getAdapterDefaults("claude_local");
|
||||
expect(d.runtimeImage).toBe("ghcr.io/paperclipai/agent-runtime-claude:v1");
|
||||
expect(d.envKeys).toContain("ANTHROPIC_API_KEY");
|
||||
expect(d.allowFqdns).toContain("api.anthropic.com");
|
||||
expect(d.probeCommand).toEqual(["claude", "--version"]);
|
||||
});
|
||||
|
||||
it("returns defaults for codex_local", () => {
|
||||
const d = getAdapterDefaults("codex_local");
|
||||
expect(d.runtimeImage).toBe("ghcr.io/paperclipai/agent-runtime-codex:v1");
|
||||
expect(d.envKeys).toContain("OPENAI_API_KEY");
|
||||
expect(d.probeCommand).toEqual(["codex", "--version"]);
|
||||
});
|
||||
|
||||
it("throws on unknown adapter type", () => {
|
||||
expect(() => getAdapterDefaults("nonexistent_local")).toThrow(/unknown adapter type/i);
|
||||
});
|
||||
|
||||
it("KNOWN_ADAPTER_TYPES contains all 7 supported adapters", () => {
|
||||
expect(KNOWN_ADAPTER_TYPES).toEqual(
|
||||
new Set([
|
||||
"claude_local",
|
||||
"codex_local",
|
||||
"gemini_local",
|
||||
"cursor_local",
|
||||
"opencode_local",
|
||||
"acpx_local",
|
||||
"pi_local",
|
||||
]),
|
||||
);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,60 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import { buildCiliumNetworkPolicyManifest } from "../../src/cilium-network-policy.js";
|
||||
|
||||
describe("buildCiliumNetworkPolicyManifest", () => {
|
||||
const baseInput = {
|
||||
namespace: "paperclip-acme",
|
||||
paperclipServerNamespace: "paperclip",
|
||||
egressAllowFqdns: ["api.anthropic.com"],
|
||||
egressAllowCidrs: [] as string[],
|
||||
};
|
||||
|
||||
it("returns a CiliumNetworkPolicy with the correct apiVersion and kind", () => {
|
||||
const cnp = buildCiliumNetworkPolicyManifest(baseInput);
|
||||
expect(cnp.apiVersion).toBe("cilium.io/v2");
|
||||
expect(cnp.kind).toBe("CiliumNetworkPolicy");
|
||||
});
|
||||
|
||||
it("targets agent pods by role label", () => {
|
||||
const cnp = buildCiliumNetworkPolicyManifest(baseInput);
|
||||
expect(cnp.spec.endpointSelector.matchLabels["paperclip.io/role"]).toBe("agent");
|
||||
});
|
||||
|
||||
it("includes an FQDN allow rule for each adapter FQDN", () => {
|
||||
const cnp = buildCiliumNetworkPolicyManifest({
|
||||
...baseInput,
|
||||
egressAllowFqdns: ["api.anthropic.com", "api.openai.com"],
|
||||
});
|
||||
const fqdnRule = cnp.spec.egress.find((e: { toFQDNs?: { matchName: string }[] }) => e.toFQDNs);
|
||||
expect(fqdnRule).toBeDefined();
|
||||
expect(fqdnRule.toFQDNs.map((f: { matchName: string }) => f.matchName).sort()).toEqual([
|
||||
"api.anthropic.com",
|
||||
"api.openai.com",
|
||||
]);
|
||||
});
|
||||
|
||||
it("permits DNS to kube-dns explicitly so FQDN resolution can happen", () => {
|
||||
const cnp = buildCiliumNetworkPolicyManifest(baseInput);
|
||||
const dnsRule = cnp.spec.egress.find((e: { toPorts?: { ports: { port: string }[] }[] }) =>
|
||||
e.toPorts?.some((tp) => tp.ports.some((p) => p.port === "53")),
|
||||
);
|
||||
expect(dnsRule).toBeDefined();
|
||||
});
|
||||
|
||||
it("includes a rule for paperclip-server callback", () => {
|
||||
const cnp = buildCiliumNetworkPolicyManifest(baseInput);
|
||||
const cb = cnp.spec.egress.find((e: { toEndpoints?: { matchLabels: Record<string, string> }[] }) =>
|
||||
e.toEndpoints?.some((ep) => ep.matchLabels.app === "paperclip-server"),
|
||||
);
|
||||
expect(cb).toBeDefined();
|
||||
});
|
||||
|
||||
it("includes user-supplied CIDRs in toCIDRSet rule", () => {
|
||||
const cnp = buildCiliumNetworkPolicyManifest({
|
||||
...baseInput,
|
||||
egressAllowCidrs: ["10.0.0.0/8"],
|
||||
});
|
||||
const cidrRule = cnp.spec.egress.find((e: { toCIDRSet?: { cidr: string }[] }) => e.toCIDRSet);
|
||||
expect(cidrRule.toCIDRSet[0].cidr).toBe("10.0.0.0/8");
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,62 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import { globMatch, resolveImage } from "../../src/image-allowlist.js";
|
||||
|
||||
describe("globMatch", () => {
|
||||
it("matches exact image", () => {
|
||||
expect(globMatch("ghcr.io/paperclipai/agent-runtime-claude:v1", "ghcr.io/paperclipai/agent-runtime-claude:v1")).toBe(true);
|
||||
});
|
||||
|
||||
it("matches single-character wildcard", () => {
|
||||
expect(globMatch("ghcr.io/x:v?", "ghcr.io/x:v1")).toBe(true);
|
||||
expect(globMatch("ghcr.io/x:v?", "ghcr.io/x:v12")).toBe(false);
|
||||
});
|
||||
|
||||
it("matches multi-character wildcard", () => {
|
||||
expect(globMatch("ghcr.io/paperclipai/*:v1", "ghcr.io/paperclipai/agent-runtime-claude:v1")).toBe(true);
|
||||
expect(globMatch("ghcr.io/paperclipai/*:v1", "docker.io/other/img:v1")).toBe(false);
|
||||
});
|
||||
|
||||
it("does not allow wildcard to span slashes by default", () => {
|
||||
expect(globMatch("ghcr.io/*:v1", "ghcr.io/paperclipai/agent-runtime-claude:v1")).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
describe("resolveImage", () => {
|
||||
const defaults = { runtimeImage: "ghcr.io/paperclipai/agent-runtime-claude:v1" };
|
||||
|
||||
it("uses adapter default when no override", () => {
|
||||
expect(resolveImage({ imageOverride: null }, defaults, { imageAllowList: [], imageRegistry: undefined })).toBe(
|
||||
"ghcr.io/paperclipai/agent-runtime-claude:v1",
|
||||
);
|
||||
});
|
||||
|
||||
it("rewrites registry when imageRegistry is set", () => {
|
||||
expect(
|
||||
resolveImage(
|
||||
{ imageOverride: null },
|
||||
defaults,
|
||||
{ imageAllowList: [], imageRegistry: "registry.example.com/paperclip" },
|
||||
),
|
||||
).toBe("registry.example.com/paperclip/agent-runtime-claude:v1");
|
||||
});
|
||||
|
||||
it("accepts imageOverride when in allowlist", () => {
|
||||
expect(
|
||||
resolveImage(
|
||||
{ imageOverride: "registry.example.com/mine:v2" },
|
||||
defaults,
|
||||
{ imageAllowList: ["registry.example.com/*:v2"], imageRegistry: undefined },
|
||||
),
|
||||
).toBe("registry.example.com/mine:v2");
|
||||
});
|
||||
|
||||
it("rejects imageOverride not in allowlist", () => {
|
||||
expect(() =>
|
||||
resolveImage(
|
||||
{ imageOverride: "evil.io/img:latest" },
|
||||
defaults,
|
||||
{ imageAllowList: ["registry.example.com/*"], imageRegistry: undefined },
|
||||
),
|
||||
).toThrow(/not in allowlist/);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,101 @@
|
||||
import { describe, it, expect, vi } from "vitest";
|
||||
import { createJob, deleteJob, getJobStatus, findPodForJob, JobTimeoutError, streamPodLogs, waitForJobCompletion } from "../../src/job-orchestrator.js";
|
||||
|
||||
describe("createJob", () => {
|
||||
it("calls batch.createNamespacedJob with the manifest", async () => {
|
||||
const create = vi.fn().mockResolvedValue({ metadata: { uid: "abc-uid" } });
|
||||
const clients = { batch: { createNamespacedJob: create } };
|
||||
const jobManifest = { apiVersion: "batch/v1", kind: "Job", metadata: { name: "r-1", namespace: "ns" }, spec: { template: {} } };
|
||||
const result = await createJob(clients as never, "ns", jobManifest);
|
||||
expect(create).toHaveBeenCalledWith({ namespace: "ns", body: jobManifest });
|
||||
expect(result.uid).toBe("abc-uid");
|
||||
});
|
||||
});
|
||||
|
||||
describe("getJobStatus", () => {
|
||||
it("returns phase=Succeeded when succeeded count is 1", async () => {
|
||||
const get = vi.fn().mockResolvedValue({ status: { succeeded: 1, conditions: [{ type: "Complete", status: "True" }] } });
|
||||
const clients = { batch: { readNamespacedJobStatus: get } };
|
||||
const status = await getJobStatus(clients as never, "ns", "r-1");
|
||||
expect(status.phase).toBe("Succeeded");
|
||||
expect(status.complete).toBe(true);
|
||||
});
|
||||
|
||||
it("returns phase=Failed when failed count is >0", async () => {
|
||||
const get = vi.fn().mockResolvedValue({ status: { failed: 1, conditions: [{ type: "Failed", status: "True", reason: "DeadlineExceeded" }] } });
|
||||
const clients = { batch: { readNamespacedJobStatus: get } };
|
||||
const status = await getJobStatus(clients as never, "ns", "r-1");
|
||||
expect(status.phase).toBe("Failed");
|
||||
expect(status.reason).toBe("DeadlineExceeded");
|
||||
});
|
||||
|
||||
it("returns phase=Running when active count is >0", async () => {
|
||||
const get = vi.fn().mockResolvedValue({ status: { active: 1 } });
|
||||
const clients = { batch: { readNamespacedJobStatus: get } };
|
||||
const status = await getJobStatus(clients as never, "ns", "r-1");
|
||||
expect(status.phase).toBe("Running");
|
||||
});
|
||||
|
||||
it("returns phase=Pending when no active/succeeded/failed counters set", async () => {
|
||||
const get = vi.fn().mockResolvedValue({ status: {} });
|
||||
const clients = { batch: { readNamespacedJobStatus: get } };
|
||||
const status = await getJobStatus(clients as never, "ns", "r-1");
|
||||
expect(status.phase).toBe("Pending");
|
||||
});
|
||||
});
|
||||
|
||||
describe("findPodForJob", () => {
|
||||
it("lists pods by job-name label and returns the first running pod", async () => {
|
||||
const list = vi.fn().mockResolvedValue({ items: [{ metadata: { name: "r-1-xyz" }, status: { phase: "Running" } }] });
|
||||
const clients = { core: { listNamespacedPod: list } };
|
||||
const podName = await findPodForJob(clients as never, "ns", "r-1");
|
||||
expect(list).toHaveBeenCalledWith(expect.objectContaining({ namespace: "ns", labelSelector: "job-name=r-1" }));
|
||||
expect(podName).toBe("r-1-xyz");
|
||||
});
|
||||
|
||||
it("returns null when no pod is found", async () => {
|
||||
const list = vi.fn().mockResolvedValue({ items: [] });
|
||||
const clients = { core: { listNamespacedPod: list } };
|
||||
const podName = await findPodForJob(clients as never, "ns", "r-1");
|
||||
expect(podName).toBeNull();
|
||||
});
|
||||
});
|
||||
|
||||
describe("deleteJob", () => {
|
||||
it("calls batch.deleteNamespacedJob with foreground propagation", async () => {
|
||||
const del = vi.fn().mockResolvedValue({});
|
||||
const clients = { batch: { deleteNamespacedJob: del } };
|
||||
await deleteJob(clients as never, "ns", "r-1");
|
||||
expect(del).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
namespace: "ns",
|
||||
name: "r-1",
|
||||
propagationPolicy: "Foreground",
|
||||
}),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
describe("streamPodLogs", () => {
|
||||
it("emits pod log response bodies as stdout because Kubernetes pod logs are combined", async () => {
|
||||
const readNamespacedPodLog = vi.fn().mockResolvedValue({ body: "hello\n" });
|
||||
const clients = { core: { readNamespacedPodLog } };
|
||||
const chunks: { stream: "stdout" | "stderr"; text: string }[] = [];
|
||||
await streamPodLogs(clients as never, "ns", "pod-1", async (stream, text) => {
|
||||
chunks.push({ stream, text });
|
||||
});
|
||||
|
||||
expect(readNamespacedPodLog).toHaveBeenCalledWith({ namespace: "ns", name: "pod-1" });
|
||||
expect(chunks).toEqual([{ stream: "stdout", text: "hello\n" }]);
|
||||
});
|
||||
});
|
||||
|
||||
describe("waitForJobCompletion", () => {
|
||||
it("throws JobTimeoutError when the deadline is exceeded", async () => {
|
||||
const get = vi.fn().mockResolvedValue({ status: { active: 1 } });
|
||||
const clients = { batch: { readNamespacedJobStatus: get } };
|
||||
await expect(
|
||||
waitForJobCompletion(clients as never, "ns", "r-1", { timeoutMs: 50, pollMs: 10 }),
|
||||
).rejects.toBeInstanceOf(JobTimeoutError);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,47 @@
|
||||
import { describe, it, expect, vi } from "vitest";
|
||||
import { KubeConfig } from "@kubernetes/client-node";
|
||||
import { createKubeConfig } from "../../src/kube-client.js";
|
||||
|
||||
describe("createKubeConfig", () => {
|
||||
it("loads from inline kubeconfig string", () => {
|
||||
const yaml = `apiVersion: v1
|
||||
kind: Config
|
||||
clusters:
|
||||
- name: test
|
||||
cluster:
|
||||
server: https://fake.example.com
|
||||
contexts:
|
||||
- name: test
|
||||
context:
|
||||
cluster: test
|
||||
user: test
|
||||
current-context: test
|
||||
users:
|
||||
- name: test
|
||||
user:
|
||||
token: fake-token
|
||||
`;
|
||||
const kc = createKubeConfig({ inCluster: false, kubeconfig: yaml });
|
||||
expect(kc.getCurrentContext()).toBe("test");
|
||||
expect(kc.getCurrentCluster()?.server).toBe("https://fake.example.com");
|
||||
});
|
||||
|
||||
it("loads from-cluster config when inCluster=true", () => {
|
||||
const spy = vi.spyOn(KubeConfig.prototype, "loadFromCluster").mockImplementation(function (this: KubeConfig) {
|
||||
this.loadFromString(`apiVersion: v1
|
||||
kind: Config
|
||||
clusters: [{name: in-cluster, cluster: {server: 'https://kubernetes.default.svc'}}]
|
||||
contexts: [{name: in-cluster, context: {cluster: in-cluster, user: in-cluster}}]
|
||||
current-context: in-cluster
|
||||
users: [{name: in-cluster, user: {token: tok}}]`);
|
||||
});
|
||||
const kc = createKubeConfig({ inCluster: true });
|
||||
expect(spy).toHaveBeenCalledOnce();
|
||||
expect(kc.getCurrentContext()).toBe("in-cluster");
|
||||
spy.mockRestore();
|
||||
});
|
||||
|
||||
it("throws when neither inCluster nor kubeconfig string is provided", () => {
|
||||
expect(() => createKubeConfig({ inCluster: false })).toThrow(/requires/i);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,65 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import { buildNetworkPolicyManifests } from "../../src/network-policy.js";
|
||||
|
||||
describe("buildNetworkPolicyManifests", () => {
|
||||
const baseInput = {
|
||||
namespace: "paperclip-acme",
|
||||
paperclipServerNamespace: "paperclip",
|
||||
egressAllowFqdns: [] as string[],
|
||||
egressAllowCidrs: [] as string[],
|
||||
};
|
||||
|
||||
it("produces a deny-all + egress allow pair", () => {
|
||||
const manifests = buildNetworkPolicyManifests(baseInput);
|
||||
expect(manifests).toHaveLength(2);
|
||||
expect(manifests[0].metadata.name).toBe("paperclip-deny-all");
|
||||
expect(manifests[1].metadata.name).toBe("paperclip-egress-allow");
|
||||
});
|
||||
|
||||
it("deny-all has no ingress/egress rules and applies to all pods", () => {
|
||||
const [denyAll] = buildNetworkPolicyManifests(baseInput);
|
||||
expect(denyAll.spec.podSelector).toEqual({});
|
||||
expect(denyAll.spec.policyTypes).toEqual(["Ingress", "Egress"]);
|
||||
expect(denyAll.spec.ingress).toBeUndefined();
|
||||
expect(denyAll.spec.egress).toBeUndefined();
|
||||
});
|
||||
|
||||
it("egress allow includes kube-dns and paperclip-server callback", () => {
|
||||
const [, egress] = buildNetworkPolicyManifests(baseInput);
|
||||
const rules = egress.spec.egress;
|
||||
const dnsRule = rules.find((r: { ports?: { protocol: string; port: number }[] }) =>
|
||||
r.ports?.some((p) => p.port === 53),
|
||||
);
|
||||
expect(dnsRule).toBeDefined();
|
||||
const paperclipRule = rules.find((r: { to: { namespaceSelector?: { matchLabels?: Record<string, string> } }[] }) =>
|
||||
r.to.some((t) => t.namespaceSelector?.matchLabels?.["kubernetes.io/metadata.name"] === "paperclip"),
|
||||
);
|
||||
expect(paperclipRule).toBeDefined();
|
||||
});
|
||||
|
||||
it("includes user-supplied CIDRs in egress allow", () => {
|
||||
const [, egress] = buildNetworkPolicyManifests({ ...baseInput, egressAllowCidrs: ["10.0.0.0/8"] });
|
||||
const cidrRule = egress.spec.egress.find((r: { to: { ipBlock?: { cidr: string } }[] }) =>
|
||||
r.to.some((t) => t.ipBlock?.cidr === "10.0.0.0/8"),
|
||||
);
|
||||
expect(cidrRule).toBeDefined();
|
||||
});
|
||||
|
||||
it("adds a public HTTPS fallback when standard mode receives FQDN allow-list entries", () => {
|
||||
const [, egress] = buildNetworkPolicyManifests({ ...baseInput, egressAllowFqdns: ["api.anthropic.com"] });
|
||||
const publicHttpsRule = egress.spec.egress.find((r: { to: { ipBlock?: { cidr: string; except?: string[] } }[]; ports?: { port: number }[] }) =>
|
||||
r.to.some((t) => t.ipBlock?.cidr === "0.0.0.0/0") && r.ports?.some((p) => p.port === 443),
|
||||
);
|
||||
expect(publicHttpsRule).toBeDefined();
|
||||
expect(publicHttpsRule.to[0].ipBlock.except).toContain("10.0.0.0/8");
|
||||
});
|
||||
|
||||
it("uses paperclip-server pod label selector for callback ingress to paperclip ns", () => {
|
||||
const [, egress] = buildNetworkPolicyManifests(baseInput);
|
||||
const callbackRule = egress.spec.egress.find((r: { to: { podSelector?: { matchLabels?: Record<string, string> } }[] }) =>
|
||||
r.to.some((t) => t.podSelector?.matchLabels?.app === "paperclip-server"),
|
||||
);
|
||||
expect(callbackRule).toBeDefined();
|
||||
expect(callbackRule.ports[0].port).toBe(3100);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,94 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import plugin from "../../src/plugin.js";
|
||||
|
||||
describe("plugin", () => {
|
||||
it("exports the kubernetes driver", () => {
|
||||
expect(plugin.definition.onEnvironmentAcquireLease).toBeTypeOf("function");
|
||||
expect(plugin.definition.onEnvironmentValidateConfig).toBeTypeOf("function");
|
||||
});
|
||||
|
||||
it("validateConfig accepts inCluster=true config", async () => {
|
||||
const result = await plugin.definition.onEnvironmentValidateConfig!({
|
||||
driverKey: "kubernetes",
|
||||
config: { inCluster: true },
|
||||
});
|
||||
expect(result.ok).toBe(true);
|
||||
});
|
||||
|
||||
it("validateConfig rejects missing auth", async () => {
|
||||
const result = await plugin.definition.onEnvironmentValidateConfig!({
|
||||
driverKey: "kubernetes",
|
||||
config: {},
|
||||
});
|
||||
expect(result.ok).toBe(false);
|
||||
expect(result.errors?.[0]).toMatch(/requires one of `inCluster`/);
|
||||
});
|
||||
|
||||
it("validateConfig normalizes defaults", async () => {
|
||||
const result = await plugin.definition.onEnvironmentValidateConfig!({
|
||||
driverKey: "kubernetes",
|
||||
config: { inCluster: true },
|
||||
});
|
||||
expect(result.ok).toBe(true);
|
||||
expect(result.normalizedConfig).toEqual(
|
||||
expect.objectContaining({
|
||||
namespacePrefix: "paperclip-",
|
||||
egressMode: "standard",
|
||||
jobTtlSecondsAfterFinished: 900,
|
||||
podActivityDeadlineSec: 3600,
|
||||
adapterType: "claude_local",
|
||||
backend: "sandbox-cr", // new default
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
it("validateConfig accepts backend=sandbox-cr explicitly", async () => {
|
||||
const result = await plugin.definition.onEnvironmentValidateConfig!({
|
||||
driverKey: "kubernetes",
|
||||
config: { inCluster: true, backend: "sandbox-cr" },
|
||||
});
|
||||
expect(result.ok).toBe(true);
|
||||
expect(result.normalizedConfig?.backend).toBe("sandbox-cr");
|
||||
});
|
||||
|
||||
it("validateConfig accepts backend=job (stable fallback)", async () => {
|
||||
const result = await plugin.definition.onEnvironmentValidateConfig!({
|
||||
driverKey: "kubernetes",
|
||||
config: { inCluster: true, backend: "job" },
|
||||
});
|
||||
expect(result.ok).toBe(true);
|
||||
expect(result.normalizedConfig?.backend).toBe("job");
|
||||
});
|
||||
|
||||
it("validateConfig rejects unknown backend value", async () => {
|
||||
const result = await plugin.definition.onEnvironmentValidateConfig!({
|
||||
driverKey: "kubernetes",
|
||||
config: { inCluster: true, backend: "kata-fc" },
|
||||
});
|
||||
expect(result.ok).toBe(false);
|
||||
});
|
||||
|
||||
it("onHealth returns ok", async () => {
|
||||
const result = await plugin.definition.onHealth!();
|
||||
expect(result.status).toBe("ok");
|
||||
});
|
||||
|
||||
it("validateConfig warns about FQDN limitation in standard mode", async () => {
|
||||
const result = await plugin.definition.onEnvironmentValidateConfig!({
|
||||
driverKey: "kubernetes",
|
||||
config: { inCluster: true, adapterType: "claude_local" },
|
||||
});
|
||||
expect(result.ok).toBe(true);
|
||||
expect(result.warnings).toBeDefined();
|
||||
expect(result.warnings?.some((w) => w.includes("api.anthropic.com"))).toBe(true);
|
||||
});
|
||||
|
||||
it("validateConfig does NOT warn when egressMode is cilium", async () => {
|
||||
const result = await plugin.definition.onEnvironmentValidateConfig!({
|
||||
driverKey: "kubernetes",
|
||||
config: { inCluster: true, adapterType: "claude_local", egressMode: "cilium" },
|
||||
});
|
||||
expect(result.ok).toBe(true);
|
||||
expect(result.warnings).toBeUndefined();
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,95 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import { buildJobManifest } from "../../src/pod-spec-builder.js";
|
||||
|
||||
const baseInput = {
|
||||
namespace: "paperclip-acme",
|
||||
jobName: "r-01h00000000000000000000000",
|
||||
adapterType: "claude_local",
|
||||
image: "ghcr.io/paperclipai/agent-runtime-claude:v1",
|
||||
envSecretName: "r-01h00000000000000000000000-env",
|
||||
serviceAccountName: "paperclip-tenant-sa",
|
||||
labels: { "paperclip.io/run-id": "r1" },
|
||||
resources: { requests: { cpu: "250m", memory: "512Mi" }, limits: { cpu: "2", memory: "4Gi" } },
|
||||
runtimeClassName: undefined,
|
||||
activeDeadlineSec: 3600,
|
||||
ttlSecondsAfterFinished: 900,
|
||||
};
|
||||
|
||||
describe("buildJobManifest", () => {
|
||||
it("returns a Job manifest with the correct apiVersion and kind", () => {
|
||||
const job = buildJobManifest(baseInput);
|
||||
expect(job.apiVersion).toBe("batch/v1");
|
||||
expect(job.kind).toBe("Job");
|
||||
});
|
||||
|
||||
it("sets Job-level lifecycle controls: backoffLimit=0, ttlSecondsAfterFinished, activeDeadlineSeconds", () => {
|
||||
const job = buildJobManifest({ ...baseInput, activeDeadlineSec: 1800, ttlSecondsAfterFinished: 600 });
|
||||
expect(job.spec.backoffLimit).toBe(0);
|
||||
expect(job.spec.ttlSecondsAfterFinished).toBe(600);
|
||||
expect(job.spec.activeDeadlineSeconds).toBe(1800);
|
||||
});
|
||||
|
||||
it("sets the security context to non-root, drop ALL caps, read-only rootFS, seccomp RuntimeDefault", () => {
|
||||
const job = buildJobManifest(baseInput);
|
||||
const podSec = job.spec.template.spec.securityContext;
|
||||
expect(podSec.runAsNonRoot).toBe(true);
|
||||
expect(podSec.runAsUser).toBe(1000);
|
||||
expect(podSec.fsGroupChangePolicy).toBe("OnRootMismatch");
|
||||
expect(podSec.seccompProfile.type).toBe("RuntimeDefault");
|
||||
|
||||
const container = job.spec.template.spec.containers[0];
|
||||
expect(container.securityContext.runAsNonRoot).toBe(true);
|
||||
expect(container.securityContext.readOnlyRootFilesystem).toBe(true);
|
||||
expect(container.securityContext.allowPrivilegeEscalation).toBe(false);
|
||||
expect(container.securityContext.capabilities.drop).toEqual(["ALL"]);
|
||||
});
|
||||
|
||||
it("wraps the entrypoint in tini for PID 1", () => {
|
||||
const job = buildJobManifest(baseInput);
|
||||
const container = job.spec.template.spec.containers[0];
|
||||
expect(container.command).toEqual(["/usr/bin/tini", "--", "/usr/local/bin/paperclip-agent-shim"]);
|
||||
});
|
||||
|
||||
it("declares explicit writable emptyDir mounts for the standard agent paths", () => {
|
||||
const job = buildJobManifest(baseInput);
|
||||
const mounts = job.spec.template.spec.containers[0].volumeMounts;
|
||||
const mountPaths = mounts.map((m: { mountPath: string }) => m.mountPath).sort();
|
||||
expect(mountPaths).toEqual(["/home/paperclip", "/home/paperclip/.cache", "/tmp", "/workspace"]);
|
||||
|
||||
const volumes = job.spec.template.spec.volumes;
|
||||
expect(volumes.every((v: { emptyDir?: unknown }) => v.emptyDir !== undefined)).toBe(true);
|
||||
});
|
||||
|
||||
it("envFrom references the per-run secret", () => {
|
||||
const job = buildJobManifest(baseInput);
|
||||
const envFrom = job.spec.template.spec.containers[0].envFrom;
|
||||
expect(envFrom[0].secretRef.name).toBe(baseInput.envSecretName);
|
||||
});
|
||||
|
||||
it("applies runtimeClassName when set", () => {
|
||||
const job = buildJobManifest({ ...baseInput, runtimeClassName: "kata-fc" });
|
||||
expect(job.spec.template.spec.runtimeClassName).toBe("kata-fc");
|
||||
});
|
||||
|
||||
it("does not set runtimeClassName when unset", () => {
|
||||
const job = buildJobManifest(baseInput);
|
||||
expect(job.spec.template.spec.runtimeClassName).toBeUndefined();
|
||||
});
|
||||
|
||||
it("sets pod restartPolicy=Never (required for Job)", () => {
|
||||
const job = buildJobManifest(baseInput);
|
||||
expect(job.spec.template.spec.restartPolicy).toBe("Never");
|
||||
});
|
||||
|
||||
it("disables automountServiceAccountToken to avoid exposing an unnecessary SA token", () => {
|
||||
const job = buildJobManifest(baseInput);
|
||||
expect(job.spec.template.spec.automountServiceAccountToken).toBe(false);
|
||||
});
|
||||
|
||||
it("applies the provided labels to both Job metadata and pod template", () => {
|
||||
const job = buildJobManifest(baseInput);
|
||||
expect(job.metadata.labels["paperclip.io/run-id"]).toBe("r1");
|
||||
expect(job.spec.template.metadata.labels["paperclip.io/run-id"]).toBe("r1");
|
||||
expect(job.spec.template.metadata.labels["paperclip.io/role"]).toBe("agent");
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,137 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import { buildSandboxCrManifest } from "../../src/sandbox-cr-builder.js";
|
||||
|
||||
const baseInput = {
|
||||
namespace: "paperclip-acme",
|
||||
sandboxName: "pc-01h00000000000000000000000",
|
||||
adapterType: "claude_local",
|
||||
image: "ghcr.io/paperclipai/agent-runtime-claude:v1",
|
||||
envSecretName: "pc-01h00000000000000000000000-env",
|
||||
serviceAccountName: "paperclip-tenant-sa",
|
||||
labels: { "paperclip.io/run-id": "r1" },
|
||||
resources: {
|
||||
requests: { cpu: "250m", memory: "512Mi" },
|
||||
limits: { cpu: "2", memory: "4Gi" },
|
||||
},
|
||||
runtimeClassName: undefined,
|
||||
};
|
||||
|
||||
describe("buildSandboxCrManifest", () => {
|
||||
it("returns a Sandbox CR with the correct apiVersion and kind", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
expect(cr.apiVersion).toBe("agents.x-k8s.io/v1alpha1");
|
||||
expect(cr.kind).toBe("Sandbox");
|
||||
});
|
||||
|
||||
it("sets metadata name and namespace correctly", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
expect(cr.metadata.name).toBe(baseInput.sandboxName);
|
||||
expect(cr.metadata.namespace).toBe(baseInput.namespace);
|
||||
});
|
||||
|
||||
it("does NOT set ownerReferences (out-of-cluster server, explicit release path)", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
expect(cr.metadata.ownerReferences).toBeUndefined();
|
||||
});
|
||||
|
||||
it("sets restartPolicy=Always on the pod template (required for long-lived Sandbox pod)", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
expect(cr.spec.podTemplate.spec.restartPolicy).toBe("Always");
|
||||
});
|
||||
|
||||
it("uses sleep-infinity entrypoint via Tini for multi-command exec", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
const container = cr.spec.podTemplate.spec.containers[0];
|
||||
expect(container.command).toEqual([
|
||||
"/usr/bin/tini",
|
||||
"--",
|
||||
"/bin/sh",
|
||||
"-c",
|
||||
"sleep infinity",
|
||||
]);
|
||||
});
|
||||
|
||||
it("applies the same security baseline as Job backend (non-root, drop ALL, RO rootFS, seccomp)", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
const podSec = cr.spec.podTemplate.spec.securityContext;
|
||||
expect(podSec.runAsNonRoot).toBe(true);
|
||||
expect(podSec.runAsUser).toBe(1000);
|
||||
expect(podSec.fsGroupChangePolicy).toBe("OnRootMismatch");
|
||||
expect(podSec.seccompProfile.type).toBe("RuntimeDefault");
|
||||
|
||||
const container = cr.spec.podTemplate.spec.containers[0];
|
||||
expect(container.securityContext.runAsNonRoot).toBe(true);
|
||||
expect(container.securityContext.readOnlyRootFilesystem).toBe(true);
|
||||
expect(container.securityContext.allowPrivilegeEscalation).toBe(false);
|
||||
expect(container.securityContext.capabilities.drop).toEqual(["ALL"]);
|
||||
});
|
||||
|
||||
it("disables automountServiceAccountToken", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
expect(cr.spec.podTemplate.spec.automountServiceAccountToken).toBe(false);
|
||||
});
|
||||
|
||||
it("declares emptyDir volume mounts for standard agent paths", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
const mounts = cr.spec.podTemplate.spec.containers[0].volumeMounts;
|
||||
const mountPaths = mounts
|
||||
.map((m: { mountPath: string }) => m.mountPath)
|
||||
.sort();
|
||||
expect(mountPaths).toEqual([
|
||||
"/home/paperclip",
|
||||
"/home/paperclip/.cache",
|
||||
"/tmp",
|
||||
"/workspace",
|
||||
]);
|
||||
|
||||
const volumes = cr.spec.podTemplate.spec.volumes;
|
||||
expect(
|
||||
volumes.every((v: { emptyDir?: unknown }) => v.emptyDir !== undefined),
|
||||
).toBe(true);
|
||||
});
|
||||
|
||||
it("envFrom references the per-run secret", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
const envFrom = cr.spec.podTemplate.spec.containers[0].envFrom;
|
||||
expect(envFrom[0].secretRef.name).toBe(baseInput.envSecretName);
|
||||
});
|
||||
|
||||
it("applies runtimeClassName when set", () => {
|
||||
const cr = buildSandboxCrManifest({
|
||||
...baseInput,
|
||||
runtimeClassName: "kata-fc",
|
||||
});
|
||||
expect(cr.spec.podTemplate.spec.runtimeClassName).toBe("kata-fc");
|
||||
});
|
||||
|
||||
it("does not set runtimeClassName when unset", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
expect(cr.spec.podTemplate.spec.runtimeClassName).toBeUndefined();
|
||||
});
|
||||
|
||||
it("applies provided labels to CR metadata and pod template labels (with role=agent added)", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
expect(cr.metadata.labels["paperclip.io/run-id"]).toBe("r1");
|
||||
expect(
|
||||
cr.spec.podTemplate.metadata.labels["paperclip.io/run-id"],
|
||||
).toBe("r1");
|
||||
expect(cr.spec.podTemplate.metadata.labels["paperclip.io/role"]).toBe(
|
||||
"agent",
|
||||
);
|
||||
});
|
||||
|
||||
it("applies imagePullSecrets when provided", () => {
|
||||
const cr = buildSandboxCrManifest({
|
||||
...baseInput,
|
||||
imagePullSecrets: ["my-pull-secret"],
|
||||
});
|
||||
expect(cr.spec.podTemplate.spec.imagePullSecrets).toEqual([
|
||||
{ name: "my-pull-secret" },
|
||||
]);
|
||||
});
|
||||
|
||||
it("does not set imagePullSecrets when not provided", () => {
|
||||
const cr = buildSandboxCrManifest(baseInput);
|
||||
expect(cr.spec.podTemplate.spec.imagePullSecrets).toBeUndefined();
|
||||
});
|
||||
});
|
||||
+216
@@ -0,0 +1,216 @@
|
||||
import { describe, it, expect, vi } from "vitest";
|
||||
import {
|
||||
createSandboxCr,
|
||||
deleteSandboxCr,
|
||||
getSandboxCrStatus,
|
||||
findPodForSandbox,
|
||||
SandboxCrTimeoutError,
|
||||
waitForSandboxReady,
|
||||
} from "../../src/sandbox-cr-orchestrator.js";
|
||||
|
||||
const SANDBOX_GROUP = "agents.x-k8s.io";
|
||||
const SANDBOX_VERSION = "v1alpha1";
|
||||
const SANDBOX_PLURAL = "sandboxes";
|
||||
|
||||
// Helpers to build mock CR objects with given phase
|
||||
function makeCr(phase: string, podName?: string): Record<string, unknown> {
|
||||
return {
|
||||
metadata: { uid: "sandbox-uid-123" },
|
||||
status: {
|
||||
phase,
|
||||
...(podName ? { podName } : {}),
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
describe("createSandboxCr", () => {
|
||||
it("calls custom.createNamespacedCustomObject with the correct params", async () => {
|
||||
const create = vi.fn().mockResolvedValue({ metadata: { uid: "test-uid" } });
|
||||
const clients = { custom: { createNamespacedCustomObject: create } };
|
||||
const manifest = {
|
||||
apiVersion: "agents.x-k8s.io/v1alpha1",
|
||||
kind: "Sandbox",
|
||||
metadata: { name: "pc-abc", namespace: "paperclip-acme" },
|
||||
};
|
||||
const result = await createSandboxCr(clients as never, "paperclip-acme", manifest);
|
||||
expect(create).toHaveBeenCalledWith({
|
||||
group: SANDBOX_GROUP,
|
||||
version: SANDBOX_VERSION,
|
||||
namespace: "paperclip-acme",
|
||||
plural: SANDBOX_PLURAL,
|
||||
body: manifest,
|
||||
});
|
||||
expect(result.uid).toBe("test-uid");
|
||||
});
|
||||
|
||||
it("throws if the API response has no UID", async () => {
|
||||
const create = vi.fn().mockResolvedValue({ metadata: {} });
|
||||
const clients = { custom: { createNamespacedCustomObject: create } };
|
||||
await expect(
|
||||
createSandboxCr(clients as never, "ns", {}),
|
||||
).rejects.toThrow("Sandbox CR created without a UID");
|
||||
});
|
||||
});
|
||||
|
||||
describe("getSandboxCrStatus", () => {
|
||||
it("maps phase=Ready to SandboxStatus.phase=Running with active=1", async () => {
|
||||
const get = vi.fn().mockResolvedValue(makeCr("Ready"));
|
||||
const clients = { custom: { getNamespacedCustomObject: get } };
|
||||
const status = await getSandboxCrStatus(clients as never, "ns", "pc-abc");
|
||||
expect(status.phase).toBe("Running");
|
||||
expect(status.active).toBe(1);
|
||||
expect(status.complete).toBe(false);
|
||||
});
|
||||
|
||||
it("maps phase=Pending to SandboxStatus.phase=Pending", async () => {
|
||||
const get = vi.fn().mockResolvedValue(makeCr("Pending"));
|
||||
const clients = { custom: { getNamespacedCustomObject: get } };
|
||||
const status = await getSandboxCrStatus(clients as never, "ns", "pc-abc");
|
||||
expect(status.phase).toBe("Pending");
|
||||
expect(status.active).toBe(0);
|
||||
});
|
||||
|
||||
it("maps phase=Failed to SandboxStatus.phase=Failed with failed=1", async () => {
|
||||
const get = vi.fn().mockResolvedValue({
|
||||
metadata: { uid: "uid-1" },
|
||||
status: {
|
||||
phase: "Failed",
|
||||
conditions: [
|
||||
{ type: "Failed", reason: "ImagePullFailed", message: "no image" },
|
||||
],
|
||||
},
|
||||
});
|
||||
const clients = { custom: { getNamespacedCustomObject: get } };
|
||||
const status = await getSandboxCrStatus(clients as never, "ns", "pc-abc");
|
||||
expect(status.phase).toBe("Failed");
|
||||
expect(status.failed).toBe(1);
|
||||
expect(status.reason).toBe("ImagePullFailed");
|
||||
});
|
||||
|
||||
it("maps phase=Terminating to SandboxStatus.phase=Running with reason=Terminating", async () => {
|
||||
const get = vi.fn().mockResolvedValue(makeCr("Terminating"));
|
||||
const clients = { custom: { getNamespacedCustomObject: get } };
|
||||
const status = await getSandboxCrStatus(clients as never, "ns", "pc-abc");
|
||||
expect(status.phase).toBe("Running");
|
||||
expect(status.reason).toBe("Terminating");
|
||||
});
|
||||
});
|
||||
|
||||
describe("findPodForSandbox", () => {
|
||||
it("returns status.podName from the Sandbox CR when set", async () => {
|
||||
const get = vi.fn().mockResolvedValue(makeCr("Ready", "pc-abc-pod-xyz"));
|
||||
const clients = {
|
||||
custom: { getNamespacedCustomObject: get },
|
||||
core: { listNamespacedPod: vi.fn() },
|
||||
};
|
||||
const podName = await findPodForSandbox(clients as never, "ns", "pc-abc");
|
||||
expect(podName).toBe("pc-abc-pod-xyz");
|
||||
// Should NOT have called listNamespacedPod (primary path succeeded)
|
||||
expect(clients.core.listNamespacedPod).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("falls back to pod listing when status.podName is absent", async () => {
|
||||
const get = vi.fn().mockResolvedValue(makeCr("Pending")); // no podName
|
||||
const list = vi.fn().mockResolvedValue({
|
||||
items: [
|
||||
{
|
||||
metadata: { name: "pc-abc-001", labels: { "paperclip.io/managed-by": "paperclip-k8s-plugin" } },
|
||||
status: { phase: "Running" },
|
||||
},
|
||||
],
|
||||
});
|
||||
const clients = {
|
||||
custom: { getNamespacedCustomObject: get },
|
||||
core: { listNamespacedPod: list },
|
||||
};
|
||||
const podName = await findPodForSandbox(clients as never, "ns", "pc-abc");
|
||||
// name starts with "pc-abc" → matched by prefix heuristic
|
||||
expect(podName).toBe("pc-abc-001");
|
||||
});
|
||||
|
||||
it("returns null when no pod is found in fallback", async () => {
|
||||
const get = vi.fn().mockResolvedValue(makeCr("Pending"));
|
||||
const list = vi.fn().mockResolvedValue({ items: [] });
|
||||
const clients = {
|
||||
custom: { getNamespacedCustomObject: get },
|
||||
core: { listNamespacedPod: list },
|
||||
};
|
||||
const podName = await findPodForSandbox(clients as never, "ns", "pc-abc");
|
||||
expect(podName).toBeNull();
|
||||
});
|
||||
});
|
||||
|
||||
describe("deleteSandboxCr", () => {
|
||||
it("calls custom.deleteNamespacedCustomObject with Foreground propagation", async () => {
|
||||
const del = vi.fn().mockResolvedValue({});
|
||||
const clients = { custom: { deleteNamespacedCustomObject: del } };
|
||||
await deleteSandboxCr(clients as never, "ns", "pc-abc");
|
||||
expect(del).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
group: SANDBOX_GROUP,
|
||||
version: SANDBOX_VERSION,
|
||||
namespace: "ns",
|
||||
plural: SANDBOX_PLURAL,
|
||||
name: "pc-abc",
|
||||
propagationPolicy: "Foreground",
|
||||
}),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
describe("waitForSandboxReady", () => {
|
||||
it("resolves immediately when Sandbox is already Ready", async () => {
|
||||
const get = vi.fn().mockResolvedValue(makeCr("Ready"));
|
||||
const clients = { custom: { getNamespacedCustomObject: get } };
|
||||
const status = await waitForSandboxReady(
|
||||
clients as never,
|
||||
"ns",
|
||||
"pc-abc",
|
||||
{ timeoutMs: 5000, pollMs: 10 },
|
||||
);
|
||||
expect(status.phase).toBe("Running"); // Ready maps to Running
|
||||
expect(get).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
|
||||
it("polls until Ready", async () => {
|
||||
const get = vi
|
||||
.fn()
|
||||
.mockResolvedValueOnce(makeCr("Pending"))
|
||||
.mockResolvedValueOnce(makeCr("Pending"))
|
||||
.mockResolvedValueOnce(makeCr("Ready"));
|
||||
const clients = { custom: { getNamespacedCustomObject: get } };
|
||||
const status = await waitForSandboxReady(
|
||||
clients as never,
|
||||
"ns",
|
||||
"pc-abc",
|
||||
{ timeoutMs: 5000, pollMs: 10 },
|
||||
);
|
||||
expect(status.phase).toBe("Running");
|
||||
expect(get).toHaveBeenCalledTimes(3);
|
||||
});
|
||||
|
||||
it("throws SandboxCrTimeoutError when deadline is exceeded", async () => {
|
||||
const get = vi.fn().mockResolvedValue(makeCr("Pending"));
|
||||
const clients = { custom: { getNamespacedCustomObject: get } };
|
||||
await expect(
|
||||
waitForSandboxReady(clients as never, "ns", "pc-abc", {
|
||||
timeoutMs: 50,
|
||||
pollMs: 10,
|
||||
}),
|
||||
).rejects.toBeInstanceOf(SandboxCrTimeoutError);
|
||||
});
|
||||
|
||||
it("throws an error describing the failure when Sandbox fails", async () => {
|
||||
const get = vi.fn().mockResolvedValue({
|
||||
metadata: { uid: "u1" },
|
||||
status: { phase: "Failed", conditions: [{ type: "Failed", reason: "OOMKilled" }] },
|
||||
});
|
||||
const clients = { custom: { getNamespacedCustomObject: get } };
|
||||
await expect(
|
||||
waitForSandboxReady(clients as never, "ns", "pc-abc", {
|
||||
timeoutMs: 5000,
|
||||
pollMs: 10,
|
||||
}),
|
||||
).rejects.toThrow(/failed.*OOMKilled/i);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,68 @@
|
||||
import { describe, it, expect, vi } from "vitest";
|
||||
import { createPerRunSecret } from "../../src/secret-manager.js";
|
||||
|
||||
describe("createPerRunSecret", () => {
|
||||
const baseInput = {
|
||||
namespace: "paperclip-acme",
|
||||
secretName: "r-abcd-env",
|
||||
runId: "r-abcd",
|
||||
ownerKind: "Job",
|
||||
ownerApiVersion: "batch/v1",
|
||||
ownerName: "r-abcd",
|
||||
ownerUid: "11111111-1111-1111-1111-111111111111",
|
||||
bootstrapToken: "tok-xyz",
|
||||
adapterEnv: { ANTHROPIC_API_KEY: "sk-test" },
|
||||
};
|
||||
|
||||
it("creates a Secret with the correct name and namespace", async () => {
|
||||
const created: { body: Record<string, unknown> }[] = [];
|
||||
const clients = {
|
||||
core: { createNamespacedSecret: vi.fn(async (args: { body: Record<string, unknown> }) => { created.push(args); }) },
|
||||
};
|
||||
await createPerRunSecret(clients as never, baseInput);
|
||||
expect(clients.core.createNamespacedSecret).toHaveBeenCalledOnce();
|
||||
const body = created[0].body as { metadata: { name: string; namespace: string } };
|
||||
expect(body.metadata.name).toBe("r-abcd-env");
|
||||
expect(body.metadata.namespace).toBe("paperclip-acme");
|
||||
});
|
||||
|
||||
it("includes BOOTSTRAP_TOKEN and adapter env keys in stringData", async () => {
|
||||
const created: { body: Record<string, unknown> }[] = [];
|
||||
const clients = {
|
||||
core: { createNamespacedSecret: vi.fn(async (args: { body: Record<string, unknown> }) => { created.push(args); }) },
|
||||
};
|
||||
await createPerRunSecret(clients as never, baseInput);
|
||||
const body = created[0].body as { stringData: Record<string, string> };
|
||||
expect(body.stringData.BOOTSTRAP_TOKEN).toBe("tok-xyz");
|
||||
expect(body.stringData.ANTHROPIC_API_KEY).toBe("sk-test");
|
||||
});
|
||||
|
||||
it("sets ownerReferences to the owner resource for cascade delete", async () => {
|
||||
const created: { body: Record<string, unknown> }[] = [];
|
||||
const clients = {
|
||||
core: { createNamespacedSecret: vi.fn(async (args: { body: Record<string, unknown> }) => { created.push(args); }) },
|
||||
};
|
||||
await createPerRunSecret(clients as never, baseInput);
|
||||
const body = created[0].body as { metadata: { ownerReferences: { uid: string; controller: boolean }[] } };
|
||||
expect(body.metadata.ownerReferences).toHaveLength(1);
|
||||
expect(body.metadata.ownerReferences[0].uid).toBe("11111111-1111-1111-1111-111111111111");
|
||||
expect(body.metadata.ownerReferences[0].controller).toBe(true);
|
||||
});
|
||||
|
||||
it("throws if adapterEnv contains BOOTSTRAP_TOKEN", async () => {
|
||||
const clients = { core: { createNamespacedSecret: vi.fn() } };
|
||||
await expect(
|
||||
createPerRunSecret(clients as never, {
|
||||
...baseInput,
|
||||
adapterEnv: { BOOTSTRAP_TOKEN: "evil" },
|
||||
}),
|
||||
).rejects.toThrow(/BOOTSTRAP_TOKEN/);
|
||||
});
|
||||
|
||||
it("throws if ownerUid is empty", async () => {
|
||||
const clients = { core: { createNamespacedSecret: vi.fn() } };
|
||||
await expect(
|
||||
createPerRunSecret(clients as never, { ...baseInput, ownerUid: "" }),
|
||||
).rejects.toThrow(/ownerUid/);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,153 @@
|
||||
import { describe, it, expect, vi } from "vitest";
|
||||
import { ensureTenant } from "../../src/tenant-orchestrator.js";
|
||||
|
||||
function makeMockClients() {
|
||||
const calls: { kind: string; name: string; namespace?: string; body?: unknown }[] = [];
|
||||
function track(kind: string) {
|
||||
return vi.fn(async (...args: unknown[]) => {
|
||||
const arg = (args[0] ?? {}) as { name?: string; namespace?: string; body?: unknown };
|
||||
calls.push({ kind, name: arg.name ?? "", namespace: arg.namespace, body: arg.body });
|
||||
return { body: arg.body };
|
||||
});
|
||||
}
|
||||
return {
|
||||
calls,
|
||||
core: {
|
||||
createNamespace: track("Namespace"),
|
||||
readNamespacedServiceAccount: vi.fn().mockRejectedValue({ code: 404 }),
|
||||
createNamespacedServiceAccount: track("ServiceAccount"),
|
||||
replaceNamespacedServiceAccount: track("ServiceAccountReplace"),
|
||||
readNamespacedResourceQuota: vi.fn().mockRejectedValue({ code: 404 }),
|
||||
createNamespacedResourceQuota: track("ResourceQuota"),
|
||||
replaceNamespacedResourceQuota: track("ResourceQuotaReplace"),
|
||||
readNamespacedLimitRange: vi.fn().mockRejectedValue({ code: 404 }),
|
||||
createNamespacedLimitRange: track("LimitRange"),
|
||||
replaceNamespacedLimitRange: track("LimitRangeReplace"),
|
||||
readNamespace: vi.fn().mockRejectedValue({ code: 404 }),
|
||||
},
|
||||
rbac: {
|
||||
readNamespacedRole: vi.fn().mockRejectedValue({ code: 404 }),
|
||||
createNamespacedRole: track("Role"),
|
||||
replaceNamespacedRole: track("RoleReplace"),
|
||||
readNamespacedRoleBinding: vi.fn().mockRejectedValue({ code: 404 }),
|
||||
createNamespacedRoleBinding: track("RoleBinding"),
|
||||
replaceNamespacedRoleBinding: track("RoleBindingReplace"),
|
||||
},
|
||||
networking: {
|
||||
readNamespacedNetworkPolicy: vi.fn().mockRejectedValue({ code: 404 }),
|
||||
createNamespacedNetworkPolicy: track("NetworkPolicy"),
|
||||
replaceNamespacedNetworkPolicy: track("NetworkPolicyReplace"),
|
||||
deleteNamespacedNetworkPolicy: vi.fn().mockRejectedValue({ code: 404 }),
|
||||
},
|
||||
custom: {
|
||||
getNamespacedCustomObject: vi.fn().mockRejectedValue({ code: 404 }),
|
||||
createNamespacedCustomObject: track("CiliumNetworkPolicy"),
|
||||
replaceNamespacedCustomObject: track("CiliumNetworkPolicyReplace"),
|
||||
deleteNamespacedCustomObject: vi.fn().mockRejectedValue({ code: 404 }),
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
describe("ensureTenant", () => {
|
||||
const baseInput = {
|
||||
namespace: "paperclip-acme",
|
||||
companyId: "11111111-1111-1111-1111-111111111111",
|
||||
paperclipServerNamespace: "paperclip",
|
||||
serviceAccountAnnotations: {},
|
||||
egressMode: "standard" as const,
|
||||
egressAllowFqdns: ["api.anthropic.com"],
|
||||
egressAllowCidrs: [] as string[],
|
||||
resourceQuota: { pods: "20", requestsCpu: "5", requestsMemory: "20Gi", limitsCpu: "20", limitsMemory: "80Gi" },
|
||||
};
|
||||
|
||||
it("creates all required resources in the correct order on a fresh tenant", async () => {
|
||||
const clients = makeMockClients();
|
||||
await ensureTenant(clients as never, baseInput);
|
||||
const order = clients.calls.map((c) => c.kind);
|
||||
expect(order).toEqual([
|
||||
"Namespace",
|
||||
"ServiceAccount",
|
||||
"Role",
|
||||
"RoleBinding",
|
||||
"ResourceQuota",
|
||||
"LimitRange",
|
||||
"NetworkPolicy",
|
||||
"NetworkPolicy",
|
||||
]);
|
||||
});
|
||||
|
||||
it("creates a CiliumNetworkPolicy instead of standard egress when egressMode=cilium", async () => {
|
||||
const clients = makeMockClients();
|
||||
await ensureTenant(clients as never, { ...baseInput, egressMode: "cilium" });
|
||||
const cnpCall = clients.calls.find((c) => c.kind === "CiliumNetworkPolicy");
|
||||
expect(cnpCall).toBeDefined();
|
||||
const npCalls = clients.calls.filter((c) => c.kind === "NetworkPolicy");
|
||||
expect(npCalls).toHaveLength(1);
|
||||
expect((npCalls[0].body as { metadata: { name: string } }).metadata.name).toBe("paperclip-deny-all");
|
||||
});
|
||||
|
||||
it("applies serviceAccountAnnotations to the ServiceAccount", async () => {
|
||||
const clients = makeMockClients();
|
||||
await ensureTenant(clients as never, {
|
||||
...baseInput,
|
||||
serviceAccountAnnotations: { "eks.amazonaws.com/role-arn": "arn:aws:iam::123:role/paperclip" },
|
||||
});
|
||||
const saCall = clients.calls.find((c) => c.kind === "ServiceAccount");
|
||||
const sa = saCall!.body as { metadata: { annotations: Record<string, string> } };
|
||||
expect(sa.metadata.annotations["eks.amazonaws.com/role-arn"]).toBe("arn:aws:iam::123:role/paperclip");
|
||||
});
|
||||
|
||||
it("does not recreate a namespace that already exists", async () => {
|
||||
const clients = makeMockClients();
|
||||
clients.core.readNamespace.mockResolvedValue({ body: { metadata: { name: baseInput.namespace } } });
|
||||
await ensureTenant(clients as never, baseInput);
|
||||
expect(clients.core.createNamespace).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("reconciles existing managed resources with the latest desired manifests", async () => {
|
||||
const clients = makeMockClients();
|
||||
const existing = { metadata: { resourceVersion: "rv-1" } };
|
||||
clients.core.readNamespace.mockResolvedValue({ metadata: { name: baseInput.namespace } });
|
||||
clients.core.readNamespacedServiceAccount.mockResolvedValue(existing);
|
||||
clients.rbac.readNamespacedRole.mockResolvedValue(existing);
|
||||
clients.rbac.readNamespacedRoleBinding.mockResolvedValue(existing);
|
||||
clients.core.readNamespacedResourceQuota.mockResolvedValue(existing);
|
||||
clients.core.readNamespacedLimitRange.mockResolvedValue(existing);
|
||||
clients.networking.readNamespacedNetworkPolicy.mockResolvedValue(existing);
|
||||
|
||||
await ensureTenant(clients as never, {
|
||||
...baseInput,
|
||||
serviceAccountAnnotations: { "eks.amazonaws.com/role-arn": "arn:aws:iam::123:role/paperclip" },
|
||||
resourceQuota: { ...baseInput.resourceQuota, pods: "25" },
|
||||
});
|
||||
|
||||
expect(clients.core.replaceNamespacedServiceAccount).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
body: expect.objectContaining({
|
||||
metadata: expect.objectContaining({
|
||||
annotations: { "eks.amazonaws.com/role-arn": "arn:aws:iam::123:role/paperclip" },
|
||||
resourceVersion: "rv-1",
|
||||
}),
|
||||
}),
|
||||
}),
|
||||
);
|
||||
expect(clients.core.replaceNamespacedResourceQuota).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
body: expect.objectContaining({
|
||||
metadata: expect.objectContaining({ resourceVersion: "rv-1" }),
|
||||
spec: expect.objectContaining({ hard: expect.objectContaining({ pods: "25" }) }),
|
||||
}),
|
||||
}),
|
||||
);
|
||||
expect(clients.networking.replaceNamespacedNetworkPolicy).toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("removes stale standard egress NetworkPolicy when cilium mode is selected", async () => {
|
||||
const clients = makeMockClients();
|
||||
await ensureTenant(clients as never, { ...baseInput, egressMode: "cilium" });
|
||||
expect(clients.networking.deleteNamespacedNetworkPolicy).toHaveBeenCalledWith({
|
||||
namespace: baseInput.namespace,
|
||||
name: "paperclip-egress-allow",
|
||||
});
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,39 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import { kubernetesProviderConfigSchema, parseKubernetesProviderConfig } from "../../src/types.js";
|
||||
|
||||
describe("kubernetesProviderConfigSchema", () => {
|
||||
it("accepts inCluster=true with no kubeconfig", () => {
|
||||
const parsed = parseKubernetesProviderConfig({ inCluster: true });
|
||||
expect(parsed.inCluster).toBe(true);
|
||||
expect(parsed.namespacePrefix).toBe("paperclip-");
|
||||
expect(parsed.imageAllowList).toEqual([]);
|
||||
expect(parsed.egressMode).toBe("standard");
|
||||
expect(parsed.jobTtlSecondsAfterFinished).toBe(900);
|
||||
});
|
||||
|
||||
it("accepts inline kubeconfig", () => {
|
||||
const parsed = parseKubernetesProviderConfig({
|
||||
inCluster: false,
|
||||
kubeconfig: "apiVersion: v1\nkind: Config\n",
|
||||
});
|
||||
expect(parsed.kubeconfig).toContain("apiVersion");
|
||||
});
|
||||
|
||||
it("rejects when neither inCluster nor any kubeconfig source is set", () => {
|
||||
expect(() => parseKubernetesProviderConfig({ inCluster: false })).toThrow(
|
||||
/requires one of `inCluster` or `kubeconfig`/,
|
||||
);
|
||||
});
|
||||
|
||||
it("rejects invalid companySlug", () => {
|
||||
expect(() =>
|
||||
parseKubernetesProviderConfig({ inCluster: true, companySlug: "INVALID UPPER" }),
|
||||
).toThrow();
|
||||
});
|
||||
|
||||
it("rejects egressAllowCidrs entries that are not valid CIDR", () => {
|
||||
expect(() =>
|
||||
parseKubernetesProviderConfig({ inCluster: true, egressAllowCidrs: ["not-a-cidr"] }),
|
||||
).toThrow(/CIDR/i);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,42 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import { deriveCompanySlug, deriveNamespaceName, newRunUlidDns, paperclipLabels } from "../../src/utils.js";
|
||||
|
||||
describe("deriveCompanySlug", () => {
|
||||
it("lowercases and replaces non-alphanumerics", () => {
|
||||
expect(deriveCompanySlug("Acme Co!")).toBe("acme-co");
|
||||
});
|
||||
|
||||
it("truncates to 32 chars and strips trailing dashes", () => {
|
||||
expect(deriveCompanySlug("A".repeat(50))).toBe("a".repeat(32));
|
||||
expect(deriveCompanySlug("ab---")).toBe("ab");
|
||||
});
|
||||
|
||||
it("falls back to 'company' on empty/zero-letter input", () => {
|
||||
expect(deriveCompanySlug("!!!")).toBe("company");
|
||||
expect(deriveCompanySlug("")).toBe("company");
|
||||
});
|
||||
});
|
||||
|
||||
describe("deriveNamespaceName", () => {
|
||||
it("concatenates prefix and slug", () => {
|
||||
expect(deriveNamespaceName("paperclip-", "acme-co")).toBe("paperclip-acme-co");
|
||||
});
|
||||
});
|
||||
|
||||
describe("newRunUlidDns", () => {
|
||||
it("produces a DNS-safe 26-char lowercase id", () => {
|
||||
const id = newRunUlidDns();
|
||||
expect(id).toMatch(/^[a-z0-9]{26}$/);
|
||||
});
|
||||
});
|
||||
|
||||
describe("paperclipLabels", () => {
|
||||
it("returns canonical label map", () => {
|
||||
const labels = paperclipLabels({ runId: "r1", agentId: "a1", companyId: "c1", adapterType: "claude_local" });
|
||||
expect(labels["paperclip.io/run-id"]).toBe("r1");
|
||||
expect(labels["paperclip.io/agent-id"]).toBe("a1");
|
||||
expect(labels["paperclip.io/company-id"]).toBe("c1");
|
||||
expect(labels["paperclip.io/adapter"]).toBe("claude_local");
|
||||
expect(labels["paperclip.io/managed-by"]).toBe("paperclip-k8s-plugin");
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,11 @@
|
||||
{
|
||||
"extends": "../../../../tsconfig.json",
|
||||
"compilerOptions": {
|
||||
"outDir": "dist",
|
||||
"rootDir": "src",
|
||||
"lib": ["ES2023"],
|
||||
"types": ["node"]
|
||||
},
|
||||
"include": ["src"],
|
||||
"exclude": ["src/**/*.test.ts"]
|
||||
}
|
||||
@@ -0,0 +1,12 @@
|
||||
import { defineConfig } from "vitest/config";
|
||||
|
||||
export default defineConfig({
|
||||
test: {
|
||||
include: [
|
||||
"test/unit/**/*.test.ts",
|
||||
...(process.env.RUN_K8S_INTEGRATION_TESTS === "1" ? ["test/integration/**/*.test.ts"] : []),
|
||||
],
|
||||
testTimeout: process.env.RUN_K8S_INTEGRATION_TESTS === "1" ? 120_000 : 5_000,
|
||||
environment: "node",
|
||||
},
|
||||
});
|
||||
@@ -94,6 +94,11 @@
|
||||
"name": "@paperclipai/plugin-daytona",
|
||||
"publishFromCi": true
|
||||
},
|
||||
{
|
||||
"dir": "packages/plugins/sandbox-providers/kubernetes",
|
||||
"name": "@paperclipai/plugin-kubernetes",
|
||||
"publishFromCi": false
|
||||
},
|
||||
{
|
||||
"dir": "packages/plugins/sandbox-providers/exe-dev",
|
||||
"name": "@paperclipai/plugin-exe-dev",
|
||||
|
||||
Reference in New Issue
Block a user