[codex] Sandbox dynamic adapter UI parsers (#4225)

## Thinking Path

> - Paperclip is a control plane for AI-agent companies.
> - External adapters can provide UI parser code that the board loads
dynamically for run transcript rendering.
> - Running adapter-provided parser code directly in the board page
gives that parser access to same-origin browser state.
> - This PR narrows that surface by evaluating dynamically loaded
external adapter UI parser code in a dedicated browser Web Worker with a
constrained postMessage protocol.
> - The worker here is a frontend isolation boundary for adapter UI
parser JavaScript; it is not Paperclip's server plugin-worker system and
it is not a server-side job runner.

## What Changed

- Runs dynamically loaded external adapter UI parsers inside a dedicated
Web Worker instead of importing/evaluating them directly in the board
page.
- Adds a narrow postMessage protocol for parser initialization and line
parsing.
- Caches completed async parse results and notifies the adapter registry
so transcript recomputation can synchronously drain the final parsed
line.
- Disables common worker network, persistence, child worker, Blob/object
URL, and WebRTC escape APIs inside the parser worker bootstrap.
- Handles worker error messages after initialization and drains pending
callbacks on worker termination or mid-session worker error.
- Adds focused regression coverage for the parser worker lockdown and
unused protocol removal.

## Verification

- `pnpm exec vitest run --config ui/vitest.config.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm exec tsc --noEmit --target es2021 --moduleResolution bundler
--module esnext --jsx react-jsx --lib dom,es2021 --skipLibCheck
ui/src/adapters/dynamic-loader.ts
ui/src/adapters/sandboxed-parser-worker.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm --filter @paperclipai/ui typecheck` was attempted; it reached
existing unrelated failures in HeartbeatRun test/storybook fixtures and
missing Storybook type resolution, with no adapter-module errors
surfaced.
- PR #4225 checks on current head `34c9da00`: `policy`, `e2e`, `verify`,
`security/snyk`, and `Greptile Review` are all `SUCCESS`.
- Greptile Review on current head `34c9da00` reached 5/5.

## Risks

- Medium risk: parser execution is now asynchronous through a worker
while the existing parser interface is synchronous, so transcript
updates should be watched with external adapters.
- Some adapter parser bundles may rely on direct ESM `export` syntax or
browser APIs that are no longer available inside the worker lockdown.
- The worker lockdown is a hardening layer around external parser code,
not a complete browser security sandbox for arbitrary untrusted
applications.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
This commit is contained in:
Dotta
2026-04-21 13:42:44 -05:00
committed by GitHub
parent a26e1288b6
commit 73ef40e7be
4 changed files with 445 additions and 73 deletions
+235 -72
View File
@@ -1,122 +1,285 @@
/**
* Dynamic UI parser loading for external adapters.
* Dynamic UI parser loading for external adapters — sandboxed execution.
*
* When the Paperclip UI encounters an adapter type that doesn't have a
* built-in parser (e.g., an external adapter loaded via the plugin system),
* it fetches the parser JS from `/api/adapters/:type/ui-parser.js` and
* evaluates it to create a `parseStdoutLine` function.
* executes it **inside a dedicated Web Worker** so it cannot access the
* board UI's same-origin state (cookies, localStorage, DOM, authenticated
* fetch, etc.).
*
* The parser module must export:
* - `parseStdoutLine(line: string, ts: string): TranscriptEntry[]`
* - optionally `createStdoutParser(): { parseLine, reset }` for stateful parsers
* The worker communicates via a narrow postMessage protocol:
* Main → Worker: { type: "init", source }
* Worker → Main: { type: "ready" } | { type: "error", message }
* Main → Worker: { type: "parse", id, line, ts }
* Worker → Main: { type: "result", id, entries }
*
* This is the bridge between the server-side plugin system and the client-side
* UI rendering. Adapter developers ship a `dist/ui-parser.js` with zero
* runtime dependencies, and Paperclip's UI loads it on demand.
* Because the parse call is async (cross-thread postMessage), but the
* existing `parseStdoutLine` contract is synchronous, we cache completed
* worker results and ask the adapter registry to recompute transcripts when
* a new result arrives.
*
* **Synchronous fast-path**: After init, parse requests are sent to the
* worker which responds asynchronously. The `parseStdoutLine` wrapper
* returns cached results synchronously on the next transcript recomputation.
* In practice this adds ~1 frame of latency which is imperceptible.
*
* Security: see `sandboxed-parser-worker.ts` for the full lockdown.
*/
import type { TranscriptEntry } from "@paperclipai/adapter-utils";
import type { StatefulStdoutParser, StdoutLineParser, StdoutParserFactory } from "./types";
import type { StdoutLineParser, StdoutParserFactory } from "./types";
import { createSandboxedWorker } from "./sandboxed-parser-worker";
import type { SandboxRequest, SandboxResponse } from "./sandboxed-parser-worker";
// ── Types ───────────────────────────────────────────────────────────────────
interface DynamicParserModule {
parseStdoutLine: StdoutLineParser;
createStdoutParser?: StdoutParserFactory;
}
// Cache of dynamically loaded parsers by adapter type.
// Once loaded, the parser is reused for all runs of that adapter type.
interface SandboxedParser {
worker: Worker;
ready: boolean;
nextId: number;
pendingResolves: Map<number, (entries: TranscriptEntry[]) => void>;
}
// ── State ───────────────────────────────────────────────────────────────────
/** Cache of fully initialised sandboxed parsers by adapter type. */
const sandboxedParsers = new Map<string, SandboxedParser>();
/** Cache of the public DynamicParserModule wrappers. */
const dynamicParserCache = new Map<string, DynamicParserModule>();
// Track which types we've already attempted to load (to avoid repeat 404s).
/** Track which types we've already attempted to load (to avoid repeat 404s). */
const failedLoads = new Set<string>();
/** In-flight init promises so concurrent callers share the same load. */
const loadPromises = new Map<string, Promise<DynamicParserModule | null>>();
let resultNotifier: (() => void) | null = null;
export function setDynamicParserResultNotifier(fn: (() => void) | null): void {
resultNotifier = fn;
}
// ── Internal helpers ────────────────────────────────────────────────────────
function sendToWorker(sandbox: SandboxedParser, msg: SandboxRequest): void {
sandbox.worker.postMessage(msg);
}
function nextRequestId(sandbox: SandboxedParser): number {
return sandbox.nextId++;
}
function lineCacheKey(line: string, ts: string): string {
return `${ts}\u0000${line}`;
}
function notifyResultReady(): void {
resultNotifier?.();
}
/**
* Dynamically load a UI parser for an adapter type from the server API.
* Parse a single line synchronously by delegating to the worker.
* Returns a Promise that resolves with the TranscriptEntry[] from the worker.
*/
function parseLineAsync(sandbox: SandboxedParser, line: string, ts: string): Promise<TranscriptEntry[]> {
return new Promise((resolve) => {
const id = nextRequestId(sandbox);
sandbox.pendingResolves.set(id, resolve);
sendToWorker(sandbox, { type: "parse", id, line, ts });
});
}
function drainPendingRequests(sandbox: SandboxedParser): void {
for (const resolver of sandbox.pendingResolves.values()) {
resolver([]);
}
sandbox.pendingResolves.clear();
}
/**
* Create a sandboxed worker, send the parser source, and wait for init.
*/
function initSandboxedWorker(source: string): Promise<SandboxedParser> {
return new Promise((resolve, reject) => {
const worker = createSandboxedWorker();
const sandbox: SandboxedParser = {
worker,
ready: false,
nextId: 1,
pendingResolves: new Map(),
};
// Timeout if the worker doesn't respond within 5s
const timeout = setTimeout(() => {
drainPendingRequests(sandbox);
worker.terminate();
reject(new Error("Parser worker init timed out"));
}, 5000);
worker.onmessage = (e: MessageEvent<SandboxResponse>) => {
const msg = e.data;
if (msg.type === "ready") {
clearTimeout(timeout);
sandbox.ready = true;
// Switch to the steady-state message handler.
worker.onmessage = (ev: MessageEvent<SandboxResponse>) => {
const resp = ev.data;
if (resp.type === "result") {
const resolver = sandbox.pendingResolves.get(resp.id);
if (resolver) {
sandbox.pendingResolves.delete(resp.id);
resolver(resp.entries as TranscriptEntry[]);
}
} else if (resp.type === "error") {
console.error("[adapter-ui-loader] Worker reported error:", resp.message);
drainPendingRequests(sandbox);
}
};
resolve(sandbox);
return;
}
if (msg.type === "error") {
clearTimeout(timeout);
drainPendingRequests(sandbox);
worker.terminate();
reject(new Error(msg.message));
return;
}
};
worker.onerror = (ev) => {
clearTimeout(timeout);
drainPendingRequests(sandbox);
worker.terminate();
reject(new Error(`Worker error: ${ev.message}`));
};
// Send the parser source to the worker for evaluation.
sendToWorker(sandbox, { type: "init", source });
});
}
/**
* Build a DynamicParserModule that delegates all calls to the sandboxed worker.
*
* Fetches `/api/adapters/:type/ui-parser.js`, evaluates the module source
* in a scoped context, and extracts the `parseStdoutLine` export.
* The parseStdoutLine wrapper is **synchronous** to match the existing contract.
* Cache misses send a parse request to the worker and return `[]`; when the
* worker responds, the registry notification path recomputes transcripts and
* this wrapper returns the cached result synchronously.
*
* @returns A StdoutLineParser function, or null if unavailable.
* In practice, because the existing codebase already handles the "bridge"
* pattern where parseStdoutLine returns [] until the dynamic parser loads,
* the same UX applies here: the first render may show raw lines, and a
* subsequent render shows the parsed entries.
*/
function buildParserModule(sandbox: SandboxedParser): DynamicParserModule {
const parseCache = new Map<string, TranscriptEntry[]>();
const pendingParseKeys = new Set<string>();
const parseStdoutLine: StdoutLineParser = (line: string, ts: string) => {
const key = lineCacheKey(line, ts);
const cached = parseCache.get(key);
if (cached) return cached.slice();
if (!pendingParseKeys.has(key)) {
pendingParseKeys.add(key);
parseLineAsync(sandbox, line, ts).then((entries) => {
pendingParseKeys.delete(key);
parseCache.set(key, entries);
notifyResultReady();
});
}
return [];
};
return { parseStdoutLine };
}
// ── Public API ──────────────────────────────────────────────────────────────
/**
* Dynamically load a UI parser for an adapter type from the server API,
* executing it inside a sandboxed Web Worker.
*
* @returns A DynamicParserModule, or null if unavailable.
*/
export async function loadDynamicParser(adapterType: string): Promise<DynamicParserModule | null> {
// Return cached parser if already loaded
// Return cached parser if already loaded.
const cached = dynamicParserCache.get(adapterType);
if (cached) return cached;
// Don't retry types that previously 404'd
// Don't retry types that previously failed.
if (failedLoads.has(adapterType)) return null;
try {
const response = await fetch(`/api/adapters/${encodeURIComponent(adapterType)}/ui-parser.js`);
if (!response.ok) {
failedLoads.add(adapterType);
return null;
}
const source = await response.text();
// Evaluate the module source using URL.createObjectURL + dynamic import().
// This properly supports ESM modules with `export` statements.
// (new Function("exports", source) would fail with SyntaxError on `export` keywords.)
const blob = new Blob([source], { type: "application/javascript" });
const blobUrl = URL.createObjectURL(blob);
let parserModule: DynamicParserModule;
// Coalesce concurrent loads.
const inflight = loadPromises.get(adapterType);
if (inflight) return inflight;
const loadPromise = (async (): Promise<DynamicParserModule | null> => {
try {
const mod = await import(/* @vite-ignore */ blobUrl);
// Prefer the factory function (stateful parser) if available,
// fall back to the static parseStdoutLine function.
if (typeof mod.createStdoutParser === "function") {
const createStdoutParser = mod.createStdoutParser as StdoutParserFactory;
parserModule = {
createStdoutParser,
// Fallback for callers that only know about parseStdoutLine.
parseStdoutLine:
typeof mod.parseStdoutLine === "function"
? (mod.parseStdoutLine as StdoutLineParser)
: ((line: string, ts: string) => {
const parser = createStdoutParser() as StatefulStdoutParser;
const entries = parser.parseLine(line, ts);
parser.reset();
return entries;
}),
};
} else if (typeof mod.parseStdoutLine === "function") {
parserModule = {
parseStdoutLine: mod.parseStdoutLine as StdoutLineParser,
};
} else {
console.warn(`[adapter-ui-loader] Module for "${adapterType}" exports neither parseStdoutLine nor createStdoutParser`);
const response = await fetch(`/api/adapters/${encodeURIComponent(adapterType)}/ui-parser.js`);
if (!response.ok) {
failedLoads.add(adapterType);
return null;
}
} finally {
URL.revokeObjectURL(blobUrl);
}
// Cache for reuse
dynamicParserCache.set(adapterType, parserModule);
console.info(`[adapter-ui-loader] Loaded dynamic UI parser for "${adapterType}"`);
return parserModule;
} catch (err) {
console.warn(`[adapter-ui-loader] Failed to load UI parser for "${adapterType}":`, err);
failedLoads.add(adapterType);
return null;
}
const source = await response.text();
// Initialise the sandboxed worker with the parser source.
const sandbox = await initSandboxedWorker(source);
sandboxedParsers.set(adapterType, sandbox);
const parserModule = buildParserModule(sandbox);
dynamicParserCache.set(adapterType, parserModule);
console.info(`[adapter-ui-loader] Loaded sandboxed UI parser for "${adapterType}"`);
return parserModule;
} catch (err) {
console.warn(`[adapter-ui-loader] Failed to load UI parser for "${adapterType}":`, err);
failedLoads.add(adapterType);
return null;
} finally {
loadPromises.delete(adapterType);
}
})();
loadPromises.set(adapterType, loadPromise);
return loadPromise;
}
/**
* Invalidate a cached dynamic parser, removing it from both the parser cache
* and the failed-loads set so that the next load attempt will try again.
* Also terminates the sandboxed worker if one exists.
*/
export function invalidateDynamicParser(adapterType: string): boolean {
const wasCached = dynamicParserCache.has(adapterType);
dynamicParserCache.delete(adapterType);
failedLoads.delete(adapterType);
loadPromises.delete(adapterType);
// Terminate the worker to free resources.
const sandbox = sandboxedParsers.get(adapterType);
if (sandbox) {
drainPendingRequests(sandbox);
sandbox.worker.terminate();
sandboxedParsers.delete(adapterType);
}
if (wasCached) {
console.info(`[adapter-ui-loader] Invalidated dynamic UI parser for "${adapterType}"`);
console.info(`[adapter-ui-loader] Invalidated sandboxed UI parser for "${adapterType}"`);
}
return wasCached;
}
+3 -1
View File
@@ -9,7 +9,7 @@ import { openClawGatewayUIAdapter } from "./openclaw-gateway";
import { hermesLocalUIAdapter } from "./hermes-local";
import { processUIAdapter } from "./process";
import { httpUIAdapter } from "./http";
import { loadDynamicParser, invalidateDynamicParser } from "./dynamic-loader";
import { loadDynamicParser, invalidateDynamicParser, setDynamicParserResultNotifier } from "./dynamic-loader";
import { SchemaConfigFields, buildSchemaAdapterConfig } from "./schema-config-fields";
const uiAdapters: UIAdapterModule[] = [];
@@ -45,6 +45,8 @@ function notifyAdapterChange(): void {
for (const fn of adapterChangeListeners) fn();
}
setDynamicParserResultNotifier(notifyAdapterChange);
function registerBuiltInUIAdapters() {
for (const adapter of [
claudeLocalUIAdapter,
@@ -0,0 +1,25 @@
import { describe, expect, it } from "vitest";
import { getWorkerBootstrapSource } from "./sandboxed-parser-worker";
describe("sandboxed parser worker bootstrap", () => {
it("disables child worker and object URL escape hatches", () => {
const source = getWorkerBootstrapSource();
expect(source).toContain("self.Worker = _undefined");
expect(source).toContain("self.SharedWorker = _undefined");
expect(source).toContain("self.Blob = _undefined");
expect(source).toContain("self.RTCPeerConnection = _undefined");
expect(source).toContain("self.RTCDataChannel = _undefined");
expect(source).toContain('"createObjectURL"');
expect(source).toContain('"revokeObjectURL"');
});
it("evaluates parser source in strict mode", () => {
expect(getWorkerBootstrapSource()).toContain('\\"use strict\\";\\n{\\n" + msg.source');
});
it("does not include the unused parse_batch protocol branch", () => {
expect(getWorkerBootstrapSource()).not.toContain("parse_batch");
});
});
+182
View File
@@ -0,0 +1,182 @@
/**
* Sandboxed Worker bootstrap for external adapter UI parsers.
*
* Security boundary: parser code runs inside a dedicated Web Worker with
* network and DOM APIs explicitly disabled. Communication uses a narrow
* postMessage protocol (see {@link SandboxRequest} / {@link SandboxResponse}).
*
* The worker is created from an inline Blob URL so no extra file needs to
* be served. On initialisation the main thread sends the parser source;
* the bootstrap evaluates it in a scope where dangerous globals are shadowed
* by `undefined`, then responds to parse requests.
*/
// ── Message protocol ────────────────────────────────────────────────────────
/** Messages sent from the main thread to the worker. */
export type SandboxRequest =
| { type: "init"; source: string }
| { type: "parse"; id: number; line: string; ts: string };
/** Messages sent from the worker back to the main thread. */
export type SandboxResponse =
| { type: "ready" }
| { type: "error"; message: string }
| { type: "result"; id: number; entries: unknown[] };
// ── Worker bootstrap source ─────────────────────────────────────────────────
/**
* Inline JS that runs inside the Worker. It:
* 1. Shadows dangerous globals (`fetch`, `XMLHttpRequest`, `WebSocket`,
* `importScripts`, `EventSource`, `navigator.sendBeacon`, etc.) with
* no-ops or `undefined`.
* 2. Waits for an `init` message carrying the adapter's parser source.
* 3. Evaluates the source via `new Function()` and extracts exports.
* 4. Responds to `parse` messages with `TranscriptEntry[]` results.
*/
const WORKER_BOOTSTRAP = `
"use strict";
// ── 1. Lock down dangerous globals ──────────────────────────────────────────
// Workers have no DOM, but they still have network and import APIs.
const _undefined = void 0;
// Network
self.fetch = _undefined;
self.XMLHttpRequest = _undefined;
self.WebSocket = _undefined;
self.EventSource = _undefined;
self.RTCPeerConnection = _undefined;
self.RTCDataChannel = _undefined;
self.Request = _undefined;
self.Response = _undefined;
self.Headers = _undefined;
self.Cache = _undefined;
self.CacheStorage = _undefined;
self.caches = _undefined;
// Import / eval escape hatches
self.importScripts = _undefined;
self.Worker = _undefined;
self.SharedWorker = _undefined;
self.Blob = _undefined;
if (self.URL) {
try { Object.defineProperty(self.URL, "createObjectURL", { value: _undefined, writable: false, configurable: false }); } catch {}
try { Object.defineProperty(self.URL, "revokeObjectURL", { value: _undefined, writable: false, configurable: false }); } catch {}
}
// Beacon / reporting
if (self.navigator) {
try { Object.defineProperty(self.navigator, "sendBeacon", { value: _undefined, writable: false, configurable: false }); } catch {}
}
// Service worker / broadcast channel
self.BroadcastChannel = _undefined;
// IndexedDB (prevents persistent state exfiltration)
self.indexedDB = _undefined;
self.IDBFactory = _undefined;
// ── 2. Parser state ─────────────────────────────────────────────────────────
let parseStdoutLine = null;
let createStdoutParser = null;
let fallbackParser = null;
// ── 3. Message handler ──────────────────────────────────────────────────────
self.onmessage = function (e) {
const msg = e.data;
if (msg.type === "init") {
try {
// Evaluate the parser source in a constrained scope.
// We use a Function constructor to avoid giving the source access to
// our local variables. The only value we inject is a module-like
// \`exports\` object so both CJS-style and ESM-compiled code works.
//
// ESM sources compiled to IIFE typically assign to an \`exports\` param
// or use \`export\`. Since we can't use real ESM import() here (the
// source is a string, not a URL), we wrap it.
const exports = {};
const module = { exports };
// Build a function that receives common CJS shims.
// \`self\` is shadowed to prevent the parser from un-deleting globals.
const factory = new Function(
"exports", "module", "self", "globalThis",
// Wrap in a block to prevent hoisted declarations from leaking.
"\\"use strict\\";\\n{\\n" + msg.source + "\\n}"
);
factory(exports, module, _undefined, _undefined);
// Resolve exports — try module.exports first (CJS), then named exports.
const resolved = module.exports && typeof module.exports === "object" && Object.keys(module.exports).length > 0
? module.exports
: exports;
if (typeof resolved.parseStdoutLine === "function") {
parseStdoutLine = resolved.parseStdoutLine;
}
if (typeof resolved.createStdoutParser === "function") {
createStdoutParser = resolved.createStdoutParser;
}
if (!parseStdoutLine && createStdoutParser) {
fallbackParser = createStdoutParser();
if (fallbackParser && typeof fallbackParser.parseLine === "function") {
parseStdoutLine = (line, ts) => fallbackParser.parseLine(line, ts);
}
}
if (!parseStdoutLine) {
self.postMessage({ type: "error", message: "Parser module exports no usable parseStdoutLine or createStdoutParser" });
return;
}
self.postMessage({ type: "ready" });
} catch (err) {
self.postMessage({ type: "error", message: "Parser init failed: " + (err && err.message || String(err)) });
}
return;
}
if (msg.type === "parse") {
try {
const entries = parseStdoutLine ? parseStdoutLine(msg.line, msg.ts) : [];
self.postMessage({ type: "result", id: msg.id, entries: entries || [] });
} catch (err) {
self.postMessage({ type: "result", id: msg.id, entries: [] });
}
return;
}
};
`;
// ── Public API ───────────────────────────────────────────────────────────────
/**
* Return the inline Worker bootstrap source.
* Exported for testing (so test code can verify the lockdown behaviour).
*/
export function getWorkerBootstrapSource(): string {
return WORKER_BOOTSTRAP;
}
/**
* Create a sandboxed Web Worker from the inline bootstrap.
* The caller must send an `init` message with the parser source before
* sending parse requests.
*/
export function createSandboxedWorker(): Worker {
const blob = new Blob([WORKER_BOOTSTRAP], { type: "application/javascript" });
const url = URL.createObjectURL(blob);
try {
return new Worker(url);
} finally {
// Revoke after construction; the Worker has already captured the Blob URL source.
URL.revokeObjectURL(url);
}
}