Files
paperclip/ui/src/adapters/sandboxed-parser-worker.ts
T
Dotta 73ef40e7be [codex] Sandbox dynamic adapter UI parsers (#4225)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies.
> - External adapters can provide UI parser code that the board loads
dynamically for run transcript rendering.
> - Running adapter-provided parser code directly in the board page
gives that parser access to same-origin browser state.
> - This PR narrows that surface by evaluating dynamically loaded
external adapter UI parser code in a dedicated browser Web Worker with a
constrained postMessage protocol.
> - The worker here is a frontend isolation boundary for adapter UI
parser JavaScript; it is not Paperclip's server plugin-worker system and
it is not a server-side job runner.

## What Changed

- Runs dynamically loaded external adapter UI parsers inside a dedicated
Web Worker instead of importing/evaluating them directly in the board
page.
- Adds a narrow postMessage protocol for parser initialization and line
parsing.
- Caches completed async parse results and notifies the adapter registry
so transcript recomputation can synchronously drain the final parsed
line.
- Disables common worker network, persistence, child worker, Blob/object
URL, and WebRTC escape APIs inside the parser worker bootstrap.
- Handles worker error messages after initialization and drains pending
callbacks on worker termination or mid-session worker error.
- Adds focused regression coverage for the parser worker lockdown and
unused protocol removal.

## Verification

- `pnpm exec vitest run --config ui/vitest.config.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm exec tsc --noEmit --target es2021 --moduleResolution bundler
--module esnext --jsx react-jsx --lib dom,es2021 --skipLibCheck
ui/src/adapters/dynamic-loader.ts
ui/src/adapters/sandboxed-parser-worker.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm --filter @paperclipai/ui typecheck` was attempted; it reached
existing unrelated failures in HeartbeatRun test/storybook fixtures and
missing Storybook type resolution, with no adapter-module errors
surfaced.
- PR #4225 checks on current head `34c9da00`: `policy`, `e2e`, `verify`,
`security/snyk`, and `Greptile Review` are all `SUCCESS`.
- Greptile Review on current head `34c9da00` reached 5/5.

## Risks

- Medium risk: parser execution is now asynchronous through a worker
while the existing parser interface is synchronous, so transcript
updates should be watched with external adapters.
- Some adapter parser bundles may rely on direct ESM `export` syntax or
browser APIs that are no longer available inside the worker lockdown.
- The worker lockdown is a hardening layer around external parser code,
not a complete browser security sandbox for arbitrary untrusted
applications.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 13:42:44 -05:00

183 lines
7.1 KiB
TypeScript

/**
* Sandboxed Worker bootstrap for external adapter UI parsers.
*
* Security boundary: parser code runs inside a dedicated Web Worker with
* network and DOM APIs explicitly disabled. Communication uses a narrow
* postMessage protocol (see {@link SandboxRequest} / {@link SandboxResponse}).
*
* The worker is created from an inline Blob URL so no extra file needs to
* be served. On initialisation the main thread sends the parser source;
* the bootstrap evaluates it in a scope where dangerous globals are shadowed
* by `undefined`, then responds to parse requests.
*/
// ── Message protocol ────────────────────────────────────────────────────────
/** Messages sent from the main thread to the worker. */
export type SandboxRequest =
| { type: "init"; source: string }
| { type: "parse"; id: number; line: string; ts: string };
/** Messages sent from the worker back to the main thread. */
export type SandboxResponse =
| { type: "ready" }
| { type: "error"; message: string }
| { type: "result"; id: number; entries: unknown[] };
// ── Worker bootstrap source ─────────────────────────────────────────────────
/**
* Inline JS that runs inside the Worker. It:
* 1. Shadows dangerous globals (`fetch`, `XMLHttpRequest`, `WebSocket`,
* `importScripts`, `EventSource`, `navigator.sendBeacon`, etc.) with
* no-ops or `undefined`.
* 2. Waits for an `init` message carrying the adapter's parser source.
* 3. Evaluates the source via `new Function()` and extracts exports.
* 4. Responds to `parse` messages with `TranscriptEntry[]` results.
*/
const WORKER_BOOTSTRAP = `
"use strict";
// ── 1. Lock down dangerous globals ──────────────────────────────────────────
// Workers have no DOM, but they still have network and import APIs.
const _undefined = void 0;
// Network
self.fetch = _undefined;
self.XMLHttpRequest = _undefined;
self.WebSocket = _undefined;
self.EventSource = _undefined;
self.RTCPeerConnection = _undefined;
self.RTCDataChannel = _undefined;
self.Request = _undefined;
self.Response = _undefined;
self.Headers = _undefined;
self.Cache = _undefined;
self.CacheStorage = _undefined;
self.caches = _undefined;
// Import / eval escape hatches
self.importScripts = _undefined;
self.Worker = _undefined;
self.SharedWorker = _undefined;
self.Blob = _undefined;
if (self.URL) {
try { Object.defineProperty(self.URL, "createObjectURL", { value: _undefined, writable: false, configurable: false }); } catch {}
try { Object.defineProperty(self.URL, "revokeObjectURL", { value: _undefined, writable: false, configurable: false }); } catch {}
}
// Beacon / reporting
if (self.navigator) {
try { Object.defineProperty(self.navigator, "sendBeacon", { value: _undefined, writable: false, configurable: false }); } catch {}
}
// Service worker / broadcast channel
self.BroadcastChannel = _undefined;
// IndexedDB (prevents persistent state exfiltration)
self.indexedDB = _undefined;
self.IDBFactory = _undefined;
// ── 2. Parser state ─────────────────────────────────────────────────────────
let parseStdoutLine = null;
let createStdoutParser = null;
let fallbackParser = null;
// ── 3. Message handler ──────────────────────────────────────────────────────
self.onmessage = function (e) {
const msg = e.data;
if (msg.type === "init") {
try {
// Evaluate the parser source in a constrained scope.
// We use a Function constructor to avoid giving the source access to
// our local variables. The only value we inject is a module-like
// \`exports\` object so both CJS-style and ESM-compiled code works.
//
// ESM sources compiled to IIFE typically assign to an \`exports\` param
// or use \`export\`. Since we can't use real ESM import() here (the
// source is a string, not a URL), we wrap it.
const exports = {};
const module = { exports };
// Build a function that receives common CJS shims.
// \`self\` is shadowed to prevent the parser from un-deleting globals.
const factory = new Function(
"exports", "module", "self", "globalThis",
// Wrap in a block to prevent hoisted declarations from leaking.
"\\"use strict\\";\\n{\\n" + msg.source + "\\n}"
);
factory(exports, module, _undefined, _undefined);
// Resolve exports — try module.exports first (CJS), then named exports.
const resolved = module.exports && typeof module.exports === "object" && Object.keys(module.exports).length > 0
? module.exports
: exports;
if (typeof resolved.parseStdoutLine === "function") {
parseStdoutLine = resolved.parseStdoutLine;
}
if (typeof resolved.createStdoutParser === "function") {
createStdoutParser = resolved.createStdoutParser;
}
if (!parseStdoutLine && createStdoutParser) {
fallbackParser = createStdoutParser();
if (fallbackParser && typeof fallbackParser.parseLine === "function") {
parseStdoutLine = (line, ts) => fallbackParser.parseLine(line, ts);
}
}
if (!parseStdoutLine) {
self.postMessage({ type: "error", message: "Parser module exports no usable parseStdoutLine or createStdoutParser" });
return;
}
self.postMessage({ type: "ready" });
} catch (err) {
self.postMessage({ type: "error", message: "Parser init failed: " + (err && err.message || String(err)) });
}
return;
}
if (msg.type === "parse") {
try {
const entries = parseStdoutLine ? parseStdoutLine(msg.line, msg.ts) : [];
self.postMessage({ type: "result", id: msg.id, entries: entries || [] });
} catch (err) {
self.postMessage({ type: "result", id: msg.id, entries: [] });
}
return;
}
};
`;
// ── Public API ───────────────────────────────────────────────────────────────
/**
* Return the inline Worker bootstrap source.
* Exported for testing (so test code can verify the lockdown behaviour).
*/
export function getWorkerBootstrapSource(): string {
return WORKER_BOOTSTRAP;
}
/**
* Create a sandboxed Web Worker from the inline bootstrap.
* The caller must send an `init` message with the parser source before
* sending parse requests.
*/
export function createSandboxedWorker(): Worker {
const blob = new Blob([WORKER_BOOTSTRAP], { type: "application/javascript" });
const url = URL.createObjectURL(blob);
try {
return new Worker(url);
} finally {
// Revoke after construction; the Worker has already captured the Blob URL source.
URL.revokeObjectURL(url);
}
}