9eac727cf1
## Thinking Path > - Paperclip orchestrates AI agents for zero-human companies through company-scoped control-plane workflows. > - Agents need reusable, inspectable skills that can be installed, reset, audited, exported, and assigned without bespoke local setup. > - The existing skill truth model needed cleanup so bundled skills, optional catalog skills, runtime skills, and adapter-provided skills have clear provenance. > - Operators also need a practical CLI and board UI for discovering and managing company skills. > - This pull request adds the skills CLI, packaged skills catalog, company skills APIs, and catalog-aware board UI. > - The benefit is a more reusable Paperclip company setup where skills are portable, auditable, and easier for operators and agents to manage. ## What Changed - Added `paperclipai skills` CLI commands and coverage for catalog listing, installing, resetting, and inspecting company skills. - Added a packaged `@paperclipai/skills-catalog` workspace with bundled and optional skill content plus validation/build tests. - Added shared company-skill types and validators used across CLI, server, and UI contracts. - Added server catalog APIs/services for company skill catalog operations, reset semantics, audit behavior, and portability provenance. - Updated adapter skill handling so runtime/catalog provenance remains explicit across local adapters. - Added board UI support for browsing and managing catalog-backed company skills. - Updated docs for the skills CLI/catalog flow and the company skills Paperclip skill reference. - Rebased the branch onto current `paperclipai/paperclip:master`; no `pnpm-lock.yaml`, `.github/workflows`, or migration files are included in the final PR diff. ## Verification - Passed: `pnpm run preflight:workspace-links && pnpm exec vitest run cli/src/__tests__/skills.test.ts packages/skills-catalog/src/catalog-builder.test.ts packages/skills-catalog/src/shipped-catalog.test.ts packages/shared/src/validators/company-skill.test.ts packages/adapter-utils/src/server-utils.test.ts packages/plugins/create-paperclip-plugin/src/entrypoints.test.ts server/src/__tests__/company-skills-catalog-service.test.ts server/src/__tests__/company-skills-routes.test.ts server/src/__tests__/company-portability.test.ts`. - Passed: `pnpm exec vitest run server/src/__tests__/workspace-runtime.test.ts -t "default branch|origin/master|symbolic-ref"`. - Attempted: full `server/src/__tests__/workspace-runtime.test.ts`. Four provisioning tests failed while seeding an isolated worktree database from the local Paperclip instance because the local plugin schema dump contains a duplicate-column foreign key (`plugin_content_machine_18a7bc327b.content_case_signals`). The default-branch tests touched by the rebase conflict passed in the focused run above. - Checked final diff: no `pnpm-lock.yaml`, no `.github/workflows`, and no migration-file changes relative to `master`. ## Risks - Medium: this is a broad skills/catalog change touching CLI, server APIs, shared contracts, adapter skill sync, and UI. - Catalog validation and reset semantics need careful reviewer attention because they affect reusable company setup and portability. - No database migrations are included in this PR, so there is no migration ordering/idempotency risk in the final diff. - No lockfile is included by design; dependency resolution will be handled by the repository lockfile workflow. ## Model Used - OpenAI Codex coding agent based on GPT-5, running in Paperclip via the `codex_local` adapter with shell, git, GitHub CLI, and code-editing tool access. Exact hosted model build/context-window metadata is not exposed in this runtime. ## Checklist - [x] I have included a thinking path that traces from project context to this change - [x] I have specified the model used (with version and capability details) - [x] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work - [x] I have run targeted tests locally and documented the local workspace-runtime seed failure above - [x] I have added or updated tests where applicable - [x] If this change affects the UI, screenshots were intentionally omitted per PAP-10124 instructions; UI behavior is covered by tests and reviewer inspection - [x] I have updated relevant documentation to reflect my changes - [x] I have considered and documented any risks above - [x] I will address all Greptile and reviewer comments before requesting merge --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
5.0 KiB
5.0 KiB
name, description, key, recommendedForRoles, tags
| name | description | key | recommendedForRoles | tags | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| agent-browser | Drive a real browser to inspect or interact with a web page or app — navigate, take screenshots, read console and network, fill simple forms — for verification tasks, not unattended automation. | paperclipai/optional/browser/agent-browser |
|
|
Agent Browser
Use a controlled browser to verify behavior, capture evidence, or extract information from web pages that a static fetch cannot reach (SPAs, login-gated pages, dynamic content). This skill is about supervised verification, not unattended scraping.
When to use
- You need a screenshot of a deployed page or a local dev server to confirm a UI change.
- You need to read JavaScript-rendered content that
curl/wgetwill not see. - A user reports a UI bug and you need to reproduce it interactively to capture console errors, network requests, or layout state.
- You need to walk through a short flow (load page, click, observe) to verify acceptance criteria.
When not to use
- The page is reachable as static HTML. Use
curl/HTTP fetch — it is cheaper, faster, and more reliable. - The task is unattended large-scale scraping. That belongs to a dedicated scraper with rate limits, robots.txt handling, and a real user agent policy — not this skill.
- The site is behind authentication you do not own credentials for, or whose terms of service prohibit automation.
- The site involves sensitive accounts (banking, healthcare, government) where automation risks lockout or compliance issues.
Before launching the browser
- Confirm the URL and what state should be true after navigation.
- Decide what evidence is needed: full-page screenshot, viewport screenshot, console log, network trace, HTML snapshot, extracted text.
- Decide the viewport size that matters for the task (mobile vs desktop). Default to a desktop size unless the task is mobile-specific.
- For local dev servers, confirm the server is running and the port is what you expect.
Driving the browser
A typical verification session:
- Launch with a real-looking user agent when the target is the public internet; an unrealistic UA flags automation traffic.
- Set a sane viewport (e.g., 1366×768 desktop, 390×844 iPhone-ish).
- Navigate and wait for the right signal. Prefer waiting for a specific selector or network-idle over arbitrary sleeps.
- Capture evidence immediately after the wait condition succeeds, before any interaction perturbs the state.
- Interact deliberately. One click at a time, with a wait between actions; re-screenshot after each meaningful state change.
- Read the console and network panels for unexpected errors, 4xx/5xx responses, or slow requests.
- Close the browser cleanly when done. Long-running browser sessions leak memory and hold ports.
What evidence to record
For a verification task, deliver:
- A full-page or viewport screenshot of each meaningful state.
- The console log, filtered to warnings/errors.
- Any non-2xx network response with the URL, status, and a short response body excerpt.
- A short narration: "Navigated to X, observed Y, clicked Z, observed W."
For a UI bug repro, also record:
- The exact reproduction steps the user can follow.
- Viewport size and (where relevant) device pixel ratio.
- Whether the bug reproduces on first load vs after interaction.
Login-gated pages
- Prefer programmatic auth (API token, magic link) over UI login.
- If UI login is the only path, the user must provide credentials explicitly for this run. Never reuse credentials outside the session.
- Do not store credentials in the session log, screenshot, or returned output.
Performance and politeness
- Throttle to one navigation per few seconds when touching shared infra.
- Respect
robots.txtfor public sites you are inspecting at any volume. - Cancel navigations if a page exceeds a reasonable timeout (e.g., 30s); the page is broken or rate-limiting you.
- Do not retry forever on failure. Retry once with a longer timeout, then escalate.
Common failure modes
- Selector not found. Page changed, or you are waiting before render. Take a screenshot to see actual state; adjust the selector.
- Click does nothing. The element is offscreen, covered by a modal, or in a shadow DOM. Scroll into view or pierce the shadow root.
- Headless detection. Some sites detect headless Chrome and serve a different page. Use a non-headless mode or a fingerprint-realistic configuration only when authorized.
- Cross-origin iframe blocking. Iframes you do not own cannot be inspected; the page must offer the data outside the iframe or the task is infeasible.
Anti-patterns
- Long unsupervised browser sessions that drift from the original task.
- Scraping behind authentication you do not own.
- Captioning a screenshot with "looks good" without saying what state was loaded and what selectors confirmed it.
- Treating a passing screenshot as proof of correctness across viewports you did not actually test.