Gitea/Forgjo Skills Support #20

Merged
Chris Farhood merged 4 commits from dev into local 2026-06-10 16:06:15 +00:00
Owner

Thinking Path

  • Paperclip orchestrates AI agents for zero-human companies, and skills are how agents pick up new capabilities at runtime.
  • The skill-import subsystem (server/src/services/company-skills.ts) only knew how to fetch skill repos from GitHub.
  • The farhoodlabs deployment hosts its own skills on git.farh.net (Gitea/Forgejo). Without native Gitea support, those repos can't be imported as URL-sourced skills — operators have to vendor them locally, which defeats the point of remote skill sources.
  • Adding a Gitea/Forgejo source adapter alongside the existing GitHub one closes that gap.
  • This pull request promotes the Gitea support from dev to local so it ships in the deployed image.
  • The benefit is operators can import any public Gitea/Forgejo repo (including farhoodlabs's own skills repos) the same way they import from GitHub.

What Changed

Three commits on top of `local` (full diff = `local..dev`):

  • `e559218f` — Add Gitea/Forgejo source support for company skills
    New `server/src/services/gitea-fetch.ts` + `gitea-skills.ts` modules: URL parser, hostname probe (`/api/v1/version`) with FIFO-evicted process-lifetime cache, default-branch resolution, paginated tree traversal, canonical + legacy `/raw/...` URL fallback, and a `fetchGiteaBranch` helper for update checks. Dispatcher in `readUrlSkillImports` routes to the Gitea path when the host probes positive. Wired through `company-portability.ts`, `company-export-readme.ts`, `feedback.ts`, the CLI client (`cli/src/commands/client/skills.ts`), shared types/validators, and the `CompanySkills` UI page. Stored skills get `sourceType: "gitea"`, are read-only, and participate in the existing update-check + content-fetch paths.
  • `044d7305` — Resolve Gitea branch refs via the branches endpoint
    Gitea's `/repos/{o}/{r}/commits/{ref}` only resolves SHAs, not branch names — switched `resolveGiteaPinnedRef` to use the branches endpoint, which accepts both branches and tags.
  • `33ab4f8c` — Address PR #19 review findings
    • GitHub Enterprise regression fix: dispatcher now probes for Gitea only on non-github.com hosts and falls back to the GitHub path for unknown hosts, restoring GHE support.
    • Extracted `readGitHubUrlSkillImports` as a sibling helper so `readUrlSkillImports` reads as a flat dispatcher.
    • SSRF guard (`isPrivateOrLoopbackHost` + `assertPublicHost`) in `gitea-fetch`: `probeGiteaHost` short-circuits to `false` for loopback / RFC1918 / link-local literal IPs without making a request, and `parseGiteaSourceUrl` rejects them outright.
    • `fetchGiteaTreeBlobPaths` throws on cap-hit instead of silently returning a partial blob listing (would hide SKILL.md files).
    • `parseGiteaSourceUrl` validates non-empty owner/repo after `.git` strip.
    • Removed dead `resolveGiteaCommitSha` + `GiteaCommitResponse` (unused since the branches-endpoint fix).
    • Test file extended; `gitea-skills.test.ts` is at 32 passing tests.

Verification

  • `pnpm --filter @paperclipai/server typecheck` — clean
  • `npx vitest run server/src/tests/gitea-skills.test.ts` — 32/32 pass
  • `npx vitest run server/src/tests/company-skills-routes.test.ts server/src/tests/company-skills-service.test.ts` — 22/22 pass
  • Build: Dev workflow on the head commit (in this PR's CI run)

Risks

  • Fork policy. CLAUDE.md §7 currently lists Gitea-hosted skills under "Don't pull back from `git log` without explicit go-ahead." This PR is the explicit go-ahead. §7 should be updated in a follow-up to reflect Gitea support as an accepted fork delta.
  • No PAT/auth support. Private Gitea/Forgejo repos will return `401`/`403` with a confusing `Failed to fetch ... 401` message. Public-only for now — matches the original feature scope.
  • SSRF guard is literal-IP only. A hostname that DNS-resolves to a private IP is not blocked here. The intended threat model is an operator pasting `http://192.168.1.10/...` into a skill-source field, not full SSRF defence.
  • Probe cache has no TTL. A host that flips from non-Gitea to Gitea (e.g. after install) won't be re-detected without a process restart. Acceptable given operator-controlled deployments.

Model Used

  • Claude Opus 4.7 (1M context) — `claude-opus-4-7[1m]`. Used for the review-findings commit (`33ab4f8c`) and this PR body. Capabilities: code review, refactor, test authoring.
  • Prior commits (`e559218f`, `044d7305`) were authored by the upstream feature work; see commit metadata.
## Thinking Path - Paperclip orchestrates AI agents for zero-human companies, and skills are how agents pick up new capabilities at runtime. - The skill-import subsystem (`server/src/services/company-skills.ts`) only knew how to fetch skill repos from GitHub. - The farhoodlabs deployment hosts its own skills on `git.farh.net` (Gitea/Forgejo). Without native Gitea support, those repos can't be imported as URL-sourced skills — operators have to vendor them locally, which defeats the point of remote skill sources. - Adding a Gitea/Forgejo source adapter alongside the existing GitHub one closes that gap. - This pull request promotes the Gitea support from `dev` to `local` so it ships in the deployed image. - The benefit is operators can import any public Gitea/Forgejo repo (including farhoodlabs's own skills repos) the same way they import from GitHub. ## What Changed Three commits on top of \`local\` (full diff = \`local..dev\`): - **\`e559218f\` — Add Gitea/Forgejo source support for company skills** New \`server/src/services/gitea-fetch.ts\` + \`gitea-skills.ts\` modules: URL parser, hostname probe (\`/api/v1/version\`) with FIFO-evicted process-lifetime cache, default-branch resolution, paginated tree traversal, canonical + legacy \`/raw/...\` URL fallback, and a \`fetchGiteaBranch\` helper for update checks. Dispatcher in \`readUrlSkillImports\` routes to the Gitea path when the host probes positive. Wired through \`company-portability.ts\`, \`company-export-readme.ts\`, \`feedback.ts\`, the CLI client (\`cli/src/commands/client/skills.ts\`), shared types/validators, and the \`CompanySkills\` UI page. Stored skills get \`sourceType: "gitea"\`, are read-only, and participate in the existing update-check + content-fetch paths. - **\`044d7305\` — Resolve Gitea branch refs via the branches endpoint** Gitea's \`/repos/{o}/{r}/commits/{ref}\` only resolves SHAs, not branch names — switched \`resolveGiteaPinnedRef\` to use the branches endpoint, which accepts both branches and tags. - **\`33ab4f8c\` — Address PR #19 review findings** - GitHub Enterprise regression fix: dispatcher now probes for Gitea only on non-github.com hosts and falls back to the GitHub path for unknown hosts, restoring GHE support. - Extracted \`readGitHubUrlSkillImports\` as a sibling helper so \`readUrlSkillImports\` reads as a flat dispatcher. - SSRF guard (\`isPrivateOrLoopbackHost\` + \`assertPublicHost\`) in \`gitea-fetch\`: \`probeGiteaHost\` short-circuits to \`false\` for loopback / RFC1918 / link-local literal IPs without making a request, and \`parseGiteaSourceUrl\` rejects them outright. - \`fetchGiteaTreeBlobPaths\` throws on cap-hit instead of silently returning a partial blob listing (would hide SKILL.md files). - \`parseGiteaSourceUrl\` validates non-empty owner/repo after \`.git\` strip. - Removed dead \`resolveGiteaCommitSha\` + \`GiteaCommitResponse\` (unused since the branches-endpoint fix). - Test file extended; \`gitea-skills.test.ts\` is at 32 passing tests. ## Verification - \`pnpm --filter @paperclipai/server typecheck\` — clean - \`npx vitest run server/src/__tests__/gitea-skills.test.ts\` — 32/32 pass - \`npx vitest run server/src/__tests__/company-skills-routes.test.ts server/src/__tests__/company-skills-service.test.ts\` — 22/22 pass - Build: Dev workflow on the head commit (in this PR's CI run) ## Risks - **Fork policy.** CLAUDE.md §7 currently lists Gitea-hosted skills under "Don't pull back from \`git log\` without explicit go-ahead." This PR is the explicit go-ahead. §7 should be updated in a follow-up to reflect Gitea support as an accepted fork delta. - **No PAT/auth support.** Private Gitea/Forgejo repos will return \`401\`/\`403\` with a confusing \`Failed to fetch ... 401\` message. Public-only for now — matches the original feature scope. - **SSRF guard is literal-IP only.** A hostname that DNS-resolves to a private IP is not blocked here. The intended threat model is an operator pasting \`http://192.168.1.10/...\` into a skill-source field, not full SSRF defence. - **Probe cache has no TTL.** A host that flips from non-Gitea to Gitea (e.g. after install) won't be re-detected without a process restart. Acceptable given operator-controlled deployments. ## Model Used - Claude Opus 4.7 (1M context) — \`claude-opus-4-7[1m]\`. Used for the review-findings commit (\`33ab4f8c\`) and this PR body. Capabilities: code review, refactor, test authoring. - Prior commits (\`e559218f\`, \`044d7305\`) were authored by the upstream feature work; see commit metadata.
Chris Farhood added 3 commits 2026-06-10 03:09:37 +00:00
fork: add Gitea/Forgejo source support for company skills
Build: Dev / build (push) Successful in 4m41s
Build: Dev / update-infra (push) Successful in 0s
e559218f98
Reintroduce Gitea/Forgejo as a skill import source on dev only, since
the fork deploys against git.farh.net. Pasting a Gitea/Forgejo repo
URL into the skills sidebar mirrors the existing GitHub experience:
pin to a commit SHA, check for updates, read repo files.

Server: new gitea-fetch.ts (URL builders, probe-cache helpers) and
gitea-skills.ts (parse, probe, pin, tree, text, branch). Dispatch in
readUrlSkillImports probes /api/v1/version and routes non-github.com
hosts into the new readGiteaUrlSkillImports branch. updateStatus and
readFile get a gitea arm alongside the github/skills_sh arm. Audit
falls through to "remote not supported" the same way github does.

UI: Server icon, Gitea source label, gitea in the "external" source
class, Pin/Update UI gate widened to sourceType === "gitea". CLI help
text updated. Existing github code is left byte-for-byte unchanged
(wrapped in isGitHubDotCom) so dev <-> master syncs stay clean.

PAT support, gitea portability descriptors, and gitea audit are
deliberate follow-ups. Detection requires /api/v1/version to return
Gitea-shaped JSON; the per-host result is cached for process lifetime
with FIFO eviction at 1024 entries. Non-Gitea hosts fall through to
the existing raw-markdown url branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
fork: resolve Gitea branch refs via the branches endpoint
Build: Dev / build (push) Successful in 3m40s
Build: Dev / update-infra (push) Successful in 1s
044d730525
Gitea's /repos/{o}/{r}/commits/{ref} only resolves 40-hex SHAs —
a branch name like "main" returns 404 even when the branch exists.
GitHub's API is more lenient and resolves branch names server-side.
resolveGiteaPinnedRef was calling /commits/{ref} and 404ing on
branch refs, so the entire import path failed before it could
read the tree. updateStatus already used the branches endpoint
correctly; this aligns resolveGiteaPinnedRef with it.

resolveGiteaCommitSha is now a SHA-only helper that refuses to
make the API call for non-SHA refs (matches Gitea's contract).
Test mocks updated to return the branch response shape.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
fork: address PR #19 review findings for Gitea skill support
Build: Dev / build (push) Successful in 3m34s
Build: Dev / update-infra (push) Successful in 0s
33ab4f8cdd
- Fix GitHub Enterprise regression: dispatcher now probes for Gitea only
  on non-github.com hosts and falls back to the GitHub path for unknown
  hosts, preserving GHE support that the earlier strict github.com match
  broke.
- Refactor readUrlSkillImports into a flat dispatcher with a sibling
  readGitHubUrlSkillImports helper, mirroring readGiteaUrlSkillImports.
- Add SSRF guard (isPrivateOrLoopbackHost + assertPublicHost) in
  gitea-fetch; short-circuit probeGiteaHost and reject parseGiteaSourceUrl
  for loopback / RFC1918 / link-local literal IPs.
- Throw on fetchGiteaTreeBlobPaths cap-hit instead of silently returning a
  partial blob listing (would hide SKILL.md files).
- Validate non-empty repo in parseGiteaSourceUrl after .git strip.
- Remove dead resolveGiteaCommitSha + GiteaCommitResponse (unused since
  the branches-endpoint follow-up).
- Tests updated and extended.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Chris Farhood added 1 commit 2026-06-10 11:05:10 +00:00
fork: route Gitea skills through owner/repo canonical-key path
Build: Dev / build (push) Successful in 5m20s
Build: Dev / update-infra (push) Successful in 0s
59dc05bdbc
deriveCanonicalSkillKey only emitted the owner/repo-based key for
github and skills_sh sources, so Gitea skills fell through to the
generic company/{companyId}/{slug} branch. Add gitea to the
sourceType / sourceKind clause so a Gitea skill at
git.example.com/owner/repo gets key owner/repo/{slug}, matching the
GitHub format. Existing imports keep their old keys until re-imported.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Chris Farhood merged commit 67fb8249ac into local 2026-06-10 16:06:15 +00:00
Sign in to join this conversation.