Files

T

Chris Farhood 4ee7a5bf29 Update PR workflow: CI → UAT (Patty) → QA (Regina) → CTO → merge

Reorder the review pipeline so cheap/fast stages gate expensive ones:
CI (free) runs first, then Patty validates E2E on MiniMax, then
Regina does deep code review on Sonnet, then Nancy reviews last.

- POLICIES.md: rewrite PR Workflow with 6-step ordered pipeline
- Patty SOUL.md: establish her as first reviewer, add CI-must-pass rule
- Patty HEARTBEAT.md: check CI status before E2E, report results for Regina
- Regina SOUL.md: flip from "review first" to "review after UAT"
- Regina HEARTBEAT.md: skip PRs without CI + E2E validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-24 20:52:05 -04:00

11 KiB

Raw Permalink Blame History

Privileged Escalation — Shared Policies

All agents in this org must follow these policies.

Environment Variables

PAPERCLIP_API_KEY, PAPERCLIP_API_URL, PAPERCLIP_RUN_ID, PAPERCLIP_AGENT_ID, PAPERCLIP_COMPANY_ID are pre-injected into your process environment. Do NOT base64-decode, JWT-parse, or manually verify tokens — just use them directly in commands. If PAPERCLIP_API_URL appears empty in a shell command, use http://localhost:3100 as the API base URL.

Infrastructure

Container images: Push to ghcr.io only. We do not use Docker Hub, do not mirror public images, and do not maintain any other registry.
Dependency updates: Managed by Mend Renovate. We do not use Dependabot — never enable it, never create .github/dependabot.yml, never reference it in workflows or docs.
Package mirrors: Do not set up, configure, or reference package mirrors or proxies of any kind (npm, pip, Maven, container, etc.). Always use upstream registries directly.
Plugin installation: ArtifactHub only via Headlamp's native plugin installer. No Helm-based plugin installation, no custom install scripts.

Versioning

All releases use SemVer (semantic versioning). ArtifactHub requires SemVer for Headlamp plugin packages. Do not use CalVer.

Cluster Infrastructure

The following services are available in the cluster. Use them via their operators — do not install standalone instances.

Layer	Technology	Access	Policy
Block storage	TrueNAS CSI	storageClass: block-truenas	All PVCs backed by TrueNAS SCALE.
File storage	Rook-Ceph	storageClass: ceph-filesystem	CephFS for shared filesystems.
External Object storage	Rook-Ceph	CephObjectStore/objectstore-ceph-external	RGW for S3-compatible object storage.
Internal Object storage	Rook-Ceph	CephObjectStore/objectstore-ceph-internal	RGW for S3-compatible object storage.
Database Primary	CloudNativePG Operator	postgresql.cnpg.io/Cluster	All PostgreSQL via CloudNativePG (CNPG) CRDs. No manual Postgres installs. 3 Replicas & 30 Days of Backup in Production, 1 Replica in Dev/Test/QA 5 Days of Backup.
Database Alternate	MariaDB Operator	k8s.mariadb.com/MaxScale	All MariaDB via MariaDB Operator CRDs. No manual MariaDB installs. No MySQL. 3 Replicas & 30 Days of Backup in Production, 1 Replica in Dev/Test/QA 5 Days of Backup.
Cache / Pub-sub	DragonflyDB Operator	dragonflydb.io/Dragonfly	Redis-compatible via Dragonfly Operator CRDs. No manual DragonflyDB installs. No Redis. No Persistent or Durable Data, No Exceptions. 3 Replicas in Production, 1 Replica in Dev/Test/QA
MQTT	EMQX Operator	apps.emqx.io/EMQX	MQTT broker via `EMQX` CRDs. For IoT and messaging workloads. 3 Replicas in Production, 1 Replica in Dev/Test/QA
Authenticated External Services	Istio Gateway + Authentik	gateway-system/istio-external	OIDC/SSO for all web apps. No custom auth systems.
Authenticated Internal Services	Istio Gateway + Authentik	gateway-system/istio-internal	OIDC/SSO for all web apps. No custom auth systems.
Unauthenticated External Services	Cilum Gateway	gateway-system/external	High performance unauthenticated web apps.
Unauthenticated Internal Services	Cilum Gateway	gateway-system/internal	High performance unauthenticated web apps.
Monitoring	Prometheus Stack		Create ServiceMonitors and PrometheusRules for all services. AlertManager for alerting.

Infrastructure Deployment

Infrastructure deploys through a two-stage GitOps pipeline:

Org infra repo (<org>/infra) — contains the Kubernetes manifests for this org's applications (deployments, services, CNPG clusters, etc.)
Platform repo (cpfarhood/kubernetes) — contains Flux Kustomizations that reference each org's infra repo. Flux watches THIS repo, not the org infra repos directly.

When you need an infrastructure change:

Commit the manifest change to your org's infra repo (e.g., cartsnitch/infra, groombook/infra)
If the change requires a NEW resource that Flux doesn't already reference (new Kustomization, new namespace, new sealed secret), a corresponding change to cpfarhood/kubernetes is also needed — create a Paperclip issue for the board
If the change is to an EXISTING resource already tracked by Flux, committing to the org infra repo is sufficient — Flux will pick it up on the next reconciliation cycle

Do NOT assume that committing to the org infra repo is always sufficient. New resources, new namespaces, and new secrets require platform repo changes that only the board can make.

kubectl is available and agents have the following access:
- Cluster-wide: read-only (get, list, watch) across all namespaces
- privilegedescalation namespace: read-write (production — changes MUST go through Flux, not kubectl)
- privilegedescalation-dev namespace: read-write (development — agents may use kubectl freely for testing, debugging, and iteration)
Production (privilegedescalation): All changes go through the infra repo and Flux. Do not kubectl apply to production. Flux will revert manual changes.
Development (privilegedescalation-dev): Prefer Flux-managed manifests in the infra repo even for dev workloads. Agents have read-write kubectl access for rapid iteration and debugging, but changes should be committed to the infra repo once validated.
Headlamp: Production Headlamp runs in kube-system. Development/testing Headlamp instances go in privilegedescalation-dev. Never deploy test plugins to the production Headlamp in kube-system.
If you need a production infrastructure change, create a PR against the infra repo (or create a Paperclip issue for the agent who owns infra).

Kubernetes Secrets

All Kubernetes secrets MUST be managed as SealedSecrets (Bitnami Sealed Secrets). Never commit plaintext Kubernetes Secret manifests to any repo. Never use kubectl create secret in production.

Use kubeseal to encrypt secrets against the cluster's public certificate
Commit the resulting SealedSecret resource to the org infra repo (privilegedescalation/infra)
The Sealed Secrets controller decrypts them in-cluster at deploy time
If kubeseal is not available, install it: curl -sL https://github.com/bitnami-labs/sealed-secrets/releases/latest/download/kubeseal-$(uname -s | tr '[:upper:]' '[:lower:]')-$(uname -m | sed 's/x86_64/amd64/') -o /usr/local/bin/kubeseal && chmod +x /usr/local/bin/kubeseal

RBAC and Permissions

Do not request additional RBAC, GitHub App permissions, or cluster permissions. The current access levels are final. This includes:

GitHub App permissions (administration, vulnerability_alerts, workflows, self_hosted_runners, etc.)
Kubernetes RBAC (Roles, RoleBindings, ClusterRoles)
Flux GitRepository/Kustomization additions to the platform repo
Any other form of privilege escalation

Agents must design their workflows to operate within existing permissions. If a task cannot be accomplished with current access, find an alternative approach — do not escalate to the board for more permissions.

Workaround guidance:

Branch protection: Enforce via agent policy (this document), not GitHub API
Security scanning: Use local tools (npm audit, pnpm audit) instead of the GitHub vulnerability alerts API
CI runner health: Verify by triggering workflows, not querying the runner API
E2E testing: Use the privilegedescalation-dev namespace where agents have read-write access

Git Workflow

All changes go through feature branches and PRs. Never push directly to main.
Branch protection: CEOs must enforce the PR workflow via GitHub branch protection rules wherever possible — require PR reviews, require status checks, restrict who can merge. Policy should be enforced by GitHub, not just by agent prompts.
Do not approve or merge PRs on the privilegedescalation/agents repo — only the board may approve changes to agent configurations and prompts.
When creating a new pull request, include cc @cpfarhood at the bottom of the PR body.

PR Workflow

All code changes follow this lifecycle:

Engineer opens a PR from a feature branch (never push directly to main)
CI passes — lint, types, unit tests must all be green before any reviewer spends tokens
UAT (Patty) validates E2E — browser testing against the deployed build in privilegedescalation-dev. Patty only picks up PRs with passing CI.
QA (Regina) reviews — code-level review: test coverage, regressions, edge cases. Regina only picks up PRs that have passed both CI and E2E.
CTO (Nancy) reviews — architecture alignment, code quality, security. Nancy only reviews after both UAT and QA have approved.
CEO (Countess) merges — only after UAT + QA + CTO have approved and CI passes

Review order is mandatory: CI → UAT → QA → CTO → merge. Each stage gates the next. If an agent reviews out of order, the earlier reviewer should refuse to review until the process is corrected — comment on the PR noting the violation. No agent merges their own PRs. No agent merges without triple approval (UAT + QA + CTO).

Work Distribution

All engineering and devops work must be broken down and distributed by the CTO (Nancy) for engineers to execute. Engineers should not self-assign work — the CTO triages, scopes, and assigns all implementation tasks.

Issue Tracking

GitHub issues are the primary tracker. All bugs, features, and work items are tracked as GitHub issues in the relevant repo. Paperclip issues are secondary — use them to trigger and coordinate agents (assignments, status handoffs, heartbeat wakes), not as the primary record of work.
GitHub issues stay open until deployed and validated. A GitHub issue is not done when a PR is merged. It is done when the change is deployed to production and validated as working. Merging is a step in the process, not the finish line.

Task Assignment

To hand off work to another agent, create a Paperclip issue with assigneeAgentId set:

curl -sf -X POST "$PAPERCLIP_API_URL/api/companies/$PAPERCLIP_COMPANY_ID/issues" \
  -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID" \
  -d '{"title": "...", "description": "...", "status": "todo", "assigneeAgentId": "<target-agent-id>", "parentId": "<parent-issue-id-if-subtask>"}'

Always include:

A clear title and description so the assignee understands the work without asking questions
assigneeAgentId — the target agent's ID (find IDs in each agent's CONFIG.md)
parentId if this is a subtask of an existing issue
A comment on the parent issue noting the delegation

To reassign an existing issue:

curl -sf -X PATCH "$PAPERCLIP_API_URL/api/issues/{issueId}" \
  -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID" \
  -d '{"assigneeAgentId": "<target-agent-id>", "comment": "Reassigning because..."}'

Never leave work unassigned. If you cannot do it yourself, assign it to the right agent with context.

CI/CD Workflow Access

Only Hugh Hackman has write access to .github/workflows/ files. All other agents must delegate CI/CD workflow changes to him.

11 KiB Raw Permalink Blame History