40ec5fba35
Update org harness/model/effort and PR review policy
145 lines
11 KiB
Markdown
145 lines
11 KiB
Markdown
# Privileged Escalation — Shared Policies
|
|
|
|
All agents in this org must follow these policies.
|
|
|
|
## Environment Variables
|
|
|
|
`PAPERCLIP_API_KEY`, `PAPERCLIP_API_URL`, `PAPERCLIP_RUN_ID`, `PAPERCLIP_AGENT_ID`, `PAPERCLIP_COMPANY_ID` are pre-injected into your process environment. **Do NOT base64-decode, JWT-parse, or manually verify tokens** — just use them directly in commands. If `PAPERCLIP_API_URL` appears empty in a shell command, use `http://localhost:3100` as the API base URL.
|
|
|
|
## Infrastructure
|
|
|
|
- **Container images**: Push to `ghcr.io` only. We do not use Docker Hub, do not mirror public images, and do not maintain any other registry.
|
|
- **Dependency updates**: Managed by **Mend Renovate**. We do not use Dependabot — never enable it, never create `.github/dependabot.yml`, never reference it in workflows or docs.
|
|
- **Package mirrors**: Do not set up, configure, or reference package mirrors or proxies of any kind (npm, pip, Maven, container, etc.). Always use upstream registries directly.
|
|
- **Plugin installation**: ArtifactHub only via Headlamp's native plugin installer. No Helm-based plugin installation, no custom install scripts.
|
|
|
|
## Versioning
|
|
|
|
All releases use **SemVer** (semantic versioning). ArtifactHub requires SemVer for Headlamp plugin packages. Do not use CalVer.
|
|
|
|
## Cluster Infrastructure
|
|
|
|
The following services are available in the cluster. Use them via their operators — do not install standalone instances.
|
|
|
|
| Layer | Technology | Access | Policy |
|
|
|-------|-----------|--------|--------|
|
|
| **Block storage** | TrueNAS CSI | storageClass: block-truenas | All PVCs backed by TrueNAS SCALE. |
|
|
| **File storage** | Rook-Ceph | storageClass: ceph-filesystem | CephFS for shared filesystems. |
|
|
| **External Object storage** | Rook-Ceph | CephObjectStore/objectstore-ceph-external | RGW for S3-compatible object storage. |
|
|
| **Internal Object storage** | Rook-Ceph | CephObjectStore/objectstore-ceph-internal | RGW for S3-compatible object storage. |
|
|
| **Database Primary** | CloudNativePG Operator | postgresql.cnpg.io/Cluster | All PostgreSQL via CloudNativePG (CNPG) CRDs. No manual Postgres installs. 3 Replicas & 30 Days of Backup in Production, 1 Replica in Dev/Test/QA 5 Days of Backup. |
|
|
| **Database Alternate** | MariaDB Operator | k8s.mariadb.com/MaxScale | All MariaDB via MariaDB Operator CRDs. No manual MariaDB installs. No MySQL. 3 Replicas & 30 Days of Backup in Production, 1 Replica in Dev/Test/QA 5 Days of Backup. |
|
|
| **Cache / Pub-sub** | DragonflyDB Operator | dragonflydb.io/Dragonfly | Redis-compatible via Dragonfly Operator CRDs. No manual DragonflyDB installs. No Redis. No Persistent or Durable Data, No Exceptions. 3 Replicas in Production, 1 Replica in Dev/Test/QA |
|
|
| **MQTT** | EMQX Operator | apps.emqx.io/EMQX | MQTT broker via `EMQX` CRDs. For IoT and messaging workloads. 3 Replicas in Production, 1 Replica in Dev/Test/QA |
|
|
| **Authenticated External Services** | Istio Gateway + Authentik | gateway-system/istio-external | OIDC/SSO for all web apps. No custom auth systems. |
|
|
| **Authenticated Internal Services** | Istio Gateway + Authentik | gateway-system/istio-internal | OIDC/SSO for all web apps. No custom auth systems. |
|
|
| **Unauthenticated External Services** | Cilum Gateway | gateway-system/external | High performance unauthenticated web apps. |
|
|
| **Unauthenticated Internal Services** | Cilum Gateway | gateway-system/internal | High performance unauthenticated web apps. |
|
|
| **Monitoring** | Prometheus Stack | | Create ServiceMonitors and PrometheusRules for all services. AlertManager for alerting. |
|
|
|
|
## Infrastructure Deployment
|
|
|
|
Infrastructure deploys through a two-stage GitOps pipeline:
|
|
|
|
1. **Org infra repo** (`<org>/infra`) — contains the Kubernetes manifests for this org's applications (deployments, services, CNPG clusters, etc.)
|
|
2. **Platform repo** (`cpfarhood/kubernetes`) — contains Flux Kustomizations that reference each org's infra repo. Flux watches THIS repo, not the org infra repos directly.
|
|
|
|
When you need an infrastructure change:
|
|
|
|
1. Commit the manifest change to your org's infra repo (e.g., `cartsnitch/infra`, `groombook/infra`)
|
|
2. If the change requires a NEW resource that Flux doesn't already reference (new Kustomization, new namespace, new sealed secret), a corresponding change to `cpfarhood/kubernetes` is also needed — create a Paperclip issue for the board
|
|
3. If the change is to an EXISTING resource already tracked by Flux, committing to the org infra repo is sufficient — Flux will pick it up on the next reconciliation cycle
|
|
|
|
**Do NOT assume that committing to the org infra repo is always sufficient.** New resources, new namespaces, and new secrets require platform repo changes that only the board can make.
|
|
- **`kubectl` is available** and agents have the following access:
|
|
- **Cluster-wide**: read-only (`get`, `list`, `watch`) across all namespaces
|
|
- **`privilegedescalation` namespace**: read-write (production — changes MUST go through Flux, not kubectl)
|
|
- **`privilegedescalation-dev` namespace**: read-write (development — agents may use kubectl freely for testing, debugging, and iteration)
|
|
- **Production (`privilegedescalation`)**: All changes go through the infra repo and Flux. Do not `kubectl apply` to production. Flux will revert manual changes.
|
|
- **Development (`privilegedescalation-dev`)**: Prefer Flux-managed manifests in the infra repo even for dev workloads. Agents have read-write kubectl access for rapid iteration and debugging, but changes should be committed to the infra repo once validated.
|
|
- **Headlamp**: Production Headlamp runs in `kube-system`. Development/testing Headlamp instances go in `privilegedescalation-dev`. Never deploy test plugins to the production Headlamp in `kube-system`.
|
|
- If you need a production infrastructure change, create a PR against the infra repo (or create a Paperclip issue for the agent who owns infra).
|
|
|
|
## Kubernetes Secrets
|
|
|
|
All Kubernetes secrets MUST be managed as **SealedSecrets** (Bitnami Sealed Secrets). Never commit plaintext Kubernetes `Secret` manifests to any repo. Never use `kubectl create secret` in production.
|
|
|
|
- Use `kubeseal` to encrypt secrets against the cluster's public certificate
|
|
- Commit the resulting `SealedSecret` resource to the org infra repo (`privilegedescalation/infra`)
|
|
- The Sealed Secrets controller decrypts them in-cluster at deploy time
|
|
- If `kubeseal` is not available, install it: `curl -sL https://github.com/bitnami-labs/sealed-secrets/releases/latest/download/kubeseal-$(uname -s | tr '[:upper:]' '[:lower:]')-$(uname -m | sed 's/x86_64/amd64/') -o /usr/local/bin/kubeseal && chmod +x /usr/local/bin/kubeseal`
|
|
|
|
## RBAC and Permissions
|
|
|
|
**Do not request additional RBAC, GitHub App permissions, or cluster permissions.** The current access levels are final. This includes:
|
|
|
|
- GitHub App permissions (administration, vulnerability_alerts, workflows, self_hosted_runners, etc.)
|
|
- Kubernetes RBAC (Roles, RoleBindings, ClusterRoles)
|
|
- Flux GitRepository/Kustomization additions to the platform repo
|
|
- Any other form of privilege escalation
|
|
|
|
Agents must design their workflows to operate within existing permissions. If a task cannot be accomplished with current access, find an alternative approach — do not escalate to the board for more permissions.
|
|
|
|
**Workaround guidance:**
|
|
- **Branch protection**: Enforce via agent policy (this document), not GitHub API
|
|
- **Security scanning**: Use local tools (`npm audit`, `pnpm audit`) instead of the GitHub vulnerability alerts API
|
|
- **CI runner health**: Verify by triggering workflows, not querying the runner API
|
|
- **E2E testing**: Use the `privilegedescalation-dev` namespace where agents have read-write access
|
|
|
|
## Git Workflow
|
|
|
|
- All changes go through feature branches and PRs. Never push directly to main.
|
|
- **Branch protection**: CEOs must enforce the PR workflow via GitHub branch protection rules wherever possible — require PR reviews, require status checks, restrict who can merge. Policy should be enforced by GitHub, not just by agent prompts.
|
|
- Do not approve or merge PRs on the `privilegedescalation/agents` repo — only the board may approve changes to agent configurations and prompts.
|
|
- When creating a new pull request, include `cc @cpfarhood` at the bottom of the PR body.
|
|
|
|
## PR Workflow
|
|
|
|
All code changes follow this lifecycle:
|
|
|
|
1. **Engineer opens a PR** from a feature branch (never push directly to main)
|
|
2. **QA (Regina) reviews first** — verifies tests, coverage, regressions, edge cases
|
|
3. **CTO (Nancy) reviews second** — verifies architecture alignment, code quality, security. **The CTO must NOT review or approve a PR before QA has approved it.**
|
|
4. **CEO (Countess) merges** — only after both QA and CTO have approved and CI passes
|
|
|
|
**Review order is mandatory.** QA reviews first, CTO reviews second. If the CTO reviews before QA has approved, QA should refuse to review the PR until the process is corrected. A PR is not ready to merge until it has both QA and CTO approval in the correct order. No agent merges their own PRs. No agent merges without dual approval.
|
|
|
|
## Work Distribution
|
|
|
|
All engineering and devops work must be broken down and distributed by the CTO (Nancy) for engineers to execute. Engineers should not self-assign work — the CTO triages, scopes, and assigns all implementation tasks.
|
|
|
|
## Issue Tracking
|
|
|
|
- **GitHub issues are the primary tracker.** All bugs, features, and work items are tracked as GitHub issues in the relevant repo. Paperclip issues are secondary — use them to trigger and coordinate agents (assignments, status handoffs, heartbeat wakes), not as the primary record of work.
|
|
- **GitHub issues stay open until deployed and validated.** A GitHub issue is not done when a PR is merged. It is done when the change is deployed to production and validated as working. Merging is a step in the process, not the finish line.
|
|
|
|
## Task Assignment
|
|
|
|
To hand off work to another agent, create a Paperclip issue with `assigneeAgentId` set:
|
|
|
|
curl -sf -X POST "$PAPERCLIP_API_URL/api/companies/$PAPERCLIP_COMPANY_ID/issues" \
|
|
-H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-H "X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID" \
|
|
-d '{"title": "...", "description": "...", "status": "todo", "assigneeAgentId": "<target-agent-id>", "parentId": "<parent-issue-id-if-subtask>"}'
|
|
|
|
Always include:
|
|
- A clear title and description so the assignee understands the work without asking questions
|
|
- `assigneeAgentId` — the target agent's ID (find IDs in each agent's CONFIG.md)
|
|
- `parentId` if this is a subtask of an existing issue
|
|
- A comment on the parent issue noting the delegation
|
|
|
|
To reassign an existing issue:
|
|
|
|
curl -sf -X PATCH "$PAPERCLIP_API_URL/api/issues/{issueId}" \
|
|
-H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-H "X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID" \
|
|
-d '{"assigneeAgentId": "<target-agent-id>", "comment": "Reassigning because..."}'
|
|
|
|
**Never leave work unassigned.** If you cannot do it yourself, assign it to the right agent with context.
|
|
|
|
## CI/CD Workflow Access
|
|
|
|
Only Hugh Hackman has write access to `.github/workflows/` files. All other agents must delegate CI/CD workflow changes to him.
|