forked from groombook/org
Compare commits
14 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 6e0e29f374 | |||
| f69319e270 | |||
| deb507714b | |||
| 4df9637518 | |||
| 11fd2a4c34 | |||
| d6b13fa58d | |||
| c756b16c7c | |||
| 743e913126 | |||
| a0ae03e959 | |||
| 643d9e8956 | |||
| 93927ea402 | |||
| a51b28a315 | |||
| 198ed88350 | |||
| 2ae9c4c2d4 |
@@ -0,0 +1,76 @@
|
||||
---
|
||||
name: devops
|
||||
description: >
|
||||
Infrastructure lifecycle for GroomBook. Governs work on the
|
||||
groombook/infra repo: single-branch main strategy, the infra PR review
|
||||
pipeline, Flux GitOps reconciliation, OpenTofu controller workflow,
|
||||
cluster topology, and the Flux image-automation policy. For application
|
||||
code, see the sdlc skill.
|
||||
---
|
||||
|
||||
# DevOps Practices
|
||||
|
||||
This skill governs work on **`groombook/infra`**. For application code lifecycle, see the `sdlc` skill. For PR/test discipline and the `cc @cpfarhood` visibility rule, see `coding-standards`. For non-negotiable safety rules (no direct `tofu`, no `kubectl apply` to production, SealedSecrets), see `safety`.
|
||||
|
||||
## Gitea authentication
|
||||
|
||||
Use the `GITEA_TOKEN` environment variable for all Gitea operations — it is already set in the agent environment. Use the **`tea`** CLI for all Gitea/Git operations (e.g., `tea issue list`, `tea pr create`). Gitea is the primary source of truth.
|
||||
|
||||
## Branch strategy
|
||||
|
||||
`groombook/infra` uses a single long-lived branch: **`main`**. Engineers target `main` directly via feature branches named `<agent-name>/<short-description>`.
|
||||
|
||||
## Pipeline
|
||||
|
||||
1. **Engineer** branches from `main`, writes code.
|
||||
2. **Engineer** opens a PR against `main`.
|
||||
3. **CI** fail → back to **Engineer**.
|
||||
4. **CI** pass → **QA** performs code review.
|
||||
5. **QA** rejected → back to **Engineer**.
|
||||
6. **QA** approved → **CTO** performs code review.
|
||||
7. **CTO** rejected → back to **Engineer**.
|
||||
8. **CTO** approved → **Engineer** merges PR → **Flux** reconciles automatically.
|
||||
|
||||
```bash
|
||||
tea pr create --base main --title "..." --body "... cc @cpfarhood"
|
||||
```
|
||||
|
||||
Gitea branch protection requires CI checks to pass. See `coding-standards` for the no-self-merge contract and the `cc @cpfarhood` rule.
|
||||
|
||||
## Infrastructure topology
|
||||
|
||||
* **Production:** namespace `groombook`, FQDN `demo.groombook.dev`
|
||||
* **UAT:** namespace `groombook-uat`, FQDN `uat.groombook.dev`
|
||||
* **Dev:** namespace `groombook-dev`, FQDN `dev.groombook.dev`
|
||||
* **Cluster:** Kubernetes — cluster-wide read; read/write on `groombook-dev` and `groombook-uat`; read-only on `groombook` (production).
|
||||
* **Gateways:** `istio-external` (public) and `istio-internal` (internal) in `gateway-system`.
|
||||
* **Container registry:** `git.farh.net/groombook/<service>` only.
|
||||
|
||||
## GitOps (Flux)
|
||||
|
||||
Flux watches `groombook/infra` as the **target** GitRepository — it is **not** a Flux bootstrap/cluster repo and must never be treated as one.
|
||||
|
||||
Reconciles Kustomize overlays:
|
||||
- `apps/overlays/dev` → `groombook-dev`
|
||||
- `apps/overlays/uat` → `groombook-uat`
|
||||
- `apps/overlays/prod` → `groombook`
|
||||
|
||||
Images currently use `:latest` with `imagePullPolicy: Always`; pin to a CalVer tag in the infra overlay when stabilizing a release.
|
||||
|
||||
**Policy — Flux Image Tag Automation is DENIED.** Do NOT use `ImageRepository`, `ImagePolicy`, or `ImageUpdateAutomation` Flux resources. Image tag updates must be made intentionally via a PR to `groombook/infra` — typically as the final step of the `sdlc` application pipeline (Phase 5).
|
||||
|
||||
## Infrastructure as Code
|
||||
|
||||
Terraform (OpenTofu) is deployed via the **Flux OpenTofu Controller** in a GitOps fashion. Submit Terraform configurations via a PR to `groombook/infra` — the tofu controller reconciles them on merge. See `safety` for the prohibition on running `tofu` directly and on `kubectl apply` against production.
|
||||
|
||||
## Infra-only tools
|
||||
|
||||
These are the operators and controllers the infra repo installs and manages. Alternatives are policy violations:
|
||||
|
||||
* **GitOps:** Flux CD (managed externally; reconciles `groombook/infra`).
|
||||
* **IaC:** Flux OpenTofu Controller.
|
||||
* **Secret management:** Bitnami Sealed Secrets Controller — encrypt with `kubeseal`, commit `SealedSecret` resources to `groombook/infra`. No plain Kubernetes secrets.
|
||||
* **Database operator:** CloudNativePG (Postgres).
|
||||
* **Cache / pub-sub operator:** DragonflyDB.
|
||||
|
||||
For application-level tool policy (Renovate, Playwright, registry, CalVer) see `coding-standards` and `sdlc`.
|
||||
@@ -29,3 +29,23 @@ The following rules apply to every GroomBook agent without exception.
|
||||
## If you are unsure
|
||||
|
||||
If you are unsure whether an action is safe, **stop**. Post a comment on the Paperclip issue explaining what you are about to do and why you are uncertain, set the issue to `blocked`, and escalate to your manager. Do not guess.
|
||||
|
||||
## Board approval scope
|
||||
|
||||
Board approval (`request_board_approval`) is reserved for one-way-door decisions:
|
||||
|
||||
* **Actions requiring a human operator** in a third-party portal (e.g. Gitea Owners team config, external vendor consoles).
|
||||
* **Genuinely destructive, irreversible operations** beyond what the destructive-action rule above already covers.
|
||||
* **Out-of-scope decisions** that exceed the agent's mandate.
|
||||
* **New spend or resource authorizations.**
|
||||
* **Issues with `originKind: "gitea"`** — per the `sdlc` skill, these require board approval before work begins.
|
||||
|
||||
Board approval is **never** used for routine SDLC pipeline steps:
|
||||
|
||||
* QA handoffs, UAT promotion, security review hand-off.
|
||||
* Returning a failing PR to the engineer or CTO.
|
||||
* Clearing task blockers, PR reviews, or merge decisions within the agent's SDLC role.
|
||||
* Feature triage decisions (Accepted / Backlogged / Denied).
|
||||
* Any standard dev → uat → prod progression.
|
||||
|
||||
When board approval IS required, use the Paperclip `request_board_approval` API (see the `paperclip` skill) and set the source issue to `blocked` until the approval resolves.
|
||||
|
||||
+55
-139
@@ -1,193 +1,109 @@
|
||||
---
|
||||
name: sdlc
|
||||
description: >
|
||||
Software development lifecycle for GroomBook. Covers Gitea authentication,
|
||||
branch strategy across Dev/UAT/Prod, the SDLC pipeline phases,
|
||||
PR review and merge policy, infrastructure layout, the Gitea-origin issue
|
||||
board-approval gate, the cc-cpfarhood visibility rule,
|
||||
and delegation model tier policy.
|
||||
Software development lifecycle for GroomBook application repos. Covers
|
||||
Gitea authentication, the 3-branch dev/uat/main strategy, the SDLC
|
||||
pipeline phases 1-5, the Stage 1 CI image build, the authentication
|
||||
framework, and application-tool policy. For infrastructure
|
||||
(groombook/infra), see the devops skill.
|
||||
---
|
||||
|
||||
# Software Development Lifecycle
|
||||
|
||||
This skill governs **application code repos**. For infrastructure (`groombook/infra`), see the `devops` skill. For PR/test discipline and the `cc @cpfarhood` visibility rule, see `coding-standards`. For non-negotiable safety rules, see `safety`.
|
||||
|
||||
## Gitea authentication
|
||||
|
||||
**Use the `GITEA_TOKEN`** environment variable for all Gitea operations. It is already set in the agent environment. Use the **`tea`** CLI for all Gitea/Git operations (e.g., `tea issue list`, `tea pr create`). The token expires when the environment variable is rotated — re-invoke any Gitea operation if you get a 401.
|
||||
|
||||
Gitea is the **primary source of truth**. Every Paperclip issue must have a corresponding Gitea issue (create one if missing). Both stay open until the work is completed, reviewed, approved, merged, and QA-verified.
|
||||
|
||||
## Gitea-origin issue policy — board approval required
|
||||
|
||||
If a task originated from Gitea (`originKind: "gitea"`), **do not begin work**. Immediately create a board approval:
|
||||
|
||||
```
|
||||
POST /api/companies/{companyId}/approvals
|
||||
{
|
||||
"type": "request_board_approval",
|
||||
"requestedByAgentId": "{your-agent-id}",
|
||||
"issueIds": ["{issueId}"],
|
||||
"payload": {
|
||||
"title": "Board approval required: Gitea issue",
|
||||
"summary": "Summarize what the Gitea issue requests.",
|
||||
"recommendedAction": "Approve to begin work.",
|
||||
"risks": ["Work begins without board review if approved."]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Set the issue to `blocked` with a comment linking to the approval. Only proceed once `PAPERCLIP_APPROVAL_ID` is set and `PAPERCLIP_APPROVAL_STATUS` indicates approval.
|
||||
|
||||
## Branch strategy
|
||||
|
||||
Three long-lived branches map to the three deployment environments:
|
||||
|
||||
| Branch | Environment | Who merges |
|
||||
|--------|-------------|-----------|
|
||||
| `dev` | Dev | Engineer (self-merges after CI passes) |
|
||||
| `uat` | UAT | QA (merges after code review) |
|
||||
| `main` | Production | CEO (merges after UAT validation) |
|
||||
| Branch | Environment | Who merges | Prerequisites for merge |
|
||||
|--------|-------------|-----------|-----------|
|
||||
| `dev` | Dev | Engineer | CI passes |
|
||||
| `uat` | UAT | Engineer | QA code review approval |
|
||||
| `main` | Production | Engineer | UAT validation & CTO code review |
|
||||
|
||||
**Engineers always target `dev`** — never `uat` or `main` directly. Feature branches: `<agent-name>/<short-description>`.
|
||||
**Engineers always target `dev` first** — never `uat` or `main` directly.
|
||||
- Feature branches: `<agent-name>/<short-description>`.
|
||||
|
||||
## Pull requests
|
||||
|
||||
All changes happen via pull request. Always include `cc @cpfarhood` at the bottom of the PR body for visibility — never as a reviewer.
|
||||
All changes happen via pull request. Gitea branch protection requires CI checks to pass. See `coding-standards` for the no-self-merge contract and the `cc @cpfarhood` visibility rule.
|
||||
|
||||
```bash
|
||||
tea pr create --base dev --title "..." --body "... cc @cpfarhood"
|
||||
```
|
||||
|
||||
Gitea branch protection requires CI checks (lint, test, build-and-push). Governance is enforced through Paperclip.
|
||||
|
||||
## PR review & merge policy
|
||||
|
||||
### Dev branch (`dev`)
|
||||
- **Engineer** self-merges after CI passes. Dev is for validation, not quality gates.
|
||||
- **QA (Lint Roller `525c2c39-1196-4682-9cd1-0bcfcb0d0f31`)** reviews the PR. Fail → back to engineer with exact details.
|
||||
- QA approves and hands off to CTO.
|
||||
- **CTO (The Dogfather `c370d244-3c3b-4f21-a403-4cdc9dbdbf96`)** reviews the PR. Fail → back to engineer.
|
||||
- **CTO** merges the dev PR.
|
||||
|
||||
### UAT branch (`uat`)
|
||||
- **CTO** opens and merges a PR from `dev` to `uat`.
|
||||
- **CI** builds and deploys automatically to UAT (`https://uat.groombook.dev`).
|
||||
- **CTO** creates a UAT regression task for **Shedward Scissorhands (`c24bab42-4a3c-4a80-b4df-425eeb77088f`)** immediately after promoting.
|
||||
|
||||
### Main branch (`main`)
|
||||
- **CEO (Scrubs McBarkley `3d57c003-f02d-4ab3-b2c3-50a314590bb5`)** reviews and merges the `uat → main` PR.
|
||||
- **CI** deploys automatically to Production (`https://demo.groombook.dev`).
|
||||
|
||||
`@cpfarhood` is cc'd for visibility on all PRs — never as a reviewer.
|
||||
|
||||
## SDLC pipeline
|
||||
|
||||
### Phase 1 — Dev
|
||||
|
||||
1. **Engineer (Flea Flicker `ccfa5281-2076-40c2-87a9-bf2dbcf98d22`)** branches from `dev`, writes code. GitOps deploys to dev on demand.
|
||||
2. **Engineer** opens a PR against `dev`. CI must pass.
|
||||
3. **Engineer** self-merges after CI passes.
|
||||
4. **CI** builds and deploys automatically to Dev (`https://dev.groombook.dev`).
|
||||
5. **QA (Lint Roller)** reviews the PR. Fail → back to engineer.
|
||||
6. QA approves and hands off to CTO.
|
||||
7. **CTO (The Dogfather)** reviews the PR. Fail → back to engineer.
|
||||
8. **CTO** merges the dev PR.
|
||||
1. **Engineer** branches from `dev`, writes code.
|
||||
2. **Engineer** opens a PR against `dev`.
|
||||
3. **CI** fail → back to **Engineer**.
|
||||
4. **CI** pass → **Engineer** merges PR.
|
||||
5. **CI** builds and deploys automatically to Dev (`https://dev.groombook.dev`).
|
||||
|
||||
### Phase 2 — UAT promotion
|
||||
|
||||
9. **CTO** opens and merges a PR from `dev` to `uat`.
|
||||
10. **CI** builds and deploys automatically to UAT (`https://uat.groombook.dev`).
|
||||
1. **Engineer** opens a PR from `dev` to `uat`.
|
||||
2. **CI** fail → back to **Engineer** (return to Phase 1).
|
||||
3. **CI** pass → **QA** performs code review.
|
||||
4. **QA** rejected → back to **Engineer** (return to Phase 1).
|
||||
5. **QA** approved → **Engineer** merges PR.
|
||||
6. **CI** builds and deploys automatically to UAT (`https://uat.groombook.dev`).
|
||||
|
||||
### Phase 3 — UAT testing & security
|
||||
### Phase 3 — User Testing & Security Review
|
||||
|
||||
11. **UAT (Shedward Scissorhands)** runs full regression against UAT — every feature, old and new, no exceptions.
|
||||
12. UAT fail → CTO redistributes to engineer (return to Phase 1).
|
||||
13. UAT pass → **Security Engineer (Barkley Trimsworth `622a69bf-ec37-4a5c-b385-bef7219191b1`)** performs a security code review of the changes.
|
||||
14. Security fail → CTO redistributes to engineer (return to Phase 1).
|
||||
1. **UAT (Shedward Scissorhands)** runs full regression against UAT — every feature, old and new, no exceptions.
|
||||
2. **UAT** fail → back to **Engineer** (return to Phase 1).
|
||||
3. **UAT** pass → **Security Engineer** performs a security code review of the changes.
|
||||
4. **Security** fail → back to **Engineer** (return to Phase 1).
|
||||
5. **Security** pass → Begin Phase 4.
|
||||
|
||||
### Phase 4 — Production
|
||||
### Phase 4 — Production Promotion
|
||||
|
||||
15. Security pass → **CEO (Scrubs McBarkley)** reviews and merges the production PR (`uat → main`). Fail → back to CTO.
|
||||
16. **CI** deploys automatically to Production (`https://demo.groombook.dev`).
|
||||
1. **Engineer** opens a PR from `uat` to `main`.
|
||||
2. **CI** fail → back to **Engineer** (return to Phase 1).
|
||||
3. **CI** pass → **CTO** performs code review.
|
||||
4. **CTO** rejected → back to **Engineer** (return to Phase 1).
|
||||
5. **CTO** approved → **Engineer** merges PR.
|
||||
6. **CI** fail → back to **Engineer** (return to Phase 1).
|
||||
7. **CI** pass → Begin Phase 5.
|
||||
|
||||
### Hierarchy rules
|
||||
### Phase 5 — Production Deployment
|
||||
|
||||
* CTO rejections at Dev go directly to the engineer (not back through QA).
|
||||
* UAT failures (Shedward Scissorhands) go to CTO — CTO cascades to engineer.
|
||||
* Security failures (Barkley Trimsworth) go to CTO — CTO cascades to engineer.
|
||||
* CEO rejections at Prod go to CTO.
|
||||
The **Engineer** opens a PR against `groombook/infra` to update the relevant Kustomize overlay with the new image tag. From this point the work follows the **`devops` skill pipeline** end-to-end — review, merge, and Flux reconciliation are all owned there. On merge, Flux rolls out the updated pods to production (`https://demo.groombook.dev`).
|
||||
|
||||
> **Note on penetration testing:** Barkley Trimsworth performs scheduled penetration testing against Prod independently of the PR workflow. Board-authorized. Not triggered per-PR.
|
||||
## Stage 1 CI — Image build
|
||||
|
||||
## Delegation model tier
|
||||
Triggered automatically on every merge to `main` in an application repo:
|
||||
- Builds and tags the Docker image: CalVer (`YYYY.MM.DD[.N]`), `latest`, and `sha-<hash>`
|
||||
- Pushes tagged images to `git.farh.net/groombook/<service>` (see `coding-standards` for the registry and CalVer policy)
|
||||
- Creates a CalVer git tag in the source repo
|
||||
|
||||
When creating subtasks for other agents, set `modelProfile: "cheap"` only for:
|
||||
- Mechanical refactors or repetitive operations
|
||||
- Basic information lookups
|
||||
- Well-specified, bounded updates
|
||||
|
||||
Leave `modelProfile` unset for anything requiring judgment, reasoning, or QA review. When in doubt, leave it unset.
|
||||
|
||||
## Infrastructure
|
||||
|
||||
* **Production:** namespace `groombook`, FQDN `demo.groombook.dev`
|
||||
* **UAT:** namespace `groombook-uat`, FQDN `uat.groombook.dev`
|
||||
* **Dev:** namespace `groombook-dev`, FQDN `dev.groombook.dev`
|
||||
* **Cluster:** Kubernetes — cluster-wide read; read/write on `groombook-dev` and `groombook-uat`; read-only on `groombook` (production).
|
||||
* **Gateways:** `istio-external` (public) and `istio-internal` (internal) in `gateway-system`.
|
||||
* **Container registry:** `git.farh.net/groombook/<service>` only.
|
||||
Stage 2 (Flux GitOps deployment) is owned by `devops`.
|
||||
|
||||
## Authentication
|
||||
|
||||
* **Framework:** Better-Auth.
|
||||
* **Social login:** Google and Apple OAuth.
|
||||
* **OAuth Providers:** GroomBook (Authentik), Google, and Apple.
|
||||
* **SSO:** Authentik OIDC at `https://auth.farh.net` (credentials in `authentik-credentials` secret).
|
||||
* **Never build custom authentication.**
|
||||
|
||||
## Deployment — 2-stage Flux GitOps
|
||||
## Application tools (canonical, not alternatives)
|
||||
|
||||
**Stage 1 — CI (runs in each application repo):**
|
||||
- Triggered automatically on every merge to `main`
|
||||
- Builds and tags the Docker image: CalVer (`YYYY.MM.DD[.N]`), `latest`, and `sha-<hash>`
|
||||
- Pushes tagged images to `git.farh.net/groombook/<service>`
|
||||
- Creates a CalVer git tag in the source repo
|
||||
These are application-level dependency choices. Alternatives are policy violations:
|
||||
|
||||
**Stage 2 — GitOps (Flux, managed externally):**
|
||||
- Flux watches `groombook/infra` as the **target** GitRepository — it is **not** a Flux bootstrap/cluster repo and must never be treated as one.
|
||||
- Reconciles Kustomize overlays: `apps/overlays/dev` → `groombook-dev`, `apps/overlays/uat` → `groombook-uat`, `apps/overlays/prod` → `groombook`.
|
||||
- Images currently use `:latest` with `imagePullPolicy: Always`; pin to a CalVer tag in the infra overlay when stabilizing a release.
|
||||
|
||||
**Policy — Flux Image Tag Automation is DENIED.** Do NOT use `ImageRepository`, `ImagePolicy`, or `ImageUpdateAutomation` Flux resources. Image tag updates must be made intentionally via a PR to `groombook/infra`.
|
||||
|
||||
**To deploy a change:**
|
||||
1. Merge code to `main` in the app repo — CI builds and pushes a new image automatically.
|
||||
2. Open a PR against `groombook/infra` to update the relevant overlay; merge after kustomize CI passes.
|
||||
3. Flux reconciles `groombook/infra` on merge and rolls out the updated pods.
|
||||
|
||||
**To force a rollout without a manifest change:**
|
||||
```bash
|
||||
kubectl rollout restart deployment/<name> -n <namespace>
|
||||
```
|
||||
|
||||
## Infrastructure as Code
|
||||
|
||||
Terraform (OpenTofu) is deployed via the **Flux OpenTofu Controller** in a GitOps fashion. Submit Terraform configurations via a PR to `groombook/infra` — the tofu controller reconciles them on merge.
|
||||
|
||||
**Never run `tofu` directly.** Never `kubectl apply` against production. Production changes go through Flux only. The `groombook-dev` and `groombook-uat` namespaces permit direct kubectl use for iteration.
|
||||
|
||||
## Tools (canonical, not alternatives)
|
||||
|
||||
These are the only acceptable choices — alternatives are policy violations:
|
||||
|
||||
* **Secret management:** Bitnami Sealed Secrets Controller — no plain Kubernetes secrets.
|
||||
* **Database:** CloudNativePG Operator (Postgres) — no SQLite, MariaDB, or MySQL.
|
||||
* **Cache / pub-sub:** DragonflyDB Operator — no Redis.
|
||||
* **Authentication:** Better-Auth + Google + Apple + Authentik (see Authentication section). Never build custom auth.
|
||||
* **Database:** CloudNativePG-managed Postgres — no SQLite, MariaDB, or MySQL.
|
||||
* **Cache / pub-sub:** DragonflyDB — no Redis.
|
||||
* **Authentication:** Better-Auth + Google + Apple + Authentik (see Authentication above).
|
||||
* **Dependency updates:** Mend Renovate. **Dependabot is not used and will not be used.** Do not configure it.
|
||||
* **Container registry:** `git.farh.net/groombook/<service>` — no Docker Hub for first-party images.
|
||||
* **Browser automation:** the `playwright` MCP server (`http://playwright:8931/mcp`). Target dev only — never test production.
|
||||
|
||||
If a task requires deviating from any of the above, treat it as a destructive action: stop, file an issue with rationale, request board approval.
|
||||
|
||||
## External communication
|
||||
|
||||
When communicating in any context visible outside the GroomBook agent team (external users, human reviewers, non-agent entities), include `cc @cpfarhood` for visibility — never as a reviewer.
|
||||
For the container registry, CalVer versioning, and general PR/test discipline, see `coding-standards`. For the operator install side (CNPG, Dragonfly, Sealed Secrets), see `devops`.
|
||||
|
||||
Reference in New Issue
Block a user