Compare commits
15 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 738f5d8f05 | |||
| cffa73cd97 | |||
| cd5d9c5614 | |||
| 6e0e29f374 | |||
| f69319e270 | |||
| deb507714b | |||
| 40f8153c86 | |||
| 4df9637518 | |||
| 11fd2a4c34 | |||
| d6b13fa58d | |||
| c756b16c7c | |||
| 743e913126 | |||
| a0ae03e959 | |||
| 643d9e8956 | |||
| 93927ea402 |
@@ -0,0 +1,76 @@
|
||||
---
|
||||
name: devops
|
||||
description: >
|
||||
Infrastructure lifecycle for GroomBook. Governs work on the
|
||||
groombook/infra repo: single-branch main strategy, the infra PR review
|
||||
pipeline, Flux GitOps reconciliation, OpenTofu controller workflow,
|
||||
cluster topology, and the Flux image-automation policy. For application
|
||||
code, see the sdlc skill.
|
||||
---
|
||||
|
||||
# DevOps Practices
|
||||
|
||||
This skill governs work on **`groombook/infra`**. For application code lifecycle, see the `sdlc` skill. For PR/test discipline and the `cc @cpfarhood` visibility rule, see `coding-standards`. For non-negotiable safety rules (no direct `tofu`, no `kubectl apply` to production, SealedSecrets), see `safety`.
|
||||
|
||||
## Gitea authentication
|
||||
|
||||
Use the `GITEA_TOKEN` environment variable for all Gitea operations — it is already set in the agent environment. Use the **`tea`** CLI for all Gitea/Git operations (e.g., `tea issue list`, `tea pr create`). Gitea is the primary source of truth.
|
||||
|
||||
## Branch strategy
|
||||
|
||||
`groombook/infra` uses a single long-lived branch: **`main`**. Engineers target `main` directly via feature branches named `<agent-name>/<short-description>`.
|
||||
|
||||
## Pipeline
|
||||
|
||||
1. **Engineer** branches from `main`, writes code.
|
||||
2. **Engineer** opens a PR against `main`.
|
||||
3. **CI** fail → back to **Engineer**.
|
||||
4. **CI** pass → **QA** performs code review.
|
||||
5. **QA** rejected → back to **Engineer**.
|
||||
6. **QA** approved → **CTO** performs code review.
|
||||
7. **CTO** rejected → back to **Engineer**.
|
||||
8. **CTO** approved → **Engineer** merges PR → **Flux** reconciles automatically.
|
||||
|
||||
```bash
|
||||
tea pr create --base main --title "..." --body "... cc @cpfarhood"
|
||||
```
|
||||
|
||||
Gitea branch protection requires CI checks to pass. See `coding-standards` for the no-self-merge contract and the `cc @cpfarhood` rule.
|
||||
|
||||
## Infrastructure topology
|
||||
|
||||
* **Production:** namespace `groombook`, FQDN `demo.groombook.dev`
|
||||
* **UAT:** namespace `groombook-uat`, FQDN `uat.groombook.dev`
|
||||
* **Dev:** namespace `groombook-dev`, FQDN `dev.groombook.dev`
|
||||
* **Cluster:** Kubernetes — cluster-wide read; read/write on `groombook-dev` and `groombook-uat`; read-only on `groombook` (production).
|
||||
* **Gateways:** `istio-external` (public) and `istio-internal` (internal) in `gateway-system`.
|
||||
* **Container registry:** `git.farh.net/groombook/<service>` only.
|
||||
|
||||
## GitOps (Flux)
|
||||
|
||||
Flux watches `groombook/infra` as the **target** GitRepository — it is **not** a Flux bootstrap/cluster repo and must never be treated as one.
|
||||
|
||||
Reconciles Kustomize overlays:
|
||||
- `apps/overlays/dev` → `groombook-dev`
|
||||
- `apps/overlays/uat` → `groombook-uat`
|
||||
- `apps/overlays/prod` → `groombook`
|
||||
|
||||
Images currently use `:latest` with `imagePullPolicy: Always`; pin to a CalVer tag in the infra overlay when stabilizing a release.
|
||||
|
||||
**Policy — Flux Image Tag Automation is DENIED.** Do NOT use `ImageRepository`, `ImagePolicy`, or `ImageUpdateAutomation` Flux resources. Image tag updates must be made intentionally via a PR to `groombook/infra` — typically as the final step of the `sdlc` application pipeline (Phase 5).
|
||||
|
||||
## Infrastructure as Code
|
||||
|
||||
Terraform (OpenTofu) is deployed via the **Flux OpenTofu Controller** in a GitOps fashion. Submit Terraform configurations via a PR to `groombook/infra` — the tofu controller reconciles them on merge. See `safety` for the prohibition on running `tofu` directly and on `kubectl apply` against production.
|
||||
|
||||
## Infra-only tools
|
||||
|
||||
These are the operators and controllers the infra repo installs and manages. Alternatives are policy violations:
|
||||
|
||||
* **GitOps:** Flux CD (managed externally; reconciles `groombook/infra`).
|
||||
* **IaC:** Flux OpenTofu Controller.
|
||||
* **Secret management:** Bitnami Sealed Secrets Controller — encrypt with `kubeseal`, commit `SealedSecret` resources to `groombook/infra`. No plain Kubernetes secrets.
|
||||
* **Database operator:** CloudNativePG (Postgres).
|
||||
* **Cache / pub-sub operator:** DragonflyDB.
|
||||
|
||||
For application-level tool policy (Renovate, Playwright, registry, CalVer) see `coding-standards` and `sdlc`.
|
||||
@@ -26,6 +26,43 @@ The following rules apply to every GroomBook agent without exception.
|
||||
|
||||
* **Never run `tofu` directly.** Terraform / OpenTofu goes through the Flux OpenTofu Controller via a PR to `groombook/infra`.
|
||||
|
||||
* **Always read-before-write when updating `adapterConfig.env`.** The Paperclip `PATCH /api/agents/{agentId}` endpoint with an `adapterConfig.env` body **replaces the entire env object** — sending a partial payload silently drops every key you did not include. Before writing any env variable, read the current config first, merge your changes on top, and send the full merged object:
|
||||
|
||||
```bash
|
||||
# 1. Read existing config
|
||||
existing=$(curl -s "$PAPERCLIP_API_URL/api/agents/<agentId>" \
|
||||
-H "Authorization: Bearer $PAPERCLIP_API_KEY")
|
||||
|
||||
# 2. Merge: spread existing env, then apply new keys on top
|
||||
curl -s -X PATCH "$PAPERCLIP_API_URL/api/agents/<agentId>" \
|
||||
-H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||
-H "X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "$(echo "$existing" | jq '.adapterConfig.env + {"NEW_KEY": {"type":"plain","value":"val"}} | {adapterConfig: {env: .}}')"
|
||||
```
|
||||
|
||||
Skipping the read step is a destructive operation — it erases all existing env vars for that agent.
|
||||
|
||||
## If you are unsure
|
||||
|
||||
If you are unsure whether an action is safe, **stop**. Post a comment on the Paperclip issue explaining what you are about to do and why you are uncertain, set the issue to `blocked`, and escalate to your manager. Do not guess.
|
||||
|
||||
## Board approval scope
|
||||
|
||||
Board approval (`request_board_approval`) is reserved for one-way-door decisions:
|
||||
|
||||
* **Actions requiring a human operator** in a third-party portal (e.g. Gitea Owners team config, external vendor consoles).
|
||||
* **Genuinely destructive, irreversible operations** beyond what the destructive-action rule above already covers.
|
||||
* **Out-of-scope decisions** that exceed the agent's mandate.
|
||||
* **New spend or resource authorizations.**
|
||||
* **Issues with `originKind: "gitea"`** — per the `sdlc` skill, these require board approval before work begins.
|
||||
|
||||
Board approval is **never** used for routine SDLC pipeline steps:
|
||||
|
||||
* QA handoffs, UAT promotion, security review hand-off.
|
||||
* Returning a failing PR to the engineer or CTO.
|
||||
* Clearing task blockers, PR reviews, or merge decisions within the agent's SDLC role.
|
||||
* Feature triage decisions (Accepted / Backlogged / Denied).
|
||||
* Any standard dev → uat → prod progression.
|
||||
|
||||
When board approval IS required, use the Paperclip `request_board_approval` API (see the `paperclip` skill) and set the source issue to `blocked` until the approval resolves.
|
||||
|
||||
+35
-66
@@ -1,29 +1,30 @@
|
||||
---
|
||||
name: sdlc
|
||||
description: >
|
||||
Software development lifecycle for GroomBook. Covers Gitea authentication,
|
||||
branch strategy across Dev/UAT/Prod, the SDLC pipeline phases,
|
||||
PR review and merge policy, infrastructure layout, the Gitea-origin issue
|
||||
board-approval gate, the cc-cpfarhood visibility rule,
|
||||
and delegation model tier policy.
|
||||
Software development lifecycle for GroomBook application repos. Covers
|
||||
Gitea authentication, the 3-branch dev/uat/main strategy, the SDLC
|
||||
pipeline phases 1-5, the Stage 1 CI image build, the authentication
|
||||
framework, and application-tool policy. For infrastructure
|
||||
(groombook/infra), see the devops skill.
|
||||
---
|
||||
|
||||
# Software Development Lifecycle
|
||||
|
||||
This skill governs **application code repos**. For infrastructure (`groombook/infra`), see the `devops` skill. For PR/test discipline and the `cc @cpfarhood` visibility rule, see `coding-standards`. For non-negotiable safety rules, see `safety`.
|
||||
|
||||
## Gitea authentication
|
||||
|
||||
**Use the `GITEA_TOKEN`** environment variable for all Gitea operations. It is already set in the agent environment. Use the **`tea`** CLI for all Gitea/Git operations (e.g., `tea issue list`, `tea pr create`). The token expires when the environment variable is rotated — re-invoke any Gitea operation if you get a 401.
|
||||
|
||||
Gitea is the **primary source of truth**. Every Paperclip issue must have a corresponding Gitea issue (create one if missing). Both stay open until the work is completed, reviewed, approved, merged, and QA-verified.
|
||||
|
||||
|
||||
## Branch strategy
|
||||
|
||||
Three long-lived branches map to the three deployment environments:
|
||||
|
||||
| Branch | Environment | Who merges | Prerequisites for merge |
|
||||
|--------|-------------|-----------|-----------|
|
||||
| `dev` | Dev | Engineer | None (self-merges after CI passes) |
|
||||
| `dev` | Dev | Engineer | CI passes |
|
||||
| `uat` | UAT | Engineer | QA code review approval |
|
||||
| `main` | Production | Engineer | UAT validation & CTO code review |
|
||||
|
||||
@@ -32,14 +33,12 @@ Three long-lived branches map to the three deployment environments:
|
||||
|
||||
## Pull requests
|
||||
|
||||
All changes happen via pull request. Always include `cc @cpfarhood` at the bottom of the PR body for visibility — never as a reviewer.
|
||||
All changes happen via pull request. Gitea branch protection requires CI checks to pass. See `coding-standards` for the no-self-merge contract and the `cc @cpfarhood` visibility rule.
|
||||
|
||||
```bash
|
||||
tea pr create --base dev --title "..." --body "... cc @cpfarhood"
|
||||
```
|
||||
|
||||
Gitea branch protection requires CI checks to pass.
|
||||
|
||||
## SDLC pipeline
|
||||
|
||||
### Phase 1 — Dev
|
||||
@@ -65,76 +64,46 @@ Gitea branch protection requires CI checks to pass.
|
||||
2. **UAT** fail → back to **Engineer** (return to Phase 1).
|
||||
3. **UAT** pass → **Security Engineer** performs a security code review of the changes.
|
||||
4. **Security** fail → back to **Engineer** (return to Phase 1).
|
||||
5. **Security** pass → **Engineer** opens a PR from `uat` to `main`.
|
||||
5. **Security** pass → Begin Phase 4.
|
||||
|
||||
### Phase 4 — Production Promotion
|
||||
|
||||
1. **Engineer** opens a PR from `uat` to `main`.
|
||||
2. **CI** fail → back to **Engineer** (return to Phase 1).
|
||||
3. **CI** pass → **CTO** performs code review.
|
||||
4. **CTO** rejected → back to **Engineer** (return to Phase 1).
|
||||
5. **CTO** approved → **Engineer** merges PR.
|
||||
6. **CI** fail → back to **Engineer** (return to Phase 1).
|
||||
7. **CI** pass → Begin Phase 4.
|
||||
7. **CI** pass → Begin Phase 5.
|
||||
|
||||
### Phase 4 — Production
|
||||
### Phase 5 — Production Deployment
|
||||
|
||||
1. **CTO** performs code review.
|
||||
2. **CTO** approved → **Engineer** merges PR.
|
||||
3. **CTO** rejected → back to **Engineer** (return to Phase 1).
|
||||
4. **CI** deploys automatically to Production (`https://demo.groombook.dev`).
|
||||
The **Engineer** opens a PR against `groombook/infra` to update the relevant Kustomize overlay with the new image tag. From this point the work follows the **`devops` skill pipeline** end-to-end — review, merge, and Flux reconciliation are all owned there. On merge, Flux rolls out the updated pods to production (`https://demo.groombook.dev`).
|
||||
|
||||
## Infrastructure
|
||||
## Stage 1 CI — Image build
|
||||
|
||||
* **Production:** namespace `groombook`, FQDN `demo.groombook.dev`
|
||||
* **UAT:** namespace `groombook-uat`, FQDN `uat.groombook.dev`
|
||||
* **Dev:** namespace `groombook-dev`, FQDN `dev.groombook.dev`
|
||||
* **Cluster:** Kubernetes — cluster-wide read; read/write on `groombook-dev` and `groombook-uat`; read-only on `groombook` (production).
|
||||
* **Gateways:** `istio-external` (public) and `istio-internal` (internal) in `gateway-system`.
|
||||
* **Container registry:** `git.farh.net/groombook/<service>` only.
|
||||
Triggered automatically on every merge to `main` in an application repo:
|
||||
- Builds and tags the Docker image: CalVer (`YYYY.MM.DD[.N]`), `latest`, and `sha-<hash>`
|
||||
- Pushes tagged images to `git.farh.net/groombook/<service>` (see `coding-standards` for the registry and CalVer policy)
|
||||
- Creates a CalVer git tag in the source repo
|
||||
|
||||
Stage 2 (Flux GitOps deployment) is owned by `devops`.
|
||||
|
||||
## Authentication
|
||||
|
||||
* **Framework:** Better-Auth.
|
||||
* **OAuth Providers:** GroomBook (Authentik), Google and Apple.
|
||||
* **OAuth Providers:** GroomBook (Authentik), Google, and Apple.
|
||||
* **SSO:** Authentik OIDC at `https://auth.farh.net` (credentials in `authentik-credentials` secret).
|
||||
* **Never build custom authentication.**
|
||||
|
||||
## Deployment — 2-stage Flux GitOps
|
||||
## Application tools (canonical, not alternatives)
|
||||
|
||||
**Stage 1 — CI (runs in each application repo):**
|
||||
- Triggered automatically on every merge to `main`
|
||||
- Builds and tags the Docker image: CalVer (`YYYY.MM.DD[.N]`), `latest`, and `sha-<hash>`
|
||||
- Pushes tagged images to `git.farh.net/groombook/<service>`
|
||||
- Creates a CalVer git tag in the source repo
|
||||
These are application-level dependency choices. Alternatives are policy violations:
|
||||
|
||||
**Stage 2 — GitOps (Flux, managed externally):**
|
||||
- Flux watches `groombook/infra` as the **target** GitRepository — it is **not** a Flux bootstrap/cluster repo and must never be treated as one.
|
||||
- Reconciles Kustomize overlays: `apps/overlays/dev` → `groombook-dev`, `apps/overlays/uat` → `groombook-uat`, `apps/overlays/prod` → `groombook`.
|
||||
- Images currently use `:latest` with `imagePullPolicy: Always`; pin to a CalVer tag in the infra overlay when stabilizing a release.
|
||||
|
||||
**Policy — Flux Image Tag Automation is DENIED.** Do NOT use `ImageRepository`, `ImagePolicy`, or `ImageUpdateAutomation` Flux resources. Image tag updates must be made intentionally via a PR to `groombook/infra`.
|
||||
|
||||
**To deploy a change:**
|
||||
1. Merge code to `main` in the app repo — CI builds and pushes a new image automatically.
|
||||
2. Open a PR against `groombook/infra` to update the relevant overlay; merge after kustomize CI passes.
|
||||
3. Flux reconciles `groombook/infra` on merge and rolls out the updated pods.
|
||||
|
||||
**To force a rollout without a manifest change:**
|
||||
```bash
|
||||
kubectl rollout restart deployment/<name> -n <namespace>
|
||||
```
|
||||
|
||||
## Infrastructure as Code
|
||||
|
||||
Terraform (OpenTofu) is deployed via the **Flux OpenTofu Controller** in a GitOps fashion. Submit Terraform configurations via a PR to `groombook/infra` — the tofu controller reconciles them on merge.
|
||||
|
||||
**Never run `tofu` directly.** Never `kubectl apply` against production. Production changes go through Flux only. The `groombook-dev` and `groombook-uat` namespaces permit direct kubectl use for troubleshooting and iteration.
|
||||
|
||||
## Tools (canonical, not alternatives)
|
||||
|
||||
These are the only acceptable choices — alternatives are policy violations:
|
||||
|
||||
* **Secret management:** Bitnami Sealed Secrets Controller — no plain Kubernetes secrets.
|
||||
* **Database:** CloudNativePG Operator (Postgres) — no SQLite, MariaDB, or MySQL.
|
||||
* **Cache / pub-sub:** DragonflyDB Operator — no Redis.
|
||||
* **Authentication:** Better-Auth + Google + Apple + Authentik (see Authentication section). Never build custom auth.
|
||||
* **Database:** CloudNativePG-managed Postgres — no SQLite, MariaDB, or MySQL.
|
||||
* **Cache / pub-sub:** DragonflyDB — no Redis.
|
||||
* **Authentication:** Better-Auth + Google + Apple + Authentik (see Authentication above).
|
||||
* **Dependency updates:** Mend Renovate. **Dependabot is not used and will not be used.** Do not configure it.
|
||||
* **Container registry:** `git.farh.net/groombook/<service>` — no Docker Hub for first-party images.
|
||||
* **Browser automation:** the `playwright` MCP server (`http://playwright:8931/mcp`). Target dev only — never test production.
|
||||
|
||||
## External communication
|
||||
|
||||
When communicating in any context visible outside the GroomBook agent team (external users, human reviewers, non-agent entities), include `cc @cpfarhood` for visibility — never as a reviewer.
|
||||
For the container registry, CalVer versioning, and general PR/test discipline, see `coding-standards`. For the operator install side (CNPG, Dragonfly, Sealed Secrets), see `devops`.
|
||||
|
||||
Reference in New Issue
Block a user