15 Commits

Author SHA1 Message Date
Chris Farhood 738f5d8f05 Merge pull request 'feat(safety): require read-before-write for adapterConfig.env updates' (#12) from fix/gro-2049-adapter-env-preservation into main
Reviewed-on: #12
2026-06-02 01:07:28 +00:00
Flea Flicker cffa73cd97 feat(safety): require read-before-write for adapterConfig.env updates
Sending a partial adapterConfig.env payload silently drops all keys not
included, which is what caused Shedward's env vars to be erased when UAT
passwords were added (GRO-2049). Adds an explicit non-negotiable rule with
the safe read-merge-write pattern to prevent recurrence.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-02 01:05:49 +00:00
Scrubs McBarkley cd5d9c5614 feat(safety): add board approval scope section
CTO content-verified + QA (Lint Roller) officially approved. Merged by CEO (Scrubs McBarkley) as non-self merge on markdown-only org repo. GRO-1838.
2026-05-28 18:45:26 +00:00
The Dogfather 6e0e29f374 feat(safety): add board approval scope section 2026-05-28 12:17:23 +00:00
The Dogfather f69319e270 Revert "feat(safety): add board approval scope section" (direct push - will redo via PR) 2026-05-28 12:14:29 +00:00
The Dogfather deb507714b feat(safety): add board approval scope section 2026-05-28 12:13:27 +00:00
Flea Flicker 40f8153c86 feat(safety): add board approval scope section
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 12:05:46 +00:00
Chris Farhood 4df9637518 Merge pull request 'Split devops and sdlc skills by scope; dedupe shared content' (#9) from claude/devops-sdlc-split into main
Reviewed-on: #9
2026-05-28 01:18:55 +00:00
Chris Farhood 11fd2a4c34 Collapse sdlc Phase 5 to a redirect into the devops pipeline
Phase 5 is an infra PR against groombook/infra, which means it is governed
by the devops pipeline. Spelling out a separate (QA-only) review flow here
both duplicates devops and contradicted its QA+CTO requirement. Replaced
the step list with a one-paragraph hand-off.

Resolves the policy ambiguity flagged in the PR description.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 21:16:10 -04:00
Chris Farhood d6b13fa58d Split devops and sdlc skills by scope; dedupe shared content
devops/SKILL.md is now the canonical home for infrastructure lifecycle
(groombook/infra, single-branch main, Flux + OpenTofu controller, cluster
topology). sdlc/SKILL.md is scoped to application code (3-branch dev/uat/main,
Phases 1-5, Stage 1 CI image build, app-tool policy). Each skill cross-refs
the other and defers to coding-standards/safety for cross-cutting rules
rather than restating them.

Fixes in devops/SKILL.md:
- Rewrote frontmatter description (was a copy of sdlc, referenced phases
  and dev/uat/prod that do not apply).
- Hoisted "applies to groombook/infra" to a top-level scope statement.
- Renumbered the pipeline (was 1,2,3,4,4,5,4,5,5) and fixed --base dev
  -> --base main in the tea example.
- Closed an unterminated bold marker.
- Removed Authentication framework, Stage 1 image build, and the
  "never tofu / never kubectl apply" lines (now cited from sdlc / safety).
- Trimmed the tools list to infra-only operators and controllers.

Trims in sdlc/SKILL.md:
- Removed Infrastructure topology, IaC, Stage 2 GitOps detail, the Flux
  Image Automation DENIED policy, the "never tofu / never kubectl apply"
  lines, and the External communication section (cited from devops /
  safety / coding-standards instead).
- Trimmed the tools list to application-level dependency choices.
- Added a pointer from Phase 5 into the devops pipeline.

cc @cpfarhood

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 15:31:54 -04:00
Chris Farhood c756b16c7c more updates 2026-05-26 23:27:03 -04:00
Chris Farhood 743e913126 Update skills/sdlc/SKILL.md 2026-05-27 03:05:30 +00:00
Chris Farhood a0ae03e959 Update skills/sdlc/SKILL.md 2026-05-27 03:04:25 +00:00
Chris Farhood 643d9e8956 Update skills/sdlc/SKILL.md 2026-05-27 03:04:09 +00:00
Chris Farhood 93927ea402 Merge pull request 'udpate sdlc' (#8) from sdlc-updates into main
Reviewed-on: #8
2026-05-27 03:03:37 +00:00
3 changed files with 148 additions and 66 deletions
+76
View File
@@ -0,0 +1,76 @@
---
name: devops
description: >
Infrastructure lifecycle for GroomBook. Governs work on the
groombook/infra repo: single-branch main strategy, the infra PR review
pipeline, Flux GitOps reconciliation, OpenTofu controller workflow,
cluster topology, and the Flux image-automation policy. For application
code, see the sdlc skill.
---
# DevOps Practices
This skill governs work on **`groombook/infra`**. For application code lifecycle, see the `sdlc` skill. For PR/test discipline and the `cc @cpfarhood` visibility rule, see `coding-standards`. For non-negotiable safety rules (no direct `tofu`, no `kubectl apply` to production, SealedSecrets), see `safety`.
## Gitea authentication
Use the `GITEA_TOKEN` environment variable for all Gitea operations — it is already set in the agent environment. Use the **`tea`** CLI for all Gitea/Git operations (e.g., `tea issue list`, `tea pr create`). Gitea is the primary source of truth.
## Branch strategy
`groombook/infra` uses a single long-lived branch: **`main`**. Engineers target `main` directly via feature branches named `<agent-name>/<short-description>`.
## Pipeline
1. **Engineer** branches from `main`, writes code.
2. **Engineer** opens a PR against `main`.
3. **CI** fail → back to **Engineer**.
4. **CI** pass → **QA** performs code review.
5. **QA** rejected → back to **Engineer**.
6. **QA** approved → **CTO** performs code review.
7. **CTO** rejected → back to **Engineer**.
8. **CTO** approved → **Engineer** merges PR → **Flux** reconciles automatically.
```bash
tea pr create --base main --title "..." --body "... cc @cpfarhood"
```
Gitea branch protection requires CI checks to pass. See `coding-standards` for the no-self-merge contract and the `cc @cpfarhood` rule.
## Infrastructure topology
* **Production:** namespace `groombook`, FQDN `demo.groombook.dev`
* **UAT:** namespace `groombook-uat`, FQDN `uat.groombook.dev`
* **Dev:** namespace `groombook-dev`, FQDN `dev.groombook.dev`
* **Cluster:** Kubernetes — cluster-wide read; read/write on `groombook-dev` and `groombook-uat`; read-only on `groombook` (production).
* **Gateways:** `istio-external` (public) and `istio-internal` (internal) in `gateway-system`.
* **Container registry:** `git.farh.net/groombook/<service>` only.
## GitOps (Flux)
Flux watches `groombook/infra` as the **target** GitRepository — it is **not** a Flux bootstrap/cluster repo and must never be treated as one.
Reconciles Kustomize overlays:
- `apps/overlays/dev``groombook-dev`
- `apps/overlays/uat``groombook-uat`
- `apps/overlays/prod``groombook`
Images currently use `:latest` with `imagePullPolicy: Always`; pin to a CalVer tag in the infra overlay when stabilizing a release.
**Policy — Flux Image Tag Automation is DENIED.** Do NOT use `ImageRepository`, `ImagePolicy`, or `ImageUpdateAutomation` Flux resources. Image tag updates must be made intentionally via a PR to `groombook/infra` — typically as the final step of the `sdlc` application pipeline (Phase 5).
## Infrastructure as Code
Terraform (OpenTofu) is deployed via the **Flux OpenTofu Controller** in a GitOps fashion. Submit Terraform configurations via a PR to `groombook/infra` — the tofu controller reconciles them on merge. See `safety` for the prohibition on running `tofu` directly and on `kubectl apply` against production.
## Infra-only tools
These are the operators and controllers the infra repo installs and manages. Alternatives are policy violations:
* **GitOps:** Flux CD (managed externally; reconciles `groombook/infra`).
* **IaC:** Flux OpenTofu Controller.
* **Secret management:** Bitnami Sealed Secrets Controller — encrypt with `kubeseal`, commit `SealedSecret` resources to `groombook/infra`. No plain Kubernetes secrets.
* **Database operator:** CloudNativePG (Postgres).
* **Cache / pub-sub operator:** DragonflyDB.
For application-level tool policy (Renovate, Playwright, registry, CalVer) see `coding-standards` and `sdlc`.
+37
View File
@@ -26,6 +26,43 @@ The following rules apply to every GroomBook agent without exception.
* **Never run `tofu` directly.** Terraform / OpenTofu goes through the Flux OpenTofu Controller via a PR to `groombook/infra`.
* **Always read-before-write when updating `adapterConfig.env`.** The Paperclip `PATCH /api/agents/{agentId}` endpoint with an `adapterConfig.env` body **replaces the entire env object** — sending a partial payload silently drops every key you did not include. Before writing any env variable, read the current config first, merge your changes on top, and send the full merged object:
```bash
# 1. Read existing config
existing=$(curl -s "$PAPERCLIP_API_URL/api/agents/<agentId>" \
-H "Authorization: Bearer $PAPERCLIP_API_KEY")
# 2. Merge: spread existing env, then apply new keys on top
curl -s -X PATCH "$PAPERCLIP_API_URL/api/agents/<agentId>" \
-H "Authorization: Bearer $PAPERCLIP_API_KEY" \
-H "X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID" \
-H "Content-Type: application/json" \
-d "$(echo "$existing" | jq '.adapterConfig.env + {"NEW_KEY": {"type":"plain","value":"val"}} | {adapterConfig: {env: .}}')"
```
Skipping the read step is a destructive operation — it erases all existing env vars for that agent.
## If you are unsure
If you are unsure whether an action is safe, **stop**. Post a comment on the Paperclip issue explaining what you are about to do and why you are uncertain, set the issue to `blocked`, and escalate to your manager. Do not guess.
## Board approval scope
Board approval (`request_board_approval`) is reserved for one-way-door decisions:
* **Actions requiring a human operator** in a third-party portal (e.g. Gitea Owners team config, external vendor consoles).
* **Genuinely destructive, irreversible operations** beyond what the destructive-action rule above already covers.
* **Out-of-scope decisions** that exceed the agent's mandate.
* **New spend or resource authorizations.**
* **Issues with `originKind: "gitea"`** — per the `sdlc` skill, these require board approval before work begins.
Board approval is **never** used for routine SDLC pipeline steps:
* QA handoffs, UAT promotion, security review hand-off.
* Returning a failing PR to the engineer or CTO.
* Clearing task blockers, PR reviews, or merge decisions within the agent's SDLC role.
* Feature triage decisions (Accepted / Backlogged / Denied).
* Any standard dev → uat → prod progression.
When board approval IS required, use the Paperclip `request_board_approval` API (see the `paperclip` skill) and set the source issue to `blocked` until the approval resolves.
+35 -66
View File
@@ -1,29 +1,30 @@
---
name: sdlc
description: >
Software development lifecycle for GroomBook. Covers Gitea authentication,
branch strategy across Dev/UAT/Prod, the SDLC pipeline phases,
PR review and merge policy, infrastructure layout, the Gitea-origin issue
board-approval gate, the cc-cpfarhood visibility rule,
and delegation model tier policy.
Software development lifecycle for GroomBook application repos. Covers
Gitea authentication, the 3-branch dev/uat/main strategy, the SDLC
pipeline phases 1-5, the Stage 1 CI image build, the authentication
framework, and application-tool policy. For infrastructure
(groombook/infra), see the devops skill.
---
# Software Development Lifecycle
This skill governs **application code repos**. For infrastructure (`groombook/infra`), see the `devops` skill. For PR/test discipline and the `cc @cpfarhood` visibility rule, see `coding-standards`. For non-negotiable safety rules, see `safety`.
## Gitea authentication
**Use the `GITEA_TOKEN`** environment variable for all Gitea operations. It is already set in the agent environment. Use the **`tea`** CLI for all Gitea/Git operations (e.g., `tea issue list`, `tea pr create`). The token expires when the environment variable is rotated — re-invoke any Gitea operation if you get a 401.
Gitea is the **primary source of truth**. Every Paperclip issue must have a corresponding Gitea issue (create one if missing). Both stay open until the work is completed, reviewed, approved, merged, and QA-verified.
## Branch strategy
Three long-lived branches map to the three deployment environments:
| Branch | Environment | Who merges | Prerequisites for merge |
|--------|-------------|-----------|-----------|
| `dev` | Dev | Engineer | None (self-merges after CI passes) |
| `dev` | Dev | Engineer | CI passes |
| `uat` | UAT | Engineer | QA code review approval |
| `main` | Production | Engineer | UAT validation & CTO code review |
@@ -32,14 +33,12 @@ Three long-lived branches map to the three deployment environments:
## Pull requests
All changes happen via pull request. Always include `cc @cpfarhood` at the bottom of the PR body for visibility — never as a reviewer.
All changes happen via pull request. Gitea branch protection requires CI checks to pass. See `coding-standards` for the no-self-merge contract and the `cc @cpfarhood` visibility rule.
```bash
tea pr create --base dev --title "..." --body "... cc @cpfarhood"
```
Gitea branch protection requires CI checks to pass.
## SDLC pipeline
### Phase 1 — Dev
@@ -65,76 +64,46 @@ Gitea branch protection requires CI checks to pass.
2. **UAT** fail → back to **Engineer** (return to Phase 1).
3. **UAT** pass → **Security Engineer** performs a security code review of the changes.
4. **Security** fail → back to **Engineer** (return to Phase 1).
5. **Security** pass → **Engineer** opens a PR from `uat` to `main`.
5. **Security** pass → Begin Phase 4.
### Phase 4 — Production Promotion
1. **Engineer** opens a PR from `uat` to `main`.
2. **CI** fail → back to **Engineer** (return to Phase 1).
3. **CI** pass → **CTO** performs code review.
4. **CTO** rejected → back to **Engineer** (return to Phase 1).
5. **CTO** approved → **Engineer** merges PR.
6. **CI** fail → back to **Engineer** (return to Phase 1).
7. **CI** pass → Begin Phase 4.
7. **CI** pass → Begin Phase 5.
### Phase 4 — Production
### Phase 5 — Production Deployment
1. **CTO** performs code review.
2. **CTO** approved → **Engineer** merges PR.
3. **CTO** rejected → back to **Engineer** (return to Phase 1).
4. **CI** deploys automatically to Production (`https://demo.groombook.dev`).
The **Engineer** opens a PR against `groombook/infra` to update the relevant Kustomize overlay with the new image tag. From this point the work follows the **`devops` skill pipeline** end-to-end — review, merge, and Flux reconciliation are all owned there. On merge, Flux rolls out the updated pods to production (`https://demo.groombook.dev`).
## Infrastructure
## Stage 1 CI — Image build
* **Production:** namespace `groombook`, FQDN `demo.groombook.dev`
* **UAT:** namespace `groombook-uat`, FQDN `uat.groombook.dev`
* **Dev:** namespace `groombook-dev`, FQDN `dev.groombook.dev`
* **Cluster:** Kubernetes — cluster-wide read; read/write on `groombook-dev` and `groombook-uat`; read-only on `groombook` (production).
* **Gateways:** `istio-external` (public) and `istio-internal` (internal) in `gateway-system`.
* **Container registry:** `git.farh.net/groombook/<service>` only.
Triggered automatically on every merge to `main` in an application repo:
- Builds and tags the Docker image: CalVer (`YYYY.MM.DD[.N]`), `latest`, and `sha-<hash>`
- Pushes tagged images to `git.farh.net/groombook/<service>` (see `coding-standards` for the registry and CalVer policy)
- Creates a CalVer git tag in the source repo
Stage 2 (Flux GitOps deployment) is owned by `devops`.
## Authentication
* **Framework:** Better-Auth.
* **OAuth Providers:** GroomBook (Authentik), Google and Apple.
* **OAuth Providers:** GroomBook (Authentik), Google, and Apple.
* **SSO:** Authentik OIDC at `https://auth.farh.net` (credentials in `authentik-credentials` secret).
* **Never build custom authentication.**
## Deployment — 2-stage Flux GitOps
## Application tools (canonical, not alternatives)
**Stage 1 — CI (runs in each application repo):**
- Triggered automatically on every merge to `main`
- Builds and tags the Docker image: CalVer (`YYYY.MM.DD[.N]`), `latest`, and `sha-<hash>`
- Pushes tagged images to `git.farh.net/groombook/<service>`
- Creates a CalVer git tag in the source repo
These are application-level dependency choices. Alternatives are policy violations:
**Stage 2 — GitOps (Flux, managed externally):**
- Flux watches `groombook/infra` as the **target** GitRepository — it is **not** a Flux bootstrap/cluster repo and must never be treated as one.
- Reconciles Kustomize overlays: `apps/overlays/dev``groombook-dev`, `apps/overlays/uat``groombook-uat`, `apps/overlays/prod``groombook`.
- Images currently use `:latest` with `imagePullPolicy: Always`; pin to a CalVer tag in the infra overlay when stabilizing a release.
**Policy — Flux Image Tag Automation is DENIED.** Do NOT use `ImageRepository`, `ImagePolicy`, or `ImageUpdateAutomation` Flux resources. Image tag updates must be made intentionally via a PR to `groombook/infra`.
**To deploy a change:**
1. Merge code to `main` in the app repo — CI builds and pushes a new image automatically.
2. Open a PR against `groombook/infra` to update the relevant overlay; merge after kustomize CI passes.
3. Flux reconciles `groombook/infra` on merge and rolls out the updated pods.
**To force a rollout without a manifest change:**
```bash
kubectl rollout restart deployment/<name> -n <namespace>
```
## Infrastructure as Code
Terraform (OpenTofu) is deployed via the **Flux OpenTofu Controller** in a GitOps fashion. Submit Terraform configurations via a PR to `groombook/infra` — the tofu controller reconciles them on merge.
**Never run `tofu` directly.** Never `kubectl apply` against production. Production changes go through Flux only. The `groombook-dev` and `groombook-uat` namespaces permit direct kubectl use for troubleshooting and iteration.
## Tools (canonical, not alternatives)
These are the only acceptable choices — alternatives are policy violations:
* **Secret management:** Bitnami Sealed Secrets Controller — no plain Kubernetes secrets.
* **Database:** CloudNativePG Operator (Postgres) — no SQLite, MariaDB, or MySQL.
* **Cache / pub-sub:** DragonflyDB Operator — no Redis.
* **Authentication:** Better-Auth + Google + Apple + Authentik (see Authentication section). Never build custom auth.
* **Database:** CloudNativePG-managed Postgres — no SQLite, MariaDB, or MySQL.
* **Cache / pub-sub:** DragonflyDB — no Redis.
* **Authentication:** Better-Auth + Google + Apple + Authentik (see Authentication above).
* **Dependency updates:** Mend Renovate. **Dependabot is not used and will not be used.** Do not configure it.
* **Container registry:** `git.farh.net/groombook/<service>` — no Docker Hub for first-party images.
* **Browser automation:** the `playwright` MCP server (`http://playwright:8931/mcp`). Target dev only — never test production.
## External communication
When communicating in any context visible outside the GroomBook agent team (external users, human reviewers, non-agent entities), include `cc @cpfarhood` for visibility — never as a reviewer.
For the container registry, CalVer versioning, and general PR/test discipline, see `coding-standards`. For the operator install side (CNPG, Dragonfly, Sealed Secrets), see `devops`.