7 Commits

Author SHA1 Message Date
Scrubs McBarkley 07174fe233 docs(devops): fix-forward-in-git; ban escalating reconcilable changes as manual/board actions (GRO-2536)
The board flagged that agents repeatedly requested board approval and hand-run
kubectl on a Flux-managed cluster — unfillable, wrong, and the root cause of a
multi-day stall. The devops skill prohibited `kubectl apply` to prod but never
stated the corollary: the resolution of any reconcilable breakage is a PR to
groombook/infra, never a human-run command. This adds that contract explicitly.

cc @cpfarhood
2026-06-25 11:50:00 +00:00
Scrubs McBarkley 33eba83d58 Revert "docs(devops): require fix-forward-in-git…" — landed on main directly by mistake
This change was committed straight to main, bypassing the required PR review.
Reverting main to baseline; the change is preserved on branch
scrubs/gro-2536-gitops-fix-forward-rule and will land via PR review (GRO-2536).

cc @cpfarhood
2026-06-25 11:47:57 +00:00
Scrubs McBarkley 93f41417a2 docs(devops): require fix-forward-in-git; ban escalating reconcilable changes as manual/board actions (GRO-2536)
The board flagged that agents repeatedly requested board approval and hand-run
kubectl on a Flux-managed cluster — unfillable, wrong, and the root cause of a
multi-day stall. The devops skill prohibited `kubectl apply` to prod but never
stated the corollary: the resolution of any reconcilable breakage is a PR to
groombook/infra, never a human-run command. This adds that contract explicitly.

cc @cpfarhood
2026-06-25 11:46:01 +00:00
Flea Flicker 96ca9d993d Merge pull request 'feat(GRO-2516): create .gitignore with agent-runtime credential stanza' (#14) from feature/gro-2516-harden-gitignore into main
feat(GRO-2516): add agent-runtime credential stanza to .gitignore
2026-06-25 02:24:20 +00:00
Stockboy Steve 2da85037a6 feat(GRO-2516): create .gitignore with agent-runtime credential stanza
Adds root-level .gitignore to prevent accidental commit of agent
credential artifacts (.gh-token, .config/gh/, .claude/, .codex/,
AGENT_HOME) per the GRO-2516 guardrail.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-06-25 02:18:29 +00:00
Flea Flicker 5d39685451 docs(sdlc): move uat→main merge-gate policy here; CTO Approve only for novel auth, infra/prod, and risk-flagged (GRO-2377) (#13)
Co-authored-by: Flea Flicker <flea@groombook.dev>
Co-committed-by: Flea Flicker <flea@groombook.dev>
2026-06-12 16:27:24 +00:00
Chris Farhood 36310c48db refactor(skills): resolve self-merge contradiction with sdlc
- coding-standards: replace "no agent merges their own PR" with the
  reviews-required-then-engineer-may-merge rule consistent with sdlc
- safety: drop stale "No self-merging PRs" line from the merge-gate
  rule for the same reason

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-09 09:26:05 -04:00
5 changed files with 82 additions and 14 deletions
+10
View File
@@ -0,0 +1,10 @@
# Agent runtime artifacts — never commit
.gh-token
*.gh-token
**/.gh-token
.config/gh/
**/.config/gh/
**/AGENT_HOME/**
$AGENT_HOME/**
.claude/
.codex/
+8 -3
View File
@@ -3,8 +3,9 @@ name: coding-standards
description: >
Engineering quality bar for GroomBook code: priority ordering of correctness
vs. clarity vs. maintainability vs. performance vs. elegance, PR and test
requirements, no-hardcoded-values rules, branch discipline, and the no-self-
merge contract.
requirements, no-hardcoded-values rules, branch discipline, and the
no-self-merge contract. The uat→main merge-gate policy lives in the `sdlc`
skill, not here.
---
# Coding Standards
@@ -24,7 +25,7 @@ When making technical decisions, prioritize in this order:
## Pull request discipline
* All changes go through a PR. **Never push directly to `dev`, `uat`, or `main`.**
* No agent merges their own PR.
* Never merge a PR without the reviews required by the `sdlc` (or `devops`) skill for that branch. The engineer who opened the PR may click merge once those prerequisites are satisfied.
* Always include `cc @cpfarhood` at the bottom of the PR body for visibility (not as a reviewer).
## Test requirements
@@ -57,6 +58,10 @@ All releases use CalVer (`YYYY.MMDD.PATCH`, e.g. `2026.0504.0`). No SemVer, no c
Push to `git.farh.net` only. Never Docker Hub for first-party images.
## uat→main merge-gate policy
The uat→main merge-gate policy lives in the `sdlc` skill, not here. The one-line summary: the engineer self-merges a uat→main PR once the four pre-gates (QA, UAT deploy, UAT regression, security) are green and the CEO code review is APPROVED on the Paperclip issue. A CTO Gitea Approve click is reserved for three categories: novel auth / session paths, infra / prod-affecting merges, and risk-flagged merges. See the `sdlc` skill — "uat→main merge-gate policy" — for the full rule, the category list, and the "when uncertain" escalation path.
## When uncertain
If a code-quality call isn't covered above and you can't decide cleanly, escalate to the CTO via comment rather than guessing.
+18
View File
@@ -59,6 +59,24 @@ Images currently use `:latest` with `imagePullPolicy: Always`; pin to a CalVer t
**Policy — Flux Image Tag Automation is DENIED.** Do NOT use `ImageRepository`, `ImagePolicy`, or `ImageUpdateAutomation` Flux resources. Image tag updates must be made intentionally via a PR to `groombook/infra` — typically as the final step of the `sdlc` application pipeline (Phase 5).
## When a cluster is broken: fix forward in git — never escalate a manual action
The cluster is reconciled by controllers (Flux, the OpenTofu Controller, the Sealed Secrets controller). **Any change one of these controllers can reconcile MUST be delivered as a PR to `groombook/infra`** — it is never a board approval and never a hand-run `kubectl` / `kubeseal` / `tofu` command.
This is the corollary of the read-only-prod and "no `kubectl apply` to production" rules in `safety`: agents are read-only on `groombook` **by design**, precisely because the write path is git. "I lack cluster-admin" therefore resolves to **"open a PR,"** not **"ask a human to run the command."**
Contract:
- **Do NOT** file an issue, board approval, or escalation that asks a human to run an imperative cluster command (`kubectl delete/apply/patch`, `kubeseal`, `flux reconcile`, `tofu apply`) that a controller would otherwise reconcile from git. That request is unfillable and wrong on a GitOps cluster — fix the desired state in the repo and let the controller converge.
- SealedSecret won't unseal / wrong scope → re-seal the `SealedSecret` and commit it.
- Missing or not-ready Flux `Receiver`, `Kustomization`, `Terraform`, RBAC, etc. → commit/correct the manifest in the overlay.
- Stale or wrong `sourceRef`, annotations, ownership → fix them declaratively in the overlay.
- **A reconcile blocked on a pre-existing in-cluster object** (e.g. a `SealedSecret` the controller won't adopt because an unmanaged or Reflector-mirrored `Secret` already exists) is still solved declaratively: correct ownership/annotations in git so the controller adopts it. Only if **no controller can adopt the object** is a one-time imperative step justified — and then it is a single, specifically-scoped, reviewed exception stating the exact reason, **not** a multi-day approval queue standing in for missing engineering.
- **Board approval is reserved** for genuinely irreversible or out-of-band actions no controller reconciles — destroying stateful data, rotating the cluster bootstrap, bootstrapping a brand-new cluster. Routine reconcilable breakage never qualifies. (See `safety` for destructive-action approval.)
- The Flux bootstrap/cluster repo is **not** `groombook/infra` (see GitOps above). A genuinely missing `GitRepository` or other bootstrap object is a PR to that externally-managed cluster-config repo — still a PR, still not a hand-run apply.
If you are about to write "escalated to board — a human must run …" for a reconcilable change, stop: that is the failure mode, not the fix. Open the PR.
## Infrastructure as Code
Terraform (OpenTofu) is deployed via the **Flux OpenTofu Controller** in a GitOps fashion. Submit Terraform configurations via a PR to `groombook/infra` — the tofu controller reconciles them on merge. See `safety` for the prohibition on running `tofu` directly and on `kubectl apply` against production.
+1 -1
View File
@@ -22,7 +22,7 @@ The following rules apply to every GroomBook agent without exception.
* **Never `kubectl create secret` in production.** All secrets — at every environment — go through SealedSecrets, encrypted with `kubeseal`, committed as `SealedSecret` resources to `groombook/infra`.
* **Never bypass the merge gate.** No self-merging PRs. No pushing directly to `dev`, `uat`, or `main`. Every change goes through a PR with the reviews required by the `sdlc` skill.
* **Never bypass the merge gate.** No pushing directly to `dev`, `uat`, or `main`. Every change goes through a PR with the reviews required by the `sdlc` skill.
* **Never run `tofu` directly.** Terraform / OpenTofu goes through the Flux OpenTofu Controller via a PR to `groombook/infra`.
+45 -10
View File
@@ -3,14 +3,17 @@ name: sdlc
description: >
Software development lifecycle for GroomBook application repos. Covers
Gitea authentication, the 3-branch dev/uat/main strategy, the SDLC
pipeline phases 1-5, the Stage 1 CI image build, the authentication
framework, and application-tool policy. For infrastructure
(groombook/infra), see the devops skill.
pipeline phases 1-5, the uat→main merge-gate policy (engineer
self-merge when pre-gates are green; CTO Gitea Approve only for
novel auth, infra/prod-affecting, and risk-flagged merges),
the Stage 1 CI image build, the authentication framework, and
application-tool policy. For infrastructure (groombook/infra), see
the devops skill.
---
# Software Development Lifecycle
This skill governs **application code repos**. For infrastructure (`groombook/infra`), see the `devops` skill. For PR/test discipline and the `cc @cpfarhood` visibility rule, see `coding-standards`. For non-negotiable safety rules, see `safety`.
This skill governs **application code repos**. For infrastructure (`groombook/infra`), see the `devops` skill. For PR/test discipline and the `cc @cpfarhood` visibility rule, see `coding-standards`. For non-negotiable safety rules, see `safety`. The **uat→main merge-gate policy** (which uat→main PRs need a CTO Gitea Approve click vs. engineer self-merge) is defined in this skill — see "uat→main merge-gate policy" below.
## Gitea authentication
@@ -26,7 +29,7 @@ Three long-lived branches map to the three deployment environments:
|--------|-------------|-----------|-----------|
| `dev` | Dev | Engineer | CI passes |
| `uat` | UAT | Engineer | QA code review approval |
| `main` | Production | Engineer | UAT validation & CTO code review |
| `main` | Production | Engineer | UAT validation, security review, and the uat→main merge-gate policy (below) |
**Engineers always target `dev` first** — never `uat` or `main` directly.
- Feature branches: `<agent-name>/<short-description>`.
@@ -70,16 +73,48 @@ tea pr create --base dev --title "..." --body "... cc @cpfarhood"
1. **Engineer** opens a PR from `uat` to `main`.
2. **CI** fail → back to **Engineer** (return to Phase 1).
3. **CI** pass → **CTO** performs code review.
4. **CTO** rejected → back to **Engineer** (return to Phase 1).
5. **CTO** approved → **Engineer** merges PR.
6. **CI** fail → back to **Engineer** (return to Phase 1).
7. **CI** pass → Begin Phase 5.
3. **CI** pass → **Engineer** classifies the PR against the **uat→main merge-gate policy** (below):
* **In a category that requires CTO Gitea Approve** (novel auth / session paths, infra / prod-affecting merges, `risk:cto-approve` label or explicit CTO/CEO sign-off request) → Engineer pings the CTO for a Gitea Approve click.
* **CTO** rejected → back to **Engineer** (return to Phase 1).
* **CTO** approved → continue to step 4.
* **Outside all three categories** → no CTO click needed; jump to step 4 once the four pre-gates (QA, UAT deploy, UAT regression, security) are green.
4. **Engineer** merges the PR.
5. **CI** fail → back to **Engineer** (return to Phase 1).
6. **CI** pass → Begin Phase 5.
### Phase 5 — Production Deployment
The **Engineer** opens a PR against `groombook/infra` to update the relevant Kustomize overlay with the new image tag. From this point the work follows the **`devops` skill pipeline** end-to-end — review, merge, and Flux reconciliation are all owned there. On merge, Flux rolls out the updated pods to production (`https://demo.groombook.dev`).
## uat→main merge-gate policy
This is the process policy that governs which `uat → main` PRs need a CTO Gitea Approve click vs. engineer self-merge. It exists because the Gitea `required_approvals` branch-protection gate is satisfied only by a Gitea Approve click — the Paperclip issue-thread QA/UAT-deploy/UAT-regression/security approvals do **not** clear it. The engineer **MUST** classify every `uat → main` PR against this policy before merging.
### The four pre-gates are unchanged
A `uat → main` PR is mergeable only when all of these are green on the linked Paperclip issue: QA code review, UAT deploy, UAT regression, and security review. The CTO Gitea Approve click is **in addition to** those four when the PR falls in one of the categories below; it does not replace any of them.
### Categories that require CTO Gitea Approve
A CTO Gitea Approve click is required only for PRs in one of the following three categories:
1. **Novel auth / session paths.** Login, OIDC, OOBE, session middleware, token issuance, password reset, MFA, or any new auth provider integration. Routine changes to auth-gated UI (button styling, error messages, form layout, copy edits) are **not** in this category.
2. **Infra / prod-affecting merges.** Deploys, infra manifests, secrets, GitOps overlays, CI/CD, `main` branch protection, prod-affecting routing/ingress, or any change that mutates prod state. **All Phase 5 infra overlay PRs in `groombook/infra` require CTO Gitea Approve without exception.**
3. **Risk-flagged merges.** The PR carries the `risk:cto-approve` label, **or** the PR or its linked Paperclip issue thread contains an explicit CTO/CEO sign-off request.
### Engineer workflow
The engineer who opened the PR classifies it against the three categories above (escalating to the CTO via comment if the call is unclear), then:
* **In a category** → request a CTO Gitea Approve click. Engineer merges once the CTO has approved.
* **Outside all three categories** → no CTO click needed. Engineer merges once the four pre-gates are green.
**Board/CEO approval is via SDLC code review (judgment) on the Paperclip issue, not via a Gitea Approve click.** The CEO's role at Phase 4 is to perform the code review on the Paperclip issue thread; that approval satisfies the "CEO approved" pre-condition and is recorded in Paperclip, not in Gitea.
### When uncertain
If a `uat → main` PR does not obviously belong to a category, comment on the PR with `@<the-dogfather>` (or escalate to the CEO if the CTO is unavailable) and pause the merge. Do not guess the category — a routine PR is a routine PR, but a borderline PR is cheaper to escalate than to misclassify.
## Stage 1 CI — Image build
Triggered automatically on every merge to `main` in an application repo: