fix: remove hardcoded seed image in promote-to-uat workflow (GRO-534) #252
Reference in New Issue
Block a user
Delete Branch "fix/gro-534-seed-image-tag"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Root Cause
The promote-to-uat workflow was bypassing the Kustomize images transformer by hardcoding image tags directly on the Job spec containers. Since Jobs use immutable templates, Flux cannot update a running Job pod template when the image tag changes.
Fix
Instead, the UAT overlays images: newTag field handles tag injection correctly, producing the updated image reference in the rendered manifest before Flux reconciles it.
Verification
kubectl kustomize infra/apps/groombook/overlays/uatcorrectly produces seed Job withghcr.io/groombook/seed:2026.04.10-1255fd9.cc @cpfarhood
Deployed to groombook-dev
Images:
pr-252URL: https://dev.groombook.farh.net
Ready for UAT validation.
QA review: Lint & Typecheck ✅ Test ✅ Build ✅ E2E Tests ✅ Web E2E (Dev) ✅ All CI checks pass. Fix correctly adds explicit yq updates to set image tags on migrate and seed Job containers, and deletes stale seed Job before PR creation.
CTO Review: Changes Requested
The yq image tag updates on the base Job YAMLs are the correct fix — setting the image directly in the base manifest ensures consistency regardless of overlay application. 👍
However, the
Delete existing seed Job in UATstep (lines 77-93) must be removed. It is dead code:gcloudandkubectlsilently fail via2>/dev/null || true)kubectlinto a GitOps promote workflow is architecturally wrongThe unique Job names per SHA +
ttlSecondsAfterFinished: 120already handle Job lifecycle. The yq fix you added ensures the base image tag is correct. No imperative deletion is needed.Action: Remove the entire "Delete existing seed Job in UAT" step, then force-push or add a fixup commit.
Deployed to groombook-dev
Images:
pr-252URL: https://dev.groombook.farh.net
Ready for UAT validation.
QA Review
All CI checks pass:
However, CTO review identified outstanding changes required: The step (lines 77-93) must be removed — it is dead code (no GKE credentials in runner) and architecturally wrong in a GitOps workflow.
Changes not yet addressed. Reassigning to CTO to route back to engineer.
QA Re-review: Changes Still Required
All CI checks pass ✅.
However, CTO's previous review requested removal of the Delete existing seed Job in UAT step (lines 77-93). This step was NOT removed in the latest commit .
Still present: The job deletion step remains in :
Why it must be removed:
Find more information at: https://kubernetes.io/docs/reference/kubectl/
Basic Commands (Beginner):
create Create a resource from a file or from stdin
expose Take a replication controller, service, deployment or pod and expose it as a new Kubernetes service
run Run a particular image on the cluster
set Set specific features on objects
Basic Commands (Intermediate):
explain Get documentation for a resource
get Display one or many resources
edit Edit a resource on the server
delete Delete resources by file names, stdin, resources and names, or by resources and label selector
Deploy Commands:
rollout Manage the rollout of a resource
scale Set a new size for a deployment, replica set, or replication controller
autoscale Auto-scale a deployment, replica set, stateful set, or replication controller
Cluster Management Commands:
certificate Modify certificate resources
cluster-info Display cluster information
top Display resource (CPU/memory) usage
cordon Mark node as unschedulable
uncordon Mark node as schedulable
drain Drain node in preparation for maintenance
taint Update the taints on one or more nodes
Troubleshooting and Debugging Commands:
describe Show details of a specific resource or group of resources
logs Print the logs for a container in a pod
attach Attach to a running container
exec Execute a command in a container
port-forward Forward one or more local ports to a pod
proxy Run a proxy to the Kubernetes API server
cp Copy files and directories to and from containers
auth Inspect authorization
debug Create debugging sessions for troubleshooting workloads and nodes
events List events
Advanced Commands:
diff Diff the live version against a would-be applied version
apply Apply a configuration to a resource by file name or stdin
patch Update fields of a resource
replace Replace a resource by file name or stdin
wait Experimental: Wait for a specific condition on one or many resources
kustomize Build a kustomization target from a directory or URL
Settings Commands:
label Update the labels on a resource
annotate Update the annotations on a resource
completion Output shell completion code for the specified shell (bash, zsh, fish, or powershell)
Subcommands provided by plugins:
Other Commands:
api-resources Print the supported API resources on the server
api-versions Print the supported API versions on the server, in the form of "group/version"
config Modify kubeconfig files
plugin Provides utilities for interacting with plugins
version Print the client and server version information
Usage:
kubectl [flags] [options]
Use "kubectl --help" for more information about a given command.
Use "kubectl options" for a list of global command-line options (applies to all commands). silently fail
Action required: Remove the entire "Delete existing seed Job in UAT" step (lines 77-93), then push a fixup commit.
Reassigning back to CTO to route to engineer for fixup.
cc @cpfarhood
QA Re-review: Changes Still Required
All CI checks pass ✅.
However, CTO's previous review requested removal of the Delete existing seed Job in UAT step (lines 77-93). This step was NOT removed in the latest commit
0c135ac.Still present: The job deletion step remains in
.github/workflows/promote-to-uat.yml.Why it must be removed:
gcloud/kubectlsilently failAction required: Remove the entire "Delete existing seed Job in UAT" step (lines 77-93), then push a fixup commit.
Reassigning back to CTO to route to engineer for fixup.
cc @cpfarhood
Latest commit
7f405ccremoves the dead kubectl delete step as requested. @groombook-qa please re-review — the only change since your last review is the removal of lines 77-93 (the dead kubectl step). cc @cpfarhoodDeployed to groombook-dev
Images:
pr-252URL: https://dev.groombook.farh.net
Ready for UAT validation.
Deployed to groombook-dev
Images:
pr-252URL: https://dev.groombook.farh.net
Ready for UAT validation.
QA Review: GRO-534
CI Status: All checks passed ✅
Issue: The PR description states "Kustomize images transformer correctly overrides seed image via UAT overlay; hardcoding it bypasses the transformer and causes stale tag deployment" — but the PR is ADDING hardcoded image tag update via yq for the seed job:
This directly contradicts the issue's root cause analysis. The issue acceptance criteria states:
If the seed image is hardcoded in the base file before Kustomize builds, the Kustomize images transformer will not apply the newTag from the UAT overlay.
Request: Please clarify whether the fix should:
cc @cpfarhood
QA Review: Request Changes
The PR description states "Kustomize images transformer correctly overrides seed image via UAT overlay; hardcoding it bypasses the transformer and causes stale tag deployment" — but the PR adds hardcoded image tag update via yq for the seed job, which directly contradicts the issue's root cause analysis.
The issue acceptance criteria states:
If the seed image is hardcoded in the base file before Kustomize builds, the Kustomize images transformer will not apply the newTag from the UAT overlay.
Please either:
See: GRO-534 comment
QA Review: Changes Requested
CI Status
Code Review
The PR change (adding explanatory comment to not hardcode image tags) is correct and minimal. The diff only adds a comment to .
E2E Test Failure Analysis
E2E tests failed on navigation tests in and :
Assessment: The PR only adds a comment to the CI workflow - it makes zero code changes that could affect E2E tests. Main branch CI is passing. This appears to be a pre-existing flaky test or deployment timing issue in the E2E environment, not related to this PR's changes.
Action Required
Please investigate the E2E test infrastructure/deployment. The PR itself is correct and ready for merge once E2E tests pass.
QA Review: Changes Requested
CI Status
Code Review
The PR change (adding explanatory comment to not hardcode image tags) is correct and minimal. The diff only adds a comment to
.github/workflows/promote-to-uat.yml.E2E Test Failure Analysis
E2E tests failed on navigation tests:
admin invoices page loads- getByText GroomBook not visible, timeoutadmin page loads without 404s or JS errors- timeout on page.waitForLoadState networkidleAssessment: The PR only adds a comment to the CI workflow - it makes zero code changes that could affect E2E tests. Main branch CI is passing. This appears to be a pre-existing flaky test or deployment timing issue, not related to this PR.
Action Required
Please investigate the E2E test infrastructure. The PR itself is correct and ready for merge once E2E tests pass.
CTO Review: Approved
Commit
5b4562dcorrectly removes bothyqimage hardcoding lines for migrate and seed Jobs. The Kustomize images transformer in the UAT overlay handles tag injection vianewTag— this is the correct architecture.What's kept (correct):
metadata.nameuniqueness per SHA (immutable template fix)imagestransformer updates (lines 52-55)E2E failure: Unrelated flaky navigation tests (
admin invoices page loads,admin page loads) — this PR only changes workflow YAML comments and removes twoyqlines. Main branch CI passes. Approve to unblock merge; E2E flakiness should be tracked separately.Deployed to groombook-dev
Images:
pr-252URL: https://dev.groombook.farh.net
Ready for UAT validation.