fix(e2e): keep ServiceAccount across deploy cycles to avoid token fetch race

The deploy script was deleting serviceaccount/headlamp-e2e before recreating it via kubectl apply. This causes a race: the new deployment pod tries to mount its service account token before the token is available, resulting in: Warning FailedMount: failed to fetch token: serviceaccounts "headlamp-e2e" not found Fix by removing the kubectl delete serviceaccount line and replacing it with an idempotent create (--dry-run=client | kubectl apply). This ensures the ServiceAccount persists across deploy cycles and tokens are available when pods start. Co-Authored-By: Paperclip <noreply@paperclip.ing>
Migrate to reusable plugin-e2e.yaml workflow (PRI-634)
2026-05-05 15:35:48 +00:00 · 2026-05-05 10:56:47 +00:00 · 2026-05-05 10:18:47 +00:00 · 2026-05-05 05:10:33 +00:00 · 2026-05-04 17:20:38 +00:00 · 2026-05-03 17:44:15 +00:00
8 changed files with 49 additions and 165 deletions
@@ -10,94 +10,13 @@ on:
 permissions:
  contents: read

-# Only one E2E run at a time: the shared E2E_RELEASE (headlamp-e2e) in
-# privilegedescalation-dev cannot be shared across concurrent runs.
-# cancel-in-progress: false (queue, don't cancel) — cancelling in-flight
-# runs may skip the if: always() teardown, leaving dangling cluster resources.
 concurrency:
  group: e2e-${{ github.repository }}
  cancel-in-progress: false

-env:
-  E2E_NAMESPACE: privilegedescalation-dev
-  E2E_RELEASE: headlamp-e2e
-  # Pin to a known-good Headlamp version. Using :latest is risky because
-  # the tag can change between CI runs, causing flaky failures when a newer
-  # image is pulled on some nodes but not others (IfNotPresent pull policy).
-  # Update this when Headlamp is upgraded in production (kube-system).
-  HEADLAMP_VERSION: v0.40.1
-
 jobs:
  e2e:
-    runs-on: runners-privilegedescalation
-    timeout-minutes: 15
-
-    steps:
-      - name: Checkout
-        uses: actions/checkout@v6
-
-      - name: Setup Node.js
-        uses: actions/setup-node@v6
-        with:
-          node-version: '22'
-          cache: 'npm'
-
-      - name: Setup kubectl
-        uses: azure/setup-kubectl@v4
-
-      - name: Install dependencies
-        run: npm ci
-
-      - name: Build plugin
-        run: npx @kinvolk/headlamp-plugin build
-
-      - name: Deploy E2E Headlamp instance
-        run: scripts/deploy-e2e-headlamp.sh
-
-      - name: Load E2E environment
-        run: |
-          if [ -f .env.e2e ]; then
-            cat .env.e2e >> "$GITHUB_ENV"
-          else
-            echo "::error::deploy-e2e-headlamp.sh did not produce .env.e2e"
-            exit 1
-          fi
-
-      - name: Install Playwright browsers
-        run: npx playwright install --with-deps chromium
-
-      - name: Run E2E tests
-        run: npm run e2e
-        env:
-          HEADLAMP_URL: ${{ env.HEADLAMP_URL }}
-          HEADLAMP_TOKEN: ${{ env.HEADLAMP_TOKEN }}
-
-      - name: Collect deployment diagnostics on failure
-        if: failure()
-        run: |
-          echo "=== Pod state ==="
-          kubectl get pods -n "$E2E_NAMESPACE" -l "app.kubernetes.io/instance=$E2E_RELEASE" 2>&1 || true
-          echo "=== Pod describe ==="
-          kubectl describe pods -n "$E2E_NAMESPACE" -l "app.kubernetes.io/instance=$E2E_RELEASE" 2>&1 || true
-          echo "=== Recent namespace events ==="
-          kubectl get events -n "$E2E_NAMESPACE" --sort-by='.lastTimestamp' 2>&1 | tail -20 || true
-
-      - name: Teardown E2E instance
-        if: always()
-        run: scripts/teardown-e2e-headlamp.sh
-
-      - name: Upload Playwright report
-        uses: actions/upload-artifact@v7
-        if: failure()
-        with:
-          name: playwright-report
-          path: playwright-report/
-          retention-days: 7
-
-      - name: Upload test results
-        uses: actions/upload-artifact@v7
-        if: failure()
-        with:
-          name: test-results
-          path: test-results/
-          retention-days: 7
+    uses: privilegedescalation/.github/.github/workflows/plugin-e2e.yaml@main
+    with:
+      node-version: "22"
+      headlamp-version: v0.40.1
@@ -1,53 +0,0 @@
-{
-  "config": {
-    // Line length — not enforced for docs with code examples
-    "MD013": false,
-    // First line heading — files use YAML frontmatter, not headings
-    "MD041": false,
-    // Emphasis as heading — common pattern for Option 1/2/3 sections
-    "MD036": false,
-    // No duplicate heading — changelog files repeat section names intentionally
-    "MD024": false,
-    // Fenced code language — not always applicable for diagram blocks
-    "MD040": false,
-    // Table column style — table alignment is visual, not semantic
-    "MD060": false,
-    // Ordered list item prefix — number resets are intentional in documents
-    "MD029": false,
-    // No inline HTML — each elements are valid in valid Markdown
-    "MD033": false,
-    // List marker space — spacing after list markers varies by editor
-    "MD030": false,
-    // Blanks around headings — not always needed in compact docs
-    "MD022": false,
-    // Blanks around lists — not always needed in compact docs
-    "MD032": false,
-    // Blanks around fences — not always needed between adjacent blocks
-    "MD031": false,
-    // Multiple blanks — editor artifacts, not semantic
-    "MD012": false,
-    // Single title — files may have multiple H1 sections
-    "MD025": false,
-    // Trailing spaces — editor artifacts
-    "MD009": false,
-    // Bare URLs — URL shortening not always needed
-    "MD034": false,
-    // Single trailing newline — editor artifacts
-    "MD047": false,
-    // Trailing punctuation — heading punctuation is intentional
-    "MD026": false,
-    // Space in emphasis — double-asterisk bold spacing varies by renderer
-    "MD037": false,
-    // No hard tabs — some generated docs use tabs for indentation
-    "MD010": false,
-    // Code block style — generated docs may use inconsistent styles
-    "MD046": false,
-    // Comment style — generated docs have no comments
-    "MD048": false,
-    // Commands show output — shell examples intentionally show only commands
-    "MD014": false
-  },
-  "ignores": [
-    "docs/api-reference/generated/**"
-  ]
-}
@@ -1 +0,0 @@
-docs/api-reference/generated/**
@@ -19,16 +19,18 @@ test.describe('Intel GPU plugin smoke tests', () => {

    // Should navigate to the overview route
    await expect(page).toHaveURL(/\/intel-gpu$/);
-    await expect(page.getByRole('heading', { name: /Intel GPU — Overview/i })).toBeVisible();
+    await expect(
+      page.locator('main').getByRole('heading', { name: 'Intel GPU — Overview' })
+    ).toBeVisible();
  });

  test('overview page renders GPU device list or empty state', async ({ page }) => {
    await page.goto('/c/main/intel-gpu');

    // Overview heading should be present
-    await expect(page.getByRole('heading', { name: /Intel GPU — Overview/i })).toBeVisible({
-      timeout: 15_000,
-    });
+    await expect(
+      page.locator('main').getByRole('heading', { name: 'Intel GPU — Overview' })
+    ).toBeVisible({ timeout: 15_000 });

    // Either a populated table/list or an empty-state indicator must be visible
    const hasTable = await page.locator('table').first().isVisible().catch(() => false);
@@ -43,9 +45,9 @@ test.describe('Intel GPU plugin smoke tests', () => {
  test('device plugins page renders or shows empty state', async ({ page }) => {
    await page.goto('/c/main/intel-gpu/device-plugins');

-    await expect(page.getByRole('heading', { name: /Intel GPU — Device Plugins/i })).toBeVisible({
-      timeout: 15_000,
-    });
+    await expect(
+      page.locator('main').getByRole('heading', { name: 'Intel GPU — Device Plugins' })
+    ).toBeVisible({ timeout: 15_000 });

    const hasTable = await page.locator('table').first().isVisible().catch(() => false);
    const hasEmptyState = await page
@@ -61,18 +63,24 @@ test.describe('Intel GPU plugin smoke tests', () => {
    // not after clicking the parent entry from the overview. Test route
    // accessibility via direct navigation — each route must render its heading.
    await page.goto('/c/main/intel-gpu');
-    await expect(page.getByRole('heading', { name: /Intel GPU — Overview/i })).toBeVisible({
-      timeout: 15_000,
-    });
+    await expect(
+      page.locator('main').getByRole('heading', { name: 'Intel GPU — Overview' })
+    ).toBeVisible({ timeout: 15_000 });

    await page.goto('/c/main/intel-gpu/nodes');
-    await expect(page.getByRole('heading', { name: /Intel GPU — Nodes/i })).toBeVisible({ timeout: 15_000 });
+    await expect(
+      page.locator('main').getByRole('heading', { name: 'Intel GPU — Nodes' })
+    ).toBeVisible({ timeout: 15_000 });

    await page.goto('/c/main/intel-gpu/pods');
-    await expect(page.getByRole('heading', { name: /Intel GPU — Pods/i })).toBeVisible({ timeout: 15_000 });
+    await expect(
+      page.locator('main').getByRole('heading', { name: 'Intel GPU — Pods' })
+    ).toBeVisible({ timeout: 15_000 });

    await page.goto('/c/main/intel-gpu/metrics');
-    await expect(page.getByRole('heading', { name: /Intel GPU — Metrics/i })).toBeVisible({ timeout: 15_000 });
+    await expect(
+      page.locator('main').getByRole('heading', { name: 'Intel GPU — Metrics' })
+    ).toBeVisible({ timeout: 15_000 });
  });

  test('plugin settings page shows intel-gpu plugin entry', async ({ page }) => {
@@ -11600,9 +11600,9 @@
      }
    },
    "node_modules/lodash": {
-      "version": "4.17.23",
-      "resolved": "https://registry.npmjs.org/lodash/-/lodash-4.17.23.tgz",
-      "integrity": "sha512-LgVTMpQtIopCi79SJeDiP0TfWi5CNEc/L/aRdTh3yIvmZXTnheWpKjSZhnvMl8iXbC1tFg9gdHHDMLoV7CnG+w==",
+      "version": "4.18.1",
+      "resolved": "https://registry.npmjs.org/lodash/-/lodash-4.18.1.tgz",
+      "integrity": "sha512-dMInicTPVE8d1e5otfwmmjlxkZoUpiVLwyeTdUsi/Caj/gfzzblBcCE5sRHV/AsjuCmxWrte2TNGSYuCeCq+0Q==",
      "dev": true,
      "license": "MIT"
    },
@@ -44,6 +44,7 @@
  },
  "overrides": {
    "tar": "^7.5.11",
-    "undici": "^7.24.3"
+    "undici": "^7.24.3",
+    "lodash": ">=4.18.0"
  }
 }
@@ -5,7 +5,7 @@
 # a ConfigMap volume mount. No custom Docker images — the plugin is built
 # in CI and injected as a ConfigMap.
 #
-# E2E resources are deployed to the `privilegedescalation-dev` namespace. Nothing
+# E2E resources are deployed to the `headlamp-dev` namespace. Nothing
 # persists beyond the test run — teardown cleans up all created resources.
 #
 # Prerequisites:
@@ -14,7 +14,7 @@
 #   - RBAC applied: kubectl apply -f deployment/e2e-ci-runner-rbac.yaml
 #
 # Environment:
-#   E2E_NAMESPACE     — namespace for E2E Headlamp (default: privilegedescalation-dev)
+#   E2E_NAMESPACE     — namespace for E2E Headlamp (default: headlamp-dev)
 #   E2E_RELEASE       — release/resource name prefix (default: headlamp-e2e)
 #   HEADLAMP_VERSION  — Headlamp image tag (default: latest)
 set -euo pipefail
@@ -22,7 +22,7 @@ set -euo pipefail
 REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
 DIST_DIR="$REPO_ROOT/dist"

-E2E_NAMESPACE="${E2E_NAMESPACE:-privilegedescalation-dev}"
+E2E_NAMESPACE="${E2E_NAMESPACE:-headlamp-dev}"
 E2E_RELEASE="${E2E_RELEASE:-headlamp-e2e}"
 HEADLAMP_VERSION="${HEADLAMP_VERSION:-latest}"

@@ -40,7 +40,7 @@ if ! kubectl auth can-i delete configmaps -n "$E2E_NAMESPACE" --quiet 2>/dev/nul
 fi

 echo "=== E2E Headlamp Deployment ==="
-echo "  Image:     ghcr.io/headlamp-plugins/headlamp:${HEADLAMP_VERSION}"
+echo "  Image:     ghcr.io/headlamp-k8s/headlamp:${HEADLAMP_VERSION}"
 echo "  Namespace: $E2E_NAMESPACE"
 echo "  Release:   $E2E_RELEASE"

@@ -59,11 +59,21 @@ kubectl create configmap headlamp-intel-gpu-plugin \
  --from-file=package.json="$REPO_ROOT/package.json"

 # --- Tear down any existing E2E deployment for a clean start ---
+# Deleting the Deployment forces a fresh pod (new ReplicaSet) regardless of
+# whether the pod spec changed. We do NOT delete the ServiceAccount — keeping
+# it avoids a token-race condition where kubelet tries to mount a volume using a
+# token that has been deleted but the new one isn't ready yet.
+# The Service is NOT deleted — leaving it in place avoids an
+# Endpoints UID race (FailedToUpdateEndpoint) that causes DNS resolution
+# failures. kubectl apply below upserts the Service in-place, and the new
+# pod's IP is added to the existing Endpoints automatically.
 echo ""
 echo "Removing any existing E2E deployment (clean-start)..."
 kubectl delete deployment "${E2E_RELEASE}" -n "$E2E_NAMESPACE" --ignore-not-found --wait
-kubectl delete service "${E2E_RELEASE}" -n "$E2E_NAMESPACE" --ignore-not-found --wait
-kubectl delete serviceaccount "${E2E_RELEASE}" -n "$E2E_NAMESPACE" --ignore-not-found --wait
+# ServiceAccount is kept — create it idempotently so the first run works too
+kubectl create serviceaccount "${E2E_RELEASE}" \
+  -n "$E2E_NAMESPACE" \
+  --dry-run=client -o yaml | kubectl apply -f -

 # --- Deploy Headlamp via kubectl apply ---
 echo ""
@@ -101,7 +111,7 @@ spec:
      securityContext: {}
      containers:
        - name: headlamp
-          image: ghcr.io/headlamp-plugins/headlamp:${HEADLAMP_VERSION}
+          image: ghcr.io/headlamp-k8s/headlamp:${HEADLAMP_VERSION}
          imagePullPolicy: IfNotPresent
          securityContext:
            runAsNonRoot: true
@@ -4,13 +4,13 @@
 # Tears down the dedicated E2E Headlamp instance deployed by deploy-e2e-headlamp.sh.
 #
 # Environment:
-#   E2E_NAMESPACE  — namespace to clean up (default: privilegedescalation-dev)
+#   E2E_NAMESPACE  — namespace to clean up (default: headlamp-dev)
 #   E2E_RELEASE    — release/resource name prefix (default: headlamp-e2e)
 set -euo pipefail

 REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"

-E2E_NAMESPACE="${E2E_NAMESPACE:-privilegedescalation-dev}"
+E2E_NAMESPACE="${E2E_NAMESPACE:-headlamp-dev}"
 E2E_RELEASE="${E2E_RELEASE:-headlamp-e2e}"

 echo "=== E2E Headlamp Teardown ==="
Author	SHA1	Message	Date
Chris Farhood	2d9c447467	fix(e2e): keep ServiceAccount across deploy cycles to avoid token fetch race The deploy script was deleting serviceaccount/headlamp-e2e before recreating it via kubectl apply. This causes a race: the new deployment pod tries to mount its service account token before the token is available, resulting in: Warning FailedMount: failed to fetch token: serviceaccounts "headlamp-e2e" not found Fix by removing the kubectl delete serviceaccount line and replacing it with an idempotent create (--dry-run=client \| kubectl apply). This ensures the ServiceAccount persists across deploy cycles and tokens are available when pods start. Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-05-05 15:35:48 +00:00
Chris Farhood	191d2edc55	Migrate to reusable plugin-e2e.yaml workflow (PRI-634)	2026-05-05 10:56:47 +00:00
privilegedescalation-engineer[bot]	c7920b5b8e	fix(e2e): use headlamp-dev namespace in E2E workflow (PRI-550) (#61 ) * fix(e2e): use headlamp-dev namespace in E2E workflow (PRI-550) The infra RBAC in privilegedescalation/infra already covers headlamp-dev with all needed E2E permissions. Changing the workflow to use headlamp-dev unblocks E2E since the Arc Runners SA is already authorized there. Depends on Gandalf's PR #58 for namespace corrections in scripts and RBAC manifest. Co-Authored-By: Paperclip <noreply@paperclip.ing> * chore: re-trigger E2E with headlamp-dev namespace (PRI-550) * chore: re-run CI/E2E checks (PRI-550) Co-Authored-By: Paperclip <noreply@paperclip.ing> --------- Co-authored-by: Chris Farhood <chris@farhood.org> Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-05 10:18:47 +00:00
privilegedescalation-engineer[bot]	c99e235caa	fix(e2e): remove Service delete to fix Endpoints UID race causing ERR_NAME_NOT_RESOLVED Merged via CEO gate after full pipeline approval: CI ✅ E2E ✅ UAT ✅ QA ✅ CTO ✅	2026-05-05 05:10:33 +00:00
privilegedescalation-engineer[bot]	85c839bc19	fix(e2e): scope heading locators to main content area (#50 ) Replace bare getByRole("heading", { name: /Intel GPU — .../i }) calls with page.locator('main').getByRole('heading', { name: '...' }) so that each locator matches exactly one element and Playwright strict mode is satisfied. The main element is the appropriate scoping container for plugin page content. Exact name matching (without regex) is used to be precise about which heading is being targeted. Co-authored-by: Test User <test@example.com> Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-04 17:20:38 +00:00
privilegedescalation-engineer[bot]	00c29e36dd	fix: override lodash >=4.18.0 to patch code injection vulnerability (#51 ) * fix: override lodash >=4.18.0 to patch code injection vulnerability GHSA-r5fr-rjxr-66jc is a code injection vulnerability in lodash below 4.18.0. The vulnerable transitive dependency comes through @kinvolk/headlamp-plugin. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix: update package-lock.json to satisfy lodash override The package.json override requires lodash >=4.18.0, but the lockfile had 4.17.23. Regenerated lockfile with npm install --include=dev. Co-Authored-By: Paperclip <noreply@paperclip.ing> * fix(e2e): scope heading locators to main content area Cherry-picked from PR #50 to fix E2E test failures on lodash PR. Co-Authored-By: Paperclip <noreply@paperclip.ing> --------- Co-authored-by: Gandalf the Greybeard <gandalf@privilegedescalation.dev> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Paperclip <noreply@paperclip.ing>	2026-05-03 17:44:15 +00:00