fix/metrics-heading-renders-immediately
The heading 'Intel GPU — Metrics' was blocked behind the ctxLoading check, causing the E2E navigation test to timeout when navigating directly to /c/main/intel-gpu/metrics. The K8s.ResourceClasses.useList() hooks in IntelGpuDataContext can take time to resolve when navigating directly to the metrics route (as opposed to via sidebar), causing ctxLoading to remain true beyond the 15s test timeout. Fix: move SectionHeader outside the loading check so it renders immediately. The Loader now appears below the heading while waiting for context to load. Also disable the Refresh button during ctxLoading. Updated unit test to verify heading is visible even when ctxLoading=true. Fixes: headlamp-intel-gpu-plugin#42 Co-Authored-By: Paperclip <noreply@paperclip.ing>
headlamp-intel-gpu-plugin
A Headlamp plugin providing visibility into Intel GPU device plugin deployments on Kubernetes.
Features
- Overview Dashboard — Plugin health, GPU node summary, allocation bar, active GPU pods
- Device Plugins — GpuDevicePlugin CRD instances with spec/status and daemon pod health
- GPU Nodes — Per-node GPU type (discrete/integrated), device count, allocation, workload pods
- GPU Pods — All pods requesting Intel GPU resources with per-container detail
- Metrics — Real-time GPU power draw (W) and TDP via Prometheus node-exporter i915 hwmon
- Node Detail Integration — Intel GPU section injected into native Headlamp Node detail views
- Pod Detail Integration — GPU resource requests/limits injected into native Pod detail views
- Nodes Table Columns — GPU Type and GPU Devices columns added to native Nodes table
Installation
Search for headlamp-intel-gpu in the Headlamp Plugin Manager (Settings → Plugins → Catalog).
Requirements
- Headlamp >= v0.20.0
- Intel GPU device plugin deployed (optional — plugin gracefully degrades without it)
- Optional: Node Feature Discovery with Intel GPU labels
- Optional: kube-prometheus-stack with node-exporter for GPU power metrics
RBAC
This plugin is read-only and requires the following permissions:
| Resource | API Group | Verbs |
|---|---|---|
| nodes | v1 | list, get, watch |
| pods | v1 | list, get, watch |
| gpudeviceplugins | deviceplugin.intel.com/v1 | list, get |
For metrics, Prometheus must be accessible via the Headlamp API proxy in the monitoring namespace.
Architecture
src/
├── index.tsx # Plugin entry point
├── api/
│ ├── k8s.ts # Types and helper functions
│ ├── metrics.ts # Prometheus GPU metrics
│ └── IntelGpuDataContext.tsx # React context provider
└── components/
├── OverviewPage.tsx # Dashboard
├── DevicePluginsPage.tsx # Device plugin CRDs
├── NodesPage.tsx # GPU nodes
├── PodsPage.tsx # GPU pods
├── MetricsPage.tsx # Power metrics
├── NodeDetailSection.tsx # Injected into Node detail view
├── PodDetailSection.tsx # Injected into Pod detail view
└── integrations/
└── NodeColumns.tsx # Nodes table columns
Development
npm install
npm start # dev server
npm test # run tests
npm run tsc # type check
npm run lint # ESLint
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| No GPU nodes shown | No Intel GPU labels or resources on nodes | Install Intel Node Feature Discovery or Intel GPU device plugin |
| CRD not available warning | GpuDevicePlugin CRD not installed | Install Intel device plugins operator — plugin still works without it |
| No metrics data | Prometheus not found | Deploy kube-prometheus-stack in the monitoring namespace |
| Metrics show only discrete GPUs | Integrated GPUs lack hwmon | Expected — iGPU driver doesn't expose hwmon power data |
Contributing
See CONTRIBUTING.md for development guidelines.
License
Apache License 2.0. See LICENSE for details.
Description
Headlamp plugin for Intel GPU device visibility and resource monitoring in Kubernetes
Readme
1.6 MiB