Files
headlamp-intel-gpu-plugin/artifacthub-pkg.yml
T
Chris Farhood 4b4e565a1a fix: switch Metrics page to Prometheus/node-exporter i915 hwmon source
The Intel GPU device plugin -enable-monitoring flag registers a monitoring
K8s resource type (not a Prometheus endpoint). Real GPU power metrics come
from node-exporter's hwmon collector which scrapes the i915 kernel driver.

- Rewrite src/api/metrics.ts: query kube-prometheus-stack Prometheus for
  node_hwmon_energy_joule_total (rate → watts), node_hwmon_power_max_watt
  (TDP), joined with node_hwmon_chip_names{chip_name="i915"} to identify
  GPU chips. Instance → node name resolved via node_uname_info.

- Rewrite src/components/MetricsPage.tsx: shows per-chip current power (W)
  with bar vs TDP, total fleet power summary, last-fetched timestamp.
  Auto-discovers Prometheus service in monitoring namespace.

- Update artifacthub-pkg.yml checksum for repackaged v0.2.0 tarball.

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
2026-02-18 21:37:16 -05:00

76 lines
3.0 KiB
YAML

version: "0.2.0"
name: headlamp-intel-gpu-plugin
displayName: Intel GPU
description: >-
Headlamp plugin for Intel GPU device plugin visibility and monitoring.
Surfaces GpuDevicePlugin CRD status, GPU-capable nodes with capacity and
allocation, pods requesting Intel GPU resources, and injects Intel GPU
sections into native Node and Pod detail pages. Supports discrete (i915),
Xe, and integrated GPU nodes with graceful degradation when the device
plugin operator is not installed. Includes a Metrics page showing real-time
engine utilization, GPU frequency, VRAM usage, and energy from the device
plugin's Prometheus endpoint.
createdAt: "2026-02-18T00:00:00Z"
license: Apache-2.0
category: monitoring-logging
homeURL: https://github.com/privilegedescalation/headlamp-intel-gpu-plugin
appVersion: "0.2.0"
keywords:
- headlamp
- kubernetes
- intel
- gpu
- intel-gpu
- device-plugin
- i915
- xe
- monitoring
- node-feature-discovery
maintainers:
- name: privilegedescalation
email: chris@farhood.org
provider:
name: privilegedescalation
links:
- name: GitHub
url: https://github.com/privilegedescalation/headlamp-intel-gpu-plugin
- name: Issues
url: https://github.com/privilegedescalation/headlamp-intel-gpu-plugin/issues
- name: Intel Device Plugins for Kubernetes
url: https://intel.github.io/intel-device-plugins-for-kubernetes/
changes:
- kind: added
description: "Metrics page: real-time GPU power draw (W) and TDP via node-exporter i915 hwmon metrics in kube-prometheus-stack"
- kind: changed
description: "Sidebar label changed to intel-gpu"
- kind: removed
description: "Removed app bar health badge"
- kind: added
description: "Overview dashboard: plugin health, GPU node summary, allocation bar, active GPU pods"
- kind: added
description: "Device Plugins page: GpuDevicePlugin CRD instances with spec/status and daemon pods"
- kind: added
description: "GPU Nodes page: per-node GPU type, device count, allocation, workload pods"
- kind: added
description: "GPU Pods page: all pods requesting Intel GPU resources with per-container detail"
- kind: added
description: "Node detail injection: Intel GPU section on native Node detail pages (capacity, allocatable, utilization, active pods)"
- kind: added
description: "Pod detail injection: GPU resource requests/limits per container on native Pod detail pages"
- kind: added
description: "Nodes table: GPU Type and GPU Devices columns injected into native Nodes table"
- kind: added
description: "App bar health badge: hidden when no Intel GPU plugin detected"
annotations:
headlamp/plugin/archive-url: "https://github.com/privilegedescalation/headlamp-intel-gpu-plugin/releases/download/v0.2.0/headlamp-intel-gpu-plugin-0.2.0.tar.gz"
headlamp/plugin/archive-checksum: "sha256:cbcd20916d72e91ccc36143c74680fbeb2f045e1cbe6d1cc0d844b198e2a1ea5"
headlamp/plugin/version-compat: ">=0.20.0"
headlamp/plugin/distro-compat: "in-cluster,web,app"