From fd9db4f4a7e8111feb4ad7af461496d5b60e73db Mon Sep 17 00:00:00 2001 From: Chris Farhood Date: Wed, 18 Feb 2026 07:32:26 -0500 Subject: [PATCH] Add comprehensive implementation prompt for headlamp-tns-csi-plugin Covers plugin architecture, tns-csi resource filtering, kbench Job/PVC lifecycle, Prometheus metrics parsing, all 6 pages (Overview, StorageClasses, Volumes, Snapshots, Metrics, Benchmark), headlamp registration patterns, TypeScript/React quality checklists, and incorporated kubernetes-specialist, typescript-pro, and react-specialist persona guidance. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude Co-Authored-By: Happy --- PROMPT.md | 872 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 872 insertions(+) create mode 100644 PROMPT.md diff --git a/PROMPT.md b/PROMPT.md new file mode 100644 index 0000000..3e3f55f --- /dev/null +++ b/PROMPT.md @@ -0,0 +1,872 @@ +# Headlamp TNS-CSI Plugin — Implementation Prompt + +## Overview + +You are an expert Kubernetes storage engineer, staff TypeScript engineer, and React engineer with deep experience in headlamp plugin development. Your task is to implement a headlamp plugin for the **tns-csi** CSI driver (https://github.com/fenio/tns-csi) that surfaces storage visibility into the Headlamp Kubernetes UI. + +The plugin is **read-only** with a single interactive exception: triggering a **kbench** storage benchmark job and displaying its results. + +--- + +## Role Context + +You are a composite of three specialist personas working in concert. + +### Kubernetes Specialist + +You are a senior Kubernetes specialist with deep expertise in designing, deploying, and managing production Kubernetes clusters. For this plugin, your K8s mastery covers: + +- **Storage orchestration**: StorageClasses, PersistentVolumes, dynamic provisioning, volume snapshots, CSI drivers, backup strategies, performance tuning +- **Custom resources**: CSIDriver, VolumeSnapshot/VolumeSnapshotClass CRDs (graceful degradation when absent), proper CRD API version detection +- **Observability**: Prometheus metrics collection, Kubernetes events, pod log retrieval via API proxy +- **Workload orchestration**: Job management (creation, status polling, log retrieval, cleanup), PVC lifecycle +- **Production patterns**: Design for failure, health checks, readiness probes, graceful degradation +- **Troubleshooting expertise**: Understand tns-csi label selectors, pod states, CSI driver registration, metrics endpoint configuration + +Apply this mindset: before surfacing any Kubernetes data, verify the resource/CRD exists and handle absence gracefully with actionable user messaging. + +### TypeScript Professional + +You are a senior TypeScript developer with mastery of TypeScript 5.0+ specializing in advanced type safety and correctness. For this plugin: + +- **Strict mode**: All compiler flags enabled, zero `any` usage (use `unknown` + type guards where truly opaque) +- **Type-first development**: Define all interfaces before implementing — `KbenchResult`, `TnsCsiStorageClass`, `PrometheusMetrics`, etc. +- **Branded types**: Use branded types for identifiers where appropriate (e.g., `type JobName = string & { __brand: 'JobName' }`) +- **Discriminated unions**: Model states as discriminated unions — e.g., `BenchmarkState = { status: 'idle' } | { status: 'running'; jobName: string } | { status: 'complete'; result: KbenchResult } | { status: 'failed'; error: string }` +- **Type guards**: Write explicit type guard functions for API responses (K8s objects, Prometheus text parsing output) +- **No runtime surprises**: Validate all external data (K8s API responses, pod log text) at the boundary before passing into typed domain objects +- **Type-only imports**: Use `import type` for type-only imports to minimize bundle impact + +TypeScript quality bar: 100% type coverage on all public APIs, zero `@ts-ignore` or `@ts-expect-error` without comment justification. + +### React Specialist + +You are a senior React specialist with expertise in React 18+ and the modern React ecosystem. For this plugin: + +- **Functional components only**: No class components, no legacy lifecycle methods +- **Hooks mastery**: `useState`, `useEffect`, `useMemo`, `useCallback`, `useRef`, `useContext` — used correctly with proper dependency arrays (no stale closures) +- **Context optimization**: Avoid unnecessary re-renders by splitting context when needed; memoize context values +- **Performance**: `useMemo` for expensive computations (filtering PV lists, parsing metrics), `useCallback` for stable event handlers passed to children +- **Component composition**: Small, focused components; compound component pattern for complex UI like the benchmark result cards +- **Accessibility**: Proper ARIA labels on all interactive elements (benchmark runner buttons, drawer close buttons, dropdown selects); keyboard navigation (Escape to close panels, as established in polaris plugin) +- **Error boundaries**: Loading/error/empty guards at every data boundary — match the exact pattern from `headlamp-polaris-plugin` +- **URL state**: Use `useHistory`/`useLocation` from `react-router-dom` for detail panel state (hash-based), matching polaris pattern + +React quality bar: No prop drilling beyond 2 levels (use context), no inline function definitions in JSX that cause unnecessary re-renders on hot paths. + +--- + +## Target Project: tns-csi + +**tns-csi** (https://github.com/fenio/tns-csi) is a Kubernetes CSI driver for **TrueNAS Scale 25.10+** that provisions NFS, NVMe-oF, and iSCSI persistent volumes. It is in active early development (not production-ready). + +### Key Architecture Details + +- **Driver name / provisioner**: `tns.csi.io` +- **Namespace**: `kube-system` (default Helm install) +- **Label selectors**: + - Controller pod: `app.kubernetes.io/name=tns-csi-driver,app.kubernetes.io/component=controller` + - Node pod: `app.kubernetes.io/name=tns-csi-driver,app.kubernetes.io/component=node` +- **Protocols supported**: NFS (RWX/RWO/RWOP), NVMe-oF (RWO/RWOP), iSCSI (RWO/RWOP) +- **StorageClass `provisioner`**: `tns.csi.io` +- **Prometheus metrics endpoint**: `http://:8080/metrics` + +### ZFS Volume Metadata (on TrueNAS) + +Volumes are tagged with ZFS user properties (`tns-csi:*`). While these aren't directly queryable from Kubernetes, the plugin should surface equivalent Kubernetes-native data: +- `tns-csi:protocol` → visible in PV `.spec.csi.volumeAttributes.protocol` +- `tns-csi:managed_by` = `"tns-csi"` (ownership marker) +- `tns-csi:schema_version` = `"1"` + +### Kubernetes Resources to Surface + +The plugin should query and display the following: + +**StorageClasses** (filtered where `provisioner == "tns.csi.io"`): +- Name, protocol (from `parameters.protocol`), pool, server +- `allowVolumeExpansion`, `reclaimPolicy`, `volumeBindingMode` + +**PersistentVolumes** (filtered where `spec.csi.driver == "tns.csi.io"`): +- Name, capacity, status, reclaim policy, access modes +- CSI attributes: `protocol`, `server` +- Bound PVC reference + +**PersistentVolumeClaims** (cross-referenced with tns-csi PVs): +- Name, namespace, status, requested/allocated storage +- Access modes, StorageClass name +- Bound PV + +**VolumeSnapshots** (`snapshot.storage.k8s.io/v1`): +- Filtered by `spec.volumeSnapshotClassName` matching tns-csi snapshot classes +- Name, namespace, source PVC, size, readyToUse, creation time + +**CSI Driver** resource (`storage.k8s.io/v1` CSIDriver where `name == "tns.csi.io"`): +- Capabilities: volumeLifecycleModes, podInfoOnMount, attachRequired + +**Controller and Node Pods** (via label selector): +- Status, restarts, age, image version +- Ready/not-ready state + +### Prometheus Metrics (Available from Controller) + +The controller exposes `/metrics` on port `8080`. Key metrics to display: +``` +# Volume operations +tns_volume_operations_total{protocol, operation, status} +tns_volume_operations_duration_seconds{protocol, operation, status} +tns_volume_capacity_bytes{volume_id, protocol} + +# WebSocket connection health +tns_websocket_connected # gauge: 1=connected, 0=disconnected +tns_websocket_reconnects_total # counter +tns_websocket_message_duration_seconds{method} + +# CSI operations +tns_csi_operations_total{method, grpc_status_code} +tns_csi_operations_duration_seconds{method, grpc_status_code} +``` + +These should be fetched via the Kubernetes API proxy (not direct pod access), using `ApiProxy.request` from `@kinvolk/headlamp-plugin/lib`. + +--- + +## kbench Integration + +**kbench** (https://github.com/longhorn/kbench) is a Kubernetes-native FIO storage benchmark tool. + +### How kbench Works + +kbench runs as a Kubernetes **Job** backed by a **PersistentVolumeClaim**. When the Job completes (~6 minutes), results are captured from pod logs. + +### Kubernetes YAML to Deploy + +```yaml +# PVC +kind: PersistentVolumeClaim +apiVersion: v1 +metadata: + name: kbench-pvc- + namespace: default + labels: + app.kubernetes.io/managed-by: headlamp-tns-csi-plugin +spec: + storageClassName: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 33Gi # kbench needs ~33Gi minimum for 30G test +--- +apiVersion: batch/v1 +kind: Job +metadata: + name: kbench- + namespace: default + labels: + app.kubernetes.io/managed-by: headlamp-tns-csi-plugin + kbench: fio +spec: + template: + metadata: + labels: + kbench: fio + spec: + containers: + - name: kbench + image: yasker/kbench:latest + env: + - name: MODE + value: "full" + - name: FILE_NAME + value: "/volume/test" + - name: SIZE + value: "30G" + - name: CPU_IDLE_PROF + value: "disabled" + volumeMounts: + - name: vol + mountPath: /volume/ + restartPolicy: Never + volumes: + - name: vol + persistentVolumeClaim: + claimName: kbench-pvc- + backoffLimit: 0 +``` + +### Result Format + +kbench outputs a structured summary to stdout: +``` +===================== +FIO Benchmark Summary +For: test_device +SIZE: 30G +QUICK MODE: DISABLED +===================== +IOPS (Read/Write) + Random: 98368 / 89200 + Sequential: 108513 / 107636 + CPU Idleness: 68% + +Bandwidth in KiB/sec (Read/Write) + Random: 542447 / 514487 + Sequential: 552052 / 521330 + CPU Idleness: 99% + +Latency in ns (Read/Write) + Random: 97222 / 44548 + Sequential: 40483 / 44690 + CPU Idleness: 72% +``` + +The plugin must: +1. Parse this text output from pod logs +2. Display it in a structured, readable table/card format +3. Distinguish IOPS, Bandwidth, and Latency sections +4. Show Read/Write separately +5. Indicate "higher is better" for IOPS/Bandwidth/CPU Idleness and "lower is better" for Latency + +### kbench UX Flow + +1. User navigates to the "Benchmark" section of the plugin +2. User selects a tns-csi StorageClass from a dropdown +3. User optionally configures: size (default 30G), namespace (default: `default`), mode (default: `full`) +4. User clicks "Run Benchmark" — shows confirmation dialog explaining duration (~6 min) and resource requirements +5. Plugin creates PVC + Job via `ApiProxy.request` (POST to Kubernetes API) +6. Plugin polls Job status every 10 seconds, showing progress (Pending → Running → Complete/Failed) +7. When Job completes, plugin fetches logs and parses the FIO summary +8. Results displayed in a structured card with sections for IOPS, Bandwidth, Latency +9. User can dismiss results or run another benchmark +10. Past benchmark results are listed (fetched from existing kbench Jobs with label `app.kubernetes.io/managed-by: headlamp-tns-csi-plugin`) +11. Cleanup: offer a button to delete the Job + PVC when done + +--- + +## Headlamp Plugin Development Guide + +### Project Bootstrap + +```bash +npx @kinvolk/headlamp-plugin create headlamp-tns-csi-plugin +cd headlamp-tns-csi-plugin +npm install +npm start # dev server with hot reload +``` + +### package.json + +```json +{ + "name": "headlamp-tns-csi-plugin", + "version": "0.1.0", + "description": "Headlamp plugin for TNS-CSI driver visibility and benchmarking", + "license": "Apache-2.0", + "scripts": { + "start": "headlamp-plugin start", + "build": "headlamp-plugin build", + "package": "headlamp-plugin package", + "tsc": "tsc --noEmit", + "lint": "eslint --ext .ts,.tsx src/", + "test": "vitest run", + "test:watch": "vitest" + }, + "devDependencies": { + "@kinvolk/headlamp-plugin": "^0.13.0" + } +} +``` + +### Key Registration APIs + +All imports from `@kinvolk/headlamp-plugin/lib`: + +```typescript +import { + registerRoute, + registerSidebarEntry, + registerDetailsViewSection, + registerAppBarAction, + registerPluginSettings, +} from '@kinvolk/headlamp-plugin/lib'; +``` + +**Sidebar Entry:** +```typescript +registerSidebarEntry({ + parent: null, + name: 'tns-csi', + label: 'TNS CSI', + url: '/tns-csi', + icon: 'mdi:database', // MDI icon name +}); + +registerSidebarEntry({ + parent: 'tns-csi', + name: 'tns-csi-overview', + label: 'Overview', + url: '/tns-csi', + icon: 'mdi:view-dashboard', +}); +``` + +**Route:** +```typescript +registerRoute({ + path: '/tns-csi', + sidebar: 'tns-csi-overview', + name: 'tns-csi-overview', + exact: true, + component: () => , +}); +``` + +**Details View Section** (to inject tns-csi info on PVC/PV detail pages): +```typescript +registerDetailsViewSection(({ resource }) => { + if (resource?.kind !== 'PersistentVolumeClaim') return null; + // Only for tns-csi PVCs (check storageClassName or bound PV driver) + return ; +}); +``` + +### K8s Resource Hooks + +```typescript +import { K8s } from '@kinvolk/headlamp-plugin/lib'; + +// List StorageClasses +const [storageClasses, error] = K8s.ResourceClasses.StorageClass.useList(); + +// List PVCs +const [pvcs, error] = K8s.ResourceClasses.PersistentVolumeClaim.useList({ namespace: '' }); + +// List PVs (cluster-scoped) +const [pvs, error] = K8s.ResourceClasses.PersistentVolume.useList(); + +// List Jobs +const [jobs, error] = K8s.ResourceClasses.Job.useList({ namespace: 'default' }); + +// Custom Resources (VolumeSnapshots) +// Use K8s.makeCustomResourceClass or ApiProxy.request for CRDs +``` + +### ApiProxy for Custom Requests + +```typescript +import { ApiProxy } from '@kinvolk/headlamp-plugin/lib'; + +// Fetch pod logs +const logs = await ApiProxy.request( + `/api/v1/namespaces/${namespace}/pods/${podName}/log?container=kbench×tamps=false` +); + +// Create a Job +await ApiProxy.request('/apis/batch/v1/namespaces/default/jobs', { + method: 'POST', + body: JSON.stringify(jobManifest), + headers: { 'Content-Type': 'application/json' }, +}); + +// Fetch metrics (via proxy to controller pod) +const metricsText = await ApiProxy.request( + `/api/v1/namespaces/kube-system/pods/${controllerPodName}:8080/proxy/metrics` +); +``` + +### Common UI Components + +All from `@kinvolk/headlamp-plugin/lib/CommonComponents`: + +```typescript +import { + SectionBox, + SectionHeader, + SimpleTable, + NameValueTable, + StatusLabel, + Loader, + PercentageBar, + PercentageCircle, + Link, +} from '@kinvolk/headlamp-plugin/lib/CommonComponents'; +``` + +Usage patterns (from existing headlamp-polaris-plugin): +- `` — card-style container with title +- `` — page header +- `` — sortable data table +- `` — two-column key-value display +- `text` — colored badge +- `` — loading spinner +- `` — donut chart +- `` — horizontal bar breakdown + +### Data Pattern: Context + Hook + +Follow the pattern from headlamp-polaris-plugin: + +```typescript +// src/api/TnsCsiDataContext.tsx +import React, { createContext, useContext, useState } from 'react'; + +interface TnsCsiContextType { + storageClasses: StorageClass[] | null; + pvs: PV[] | null; + pvcs: PVC[] | null; + loading: boolean; + error: string | null; + refresh: () => void; +} + +const TnsCsiContext = createContext(null); + +export function TnsCsiDataProvider({ children }: { children: React.ReactNode }) { + // ... fetch and provide data +} + +export function useTnsCsiContext() { + const ctx = useContext(TnsCsiContext); + if (!ctx) throw new Error('useTnsCsiContext must be used within TnsCsiDataProvider'); + return ctx; +} +``` + +### Testing + +Use **vitest** + **@testing-library/react** (as in headlamp-polaris-plugin): + +```typescript +// vitest.config.ts (auto-configured by headlamp-plugin) +// src/components/Overview.test.tsx + +import { render, screen } from '@testing-library/react'; +import { describe, it, expect, vi } from 'vitest'; + +vi.mock('@kinvolk/headlamp-plugin/lib', () => ({ + ApiProxy: { request: vi.fn() }, + K8s: { ResourceClasses: { StorageClass: { useList: vi.fn(() => [[], null]) } } }, +})); +``` + +--- + +## Plugin Architecture + +### File Structure + +``` +headlamp-tns-csi-plugin/ +├── package.json +├── tsconfig.json +├── src/ +│ ├── index.tsx # Plugin entry: register routes, sidebar, detail sections +│ ├── api/ +│ │ ├── k8s.ts # Helper functions: filter tns-csi resources, parse CSI attrs +│ │ ├── metrics.ts # Prometheus metrics parsing (text format) +│ │ ├── kbench.ts # kbench Job/PVC creation, log parsing, result types +│ │ └── TnsCsiDataContext.tsx # React context + provider for shared data +│ └── components/ +│ ├── OverviewPage.tsx # Main dashboard: driver health, stats summary +│ ├── StorageClassesPage.tsx # List of tns-csi StorageClasses +│ ├── VolumesPage.tsx # List of tns-csi PVs with PVC cross-reference +│ ├── SnapshotsPage.tsx # VolumeSnapshot list (tns-csi) +│ ├── MetricsPage.tsx # Prometheus metrics visualization +│ ├── BenchmarkPage.tsx # kbench trigger + results +│ ├── DriverStatusCard.tsx # Reusable: controller/node pod health +│ └── PVCDetailSection.tsx # Injected into PVC detail view +``` + +### Sidebar Navigation + +``` +TNS CSI (top-level, icon: mdi:database-cog) +├── Overview (/tns-csi) +├── Storage Classes (/tns-csi/storage-classes) +├── Volumes (/tns-csi/volumes) +├── Snapshots (/tns-csi/snapshots) +├── Metrics (/tns-csi/metrics) +└── Benchmark (/tns-csi/benchmark) +``` + +--- + +## Page Specifications + +### 1. Overview Page (`/tns-csi`) + +**Sections:** + +**Driver Status Card:** +- CSIDriver resource: name, attached, capabilities +- Controller pod(s): status, restarts, image version +- Node pod(s): status per node, restarts +- WebSocket connection health (from Prometheus `tns_websocket_connected`) + +**Storage Summary:** +- Total StorageClasses managed by tns-csi +- Breakdown by protocol (NFS / NVMe-oF / iSCSI) — using `PercentageBar` +- Total PVs, total capacity (sum of `spec.capacity.storage`) +- PVC status breakdown: Bound / Pending / Lost + +**Recent Activity:** +- Last N volume operations (inferred from recent PV creation timestamps) +- Any PVCs in non-Bound state (highlighted as warnings) + +### 2. Storage Classes Page (`/tns-csi/storage-classes`) + +**Filter**: `storageClass.provisioner === 'tns.csi.io'` + +**Table columns:** +| Column | Source | +|--------|--------| +| Name | `.metadata.name` | +| Protocol | `.parameters.protocol` (nfs/nvmeof/iscsi) | +| Pool | `.parameters.pool` | +| Server | `.parameters.server` | +| Reclaim Policy | `.reclaimPolicy` | +| Volume Binding | `.volumeBindingMode` | +| Allow Expansion | `.allowVolumeExpansion` | +| Delete Strategy | `.parameters.deleteStrategy` (retain/delete) | +| Encryption | `.parameters.encryption` (bool) | +| PV Count | (cross-ref from PV list) | + +Click row → detail panel showing all parameters + +### 3. Volumes Page (`/tns-csi/volumes`) + +**Filter**: `pv.spec.csi.driver === 'tns.csi.io'` + +**Table columns:** +| Column | Source | +|--------|--------| +| PVC Name | `.spec.claimRef.name` | +| Namespace | `.spec.claimRef.namespace` | +| Protocol | `.spec.csi.volumeAttributes.protocol` | +| Server | `.spec.csi.volumeAttributes.server` | +| Capacity | `.spec.capacity.storage` | +| Access Modes | `.spec.accessModes` | +| Reclaim Policy | `.spec.persistentVolumeReclaimPolicy` | +| Status | `.status.phase` (color-coded) | +| StorageClass | `.spec.storageClassName` | +| Age | `.metadata.creationTimestamp` | + +Click row → detail panel showing full CSI attributes and linked snapshot list + +### 4. Snapshots Page (`/tns-csi/snapshots`) + +**Resource**: `snapshot.storage.k8s.io/v1` VolumeSnapshot + +**Filter**: VolumeSnapshotClass's `driver === 'tns.csi.io'` +(fetch VolumeSnapshotClasses first, then filter VolumeSnapshots by snapshotClassName) + +Use `ApiProxy.request('/apis/snapshot.storage.k8s.io/v1/volumesnapshots')` since VolumeSnapshot is a CRD. + +**Table columns:** +| Column | Source | +|--------|--------| +| Name | `.metadata.name` | +| Namespace | `.metadata.namespace` | +| Source PVC | `.spec.source.persistentVolumeClaimName` | +| Snapshot Class | `.spec.volumeSnapshotClassName` | +| Ready | `.status.readyToUse` (boolean badge) | +| Size | `.status.restoreSize` | +| Age | `.metadata.creationTimestamp` | + +### 5. Metrics Page (`/tns-csi/metrics`) + +Fetch Prometheus metrics text via ApiProxy from the controller pod metrics endpoint. + +Display in cards: + +**WebSocket Health:** +- Connection status (green/red indicator from `tns_websocket_connected`) +- Total reconnects (`tns_websocket_reconnects_total`) +- Messages sent/received (`tns_websocket_messages_total`) + +**Volume Operations:** +- Operations by protocol (`tns_volume_operations_total`) +- Error rate per protocol/operation +- Total provisioned capacity (from `tns_volume_capacity_bytes`) + +**CSI Operations:** +- Operation counts by method (`tns_csi_operations_total`) +- Error rates + +Include a "Refresh" button and last-updated timestamp. + +Note: If the controller pod cannot be found or metrics are unavailable, display a helpful message explaining how metrics are configured. + +### 6. Benchmark Page (`/tns-csi/benchmark`) + +#### Run New Benchmark Section + +**Form:** +- **Storage Class** (required): dropdown of tns-csi StorageClasses +- **Namespace**: text input, default `default` +- **Test Size**: text input, default `30G` (with note: must be ~10% smaller than PVC) +- **Mode**: select — `full` (default), `quick`, or specific modes (random-read-iops, etc.) + +**Run Button** → opens confirmation dialog: +> "This will create a ~33Gi PVC and run FIO benchmark (~6 minutes). The Job and PVC will remain until manually deleted. Continue?" + +After confirmation: +1. Generate unique suffix (short UUID) +2. Create PVC via POST to `/apis/v1/namespaces/{ns}/persistentvolumeclaims` +3. Create Job via POST to `/apis/batch/v1/namespaces/{ns}/jobs` +4. Show status: "Creating PVC... → Waiting for PVC to bind... → Job running... → Parsing results..." + +**Progress Polling** (every 10 seconds): +- Fetch Job status +- Show phase: `Pending` / `Active` / `Succeeded` / `Failed` +- Show pod status if available + +#### Results Display + +When Job succeeds, fetch logs and parse the FIO summary text: + +```typescript +interface KbenchResult { + iops: { + randomRead: number; + randomWrite: number; + sequentialRead: number; + sequentialWrite: number; + cpuIdleness: number; + }; + bandwidth: { + randomRead: number; + randomWrite: number; + sequentialRead: number; + sequentialWrite: number; + cpuIdleness: number; + }; + latency: { + randomRead: number; + randomWrite: number; + sequentialRead: number; + sequentialWrite: number; + cpuIdleness: number; + }; + metadata: { + storageClass: string; + size: string; + startedAt: string; + completedAt: string; + jobName: string; + namespace: string; + }; +} +``` + +Display results in three cards (IOPS, Bandwidth, Latency), each with a table: +| Metric | Read | Write | Note | +|--------|------|-------|------| +| Random | ... | ... | Higher is better | +| Sequential | ... | ... | Higher is better | +| CPU Idleness | ... | - | Higher is better | + +For Latency: "Lower is better" note instead. + +Format values: +- IOPS: thousands separator (e.g., `98,368`) +- Bandwidth: human-readable (e.g., `529 MB/s`) +- Latency: microseconds or milliseconds (e.g., `97 µs`) + +#### Past Benchmarks List + +List existing Jobs with label `app.kubernetes.io/managed-by: headlamp-tns-csi-plugin` and `kbench: fio`: +| Column | Value | +|--------|-------| +| Job Name | link to Job detail | +| Namespace | namespace | +| Storage Class | (from Job annotations or labels) | +| Status | Active/Complete/Failed | +| Started | creation timestamp | +| Actions | "View Results" / "Delete" | + +**Delete** action removes both the Job and the PVC. + +--- + +## PVC Detail Section Injection + +Register a `registerDetailsViewSection` that injects a "TNS-CSI Storage Details" section on PVC detail pages when the bound PV uses `tns.csi.io` as the CSI driver. + +Display: +- Protocol (NFS/NVMe-oF/iSCSI) — with icon +- Server (TrueNAS IP) +- ZFS pool +- StorageClass parameters relevant to this volume +- Link to Volumes page filtered to this PVC + +--- + +## Implementation Requirements + +### Filtering + +**StorageClass filter**: `sc.spec.provisioner === 'tns.csi.io'` + +**PV filter**: `pv.spec.csi?.driver === 'tns.csi.io'` + +**PVC cross-reference**: For each tns-csi PV, find the PVC via `pv.spec.claimRef.{name,namespace}` + +**VolumeSnapshot filter**: +1. Get all VolumeSnapshotClasses: `GET /apis/snapshot.storage.k8s.io/v1/volumesnapshotclasses` +2. Filter where `.driver === 'tns.csi.io'` +3. Get all VolumeSnapshots: `GET /apis/snapshot.storage.k8s.io/v1/volumesnapshots` +4. Filter where `.spec.volumeSnapshotClassName` is in the tns-csi snapshot class names + +### Error Handling + +- **Driver not installed**: If no CSIDriver `tns.csi.io` exists, show a clear banner: "TNS-CSI driver not detected on this cluster. Install via Helm..." +- **No snapshots CRD**: If VolumeSnapshot CRDs are not present, show: "Volume snapshot CRDs not installed. See tns-csi documentation." +- **Metrics unavailable**: If controller pod not found or metrics request fails, show: "Metrics unavailable. Ensure controller pod is running with metrics enabled (port 8080)." +- **kbench Job fails**: Show job logs, offer to re-run or cleanup + +### Important Developer Notes from tns-csi + +Based on the upstream documentation: + +1. **Early development warning**: The driver is NOT production-ready. The plugin UI should prominently note this on the Overview page. + +2. **NVMe-oF requires static IP**: Display a note on the NVMe-oF StorageClass detail that DHCP is not supported. + +3. **Protocol-specific prerequisites**: Display prerequisite notes per protocol: + - NFS: `nfs-common` / `nfs-utils` on nodes + - NVMe-oF: `nvme-cli`, kernel modules `nvme-tcp`/`nvme-fabrics` + - iSCSI: `open-iscsi` on nodes + +4. **WebSocket API dependency**: The driver uses TrueNAS WebSocket API (`wss://`). Connection health is critical — the Metrics page `tns_websocket_connected` gauge is the primary health indicator. + +5. **Volume adoption**: Volumes tagged with `tns-csi:adoptable=true` can be adopted cross-cluster. This is surfaced as metadata on the PV detail section. + +6. **Provisioner ID**: Always use `tns.csi.io` (not `tns-csi` or variations). + +7. **Controller logs command** (show in troubleshooting section): + ``` + kubectl logs -n kube-system -l app.kubernetes.io/name=tns-csi-driver,app.kubernetes.io/component=controller + ``` + +### kbench Important Notes + +From the kbench documentation: +- **Test SIZE must be at least 10% smaller than PVC size** (default: 30G test in 33Gi PVC) +- For accurate results, **SIZE should be at least 25× the read/write bandwidth** to avoid cache effects +- A full benchmark takes **~6 minutes**; do not cancel mid-run +- Always test local storage baseline first for comparison +- **CPU Idleness for Latency benchmark should be ≥40%** — if lower, the result may be CPU-starved +- Lower read latency than local storage is a red flag (likely caching) +- Better write performance than local storage is almost impossible for distributed storage without cache + +Display these notes as info tooltips or a "Benchmark Guide" info panel. + +--- + +## Code Quality Requirements + +### TypeScript Checklist + +- [ ] `strict: true` in `tsconfig.json` with all compiler flags (`noUncheckedIndexedAccess`, `exactOptionalPropertyTypes`, etc.) +- [ ] Zero `any` — use `unknown` + type guards for external data (API responses, log parsing) +- [ ] All public APIs have 100% type coverage +- [ ] `import type` used for type-only imports +- [ ] All K8s resource shapes typed — use `KubeObject` base type from headlamp where available +- [ ] Discriminated unions for all state machines (benchmark flow, snapshot CRD availability) +- [ ] Type guards at every external data boundary (API response parsing, Prometheus text parsing, pod log parsing) +- [ ] No `@ts-ignore` without inline explanation comment + +### React Checklist + +- [ ] Functional components with hooks only — no class components +- [ ] All `useEffect` dependency arrays correct — no stale closures, no missing deps +- [ ] `useMemo` on expensive filtering (tns-csi PV/PVC cross-reference computation) +- [ ] `useCallback` for stable event handlers passed as props (open/close panel, refresh) +- [ ] Context values memoized to prevent unnecessary re-renders +- [ ] ARIA labels on all interactive elements (buttons, selects, drawer controls) +- [ ] Keyboard navigation: Escape closes detail panels +- [ ] URL hash state for detail panel (matching polaris plugin pattern) +- [ ] Use headlamp's built-in component library exclusively — **do NOT add MUI, Ant Design, or other UI libraries** + +### Error Boundary Pattern + +Wrap each page with the exact loading/error pattern from `headlamp-polaris-plugin`: + +```typescript +if (loading) return ; +if (error) return ( + + {error} }]} /> + +); +if (!data) return ( + + + +); +``` + +### Kubernetes Checklist + +- [ ] Check CSIDriver `tns.csi.io` existence before rendering any pages — show install banner if absent +- [ ] VolumeSnapshot CRD availability checked before Snapshots page renders — show degraded state if absent +- [ ] Metrics endpoint access via API proxy (`/api/v1/namespaces/kube-system/pods/:8080/proxy/metrics`) — handle 404/timeout +- [ ] kbench Job/PVC labeled with `app.kubernetes.io/managed-by: headlamp-tns-csi-plugin` for tracking +- [ ] kbench PVC cleanup offered after benchmark completion — never auto-delete without user confirmation +- [ ] Use correct label selectors for tns-csi pods: + - Controller: `app.kubernetes.io/name=tns-csi-driver,app.kubernetes.io/component=controller` + - Node: `app.kubernetes.io/name=tns-csi-driver,app.kubernetes.io/component=node` + +### Plugin Settings + +Register plugin settings for configurable options: +- Default namespace for kbench jobs +- Metrics refresh interval (default: 60s) +- Automatically cleanup completed kbench jobs (bool, default: false) + +```typescript +registerPluginSettings('headlamp-tns-csi-plugin', SettingsComponent, true); +``` + +--- + +## Reference: Existing Plugin Patterns + +Study the `headlamp-polaris-plugin` at `../headlamp-polaris-plugin/` for patterns: + +**index.tsx**: `registerSidebarEntry`, `registerRoute`, `registerDetailsViewSection`, `registerAppBarAction`, `registerPluginSettings` + +**Data context pattern**: `PolarisDataProvider` → `usePolarisDataContext()` — replicate this for tns-csi data + +**Component patterns**: +- `DashboardView.tsx`: `SectionHeader` + multiple `SectionBox` + `PercentageCircle` + `PercentageBar` + `SimpleTable` +- `NamespacesListView.tsx`: `SimpleTable` with click handlers, slide-in detail panel, keyboard navigation (Escape to close), URL hash state + +**API pattern**: `ApiProxy.request(url)` for all Kubernetes API calls, including CRDs + +**Testing pattern**: `vitest` + `vi.mock('@kinvolk/headlamp-plugin/lib', ...)` for mocking K8s APIs + +--- + +## Deliverables + +Implement the complete plugin with: + +1. **`src/index.tsx`** — entry point with all registrations +2. **`src/api/k8s.ts`** — K8s helper functions and type definitions +3. **`src/api/metrics.ts`** — Prometheus text format parser +4. **`src/api/kbench.ts`** — kbench Job management and log parser +5. **`src/api/TnsCsiDataContext.tsx`** — React context provider +6. **`src/components/OverviewPage.tsx`** +7. **`src/components/StorageClassesPage.tsx`** +8. **`src/components/VolumesPage.tsx`** +9. **`src/components/SnapshotsPage.tsx`** +10. **`src/components/MetricsPage.tsx`** +11. **`src/components/BenchmarkPage.tsx`** +12. **`src/components/DriverStatusCard.tsx`** +13. **`src/components/PVCDetailSection.tsx`** +14. **Unit tests** for all API modules and key components +15. **`package.json`** with correct headlamp-plugin dependency + +The plugin must be buildable with `npm run build` and loadable by headlamp without errors.