Feat/temporal (#46)

* refactor: modularize claude-executor and extract shared utilities - Extract message handling into src/ai/message-handlers.ts with pure functions - Extract output formatting into src/ai/output-formatters.ts - Extract progress management into src/ai/progress-manager.ts - Add audit-logger.ts with Null Object pattern for optional logging - Add shared utilities: formatting.ts, file-io.ts, functional.ts - Consolidate getPromptNameForAgent into src/types/agents.ts * feat: add Claude Code custom commands for debug and review * feat: add Temporal integration foundation (phase 1-2) - Add Temporal SDK dependencies (@temporalio/client, worker, workflow, activity) - Add shared types for pipeline state, metrics, and progress queries - Add classifyErrorForTemporal() for retry behavior classification - Add docker-compose for Temporal server with SQLite persistence * feat: add Temporal activities for agent execution (phase 3) - Add activities.ts with heartbeat loop, git checkpoint/rollback, and error classification - Export runClaudePrompt, validateAgentOutput, ClaudePromptResult for Temporal use - Track attempt number via Temporal Context for accurate audit logging - Rollback git workspace before retry to ensure clean state * feat: add Temporal workflow for 5-phase pipeline orchestration (phase 4) * feat: add Temporal worker, client, and query tools (phase 5) - Add worker.ts with workflow bundling and graceful shutdown - Add client.ts CLI to start pipelines with progress polling - Add query.ts CLI to inspect running workflow state - Fix buffer overflow by truncating error messages and stack traces - Skip git operations gracefully on non-git repositories - Add kill.sh/start.sh dev scripts and Dockerfile.worker * feat: fix Docker worker container setup - Install uv instead of deprecated uvx package - Add mcp-server and configs directories to container - Mount target repo dynamically via TARGET_REPO env variable * fix: add report assembly step to Temporal workflow - Add assembleReportActivity to concatenate exploitation evidence files before report agent runs - Call assembleFinalReport in workflow Phase 5 before runReportAgent - Ensure deliverables directory exists before writing final report - Simplify pipeline-testing report prompt to just prepend header * refactor: consolidate Docker setup to root docker-compose.yml * feat: improve Temporal client UX and env handling - Change default to fire-and-forget (--wait flag to opt-in) - Add splash screen and improve console output formatting - Add .env to gitignore, remove from dockerignore for container access - Add Taskfile for common development commands * refactor: simplify session ID handling and improve Taskfile options - Include hostname in workflow ID for better audit log organization - Extract sanitizeHostname utility to audit/utils.ts for reuse - Remove unused generateSessionLogPath and buildLogFilePath functions - Simplify Taskfile with CONFIG/OUTPUT/CLEAN named parameters * chore: add .env.example and simplify .gitignore * docs: update README and CLAUDE.md for Temporal workflow usage - Replace Docker CLI instructions with Task-based commands - Add monitoring/stopping sections and workflow examples - Document Temporal orchestration layer and troubleshooting - Simplify file structure to key files overview * refactor: replace Taskfile with bash CLI script - Add shannon bash script with start/logs/query/stop/help commands - Remove Taskfile.yml dependency (no longer requires Task installation) - Update README.md and CLAUDE.md to use ./shannon commands - Update client.ts output to show ./shannon commands * docs: fix deliverable filename in README * refactor: remove direct CLI and .shannon-store.json in favor of Temporal - Delete src/shannon.ts direct CLI entry point (Temporal is now the only mode) - Remove .shannon-store.json session lock (Temporal handles workflow deduplication) - Remove broken scripts/export-metrics.js (imported non-existent function) - Update package.json to remove main, start script, and bin entry - Clean up CLAUDE.md and debug.md to remove obsolete references * chore: remove licensing comments from prompt files to prevent leaking into actual prompts * fix: resolve parallel workflow race conditions and retry logic bugs - Fix save_deliverable race condition using closure pattern instead of global variable - Fix error classification order so OutputValidationError matches before generic validation - Fix ApplicationFailure re-classification bug by checking instanceof before re-throwing - Add per-error-type retry limits (3 for output validation, 50 for billing) - Add fast retry intervals for pipeline testing mode (10s vs 5min) - Increase worker concurrent activities to 25 for parallel workflows * refactor: pipeline vuln→exploit workflow for parallel execution - Replace sync barrier between vuln/exploit phases with independent pipelines - Each vuln type runs: vuln agent → queue check → conditional exploit - Add checkExploitationQueue activity to skip exploits when no vulns found - Use Promise.allSettled for graceful failure handling across pipelines - Add PipelineSummary type for aggregated cost/duration/turns metrics * fix: re-throw retryable errors in checkExploitationQueue * fix: detect and retry on Claude Code spending cap errors - Add spending cap pattern detection in detectApiError() with retryable error - Add matching patterns to classifyErrorForTemporal() for proper Temporal retry - Add defense-in-depth safeguard in runClaudePrompt() for $0 cost / low turn detection - Add final sanity check in activities before declaring success * fix: increase heartbeat timeout to prevent false worker-dead detection Original 30s timeout was from POC spec assuming <5min activities. With hour-long activities and multiple concurrent workflows sharing one worker, resource contention causes event loop stalls exceeding 30s, triggering false heartbeat timeouts. Increased to 10min (prod) and 5min (testing). * fix: temporal db init * fix: persist home dir * feat: add per-workflow unified logging with ./shannon logs ID=<workflow-id> - Add WorkflowLogger class for human-readable, per-workflow log files - Create workflow.log in audit-logs/{workflowId}/ with phase, agent, tool, and LLM events - Update ./shannon logs to require ID param and tail specific workflow log - Add phase transition logging at workflow boundaries - Include workflow completion summary with agent breakdown (duration, cost) - Mount audit-logs volume in docker-compose for host access --------- Co-authored-by: ezl-keygraph <ezhil@keygraph.io>
2026-01-15 10:36:11 -08:00
parent 45acb16711
commit 51e621d0d5
77 changed files with 6117 additions and 2417 deletions
@@ -79,10 +79,11 @@ Shannon is available in two editions:
 - [Product Line](#-product-line)
 - [Setup & Usage Instructions](#-setup--usage-instructions)
  - [Prerequisites](#prerequisites)
-  - [Authentication Setup](#authentication-setup)
-  - [Quick Start with Docker](#quick-start-with-docker)
+  - [Quick Start](#quick-start)
+  - [Monitoring Progress](#monitoring-progress)
+  - [Stopping Shannon](#stopping-shannon)
+  - [Usage Examples](#usage-examples)
  - [Configuration (Optional)](#configuration-optional)
-  - [Usage Patterns](#usage-patterns)
  - [Output and Results](#output-and-results)
 - [Sample Reports & Benchmarks](#-sample-reports--benchmarks)
 - [Architecture](#-architecture)
@@ -98,36 +99,71 @@ Shannon is available in two editions:

 ### Prerequisites

- **Claude Console account with credits** - Required for AI-powered analysis
- **Docker installed** - Primary deployment method
+- **Docker** - Container runtime ([Install Docker](https://docs.docker.com/get-docker/))
+- **Anthropic API key or Claude Code OAuth token** - Get from [Anthropic Console](https://console.anthropic.com)

-### Authentication Setup
-
-You need either a **Claude Code OAuth token** or an **Anthropic API key** to run Shannon. Get your token from the [Anthropic Console](https://console.anthropic.com) and pass it to Docker via the `-e` flag.
-
-### Environment Configuration (Recommended)
-
-To prevent Claude Code from hitting token limits during long report generation, set the max output tokens environment variable:
-
-**For local runs:**
-```bash
-export CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000
-```
-
-**For Docker runs:**
-```bash
-e CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000
-```
-
-### Quick Start with Docker
-
-#### Build the Container
+### Quick Start

 ```bash
-docker build -t shannon:latest .
+# 1. Clone Shannon
+git clone https://github.com/KeygraphHQ/shannon.git
+cd shannon
+
+# 2. Configure credentials (choose one method)
+
+# Option A: Export environment variables
+export ANTHROPIC_API_KEY="your-api-key"              # or CLAUDE_CODE_OAUTH_TOKEN
+export CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000           # recommended
+
+# Option B: Create a .env file
+cat > .env << 'EOF'
+ANTHROPIC_API_KEY=your-api-key
+CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000
+EOF
+
+# 3. Run a pentest
+./shannon start URL=https://your-app.com REPO=/path/to/your/repo
 ```

-#### Prepare Your Repository
+Shannon will build the containers, start the workflow, and return a workflow ID. The pentest runs in the background.
+
+### Monitoring Progress
+
+```bash
+# View real-time worker logs
+./shannon logs
+
+# Query a specific workflow's progress
+./shannon query ID=shannon-1234567890
+
+# Open the Temporal Web UI for detailed monitoring
+open http://localhost:8233
+```
+
+### Stopping Shannon
+
+```bash
+# Stop all containers (preserves workflow data)
+./shannon stop
+
+# Full cleanup (removes all data)
+./shannon stop CLEAN=true
+```
+
+### Usage Examples
+
+```bash
+# Basic pentest
+./shannon start URL=https://example.com REPO=/path/to/repo
+
+# With a configuration file
+./shannon start URL=https://example.com REPO=/path/to/repo CONFIG=./configs/my-config.yaml
+
+# Custom output directory
+./shannon start URL=https://example.com REPO=/path/to/repo OUTPUT=./my-reports
+```
+
+### Prepare Your Repository

 Shannon is designed for **web application security testing** and expects all application code to be available in a single directory structure. This works well for:

@@ -137,105 +173,35 @@ Shannon is designed for **web application security testing** and expects all app
 **For monorepos:**

 ```bash
-git clone https://github.com/your-org/your-monorepo.git repos/your-app
+git clone https://github.com/your-org/your-monorepo.git /path/to/your-app
 ```

 **For multi-repository applications** (e.g., separate frontend/backend):

 ```bash
-mkdir repos/your-app
-cd repos/your-app
+mkdir /path/to/your-app
+cd /path/to/your-app
 git clone https://github.com/your-org/frontend.git
 git clone https://github.com/your-org/backend.git
 git clone https://github.com/your-org/api.git
 ```

-**For existing local repositories:**
-
-```bash
-cp -r /path/to/your-existing-repo repos/your-app
-```
-
-#### Run Your First Pentest
-
-**With Claude Console OAuth Token:**
-
-```bash
-docker run --rm -it \
-      --network host \
-      --cap-add=NET_RAW \
-      --cap-add=NET_ADMIN \
-      -e CLAUDE_CODE_OAUTH_TOKEN="$CLAUDE_CODE_OAUTH_TOKEN" \
-      -e CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000 \
-      -v "$(pwd)/repos:/app/repos" \
-      -v "$(pwd)/configs:/app/configs" \
-      # Comment below line if using custom output directory
-      -v "$(pwd)/audit-logs:/app/audit-logs" \
-      shannon:latest \
-      "https://your-app.com/" \
-      "/app/repos/your-app" \
-      --config /app/configs/example-config.yaml
-      # Optional: uncomment below for custom output directory
-      # -v "$(pwd)/reports:/app/reports" \
-      # --output /app/reports
-```
-
-**With Anthropic API Key:**
-
-```bash
-docker run --rm -it \
-      --network host \
-      --cap-add=NET_RAW \
-      --cap-add=NET_ADMIN \
-      -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
-      -e CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000 \
-      -v "$(pwd)/repos:/app/repos" \
-      -v "$(pwd)/configs:/app/configs" \
-      # Comment below line if using custom output directory
-      -v "$(pwd)/audit-logs:/app/audit-logs" \
-      shannon:latest \
-      "https://your-app.com/" \
-      "/app/repos/your-app" \
-      --config /app/configs/example-config.yaml
-      # Optional: uncomment below for custom output directory
-      # -v "$(pwd)/reports:/app/reports" \
-      # --output /app/reports
-```
-
-#### Platform-Specific Instructions
+### Platform-Specific Instructions

 **For Linux (Native Docker):**

-Add the `--user $(id -u):$(id -g)` flag to the Docker commands above to avoid permission issues with volume mounts. Docker Desktop on macOS and Windows handles this automatically, but native Linux Docker requires explicit user mapping.
+You may need to run commands with `sudo` depending on your Docker setup. If you encounter permission issues with output files, ensure your user has access to the Docker socket.

-**Network Capabilities:**
+**For macOS:**

- `--cap-add=NET_RAW` - Enables advanced port scanning with nmap
- `--cap-add=NET_ADMIN` - Allows network administration for security tools
- `--network host` - Provides access to target network interfaces
+Works out of the box with Docker Desktop installed.

 **Testing Local Applications:**

 Docker containers cannot reach `localhost` on your host machine. Use `host.docker.internal` in place of `localhost`:

 ```bash
-docker run --rm -it \
-      --add-host=host.docker.internal:host-gateway \
-      --cap-add=NET_RAW \
-      --cap-add=NET_ADMIN \
-      -e CLAUDE_CODE_OAUTH_TOKEN="$CLAUDE_CODE_OAUTH_TOKEN" \
-      -e CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000 \
-      -v "$(pwd)/repos:/app/repos" \
-      -v "$(pwd)/configs:/app/configs" \
-      # Comment below line if using custom output directory
-      -v "$(pwd)/audit-logs:/app/audit-logs" \
-      shannon:latest \
-      "http://host.docker.internal:3000" \
-      "/app/repos/your-app" \
-      --config /app/configs/example-config.yaml
-      # Optional: uncomment below for custom output directory
-      # -v "$(pwd)/reports:/app/reports" \
-      # --output /app/reports
+./shannon start URL=http://host.docker.internal:3000 REPO=/path/to/repo
 ```

 ### Configuration (Optional)
@@ -288,12 +254,17 @@ If your application uses two-factor authentication, simply add the TOTP secret t

 ### Output and Results

-All results are saved to `./audit-logs/` by default. Use `--output <path>` to specify a custom directory. If using `--output`, ensure that path is mounted to an accessible host directory (e.g., `-v "$(pwd)/custom-directory:/app/reports"`).
+All results are saved to `./audit-logs/{hostname}_{sessionId}/` by default. Use `--output <path>` to specify a custom directory.

- **Pre-reconnaissance reports** - External scan results
- **Vulnerability assessments** - Potential vulnerabilities from thorough code analysis and network mapping
- **Exploitation results** - Proof-of-concept attempts
- **Executive reports** - Business-focused security summaries
+Output structure:
+```
+audit-logs/{hostname}_{sessionId}/
+├── session.json          # Metrics and session data
+├── agents/               # Per-agent execution logs
+├── prompts/              # Prompt snapshots for reproducibility
+└── deliverables/
+    └── comprehensive_security_assessment_report.md   # Final comprehensive security report
+```

 ---