docs: update CLAUDE.md and commands for services-layer architecture

This commit is contained in:
ajmallesh
2026-02-16 18:15:52 -08:00
parent 16de74e0be
commit 413c47af5c
3 changed files with 58 additions and 46 deletions
+22 -14
View File
@@ -8,13 +8,14 @@ You are debugging an issue. Follow this structured approach to avoid spinning in
- Read the full error message and stack trace - Read the full error message and stack trace
- Identify the layer where the error originated: - Identify the layer where the error originated:
- **CLI/Args** - Input validation, path resolution - **CLI/Args** - Input validation, path resolution
- **Config Parsing** - YAML parsing, JSON Schema validation - **Config Parsing** - YAML parsing, JSON Schema validation (`src/config-parser.ts`)
- **Session Management** - Mutex, session.json, lock files - **Session Management** - Agent definitions (`src/session-manager.ts`), mutex (`src/utils/concurrency.ts`)
- **Audit System** - Logging, metrics tracking, atomic writes - **DI Container** - Container initialization/lookup (`src/services/container.ts`)
- **Claude SDK** - Agent execution, MCP servers, turn handling - **Services** - AgentExecutionService, ConfigLoaderService, ExploitationCheckerService, error-handling (`src/services/`)
- **Git Operations** - Checkpoints, rollback, commit - **Audit System** - Logging, metrics tracking, atomic writes (`src/audit/`)
- **Tool Execution** - nmap, subfinder, whatweb - **Claude SDK** - Agent execution, MCP servers, turn handling (`src/ai/claude-executor.ts`)
- **Validation** - Deliverable checks, queue validation - **Git Operations** - Checkpoints, rollback, commit (`src/services/git-manager.ts`)
- **Validation** - Deliverable checks, queue validation (`src/services/queue-validation.ts`)
## Step 2: Check Relevant Logs ## Step 2: Check Relevant Logs
@@ -37,12 +38,14 @@ For Shannon, trace through these layers:
1. **Temporal Client**`src/temporal/client.ts` - Workflow initiation 1. **Temporal Client**`src/temporal/client.ts` - Workflow initiation
2. **Workflow**`src/temporal/workflows.ts` - Pipeline orchestration 2. **Workflow**`src/temporal/workflows.ts` - Pipeline orchestration
3. **Activities**`src/temporal/activities.ts` - Agent execution with heartbeats 3. **Activities**`src/temporal/activities.ts` - Thin wrappers: heartbeat, error classification
4. **Config**`src/config-parser.ts` - YAML loading, schema validation 4. **Container**`src/services/container.ts` - Per-workflow DI
5. **Session**`src/session-manager.ts` - Agent definitions, execution order 5. **Services**`src/services/agent-execution.ts` - Agent lifecycle
6. **Audit**`src/audit/audit-session.ts` - Logging facade, metrics tracking 6. **Config**`src/config-parser.ts` via `src/services/config-loader.ts`
7. **Executor**`src/ai/claude-executor.ts` - SDK calls, MCP setup, retry logic 7. **Prompts**`src/services/prompt-manager.ts`
8. **Validation**`src/queue-validation.ts` - Deliverable checks 8. **Audit**`src/audit/audit-session.ts` - Logging facade, metrics tracking
9. **Executor**`src/ai/claude-executor.ts` - SDK calls, MCP setup, retry logic
10. **Validation**`src/services/queue-validation.ts` - Deliverable checks
## Step 4: Identify Root Cause ## Step 4: Identify Root Cause
@@ -58,7 +61,10 @@ For Shannon, trace through these layers:
| Cost/timing not tracked | Metrics not reloaded before update | Add `metricsTracker.reload()` before updates | | Cost/timing not tracked | Metrics not reloaded before update | Add `metricsTracker.reload()` before updates |
| session.json corrupted | Partial write during crash | Delete and restart, or restore from backup | | session.json corrupted | Partial write during crash | Delete and restart, or restore from backup |
| YAML config rejected | Invalid schema or unsafe content | Run through AJV validator manually | | YAML config rejected | Invalid schema or unsafe content | Run through AJV validator manually |
| Prompt variable not replaced | Missing `{{VARIABLE}}` in context | Check `prompt-manager.ts` interpolation | | Prompt variable not replaced | Missing `{{VARIABLE}}` in context | Check `src/services/prompt-manager.ts` interpolation |
| Service returns Err result | Check `ErrorCode` in Result | Trace through `classifyErrorForTemporal()` in `src/services/error-handling.ts` |
| Container not found | `getOrCreateContainer()` not called | Check activity setup code in `src/temporal/activities.ts` |
| ActivityLogger undefined | `createActivityLogger()` not called | Must be called at top of each activity function |
**MCP Server Issues:** **MCP Server Issues:**
```bash ```bash
@@ -123,6 +129,8 @@ shannon <URL> <REPO> --pipeline-testing
## Quick Reference: Error Types ## Quick Reference: Error Types
`ErrorCode` enum in `src/types/errors.ts` provides finer-grained classification used by `classifyErrorForTemporal()` in `src/services/error-handling.ts`.
| PentestError Type | Meaning | Retryable? | | PentestError Type | Meaning | Retryable? |
|-------------------|---------|------------| |-------------------|---------|------------|
| `config` | Configuration file issues | No | | `config` | Configuration file issues | No |
+11
View File
@@ -19,6 +19,8 @@ git diff HEAD
- [ ] **Retryable flag matches behavior** - If error will be retried, set `retryable: true` - [ ] **Retryable flag matches behavior** - If error will be retried, set `retryable: true`
- [ ] **Context includes debugging info** - Add relevant paths, tool names, error codes to context object - [ ] **Context includes debugging info** - Add relevant paths, tool names, error codes to context object
- [ ] **Never swallow errors silently** - Always log or propagate errors - [ ] **Never swallow errors silently** - Always log or propagate errors
- [ ] **Use ErrorCode enum** - Prefer `ErrorCode.CONFIG_INVALID` over string matching for classification
- [ ] **Result<T,E> for service returns** - Services return `Result`, not throw
### Audit System & Concurrency (CRITICAL) ### Audit System & Concurrency (CRITICAL)
- [ ] **Mutex protection for parallel operations** - Use `sessionMutex.lock()` when updating `session.json` during parallel agent execution - [ ] **Mutex protection for parallel operations** - Use `sessionMutex.lock()` when updating `session.json` during parallel agent execution
@@ -41,6 +43,13 @@ git diff HEAD
- [ ] **Duplicate rule detection** - Same `type:url_path` cannot appear twice - [ ] **Duplicate rule detection** - Same `type:url_path` cannot appear twice
- [ ] **JSON Schema validation before use** - Config must pass AJV validation - [ ] **JSON Schema validation before use** - Config must pass AJV validation
### Services Layer & DI Container (CRITICAL)
- [ ] **Business logic in services, not activities** — Activities: heartbeat loop, error classification, container calls only. Domain logic → `src/services/`
- [ ] **Services accept ActivityLogger** — Never import `@temporalio/*` in services. Use `ActivityLogger` interface from `src/types/`
- [ ] **Result type for fallible operations** — Service methods return `Result<T, PentestError>`, unwrap with `isOk()`/`isErr()`. Activities call `executeOrThrow()` at the boundary
- [ ] **Container lifecycle**`getOrCreateContainer()` at activity start, `removeContainer()` only in workflow cleanup
- [ ] **AuditSession not in container** — Must be passed per-agent call (parallel safety)
### Session & Agent Management (CRITICAL) ### Session & Agent Management (CRITICAL)
- [ ] **Deliverable dependencies respected** - Exploitation agents only run if vulnerability queue exists AND has items - [ ] **Deliverable dependencies respected** - Exploitation agents only run if vulnerability queue exists AND has items
- [ ] **Queue validation before exploitation** - Use `safeValidateQueueAndDeliverable()` to check eligibility - [ ] **Queue validation before exploitation** - Use `safeValidateQueueAndDeliverable()` to check eligibility
@@ -91,6 +100,8 @@ git diff HEAD
- [ ] **Duplicate retry logic** - Don't implement retry at both caller and callee level - [ ] **Duplicate retry logic** - Don't implement retry at both caller and callee level
- [ ] **Hardcoded error message matching** - Prefer error codes over regex on error.message - [ ] **Hardcoded error message matching** - Prefer error codes over regex on error.message
- [ ] **Missing timeout on long operations** - Git operations and API calls should have timeouts - [ ] **Missing timeout on long operations** - Git operations and API calls should have timeouts
- [ ] **Console.log in services** — Use `ActivityLogger`. Only CLI display code (`client.ts`, `worker.ts`, `output-formatters.ts`) uses console.log
- [ ] **Temporal imports in services** — Services must stay Temporal-agnostic. If you need Temporal APIs, it belongs in activities
### Code Quality ### Code Quality
- [ ] **No dead code added** - Remove unused imports, functions, variables - [ ] **No dead code added** - Remove unused imports, functions, variables
+25 -32
View File
@@ -41,18 +41,20 @@ npm run build
## Architecture ## Architecture
### Core Modules ### Core Modules
- `src/session-manager.ts` — Agent definitions, execution order, parallel groups - `src/session-manager.ts` — Agent definitions (`AGENTS` record). Agent types in `src/types/agents.ts`
- `src/ai/claude-executor.ts` — Claude Agent SDK integration with retry logic and git checkpoints
- `src/config-parser.ts` — YAML config parsing with JSON Schema validation - `src/config-parser.ts` — YAML config parsing with JSON Schema validation
- `src/error-handling.ts` — Categorized error types (PentestError, ConfigError, NetworkError) with retry logic - `src/ai/claude-executor.ts` — Claude Agent SDK integration with retry logic
- `src/tool-checker.ts` — Validates external security tool availability before execution - `src/services/` — Business logic layer (Temporal-agnostic). Activities delegate here. Key: `agent-execution.ts`, `error-handling.ts`, `container.ts`
- `src/queue-validation.ts` — Deliverable validation and agent prerequisites - `src/types/` — Consolidated types: `Result<T,E>`, `ErrorCode`, `AgentName`, `ActivityLogger`, etc.
- `src/utils/` — Shared utilities (file I/O, formatting, concurrency)
### Temporal Orchestration ### Temporal Orchestration
Durable workflow orchestration with crash recovery, queryable progress, intelligent retry, and parallel execution (5 concurrent agents in vuln/exploit phases). Durable workflow orchestration with crash recovery, queryable progress, intelligent retry, and parallel execution (5 concurrent agents in vuln/exploit phases).
- `src/temporal/workflows.ts` — Main workflow (`pentestPipelineWorkflow`) - `src/temporal/workflows.ts` — Main workflow (`pentestPipelineWorkflow`)
- `src/temporal/activities.ts`Activity implementations with heartbeats - `src/temporal/activities.ts`Thin wrappers — heartbeat loop, error classification, container lifecycle. Business logic delegated to `src/services/`
- `src/temporal/activity-logger.ts``TemporalActivityLogger` implementation of `ActivityLogger` interface
- `src/temporal/summary-mapper.ts` — Maps `PipelineSummary` to `WorkflowSummary`
- `src/temporal/worker.ts` — Worker entry point - `src/temporal/worker.ts` — Worker entry point
- `src/temporal/client.ts` — CLI client for starting workflows - `src/temporal/client.ts` — CLI client for starting workflows
- `src/temporal/shared.ts` — Types, interfaces, query definitions - `src/temporal/shared.ts` — Types, interfaces, query definitions
@@ -66,30 +68,32 @@ Durable workflow orchestration with crash recovery, queryable progress, intellig
### Supporting Systems ### Supporting Systems
- **Configuration** — YAML configs in `configs/` with JSON Schema validation (`config-schema.json`). Supports auth settings, MFA/TOTP, and per-app testing parameters - **Configuration** — YAML configs in `configs/` with JSON Schema validation (`config-schema.json`). Supports auth settings, MFA/TOTP, and per-app testing parameters
- **Prompts** — Per-phase templates in `prompts/` with variable substitution (`{{TARGET_URL}}`, `{{CONFIG_CONTEXT}}`). Shared partials in `prompts/shared/` via `prompt-manager.ts` - **Prompts** — Per-phase templates in `prompts/` with variable substitution (`{{TARGET_URL}}`, `{{CONFIG_CONTEXT}}`). Shared partials in `prompts/shared/` via `src/services/prompt-manager.ts`
- **SDK Integration** — Uses `@anthropic-ai/claude-agent-sdk` with `maxTurns: 10_000` and `bypassPermissions` mode. Playwright MCP for browser automation, TOTP generation via MCP tool. Login flow template at `prompts/shared/login-instructions.txt` supports form, SSO, API, and basic auth - **SDK Integration** — Uses `@anthropic-ai/claude-agent-sdk` with `maxTurns: 10_000` and `bypassPermissions` mode. Playwright MCP for browser automation, TOTP generation via MCP tool. Login flow template at `prompts/shared/login-instructions.txt` supports form, SSO, API, and basic auth
- **Audit System** — Crash-safe append-only logging in `audit-logs/{hostname}_{sessionId}/`. Tracks session metrics, per-agent logs, prompts, and deliverables - **Audit System** — Crash-safe append-only logging in `audit-logs/{hostname}_{sessionId}/`. Tracks session metrics, per-agent logs, prompts, and deliverables. WorkflowLogger (`audit/workflow-logger.ts`) provides unified human-readable per-workflow logs, backed by LogStream (`audit/log-stream.ts`) shared stream primitive
- **Deliverables** — Saved to `deliverables/` in the target repo via the `save_deliverable` MCP tool - **Deliverables** — Saved to `deliverables/` in the target repo via the `save_deliverable` MCP tool
- **Workspaces & Resume** — Named workspaces via `WORKSPACE=<name>` or auto-named from URL+timestamp. Resume passes `--workspace` to the Temporal client (`src/temporal/client.ts`), which loads `session.json` to detect completed agents. `loadResumeState()` in `src/temporal/activities.ts` validates deliverable existence, restores git checkpoints, and cleans up incomplete deliverables. Workspace listing via `src/temporal/workspaces.ts` - **Workspaces & Resume** — Named workspaces via `WORKSPACE=<name>` or auto-named from URL+timestamp. Resume passes `--workspace` to the Temporal client (`src/temporal/client.ts`), which loads `session.json` to detect completed agents. `loadResumeState()` in `src/temporal/activities.ts` validates deliverable existence, restores git checkpoints, and cleans up incomplete deliverables. Workspace listing via `src/temporal/workspaces.ts`
## Development Notes ## Development Notes
### Adding a New Agent ### Adding a New Agent
1. Define agent in `src/session-manager.ts` (add to `AGENT_QUEUE` and parallel group) 1. Define agent in `src/session-manager.ts` (add to `AGENTS` record). `ALL_AGENTS`/`AgentName` types live in `src/types/agents.ts`
2. Create prompt template in `prompts/` (e.g., `vuln-newtype.txt`) 2. Create prompt template in `prompts/` (e.g., `vuln-newtype.txt`)
3. Add activity function in `src/temporal/activities.ts` 3. Two-layer pattern: add a thin activity wrapper in `src/temporal/activities.ts` (heartbeat + error classification). `AgentExecutionService` in `src/services/agent-execution.ts` handles the agent lifecycle automatically via the `AGENTS` registry
4. Register activity in `src/temporal/workflows.ts` within the appropriate phase 4. Register activity in `src/temporal/workflows.ts` within the appropriate phase
### Modifying Prompts ### Modifying Prompts
- Variable substitution: `{{TARGET_URL}}`, `{{CONFIG_CONTEXT}}`, `{{LOGIN_INSTRUCTIONS}}` - Variable substitution: `{{TARGET_URL}}`, `{{CONFIG_CONTEXT}}`, `{{LOGIN_INSTRUCTIONS}}`
- Shared partials in `prompts/shared/` included via `prompt-manager.ts` - Shared partials in `prompts/shared/` included via `src/services/prompt-manager.ts`
- Test with `PIPELINE_TESTING=true` for fast iteration - Test with `PIPELINE_TESTING=true` for fast iteration
### Key Design Patterns ### Key Design Patterns
- **Configuration-Driven** — YAML configs with JSON Schema validation - **Configuration-Driven** — YAML configs with JSON Schema validation
- **Progressive Analysis** — Each phase builds on previous results - **Progressive Analysis** — Each phase builds on previous results
- **SDK-First** — Claude Agent SDK handles autonomous analysis - **SDK-First** — Claude Agent SDK handles autonomous analysis
- **Modular Error Handling** — Categorized errors with automatic retry (3 attempts per agent) - **Modular Error Handling** — `ErrorCode` enum, `Result<T,E>` for explicit error propagation, automatic retry (3 attempts per agent)
- **Services Boundary** — Activities are thin Temporal wrappers; `src/services/` owns business logic, accepts `ActivityLogger`, returns `Result<T,E>`. No Temporal imports in services
- **DI Container** — Per-workflow in `src/services/container.ts`. `AuditSession` excluded (parallel safety)
### Security ### Security
Defensive security tool only. Use only on systems you own or have explicit permission to test. Defensive security tool only. Use only on systems you own or have explicit permission to test.
@@ -111,6 +115,7 @@ Defensive security tool only. Use only on systems you own or have explicit permi
- Use `function` keyword for top-level functions (not arrow functions) - Use `function` keyword for top-level functions (not arrow functions)
- Explicit return type annotations on exported/top-level functions - Explicit return type annotations on exported/top-level functions
- Prefer `readonly` for data that shouldn't be mutated - Prefer `readonly` for data that shouldn't be mutated
- `exactOptionalPropertyTypes` is enabled — use spread for optional props, not direct `undefined` assignment
### Avoid ### Avoid
- Combining multiple concerns into a single function to "save lines" - Combining multiple concerns into a single function to "save lines"
@@ -120,33 +125,21 @@ Defensive security tool only. Use only on systems you own or have explicit permi
- Backwards-compatibility shims, deprecated wrappers, or re-exports for removed code — delete the old code, don't preserve it - Backwards-compatibility shims, deprecated wrappers, or re-exports for removed code — delete the old code, don't preserve it
### Comments ### Comments
- Explain **WHY**, not WHAT — code shows what it does Comments must be **timeless** no references to this conversation, refactoring history, or the AI.
- Comments must be **timeless** — useful to a reader with no knowledge of this conversation
- Never reference: this chat, refactoring history ("moved from X"), the AI, or deleted files
- No comment is better than a bad comment
```typescript **Patterns used in this codebase:**
// Bad: references refactoring history - `/** JSDoc */` — file headers (after license) and exported functions/interfaces
// Moved from utils/helpers.ts - `// 1. Step` — numbered steps for sequential operations
- `// === Section ===` — dividers in long files/functions
- `// NOTE:` / `// WARNING:` / `// IMPORTANT:` — gotchas and constraints
// Bad: references conversation **Never:** obvious comments, conversation references ("as discussed"), history ("moved from X")
// Added per user request
// Bad: states the obvious
// Loop through the array
// Good: explains WHY
// Retry with backoff — Temporal server rejects rapid reconnects
// Good: documents a gotcha
// MUST use FAILSAFE_SCHEMA — default schema allows code execution
```
## Key Files ## Key Files
**Entry Points:** `src/temporal/workflows.ts`, `src/temporal/activities.ts`, `src/temporal/worker.ts`, `src/temporal/client.ts` **Entry Points:** `src/temporal/workflows.ts`, `src/temporal/activities.ts`, `src/temporal/worker.ts`, `src/temporal/client.ts`
**Core Logic:** `src/session-manager.ts`, `src/ai/claude-executor.ts`, `src/config-parser.ts`, `src/audit/` **Core Logic:** `src/session-manager.ts`, `src/ai/claude-executor.ts`, `src/config-parser.ts`, `src/services/`, `src/audit/`
**Config:** `shannon` (CLI), `docker-compose.yml`, `configs/`, `prompts/` **Config:** `shannon` (CLI), `docker-compose.yml`, `configs/`, `prompts/`