Feat/temporal (#46)
* refactor: modularize claude-executor and extract shared utilities
- Extract message handling into src/ai/message-handlers.ts with pure functions
- Extract output formatting into src/ai/output-formatters.ts
- Extract progress management into src/ai/progress-manager.ts
- Add audit-logger.ts with Null Object pattern for optional logging
- Add shared utilities: formatting.ts, file-io.ts, functional.ts
- Consolidate getPromptNameForAgent into src/types/agents.ts
* feat: add Claude Code custom commands for debug and review
* feat: add Temporal integration foundation (phase 1-2)
- Add Temporal SDK dependencies (@temporalio/client, worker, workflow, activity)
- Add shared types for pipeline state, metrics, and progress queries
- Add classifyErrorForTemporal() for retry behavior classification
- Add docker-compose for Temporal server with SQLite persistence
* feat: add Temporal activities for agent execution (phase 3)
- Add activities.ts with heartbeat loop, git checkpoint/rollback, and error classification
- Export runClaudePrompt, validateAgentOutput, ClaudePromptResult for Temporal use
- Track attempt number via Temporal Context for accurate audit logging
- Rollback git workspace before retry to ensure clean state
* feat: add Temporal workflow for 5-phase pipeline orchestration (phase 4)
* feat: add Temporal worker, client, and query tools (phase 5)
- Add worker.ts with workflow bundling and graceful shutdown
- Add client.ts CLI to start pipelines with progress polling
- Add query.ts CLI to inspect running workflow state
- Fix buffer overflow by truncating error messages and stack traces
- Skip git operations gracefully on non-git repositories
- Add kill.sh/start.sh dev scripts and Dockerfile.worker
* feat: fix Docker worker container setup
- Install uv instead of deprecated uvx package
- Add mcp-server and configs directories to container
- Mount target repo dynamically via TARGET_REPO env variable
* fix: add report assembly step to Temporal workflow
- Add assembleReportActivity to concatenate exploitation evidence files before report agent runs
- Call assembleFinalReport in workflow Phase 5 before runReportAgent
- Ensure deliverables directory exists before writing final report
- Simplify pipeline-testing report prompt to just prepend header
* refactor: consolidate Docker setup to root docker-compose.yml
* feat: improve Temporal client UX and env handling
- Change default to fire-and-forget (--wait flag to opt-in)
- Add splash screen and improve console output formatting
- Add .env to gitignore, remove from dockerignore for container access
- Add Taskfile for common development commands
* refactor: simplify session ID handling and improve Taskfile options
- Include hostname in workflow ID for better audit log organization
- Extract sanitizeHostname utility to audit/utils.ts for reuse
- Remove unused generateSessionLogPath and buildLogFilePath functions
- Simplify Taskfile with CONFIG/OUTPUT/CLEAN named parameters
* chore: add .env.example and simplify .gitignore
* docs: update README and CLAUDE.md for Temporal workflow usage
- Replace Docker CLI instructions with Task-based commands
- Add monitoring/stopping sections and workflow examples
- Document Temporal orchestration layer and troubleshooting
- Simplify file structure to key files overview
* refactor: replace Taskfile with bash CLI script
- Add shannon bash script with start/logs/query/stop/help commands
- Remove Taskfile.yml dependency (no longer requires Task installation)
- Update README.md and CLAUDE.md to use ./shannon commands
- Update client.ts output to show ./shannon commands
* docs: fix deliverable filename in README
* refactor: remove direct CLI and .shannon-store.json in favor of Temporal
- Delete src/shannon.ts direct CLI entry point (Temporal is now the only mode)
- Remove .shannon-store.json session lock (Temporal handles workflow deduplication)
- Remove broken scripts/export-metrics.js (imported non-existent function)
- Update package.json to remove main, start script, and bin entry
- Clean up CLAUDE.md and debug.md to remove obsolete references
* chore: remove licensing comments from prompt files to prevent leaking into actual prompts
* fix: resolve parallel workflow race conditions and retry logic bugs
- Fix save_deliverable race condition using closure pattern instead of global variable
- Fix error classification order so OutputValidationError matches before generic validation
- Fix ApplicationFailure re-classification bug by checking instanceof before re-throwing
- Add per-error-type retry limits (3 for output validation, 50 for billing)
- Add fast retry intervals for pipeline testing mode (10s vs 5min)
- Increase worker concurrent activities to 25 for parallel workflows
* refactor: pipeline vuln→exploit workflow for parallel execution
- Replace sync barrier between vuln/exploit phases with independent pipelines
- Each vuln type runs: vuln agent → queue check → conditional exploit
- Add checkExploitationQueue activity to skip exploits when no vulns found
- Use Promise.allSettled for graceful failure handling across pipelines
- Add PipelineSummary type for aggregated cost/duration/turns metrics
* fix: re-throw retryable errors in checkExploitationQueue
* fix: detect and retry on Claude Code spending cap errors
- Add spending cap pattern detection in detectApiError() with retryable error
- Add matching patterns to classifyErrorForTemporal() for proper Temporal retry
- Add defense-in-depth safeguard in runClaudePrompt() for $0 cost / low turn detection
- Add final sanity check in activities before declaring success
* fix: increase heartbeat timeout to prevent false worker-dead detection
Original 30s timeout was from POC spec assuming <5min activities. With
hour-long activities and multiple concurrent workflows sharing one worker,
resource contention causes event loop stalls exceeding 30s, triggering
false heartbeat timeouts. Increased to 10min (prod) and 5min (testing).
* fix: temporal db init
* fix: persist home dir
* feat: add per-workflow unified logging with ./shannon logs ID=<workflow-id>
- Add WorkflowLogger class for human-readable, per-workflow log files
- Create workflow.log in audit-logs/{workflowId}/ with phase, agent, tool, and LLM events
- Update ./shannon logs to require ID param and tail specific workflow log
- Add phase transition logging at workflow boundaries
- Include workflow completion summary with agent breakdown (duration, cost)
- Mount audit-logs volume in docker-compose for host access
---------
Co-authored-by: ezl-keygraph <ezhil@keygraph.io>
This commit is contained in:
committed by
GitHub
parent
45acb16711
commit
51e621d0d5
+55
-68
@@ -7,7 +7,7 @@
|
||||
import { $, fs, path } from 'zx';
|
||||
import chalk from 'chalk';
|
||||
import { Timer } from '../utils/metrics.js';
|
||||
import { formatDuration } from '../audit/utils.js';
|
||||
import { formatDuration } from '../utils/formatting.js';
|
||||
import { handleToolError, PentestError } from '../error-handling.js';
|
||||
import { AGENTS } from '../session-manager.js';
|
||||
import { runClaudePromptWithRetry } from '../ai/claude-executor.js';
|
||||
@@ -40,11 +40,17 @@ interface PromptVariables {
|
||||
repoPath: string;
|
||||
}
|
||||
|
||||
// Discriminated union for Wave1 tool results - clearer than loose union types
|
||||
type Wave1ToolResult =
|
||||
| { kind: 'scan'; result: TerminalScanResult }
|
||||
| { kind: 'skipped'; message: string }
|
||||
| { kind: 'agent'; result: AgentResult };
|
||||
|
||||
interface Wave1Results {
|
||||
nmap: TerminalScanResult | string | AgentResult;
|
||||
subfinder: TerminalScanResult | string | AgentResult;
|
||||
whatweb: TerminalScanResult | string | AgentResult;
|
||||
naabu?: TerminalScanResult | string | AgentResult;
|
||||
nmap: Wave1ToolResult;
|
||||
subfinder: Wave1ToolResult;
|
||||
whatweb: Wave1ToolResult;
|
||||
naabu?: Wave1ToolResult;
|
||||
codeAnalysis: AgentResult;
|
||||
}
|
||||
|
||||
@@ -57,7 +63,7 @@ interface PreReconResult {
|
||||
report: string;
|
||||
}
|
||||
|
||||
// Pure function: Run terminal scanning tools
|
||||
// Runs external security tools (nmap, whatweb, etc). Schemathesis requires schemas from code analysis.
|
||||
async function runTerminalScan(tool: ToolName, target: string, sourceDir: string | null = null): Promise<TerminalScanResult> {
|
||||
const timer = new Timer(`command-${tool}`);
|
||||
try {
|
||||
@@ -89,7 +95,7 @@ async function runTerminalScan(tool: ToolName, target: string, sourceDir: string
|
||||
return { tool: 'whatweb', output: result.stdout, status: 'success', duration: whatwebDuration };
|
||||
}
|
||||
case 'schemathesis': {
|
||||
// Only run if API schemas found
|
||||
// Schemathesis depends on code analysis output - skip if no schemas found
|
||||
const schemasDir = path.join(sourceDir || '.', 'outputs', 'schemas');
|
||||
if (await fs.pathExists(schemasDir)) {
|
||||
const schemaFiles = await fs.readdir(schemasDir) as string[];
|
||||
@@ -146,6 +152,8 @@ async function runPreReconWave1(
|
||||
|
||||
const operations: Promise<TerminalScanResult | AgentResult>[] = [];
|
||||
|
||||
const skippedResult = (message: string): Wave1ToolResult => ({ kind: 'skipped', message });
|
||||
|
||||
// Skip external commands in pipeline testing mode
|
||||
if (pipelineTestingMode) {
|
||||
console.log(chalk.gray(' ⏭️ Skipping external tools (pipeline testing mode)'));
|
||||
@@ -163,9 +171,9 @@ async function runPreReconWave1(
|
||||
);
|
||||
const [codeAnalysis] = await Promise.all(operations);
|
||||
return {
|
||||
nmap: 'Skipped (pipeline testing mode)',
|
||||
subfinder: 'Skipped (pipeline testing mode)',
|
||||
whatweb: 'Skipped (pipeline testing mode)',
|
||||
nmap: skippedResult('Skipped (pipeline testing mode)'),
|
||||
subfinder: skippedResult('Skipped (pipeline testing mode)'),
|
||||
whatweb: skippedResult('Skipped (pipeline testing mode)'),
|
||||
codeAnalysis: codeAnalysis as AgentResult
|
||||
};
|
||||
} else {
|
||||
@@ -192,9 +200,9 @@ async function runPreReconWave1(
|
||||
const [nmap, subfinder, whatweb, codeAnalysis] = await Promise.all(operations);
|
||||
|
||||
return {
|
||||
nmap: nmap as TerminalScanResult,
|
||||
subfinder: subfinder as TerminalScanResult,
|
||||
whatweb: whatweb as TerminalScanResult,
|
||||
nmap: { kind: 'scan', result: nmap as TerminalScanResult },
|
||||
subfinder: { kind: 'scan', result: subfinder as TerminalScanResult },
|
||||
whatweb: { kind: 'scan', result: whatweb as TerminalScanResult },
|
||||
codeAnalysis: codeAnalysis as AgentResult
|
||||
};
|
||||
}
|
||||
@@ -250,17 +258,21 @@ async function runPreReconWave2(
|
||||
return response;
|
||||
}
|
||||
|
||||
// Helper type for stitching results
|
||||
interface StitchableResult {
|
||||
status?: string;
|
||||
output?: string;
|
||||
tool?: string;
|
||||
// Extracts status and output from a Wave1 tool result
|
||||
function extractResult(r: Wave1ToolResult | undefined): { status: string; output: string } {
|
||||
if (!r) return { status: 'Skipped', output: 'No output' };
|
||||
switch (r.kind) {
|
||||
case 'scan':
|
||||
return { status: r.result.status || 'Skipped', output: r.result.output || 'No output' };
|
||||
case 'skipped':
|
||||
return { status: 'Skipped', output: r.message };
|
||||
case 'agent':
|
||||
return { status: r.result.success ? 'success' : 'error', output: 'See agent output' };
|
||||
}
|
||||
}
|
||||
|
||||
// Pure function: Stitch together pre-recon outputs and save to file
|
||||
async function stitchPreReconOutputs(outputs: (StitchableResult | string | undefined)[], sourceDir: string): Promise<string> {
|
||||
const [nmap, subfinder, whatweb, naabu, codeAnalysis, ...additionalScans] = outputs;
|
||||
|
||||
// Combines tool outputs into single deliverable. Falls back to reference if file missing.
|
||||
async function stitchPreReconOutputs(wave1: Wave1Results, additionalScans: TerminalScanResult[], sourceDir: string): Promise<string> {
|
||||
// Try to read the code analysis deliverable file
|
||||
let codeAnalysisContent = 'No analysis available';
|
||||
try {
|
||||
@@ -269,62 +281,45 @@ async function stitchPreReconOutputs(outputs: (StitchableResult | string | undef
|
||||
} catch (error) {
|
||||
const err = error as Error;
|
||||
console.log(chalk.yellow(`⚠️ Could not read code analysis deliverable: ${err.message}`));
|
||||
// Fallback message if file doesn't exist
|
||||
codeAnalysisContent = 'Analysis located in deliverables/code_analysis_deliverable.md';
|
||||
}
|
||||
|
||||
|
||||
// Build additional scans section
|
||||
let additionalSection = '';
|
||||
if (additionalScans && additionalScans.length > 0) {
|
||||
if (additionalScans.length > 0) {
|
||||
additionalSection = '\n## Authenticated Scans\n';
|
||||
additionalScans.forEach(scan => {
|
||||
const s = scan as StitchableResult;
|
||||
if (s && s.tool) {
|
||||
additionalSection += `
|
||||
### ${s.tool.toUpperCase()}
|
||||
Status: ${s.status}
|
||||
${s.output}
|
||||
for (const scan of additionalScans) {
|
||||
additionalSection += `
|
||||
### ${scan.tool.toUpperCase()}
|
||||
Status: ${scan.status}
|
||||
${scan.output}
|
||||
`;
|
||||
}
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
const nmapResult = nmap as StitchableResult | string | undefined;
|
||||
const subfinderResult = subfinder as StitchableResult | string | undefined;
|
||||
const whatwebResult = whatweb as StitchableResult | string | undefined;
|
||||
const naabuResult = naabu as StitchableResult | string | undefined;
|
||||
|
||||
const getStatus = (r: StitchableResult | string | undefined): string => {
|
||||
if (!r) return 'Skipped';
|
||||
if (typeof r === 'string') return 'Skipped';
|
||||
return r.status || 'Skipped';
|
||||
};
|
||||
|
||||
const getOutput = (r: StitchableResult | string | undefined): string => {
|
||||
if (!r) return 'No output';
|
||||
if (typeof r === 'string') return r;
|
||||
return r.output || 'No output';
|
||||
};
|
||||
const nmap = extractResult(wave1.nmap);
|
||||
const subfinder = extractResult(wave1.subfinder);
|
||||
const whatweb = extractResult(wave1.whatweb);
|
||||
const naabu = extractResult(wave1.naabu);
|
||||
|
||||
const report = `
|
||||
# Pre-Reconnaissance Report
|
||||
|
||||
## Port Discovery (naabu)
|
||||
Status: ${getStatus(naabuResult)}
|
||||
${getOutput(naabuResult)}
|
||||
Status: ${naabu.status}
|
||||
${naabu.output}
|
||||
|
||||
## Network Scanning (nmap)
|
||||
Status: ${getStatus(nmapResult)}
|
||||
${getOutput(nmapResult)}
|
||||
Status: ${nmap.status}
|
||||
${nmap.output}
|
||||
|
||||
## Subdomain Discovery (subfinder)
|
||||
Status: ${getStatus(subfinderResult)}
|
||||
${getOutput(subfinderResult)}
|
||||
Status: ${subfinder.status}
|
||||
${subfinder.output}
|
||||
|
||||
## Technology Detection (whatweb)
|
||||
Status: ${getStatus(whatwebResult)}
|
||||
${getOutput(whatwebResult)}
|
||||
Status: ${whatweb.status}
|
||||
${whatweb.output}
|
||||
## Code Analysis
|
||||
${codeAnalysisContent}
|
||||
${additionalSection}
|
||||
@@ -375,16 +370,8 @@ export async function executePreReconPhase(
|
||||
console.log(chalk.green(' ✅ Wave 2 operations completed'));
|
||||
|
||||
console.log(chalk.blue('📝 Stitching pre-recon outputs...'));
|
||||
// Combine wave 1 and wave 2 results for stitching
|
||||
const allResults: (StitchableResult | string | undefined)[] = [
|
||||
wave1Results.nmap as StitchableResult | string,
|
||||
wave1Results.subfinder as StitchableResult | string,
|
||||
wave1Results.whatweb as StitchableResult | string,
|
||||
wave1Results.naabu as StitchableResult | string | undefined,
|
||||
wave1Results.codeAnalysis as unknown as StitchableResult,
|
||||
...(wave2Results.schemathesis ? [wave2Results.schemathesis as StitchableResult] : [])
|
||||
];
|
||||
const preReconReport = await stitchPreReconOutputs(allResults, sourceDir);
|
||||
const additionalScans = wave2Results.schemathesis ? [wave2Results.schemathesis] : [];
|
||||
const preReconReport = await stitchPreReconOutputs(wave1Results, additionalScans, sourceDir);
|
||||
const duration = timer.stop();
|
||||
|
||||
console.log(chalk.green(`✅ Pre-reconnaissance complete in ${formatDuration(duration)}`));
|
||||
|
||||
Reference in New Issue
Block a user