Backport upstream Shannon PRs #325, #327, #328: - Update large model default to claude-opus-4-7, add adaptive thinking configuration (auto-enabled on Opus 4.6/4.7, opt-out via CLAUDE_ADAPTIVE_THINKING=false), filter thinking blocks from message content, bump claude-agent-sdk to ^0.2.114 - Remove unused scan tools (nmap, subfinder, whatweb, schemathesis) from Dockerfile, prompts, and docs; remove dead 'tool' error type from PentestErrorType; redact URLs in preflight info logs - Add --help flag to save-deliverable and generate-totp CLI scripts Co-Authored-By: Paperclip <noreply@paperclip.ing>
Trebuchet — AI Pentester
Trebuchet is a fork of Shannon by Keygraph, wrapped with a REST API and Kubernetes tooling for cluster-based deployments.
What is Trebuchet?
Trebuchet is an API-driven AI pentester built on top of Shannon's autonomous penetration testing engine. It performs white-box security testing of web applications and APIs by combining source code analysis with live exploitation.
Unlike the upstream Shannon CLI, Trebuchet is designed to run as a service on Kubernetes — scans are triggered via REST API, orchestrated by Temporal, and executed in ephemeral worker pods.
Important
White-box only. Trebuchet expects access to your application's source code and repository layout.
Features
- Fully Autonomous Operation: A single API call launches the full pentest. Handles 2FA/TOTP logins (including SSO), browser navigation, exploitation, and report generation without manual intervention.
- Reproducible Proof-of-Concept Exploits: The final report contains only proven, exploitable findings with copy-and-paste PoCs. Vulnerabilities that cannot be exploited are not reported.
- OWASP Vulnerability Coverage: Identifies and validates Injection, XSS, SSRF, and Broken Authentication/Authorization.
- Code-Aware Dynamic Testing: Analyzes source code to guide attack strategy, then validates findings with live browser and CLI-based exploits against the running application.
- Integrated Security Tooling: Leverages Nmap, Subfinder, WhatWeb, and Schemathesis during reconnaissance and discovery phases.
- Parallel Processing: Vulnerability analysis and exploitation phases run concurrently across all attack categories.
Architecture
Trebuchet uses a multi-agent architecture that combines white-box source code analysis with dynamic exploitation across five phases:
+----------------------+
| Pre-Reconnaissance |
| (nmap, subfinder, |
| whatweb, code scan) |
+----------+-----------+
|
v
+----------------------+
| Reconnaissance |
| (attack surface |
| mapping) |
+----------+-----------+
|
v
+----------+----------+
| | |
v v v
+-----------+ +---------+ +---------+
| Vuln | | Vuln | | ... |
|(Injection)| | (XSS) | | |
+-----+-----+ +----+----+ +----+----+
| | |
v v v
+-----------+ +---------+ +---------+
| Exploit | | Exploit | | ... |
|(Injection)| | (XSS) | | |
+-----+-----+ +----+----+ +----+----+
| | |
+------+------+-----------+
|
v
+----------------------+
| Reporting |
+----------------------+
Each scan runs as an ephemeral Kubernetes Job with a per-invocation Temporal task queue, enabling concurrent scans with different target repositories.
Deployment
Kubernetes manifests live in a separate repository: farhoodlabs/trebuchet-infra.
Sample Reports
Sample penetration test reports from industry-standard vulnerable applications:
- OWASP Juice Shop — 20+ vulnerabilities including auth bypass and database exfiltration. View Report
- c{api}tal API — ~15 critical/high vulnerabilities including command injection and auth bypass. View Report
- OWASP crAPI — 15+ critical/high vulnerabilities including JWT attacks and database compromise. View Report
Benchmark
Shannon Lite scored 96.15% (100/104 exploits) on a hint-free, source-aware variant of the XBOW security benchmark.
Full results with detailed agent logs and per-challenge pentest reports
Disclaimers
Warning
DO NOT run Trebuchet on production environments. It actively executes attacks to confirm vulnerabilities. Use only on sandboxed, staging, or local development environments.
Caution
You must have explicit, written authorization from the owner of the target system before running Trebuchet. Unauthorized scanning is illegal.
- Verification is Required: Human oversight is essential to validate all reported findings. LLMs can still generate hallucinated content.
- Targeted Vulnerabilities: Broken Authentication & Authorization, Injection, XSS, SSRF.
- Cost: A full test run typically takes 1-1.5 hours and may cost ~$50 USD using Claude Sonnet.
License
Released under the GNU Affero General Public License v3.0 (AGPL-3.0).
Support
- Report bugs: GitHub Issues
- Discussions: GitHub Discussions