feat: backport Opus 4.7 + adaptive thinking, remove scan tools, add --help to scripts
CI / Build & push API image (pull_request) Has been skipped
CI / Type-check & lint (pull_request) Successful in 18s
CI / Build & push worker image (pull_request) Has been skipped

Backport upstream Shannon PRs #325, #327, #328:

- Update large model default to claude-opus-4-7, add adaptive thinking
  configuration (auto-enabled on Opus 4.6/4.7, opt-out via
  CLAUDE_ADAPTIVE_THINKING=false), filter thinking blocks from message
  content, bump claude-agent-sdk to ^0.2.114
- Remove unused scan tools (nmap, subfinder, whatweb, schemathesis) from
  Dockerfile, prompts, and docs; remove dead 'tool' error type from
  PentestErrorType; redact URLs in preflight info logs
- Add --help flag to save-deliverable and generate-totp CLI scripts

Co-Authored-By: Paperclip <noreply@paperclip.ing>
This commit is contained in:
2026-05-20 00:26:25 +00:00
committed by Hugh Commit [agent]
parent ccb3dc6f75
commit 085624b287
19 changed files with 218 additions and 275 deletions
+2 -3
View File
@@ -180,17 +180,16 @@ For each root vulnerability in your plan, you will follow this systematic, four-
## **Strategic Tool Usage**
Use the right tool for the job to ensure thoroughness.
- **Use `curl` (Manual Probing) for:** Initial confirmation, simple UNION/Error-based injections, and crafting specific WAF bypasses.
- **Use `sqlmap` (Automation) for:** Time-consuming blind injections, automating enumeration **after** manual confirmation, and as a final step to try a wide range of payloads when manual techniques are failing.
## **Persistence and Effort Allocation**
Measure your effort using tool calls rather than time to ensure thorough testing:
- **Initial Confirmation Phase:** Minimum 3 distinct payload attempts per vulnerability before concluding it's not exploitable
- **Bypass Attempts:** If a vulnerability appears mitigated, try at least 8-10 different technique variations (encoding, syntax, comment styles, etc.) before concluding it's properly defended
- **Escalation Trigger:** If manual testing exceeds 10-12 tool calls without progress on a single vulnerability, escalate to automated tools (`sqlmap`) or Task Agent scripting
- **Escalation Trigger:** If manual testing exceeds 10-12 tool calls without progress on a single vulnerability, escalate to Task Agent scripting
- **Termination Criteria:** After systematic attempts with multiple different techniques → classify as appropriate level
## **Using the Task Agent for Custom Scripting**
You must delegate every injection automation task to the Task Agent. Use manual `curl` or `sqlmap` runs for spot checks, then escalate to scripted payload loops handled by the Task Agent.
You must delegate every injection automation task to the Task Agent. Use manual `curl` runs for spot checks, then escalate to scripted payload loops handled by the Task Agent.
**TEMPLATE FOR SCRIPTING TASKS (REQUIRED):**
"