A new open-source toolkit called pentest-ai-agents redefines how security professionals leverage AI in penetration testing workflows, transforming Anthropic’s Claude Code into a fully specialized offensive security research assistant that leverages 28 domain-specific subagents.
Released on GitHub by security researcher 0xSteph, pentest-ai-agents is a collection of 28 Claude Code subagents, each with deep expertise across the penetration testing lifecycle.
Coverage spans reconnaissance, web application testing, Active Directory attacks, cloud security, mobile penetration testing, wireless attacks, social engineering, exploit chains, detection engineering, forensics, malware analysis, and report generation.
Rather than relying on a single general-purpose AI model, the framework automatically routes each query to the most appropriate specialized agent.
Installing Pentest-AI-Agent
Setup requires no servers, external dependencies, or complex configuration. Handle everything with a single command.
bashcurl -fsSL https://raw.githubusercontent.com/0xSteph/pentest-ai-agents/main/install.sh | bash
The script clones the repository and includes all 28 agent files. ~/.claude/agents/exits cleanly. It is fully idempotent, so running it again will safely update existing agents.
Additional installation options support project-scoped deployments (--project) and cost optimized light mode (--global --lite) Run an advisory agent on Claude Haiku to reduce token consumption.
The toolkit introduces a two-tier execution model that provides security and flexibility. Tier 1 agents operate in advisory mode, where users paste the tool output and receive prioritized analysis, methodology guidance, and recommended next commands.
Tier 2 agents go further and create and execute commands directly to the declared authorized scopes. The Claude code displays each command for explicit approval before execution.
Tier 2 agents include Recon Advisor (nmap, whois, whatweb), Web Hunter (ffuf, sqlmap, dalfox), AD Attacker (BloodHound, Impacket, CrackMapExec, Certipy), Exploit Chainer, PoC Validator, and Business Logic Hunter. All offensive actions are mapped to MITER ATT&CK identifiers and combined with defensive context.
Persistent findings and MCP support
Built-in findings database using SQLite (findings.sh) retains engagement data across Claude Code sessions, enabling multi-day operations with seamless handoff.
Tier 2 agents automatically write to this database when: findings.sh It’s in your system PATH. The Report Generator agent creates professional penetration testing reports with summaries, CVSS scoring, and remediation roadmaps.
For air-gapped or privacy-sensitive environments, the agent can opencode-setup.sh script.
Companion MCP server (pentest-ai) Extend your ecosystem with 150+ tool wrappers, autonomous exploit chains, and CI/CD pipeline integrations for Claude Desktop, Cursor, and VS Code Copilot.
Follow us on Google News, LinkedIn, and X for daily updates on cybersecurity. Contact us to tell us your story.
