Let AI agents pentest your product
before attackers do.
Open-source AI pentesting. Real exploits, no false positives.
npx pwnkit-cli Scan four ecosystems with one loop.
Web apps, LLM APIs, npm packages, and source — in parallel, every commit.
Illustration. Real scans land in the dashboard.
Find real CVEs in code that runs the internet.
Seven public advisories. The agent did the work.
95.2% on XBOW.
99 of 104 challenges. Open source. Single config. Every solve has a stored exploit log.
The benchmark's own author scored 85%. They raised ~$237M to do it.
XBOW · 104 challenges · published headline scores
- 97.1%101 / 104Solo OSS · self-fundedBest-of-N · 10+ configs
- 96.2%100 / 104Seed VC · undisclosedWhite-box · modified fork
- 95.2%99 / 104Open source · self-fundedAudited · single config
- 92.3%96 / 104Stealth · no public fundingClosed source
- 85.0%88 / 104Unicorn · ~$237M · Microsoft-partneredBuilt for own benchmark
Hover any masked name to peek; click to keep it open. Funding + methodology posture sourced from competitor sheet.
Run the same loop a human pentester runs.
Recon to receipt. Every finding re-exploited before it ships.
Aim
URL, package, or repo.
Scan
Shell-first agent loop.
Triage
11 layers kill false positives.
Verify
A second agent re-exploits, blind.
Ship
SARIF, JSON, GitHub Security.
Just give it a target.
Audit an npm package
Review source code
Scan an LLM API
Pentest a web app
Local mission control
Triage across scans
Auto-detects target type. Drop the GitHub Action into CI for SARIF output.
Start locally.
Scale when it matters.
Scale when it matters.
Local today. Cloud when it needs to run on a schedule.
npx pwnkit-cli