Let AI agents pentest your product
before attackers do.

Open-source AI pentesting. Real exploits, no false positives.

npx pwnkit-cli
GitHub

Scan four ecosystems with one loop.

Web apps, LLM APIs, npm packages, and source — in parallel, every commit.

Illustration. Real scans land in the dashboard.

Find real CVEs in code that runs the internet.

Seven public advisories. The agent did the work.

55M+
weekly downloads · all found by the agent
paperclip · 60k ★ · GHSA-47wq-cj9q-wpmp critical jsPDF · 13M weekly · CVE-2026-31938 critical node-forge · 34M weekly · CVE-2026-33896 high mysql2 · 9.5M weekly high LiquidJS · 1.6M weekly · CVE-2026-30952 high Uptime Kuma · 152M pulls · CVE-2026-33130 medium paperclip · 60k ★ · GHSA-47wq-cj9q-wpmp critical jsPDF · 13M weekly · CVE-2026-31938 critical node-forge · 34M weekly · CVE-2026-33896 high mysql2 · 9.5M weekly high LiquidJS · 1.6M weekly · CVE-2026-30952 high Uptime Kuma · 152M pulls · CVE-2026-33130 medium paperclip · 60k ★ · GHSA-47wq-cj9q-wpmp critical jsPDF · 13M weekly · CVE-2026-31938 critical node-forge · 34M weekly · CVE-2026-33896 high mysql2 · 9.5M weekly high LiquidJS · 1.6M weekly · CVE-2026-30952 high Uptime Kuma · 152M pulls · CVE-2026-33130 medium paperclip · 60k ★ · GHSA-47wq-cj9q-wpmp critical jsPDF · 13M weekly · CVE-2026-31938 critical node-forge · 34M weekly · CVE-2026-33896 high mysql2 · 9.5M weekly high LiquidJS · 1.6M weekly · CVE-2026-30952 high Uptime Kuma · 152M pulls · CVE-2026-33130 medium
Uptime Kuma · 152M pulls · CVE-2026-33130 medium LiquidJS · 1.6M weekly · CVE-2026-30952 high mysql2 · 9.5M weekly high node-forge · 34M weekly · CVE-2026-33896 high jsPDF · 13M weekly · CVE-2026-31938 critical paperclip · 60k ★ · GHSA-47wq-cj9q-wpmp critical Uptime Kuma · 152M pulls · CVE-2026-33130 medium LiquidJS · 1.6M weekly · CVE-2026-30952 high mysql2 · 9.5M weekly high node-forge · 34M weekly · CVE-2026-33896 high jsPDF · 13M weekly · CVE-2026-31938 critical paperclip · 60k ★ · GHSA-47wq-cj9q-wpmp critical Uptime Kuma · 152M pulls · CVE-2026-33130 medium LiquidJS · 1.6M weekly · CVE-2026-30952 high mysql2 · 9.5M weekly high node-forge · 34M weekly · CVE-2026-33896 high jsPDF · 13M weekly · CVE-2026-31938 critical paperclip · 60k ★ · GHSA-47wq-cj9q-wpmp critical Uptime Kuma · 152M pulls · CVE-2026-33130 medium LiquidJS · 1.6M weekly · CVE-2026-30952 high mysql2 · 9.5M weekly high node-forge · 34M weekly · CVE-2026-33896 high jsPDF · 13M weekly · CVE-2026-31938 critical paperclip · 60k ★ · GHSA-47wq-cj9q-wpmp critical

95.2% on XBOW.

99 of 104 challenges. Open source. Single config. Every solve has a stored exploit log.

The benchmark's own author scored 85%. They raised ~$237M to do it.

XBOW · 104 challenges · published headline scores

pwnkit
  • 97.1%101 / 104
    Solo OSS · self-fundedBest-of-N · 10+ configs
  • 96.2%100 / 104
    Seed VC · undisclosedWhite-box · modified fork
  • 95.2%99 / 104
    Open source · self-fundedAudited · single config
  • 92.3%96 / 104
    Stealth · no public fundingClosed source
  • 85.0%88 / 104
    Unicorn · ~$237M · Microsoft-partneredBuilt for own benchmark

Hover any masked name to peek; click to keep it open. Funding + methodology posture sourced from competitor sheet.

pnpm bench --agentic · Writeup · Source

Run the same loop a human pentester runs.

Recon to receipt. Every finding re-exploited before it ships.

1.0

Aim

URL, package, or repo.

2.0

Scan

Shell-first agent loop.

3.0

Triage

11 layers kill false positives.

4.0

Verify

A second agent re-exploits, blind.

5.0

Ship

SARIF, JSON, GitHub Security.

Architecture

Just give it a target.

pwnkit-cli express

Audit an npm package

pwnkit-cli ./my-repo

Review source code

pwnkit-cli https://api.com/chat

Scan an LLM API

pwnkit-cli https://example.com --mode web

Pentest a web app

pwnkit-cli dashboard

Local mission control

pwnkit-cli findings list --severity critical

Triage across scans

Auto-detects target type. Drop the GitHub Action into CI for SARIF output.

Start locally.
Scale when it matters.

Local today. Cloud when it needs to run on a schedule.

npx pwnkit-cli