Let autonomous AI agents hack you
so the real ones can't.

Fully autonomous agentic pentesting framework. Attacks LLM endpoints, web apps, npm packages, and source code. Blind PoC verification to minimize false positives.

npx pwnkit-cli
GitHub

Or read the documentation

pwnkit scanning a target and finding vulnerabilities

Fully autonomous agentic pentesting.

LLM Endpoints

ChatGPT, Claude, Llama APIs, custom chatbots

Web Apps

CORS, headers, exposed files, fingerprinting

npm Packages

Supply chain, malicious code, dependency risk

Source Code

Local repos, GitHub URLs, deep AI audit

100% detection rate.

10/10 on AI/LLM security challenges. Flag-based verification -- extract the flag or fail. XBOW traditional web vulnerability testing in progress. Every finding independently re-exploited to kill false positives.

PASS Direct Prompt Injection easy
PASS System Prompt Extraction easy
PASS PII Data Leakage easy
PASS Base64 Encoding Bypass medium
PASS DAN Jailbreak medium
PASS SSRF via MCP Tool medium
PASS Multi-Turn Escalation hard
PASS CORS Misconfiguration easy
PASS Sensitive Path Exposure easy
PASS Indirect Prompt Injection hard
10/10
Detection
10/10
Flag Extraction
0
False Positives

Run it yourself: pnpm bench --agentic · View benchmark source

Just give it a target.

pwnkit-cli express

Audit an npm package

pwnkit-cli ./my-repo

Review source code

pwnkit-cli https://api.com/chat

Scan an LLM endpoint

pwnkit-cli https://example.com --mode web

Pentest a web app

pwnkit-cli dashboard

Local mission control

pwnkit-cli findings list --severity critical

Triage across scans

Auto-detects target type. No subcommands needed for most targets.

Why pwnkit

Zero config

No YAML. No Python. Just npx pwnkit-cli and you're running.

Blind verification

Every finding is independently re-exploited. Can't reproduce it? Killed as a false positive.

Bring your own AI

Your API key, or use Claude Code CLI / Codex CLI with your subscription. Any model, any provider.

CLI runs. Dashboard triages.

pwnkit-cli

The execution surface. Run scans, audits, and reviews from your terminal or CI. Resume interrupted scans. Replay attack chains. Export SARIF, JSON, Markdown, or HTML.

$ pwnkit-cli scan --target https://api.com $ pwnkit-cli audit express --depth deep $ pwnkit-cli review ./my-repo --diff-base main

pwnkit-cli dashboard

The operator surface. Local web UI for finding triage, evidence review, scan provenance, and human sign-off. Launch scans, manage the verification queue, and track finding families across runs.

$ pwnkit-cli dashboard → http://127.0.0.1:48123

How it compares

Scroll to compare →

Feature pwnkit promptfoo (acquired by OpenAI) garak nuclei Semgrep
Autonomous multi-agent Agentic pipeline
Verification (no false positives) Re-exploits
LLM endpoint scanning
npm package audit Rules
Source code review AI-powered Rules
AI attack coverage 30+ agentic Partial Partial
Zero config npx YAML Python Templates Config
Independent Acquired VC-backed
Open source Apache-2.0 OpenAI-owned OSS MIT LGPL

Findings in GitHub's Security tab.

.github/workflows/pwnkit.yml
name: Security Pentest
on: [push, pull_request]

jobs:
  pwnkit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run pwnkit
        uses: peaktwilight/pwnkit/action@v1
        with:
          target: $${{ secrets.STAGING_API_URL }}
      - name: Upload SARIF
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: pwnkit-report/report.sarif

Dogfooding

pwnkit reviews its own source code

On every push via GitHub Actions

pwnkit runs pwnkit review . on its own repository. The same agentic pipeline that found 7 CVEs — pointed at itself. If it finds something, you'll see it here.

Set it up on your repo in 2 minutes:

1. Add to your GitHub Actions workflow:

- run: npx pwnkit-cli review . --format json > pwnkit-report.json

2. Add the badge to your README:

[![pwnkit](https://pwnkit.com/badge/ORG/REPO)](https://pwnkit.com)

Built from real security research

7 CVEs found in packages with 40M+ weekly downloads.

node-forge 32M/week mysql2 5M/week Uptime Kuma 86K stars LiquidJS CVE jsPDF 2 CVEs picomatch CVE
Full CVE writeups

Stop guessing.
Start proving.

pwnkit-cli https://api.example.com/chat
pwnkit-cli express
pwnkit-cli ./my-repo
pwnkit-cli https://github.com/org/repo
Star on GitHub
pwnkit