katana.sh

katana

Fast web crawler for collecting URLs and endpoints. ProjectDiscovery.

Quickstart

# Crawl single URL
katana -u https://target.com

# Crawl with JS rendering
katana -u https://target.com -headless

# Crawl list of URLs
katana -list urls.txt

# Pipe to nuclei
katana -u https://target.com -silent | nuclei -silent

Core Concepts

Concept	Description
Crawling	Follow links and discover endpoints
Headless	Use browser for JS-heavy sites
Scope	Control what gets crawled
Passive	Extract URLs without making requests

Syntax

katana -u <url> [options]
katana -list <file> [options]

Options

Input

Option	Description
`-u <url>`	Single URL
`-list <file>`	URL list
`-`	Read from stdin
`-resume <file>`	Resume from file

Crawling

Option	Description
`-d <n>`	Max depth (default 3)
`-jc`	Crawl JS files
`-ct <sec>`	Crawl timeout
`-kf`	Keep query string in URLs
`-ef <ext>`	Exclude extensions
`-em <type>`	Exclude media
`-fs <pattern>`	Field scope

Headless

Option	Description
`-headless`	Enable headless browser
`-hl`	Headless with full browser
`-sc`	Use system Chrome
`-xhr`	Extract XHR requests
`-ws`	Extract WebSocket URLs

Scope

Option	Description
`-cs <scope>`	Crawl scope (dn, rdn, fqdn)
`-do`	Display out of scope URLs
`-fs <regex>`	Filter scope
`-sf <domain>`	Scope filter domain

Output

Option	Description
`-o <file>`	Output file
`-json`	JSON output
`-silent`	Silent mode
`-nc`	No color
`-v`	Verbose

Performance

Option	Description
`-c <n>`	Concurrency (default 10)
`-p <n>`	Parallelism
`-rl <n>`	Rate limit
`-timeout <sec>`	Timeout
`-retry <n>`	Retries

Request

Option	Description
`-H "Header: val"`	Custom header
`-proxy <url>`	HTTP proxy
`-xhr`	XHR extraction

Recipes

Basic Crawling

# Simple crawl
katana -u https://target.com

# Deeper crawl
katana -u https://target.com -d 5

# Silent output
katana -u https://target.com -silent

# Multiple targets
katana -list urls.txt -silent

JS-Heavy Sites

# Headless crawling
katana -u https://target.com -headless

# With XHR extraction
katana -u https://target.com -headless -xhr

# System Chrome
katana -u https://target.com -headless -sc

Endpoint Discovery

# Crawl + JS parsing
katana -u https://target.com -jc

# Keep query strings
katana -u https://target.com -kf

# Extract forms
katana -u https://target.com -f

Scope Control

# Same domain only
katana -u https://target.com -cs dn

# Include subdomains
katana -u https://target.com -cs rdn

# Exclude file types
katana -u https://target.com -ef png,jpg,gif,css,woff

Pipeline Integration

# katana → nuclei
katana -u https://target.com -silent | nuclei -silent

# subfinder → httpx → katana
subfinder -d target.com -silent | httpx -silent | katana -silent

# katana → gf (pattern extract)
katana -u https://target.com -silent | gf xss

# Crawl and find params
katana -u https://target.com -silent | grep "?" | sort -u

API Endpoint Discovery

# Find API endpoints
katana -u https://target.com -silent | grep -E "/api/|/v[0-9]/"

# JSON output for parsing
katana -u https://target.com -json -o crawl.json

# Extract unique paths
katana -u https://target.com -silent | \
  sed 's/\?.*//' | sort -u

Through Proxy

# Burp/Caido proxy
katana -u https://target.com -proxy http://127.0.0.1:8080

# With headers
katana -u https://target.com -H "Authorization: Bearer token"

Output & Parsing

# JSON output
katana -u https://target.com -json -o results.json

# Parse JSON
cat results.json | jq -r '.request.endpoint'

# Extract unique endpoints
katana -u https://target.com -silent | sort -u > endpoints.txt

# Filter by pattern
katana -u https://target.com -silent | grep -E "\.(php|asp|jsp)"

Troubleshooting

Issue	Solution
Missing JS endpoints	Use `-headless`
Too slow	Reduce `-d`, increase `-c`
Stuck on site	Add `-ct` timeout
Scope issues	Check `-cs` setting

katana

Quickstart

Core Concepts

Syntax

Options

Input

Crawling

Headless

Scope

Output

Performance

Request

Recipes

Basic Crawling

JS-Heavy Sites

Endpoint Discovery

Scope Control

Pipeline Integration

API Endpoint Discovery

Through Proxy

Output & Parsing

Troubleshooting

References