katana v1.0.5 releases: next-generation crawling and spidering framework

by do son · Published January 15, 2023 · Updated December 16, 2024

Katana

A next-generation crawling and spidering framework

Feature

Fast And fully configurable web crawling
Standard and Headless mode support
JavaScript parsing / crawling
Customizable automatic form filling
Scope control – Preconfigured field / Regex
Customizable output – Preconfigured fields
INPUT – STDIN, URL and LIST
OUTPUT – STDOUT, FILE, and JSON

Crawling Mode

Standard Mode

Standard crawling modality uses the standard go http library under the hood to handle HTTP requests/responses. This modality is much faster as it doesn’t have the browser overhead. Still, it analyzes HTTP responses body as is, without any javascript or DOM rendering, potentially missing post-dom-rendered endpoints or asynchronous endpoint calls that might happen in complex web applications depending, for example, on browser-specific events.

Headless Mode

Headless mode hooks internal headless calls to handle HTTP requests/responses directly within the browser context. This offers two advantages:

The HTTP fingerprint (TLS and user agent) fully identify the client as a legitimate browser
Better coverage since the endpoints are discovered analyzing the standard raw response, as in the previous modality, and also the browser-rendered one with javascript enabled.

Headless crawling is optional and can be enabled using -headless option.

Scope Control

Crawling can be endless if not scoped, as such katana comes with multiple support to define the crawl scope.

Changelog v1.0.5

🎉 New Features

Added option to disable redirects with -disable-redirects by @ErikOwen in #630

   -dr, -disable-redirects  disable following redirects (default false)

🐞 Bug Fixes

Fixed issue with header in output with XHR requests by @ErikOwen in #631
Fixed with -aff config option by @dogancanbakir in #676
Fixed issue with -kf option by @dogancanbakir in #655
Fixed issue with blank field with empty value by @dogancanbakir in #687

katana v1.0.5 releases: next-generation crawling and spidering framework

Search

Brilliantly

Content & Links

katana v1.0.5 releases: next-generation crawling and spidering framework

Katana

Feature

Crawling Mode

Standard Mode

Headless Mode

Scope Control

Changelog v1.0.5

🎉 New Features

🐞 Bug Fixes

Install & Use

Search

Brilliantly

Content & Links