katana v1.0.5 releases: next-generation crawling and spidering framework
Katana
A next-generation crawling and spidering framework
Feature
- Fast And fully configurable web crawling
- Standard and Headless mode support
- JavaScript parsing / crawling
- Customizable automatic form filling
- Scope control – Preconfigured field / Regex
- Customizable output – Preconfigured fields
- INPUT – STDIN, URL and LIST
- OUTPUT – STDOUT, FILE, and JSON
Crawling Mode
Standard Mode
Standard crawling modality uses the standard go http library under the hood to handle HTTP requests/responses. This modality is much faster as it doesn’t have the browser overhead. Still, it analyzes HTTP responses body as is, without any javascript or DOM rendering, potentially missing post-dom-rendered endpoints or asynchronous endpoint calls that might happen in complex web applications depending, for example, on browser-specific events.
Headless Mode
Headless mode hooks internal headless calls to handle HTTP requests/responses directly within the browser context. This offers two advantages:
- The HTTP fingerprint (TLS and user agent) fully identify the client as a legitimate browser
- Better coverage since the endpoints are discovered analyzing the standard raw response, as in the previous modality, and also the browser-rendered one with javascript enabled.
Headless crawling is optional and can be enabled using -headless option.
Scope Control
Crawling can be endless if not scoped, as such katana comes with multiple support to define the crawl scope.
Changelog v1.0.5
🎉 New Features
-dr, -disable-redirects disable following redirects (default false)
🐞 Bug Fixes
- Fixed issue with header in output with XHR requests by @ErikOwen in #631
- Fixed with
-aff
config option by @dogancanbakir in #676 - Fixed issue with
-kf
option by @dogancanbakir in #655 - Fixed issue with blank field with empty value by @dogancanbakir in #687
Install & Use
Copyright (c) 2022 ProjectDiscovery