kiterunner: Contextual Content Discovery Tool

Content Discovery Tool

Kiterunner

For the longest of times, content discovery has been focused on finding files and folders. While this approach is effective for legacy web servers that host static files or respond with 3xx’s upon a partial path, it is no longer effective for modern web applications, specifically APIs.

Over time, we have seen a lot of time invested in making content discovery tool faster so that larger wordlists can be used, however, the art of content discovery has not been innovated upon.

Kiterunner is a tool that is capable of not only performing traditional content discovery at lightning-fast speeds but also bruteforcing routes/endpoints in modern applications.

Modern application frameworks such as Flask, Rails, Express, Django, and others follow the paradigm of explicitly defining routes which expect certain HTTP methods, headers, parameters, and values.

When using traditional content discovery tooling, such routes are often missed and cannot easily be discovered.

By collating a dataset of Swagger specifications and condensing it into our own schema, Kiterunner can use this dataset to bruteforce API endpoints by sending the correct HTTP method, headers, path, parameters, and values for each request it sends.

Swagger files were collected from a number of datasources, including an internet-wide scan for the 40+ most common swagger paths. Other datasources included GitHub via BigQuery and APIs.guru.

Content Discovery Tool

Technical Features

Depth Scanning

A key feature of kiterunner is depth-based scanning. This attempts to handle detecting wildcards given virtual application path-based routing. The depth defines how many directories deep the baseline checks are performed E.g.

~/kiterunner $ cat wordlist.txt


/api/v1/user/create
/api/v1/user/delete
/api/v2/user/
/api/v2/admin/
/secrets/v1/
/secrets/v2/

 

  • At depth 0, only / would have the baseline checks performed for wildcard detection
  • At depth 1, /api and /secrets would have baseline checks performed, and these checks would be used against /api and /secrets correspondingly
  • At depth 2, /api/v1, /api/v2, /secrets/v1 and /secrets/v2 would all have baseline checks performed.

By default, kr scan has a depth of 1, since, from internal usage, we’ve often seen this as the most common depth where virtual routing has occurred. kr brute has a default depth of 0, as you typically don’t want this check to be performed with a static wordlist.

Naturally, increasing the depth will increase the accuracy of your scans, however, this also increases the number of requests to the target. (# of baseline checks * # of depth baseline directories). Hence, we recommend against going above 1, and in rare cases going to depth 2.

Using Assetnote Wordlists

We provide inbuilt downloading and caching of wordlists from assetnote.io. You can use these with the -A flag which receives a comma-delimited list of aliases, or fullnames.

You can get a full list of all the Assetnote wordlists with kr wordlist list.

The wordlists when used, are cached in ~/.cache/kiterunner/wordlists. When used, these are compiled from .txt -> .kite

Concurrency Settings/Going Fast

Kiterunner is made to go fast on a lot of hosts. But, just because you can run kiterunner at 20000 goroutines, doesn’t mean it’s a good idea. Bottlenecks and performance degradation will occur at high thread counts due to more time spent scheduling goroutines that are waiting on network IO and kernel context switching.

There are two main concurrency settings for kiterunner:

  • -x, –max-connection-per-host – maximum number of open connections we can have on a host. Governed by 1 goroutine each. To avoid DOS’ing a host, we recommend keeping this in a low realm of 5-10. Depending on latency to the target, this will yield on average between 1-5 requests per second per connection (200ms – 1000ms/req) to a host.
  • -j, –max-parallel-hosts – maximum number of hosts to scan at any given time. Governed by 1 goroutine supervisor for each

Depending on the hardware you are scanning from, the “maximum” number of goroutines you can run optimally will vary. On an AWS t3.medium, we saw performance degradation going over 2500 goroutines. Meaning, 500 hosts x 5 conn per host (2500) would yield peak performance.

We recommend against running kiterunner from your macbook. Due to poor kernel optimisations for high IO counts and Epoll syscalls on macOS, we noticed substantially poorer (0.3-0.5x) performance when compared to running kiterunner on a similarly configured Linux instance.

To maximise performance when scanning an individual target, or a large attack surface we recommend the following tips:

  • Spin up an EC2 instance in a similar geographic region/datacenter to the target(s) you are scanning
  • Perform some initial benchmarks against your target set with varying -x and -j options. We recommend having a typical starting point of around -x 5 -j 100 and moving -j upwards as your CPU usage/network performance permits

Converting between file formats

Kiterunner will also let you convert between the schema JSON, a kite file, and a standard txt wordlist.

Download & Use

Copyright (C) 2021 minight