DumpsterDiver: search secrets in various filetypes
DumpsterDiver
DumpsterDiver is a tool used to analyze big volumes of various file types in search of hardcoded secret keys (e.g. AWS Access Key, Azure Share Key or SSH keys) based on counting the entropy. Additionally, it allows creating a simple search rule with basic conditions (e.g. reports only csv file including at least 10 email addresses). The main idea of this tool is to detect any potential secret leaks.
Key features:
- it uses Shannon Entropy to find private keys.
- it supports multiprocessing for analyzing files.
- it unpacks compressed archives (e.g. zip, tar.gz etc.)
- it supports advanced search using simple rules (details below)
Understanding config.yaml file
There is no single tool which fits for everyone’s needs and the DumpsterDiver is not an exception here. So, in config.yaml file you can custom the program to search exactly what you want. Below you can find a description of each setting.
- logfile – specifies a file where logs should be saved.
- excluded – specifies file extensions which you don’t want to omit during a scan. There is no point in searching for hardcoded secrets in picture or video files, right?
- min_key_length and min_key_length – specifies minimum and maximum length of the secret you’re looking for. Depending on your needs this setting can greatly limit the amount of false positives. For example, the AWS secret has a length of 40 bytes so if you set min_key_length and min_key_length to 40 then the DumpsterDiver will analyze only 40 bytes strings. However, it won’t take into account longer strings like Azure shared key or private SSH key. Default values are min_key_length = 40 and min_key_length = 80 what is quite general and can generate false positives.
- high_entropy_edge – if the entropy of analyzed string equals or is higher than high_entropy_edge, then this string will be reported as a representation of high entropy. The default value high_entropy_edge = 4.3 should work in most cases, however, if you’re getting too many false positives it is also worth trying increase this value.
Advanced search:
The DumpsterDiver supports also an advanced search. Beyond a simple grepping with wildcards this tool allows you to create conditions. Let’s assume you’re searching for a leak of corporate emails. Additionally, you’re interested only in big leaks, which contain at least 100 email addresses. For this purpose, you should edit a ‘rules.yaml’ file in the following way:
Let’s assume a different scenario, you’re looking for terms “pass”, “password”, “haslo”, “hasło” (if you’re analyzing polish company repository) in a .db or .sql file. Then you can achieve this by modifying a ‘rules.yaml’ file in the following way:
Note that the rule will be triggered only when the total weight (filetype_weight + grep_words_weight) is >=10.
Download
https://github.com/securing/DumpsterDiver.git
cd DumpsterDiver
pip install -r requirements.txt
Use
Demo
DumpsterDiver from Pawel Rzepa on Vimeo.
Copyright (c) 2017 JP
Source: https://github.com/securing/