pagodo v2.6 releases: Automate Google Hacking Database scraping
pagodo (Passive Google Dork) – Automate Google Hacking Database scraping
The goal of this project was to develop a passive Google dork script to collect potentially vulnerable web pages and applications on the Internet. There are 2 parts. The first is ghdb_scraper.py that retrieves Google Dorks and the second portion is pagodo.py that leverages the information gathered by ghdb_scraper.py.
What are Google Dorks?
The awesome folks at Offensive Security maintain the Google Hacking Database (GHDB) found here: https://www.exploit-db.com/google-hacking-database. It is a collection of Google searches, called dorks, that can be used to find potentially vulnerable boxes or other juicy info that is picked up by Google’s search bots.
Changelog v2.6
- Bumped
yagooglesearch
to version 1.9.0
Installation
git clone https://github.com/opsdisk/pagodo.git
pip install -r requirements.txt
Usage
ghdb_scraper.py
To start off, pagodo.py needs a list of all the current Google dorks. Unfortunately, the entire database cannot be easily downloaded. A couple of older projects did this, but the code was slightly stale and it wasn’t multi-threaded…so collecting ~3800 Google Dorks would take a long time. ghdb_scraper.py is the resulting Python script.
ghdb_scraper.py Execution Flow
The flow of execution is pretty simple:
- Fill a queue with Google dork numbers to retrieve based off a range
- Worker threads retrieve the dork number from the queue, retrieve the page using urllib2, then process the page to extract the Google dork using the BeautifulSoup HTML parsing library
- Print the results to the screen and optionally save them to a file (to be used by pagodo.py for example)
ghdb_scraper.py Switches
The script’s switches are self-explanatory:
To run it
python ghdb_scraper.py -n 5 -x 3785 -s -t 3
pagodo.py
Now that a file with the most recent Google dorks exists, it can be fed into pagodo.py using the -g
switch to start collecting potentially vulnerable public applications. pagodo.py leverages the google
python library to search Google for sites with the Google dork, such as:
intitle:”ListMail Login” admin -demo
The -d
switch can be used to specify a domain and functions as the Google search operator:
site:example.com
pagodo.py Switches
The script’s switches are self-explanatory:
To run it
python pagodo.py -d example.com -g dorks.txt -l 50 -s -e 35.0 -j 1.1
Copyright (C) opsdisk
Source: https://github.com/opsdisk/