uddup: URLs Deduplication Tool
UDdup – URLs Deduplication Tool
The tool gets a list of URLs and removes “duplicate” pages in the sense of URL patterns that are probably repetitive and points to the same web template.
For example:
https://www.example.com/product/123
https://www.example.com/product/456
https://www.example.com/product/123?is_prod=false
https://www.example.com/product/222?is_debug=true
All the above are probably points to the same product “template”. Therefore it should be enough to scan only some of these URLs by our various scanners.
The result of the above after UDdup should be:
https://www.example.com/product/123?is_prod=false
https://www.example.com/product/222?is_debug=true
Why do I need it?
Mostly for a better (automated) reconnaissance process, with less noise (for both the tester and the target).
Use
Filter Paths by Regex
Allows filtering custom paths pattern. For example, if we would like to filter all paths that starts with /product we will need to run:
Input:
https://www.example.com/
https://www.example.com/privacy-policy
https://www.example.com/product/1
https://www.example2.com/product/2
https://www.example3.com/product/4
Output:
https://www.example.com/
https://www.example.com/privacy-policy
Advanced Regex with multiple path filters
uddup -u demo.txt -fp “(^product)|(^category)“
Install
Copyright (c) 2020 Rotem Reiss