route leaks: BGP Route Leak Detection
Route Leak Detection
Overview
This project intends to help to detect BGP full view route leaks. A route leak occurs when an AS advertises, usually by accident, some prefixes it learned from others. As The BGP routing protocol does not include any security mechanism, the leak gets propagated. It can have a major impact since traffic will be directed toward the AS having made the leak instead of the legitimate AS. This phenomenon was noticed several times along the past years, sometimes impacting thousands of ASes. In order to identify route leaks, we defined a new methodology and applied it on 2014, 2015 and 2016 BGP data. It successfully detected 6, 36 and 22 route leaks, respectively. This represents tens of thousands of leaked prefixes each year. In 2016, we detected that there are more than 11 000 conflicts due to route leaks.
The motivations, description, and results are described in the article Detecting Route Leak at Scale. This repository contains three different methods for route leak detection and the input data for 2014, 2015 and 2016.
It contains the following directories:
- data:
Contains input data for 2014, 2015 and 2016. - functional_testing:
Some internal tools we use for testing. - src:
Contains the base codes of the different implementations- route_leaks_detection:
Two propositions of new methods to detect route leaks, one with heuristics, the other with machine learning. Detailed description and usage are provided hereafter. - deroleru:
The implementation of the heuristics algorithm in Rust used to speed up the process. The Rust module is directly used in the python code of route_leaks_detection. A dedicated README explains how it can be used directly. - related_work_implem:
The proposition of implementation of an algorithm presented by researchers of the University of Arizona in NANOG49 (https://www.nanog.org/meetings/nanog49/presentations/Tuesday/LRL-NANOG49.pdf).
- route_leaks_detection:
Installation
Prerequisite: It has been tested on Debian and Ubuntu, the following packages are needed:
- python-dev python-pip virtualenv make
- To take advantage of the faster implementation, Rust should be installed:
curl https://sh.rustup.rs -sSf | sh
A Makefile is supplied to ease installation. It creates and install python packages in a virtual environment (in the directory env) and installs rust package.
Make sure you run the code using the virtualenv:
make
source env/bin/activate
export PYTHONPATH=$PWD
Note: If Rust is not installed, you should run make python instead of make.
Route Leak Detection
This is the main package of the project. It contains two methods to detect route leaks: heuristics and classification (machine learning)
Input data
The input data is available for 2014, 2015 and 2016 in the data directory.
For each year two files are needed :
- conflicts: (one time series per AS) number of ASes in conflicts per day
- prefixes: (one time series per AS) number of prefixes announced per day
They are created using TaBi and MaBo.
Lexicon
pfx: shortcut for prefixes, refers to the BGP prefixes announced.
cfl: shortcut for conflicts, refers to the BGP announces in conflicts with other announces.
Testing
For both methods, unit tests and functional tests are available.
Heuristics unit tests:
python -m pytest src/route_leaks_detection/heuristics/
Machine Learning unit tests:
python -m pytest src/route_leaks_detection/classification/
Functional tests:
python -m pytest src/route_leaks_detection/tests
You also can run all of them using:
make test
Heuristics
Apply heuristics to data from prepare_data module.
The algorithm looks for simultaneous peaks in the data. It is based on the following steps:
For each AS
- find peaks in prefixes data
- find peaks in conflicts data
- check if peaks found for prefixes and conflicts happen the same day
Example of simultaneous peaks, hence considered as a full view leak (index 19):
how to use:
There are two possibilities, using FindRouteLeaks (from detect_route_leaks module) or FittedFindRouteLeaks (from detect_route_leaks_optimized).
The first one uses default parameters while the second one will calculate them based on the data. FindRouteLeaks will be faster but the results may be less interesting.