binbloom
The purpose of this project is to analyze raw binary firmware and determine automatically some of its features. This tool is compatible with all architectures as basically, it just does simple statistics on it.
In order to compute the loading address, you will need the help of an external reverse engineering tool to extract a list of potential functions, before using binbloom.
Main features:
- Loading address: binbloom can parse a raw binary firmware and determine its loading address.
- Endianness: binbloom can use heuristics to determine the endianness of a firmware.
- UDS Database: binbloom can parse a raw binary firmware and check if it contains an array containing UDS command IDs.
Install
git clone https://github.com/quarkslab/binbloom.git
cd binbloom
mkdir build
cd build
cmake ..
make
make install
Use
Determine the endianness
binbloom -f firmware.bin -e
This command should give an output like this:
In this output, the last line is the most important one as it gives the result of the analysis. The other lines are information about the number of unique pointers and the number of array elements binbloom has been able to find in the firmware, both in big-endian and in little-endian mode. These lines can provide useful information to corroborate the heuristic used to determine the endianness.
Determine the loading address
First, you have to provide a file containing a list of potential functions addresses, in hexadecimal (one per line), like this:
This file should be named after the firmware itself, followed by the “.fun” extension.
This file can be generated with the tag_code() function of the provided tag_code.py python script, using IDA Pro:
- Load the firmware in IDA Pro at address 0 (select the correct architecture/endianness)
- From the File menu, choose Script File and select tag_code.py
- In the console at the bottom of IDA Pro, use tag_code(). The functions file is automatically generated.
If you prefer to use another tool to generate the functions file, you can do it as long as you load the firmware at address 0 (i.e. the hex values in the functions file correspond to offsets in the firmware).
You can then ask binbloom to compute a (list of) potential loading address(es) by computing a correlation score between the potential functions and the arrays of functions pointers that can be found in the firmware:
binbloom -f firmware.bin -b
This command should give an output like this:
In this output, we can see that on the 14903 provided potential functions, 1545 were found in function pointers arrays when the program takes the assumption that the loading address is 0x80010000.
If there are several sections in the binary firmware, binbloom lists the different sections with the corresponding guess for the loading address:
Here we have a section of code at address 0x00000000, and another one at 0x00040000.
Binbloom generates 2 output files:
- firmware.fad : This file contains the addresses of identified functions
- firmware.fpt : This file contains the addresses of the pointers to the identified functions
You can now start IDA Pro again (or any reverse engineering software), load the firmware at the specified address and import the addresses of the 1545 identified functions:
- Load the firmware in IDA Pro at the specified address (in this example 0x80010000)
- From the File menu, choose Script File and select import_entry_points.py
- Select the .fad file
- Select the .fpt file
Note:
binbloom will start by determining the endianness, as this information is needed to look for the arrays of functions pointers. If the automatic analysis of the endianness is wrong, you can override its result with the following option:
-E b: force big-endian mode
-E l: force little-endian mode
Find the UDS database (for an ECU’s firmware)
binbloom can try to search an array containing UDS/KWP2000 IDs, with the -u option:
binbloom -f firmware.bin -u
This command should give an output like this:
This output shows that at address 0x1234, a potential UDS database was found with a stride of 12 (meaning that UDS IDs are present in an array in which each element is 12-byte long). In this example, the UDS IDs are in the first column (10, 11, 22, 27, 28, 2e, 31, 34, 36, 37, 3e and 85).
The list of supported UDS IDs is hard-coded in binbloom.c, you can change it if needed.
This analysis is based on heuristics so it can give false positives. You have to read the list of potential UDS databases found by binbloom and check and see which one is the correct one if any.
In this example, we can see that there is a pointer in little-endian in each line (26 27 00 80 for the first line, which corresponds to address 0x80002726). There is probably a function at this address to manage UDS command 10. You have to disassemble the code to make sure and search for cross-references to this UDS database.
Copyright 2020 G. Heilles
Copyright 2020 Quarkslab
Source: https://github.com/quarkslab/