pigaios: A tool for diffing source codes directly against binaries
Pigaios
Pigaios (‘πηγαίος’, Greek for ‘source’ as in ‘source code’) is a tool for diffing/matching source codes directly against binaries. The idea is to point a tool to a code base, regardless of it being compilable or not (for example, partial source code or source code for platforms not at your hand), extract information from that code base and, then, import in an IDA database function names (symbols), structures and enumerations. It uses the Python CLang bindings (which are very limited, but still better than using pycparser).
Basically, the tool does the following:
- Parse C source code and get artifacts from the Abstract Syntax Tree (AST) of each function.
- Export the same data extracted from C source codes from IDA databases.
- Find matches between the artifacts found in C source codes and IDA databases.
- After an initial set of matches with no false positive is found, find more matches from the callgraph.
- Rate the matches using both an “expert system” and a “machine learning” based system.
- Also, import into the IDA database all the required structures and enumerations of a given code base (something not trivial in IDA).
Copyright (C) 2018 joxeankoret