reko v0.8.1.0 releases: a binary decompiler

reko

Reko (Swedish: “decent, obliging”) is a C# project containing a decompiler for machine code binaries. This project is freely available under the GNU General Public License.

The project consists of front ends, the core decompiler engine, and back ends to help it achieve its goals. A command-line, a Windows GUI, and an ASP.NET front end exist at the time of writing. The decompiler engine receives inputs from the front ends in the form of either individual executable files or decompiler project files. Reko project files contain additional information about a binary file, helpful to the decompilation process or for formatting the output. The decompiler engine then proceeds to analyze the input binary.

Reko has the ambition of supporting decompilation of various processor architectures and executable file formats with minimal user intervention.

Reko consists of a central .NET assembly Reko.Decompiler.dll which contains the central core logic. Leaving aside the user interface for a moment, the Reko can at a glance be considered a pipeline. The first stage of the pipeline loads the executable we wish to decompile. Later stages perform different kinds of analyses, extracting information from the machine language where they can and aggregating it into structured information (such as Procedures and data types). The final stage is the output stage, where the source code is emitted into files.

A central tenet is that Reko is extensible: wherever possible, we strive to avoid hard-coding knowledge about specific platforms, processors, or file formats in the core decompiler. Instead, such special knowledge is farmed out in separate assemblies. Examples:

  • Reko.Arch.X86.dll – provides support for disassembling Intel X86 binaries.
  • Reko.ImageLoaders.MzExe.dll – understands how to load MS-DOS executable files and all related formats
  • Reko.ImageLoaders.Elf.dll – understands the ELF executable file format.

The ImageMap data structure keeps track of the address space of a binary that Reko is analyzing. A binary, after being loaded into memory, consists of one or more ImageSegments, each of which has a starting address and size (or extent). Each image segment also has a reference to the MemoryArea that contains its bytes. You can think of the ImageSegments as functioning like “windows” onto a MemoryArea. Note that there one-to-many relation between MemoryAreas and ImageSegments. A typical PE or ELF executable will have logically distinct segments and memory areas:

+---------------+                     +---------------+--------------+
| .text section |                     | .data section | .bss section |
| 0x00401000    |                     | 0x00402000    | 0x00402800   |
+---------------+                     +---------------+--------------+
       ↓                                      ↓              ↓
+---------------+                     +------------------------------+
| Memory area 1 |                     | Memory area 2                |
| 0x00401000    |                     | 0x00402000                   |
+---------------+                     +------------------------------+

 

 

while an MS-DOS executable will typically be loaded into a single memory area, and all segments will refer to the same memory area

+----------------+     +----------------+       +----------------+
+ segment 0x0810 |     | segment 0x0CFA |       | segment 0x15AD |
+----------------+     +----------------+       +----------------+
        ↓                      ↓                        ↓
+----------------------------------------------------------------+
| Memory image                                                   |
| 0x0810:0000                                                    |
+----------------------------------------------------------------+

 

 

In addition to maintaining a map of all the segments, the ImageMap also maintains in its Items dictionary the locations of any identified items as Reko performs its analysis. Some executable image file formats contain information like the entry points or symbols identifying machine language procedures or data locations. Reko can use this information to populate the image map before starting its scanning phase. The scanning phase will discover more items and add them to the ImageMap. The user interface also allows the user to add items that Reko can’t discover itself.

Changelog v0.8.1.0

It’s been a long time since the last release, but a lot has been happening in Reko-land. This release contains:

  • Support for PIC16 and PIC18 architectures (with thanks to @chostelet)
  • Much better support for Intel HEX files (with thanks to @chostelet)
  • Moved to C# 7.2, .NET Framework 4.7, and NUnit 3.0 (with thanks to @chostelet)
  • Improvements in indirect call handling (with thanks to @ptomin)
  • Support for adding persistent user annotations / comments (with thanks to @ptomin)
  • Backward propagation of stack pointer from procedure end recovers more frame-based variables (with thanks to @ptomin)
  • C parser improvements (with thanks to @ptomin)
  • Improvements in MacOS classic binaries (with thanks to @gbody)
  • Handled compressed A5World segments in MacOS classic binaries (with thanks to @gbody)
  • Refactored Reko.Gui.dll to remove Windows Forms dependencies in preparation for GUI front end independence (with thanks to @smx-smx)
  • Xbox 360 XEX Loader (with thanks to @smx-smx)
  • PowerPC VMX instructions (with thanks to @smx-smx)
  • Improvements in X86 disassembler robustness (with thanks to @smx-smx)
  • Generation of for-loops from while- and do-loops (with thanks to @rfalke)
  • Initial support for ARM AArch64, IBM zSeries, Intel 8051, Motorola 6800, MSP 430, Risc-V, SuperH, TMS 7000, and XTensa architectures.
  • Support for the PowerPC 750 instruction set.
  • Improved disassembler performance about 20% across all architectures.
  • Improvements in shingled scanner heuristic, reducing false positives.
  • Much better support for indirect jump / call target resolution.
  • Improved handling of ELF PLT entries.
  • Fusion of MIPS-style unaligned loads and stores.
  • support for 16- 80- and 128-bit IEEE 754 floating point numbers.
  • Moved from Rhino.Mocks to Moq mock framework.
  • Support for more than one architecture per decompiled binary.
  • Support for platform-specific memory maps.
  • Memory visualizer control shows heat maps of binaries.
  • Command line switch –scan-only.
  • Added NamingPolicy classes to abstract automatic naming of variables, procedures, and types.
  • Lots and lots of small bug fixes.

Download && Tutorial

Copyright (C) 2017 uxmal

Share