TScopy: Python script used to parse the NTFS $MFT file
Introducing TScopy
It is a requirement during an Incident Response (IR) engagement to have the ability to analyze files on the filesystem. Sometimes these files are locked by the operating system (OS) because they are in use, which is particularly frustrating with event logs and registry hives. TScopy allows the user, who is running with administrator privileges, to access locked files by parsing out their raw location in the filesystem and copying them without asking the OS.
There are other tools that perform similar functions, such as RawCopy, which we have used and is the basis for this tool. However, there are some disadvantages to RawCopy that led us to develop TScopy, including performance, size, and the ability to incorporate it in other tools.
This blog is intended to introduce TScopy but also to ask for assistance. As in all software development, the more a tool is used, the more edge cases can be found. We are asking that people try out the tool and report any bugs.
What is TScopy?
TScopy is a Python script used to parse the NTFS $MFT file to locate and copy specific files. By parsing the Master File Table (MFT), the script bypasses operating system locks on files. The script was originally based on the work of RawCopy. RawCopy is written in AutoIT and is difficult to modify for our purposes. The decision to port RawCopy to Python was done because of the need to incorporate this functionality natively into our toolset.
TScopy is designed to be run as a standalone program or included as a python module. The python implementation makes use of the python-ntfs tools found at https://github.com/williballenthin/python-ntfs. TScopy built upon the base functionality of python-ntfs to isolate the location of each file from the raw disk.
What makes TScopy different?
TScopy is written in Python and organized into classes to make it more maintainable and readable than AutoIT. AutoIT can be flagged as malicious by anti-virus or detection software because some malware has utilized its potential.
The major difference between TScopy and RawCopy is the ability to copy multiple files per execution and to cache the file structure. As shown in the image below, TScopy has options to download a single file, multiple comma-delimited files, the contents of a directory, wildcarded paths (individual files or directories), and recursive directories.
It caches the location of each directory and file as it iterates the target file’s full path. It then uses this cache to optimize the search for any other files, ensuring future file copies are performed much faster. This is a significant advantage over RawCopy, which iterates over the entire path for each file.