statiStrings: YARA Rule Strings Statistics Calculator

strings statistics calculator

statiStrings

statiStrings is a strings statistics calculator for YARA rules.

The goal is to aid malware research by:

  • Finding common and unique strings within malware samples
  • Finding common strings within clean files
  • Saving time by finding the common characteristics of malware samples automatically

This tool helps writing better, more precise YARA rules for malware detection and malware hunting, based on custom databases of malicious and clean files.

For a given YARA rule and a directory of files, this tool returns the prevalence of each string from the rule in the matched files from the directory.

Install

git clone https://github.com/Sh3llyR/statiStrings.git
pip install yara

Use

Usage example

Research of common strings in malicious batch scripts: First, I wrote a YARA rule with many commands that were found in malicious scripts. The condition was “any of them” – very generic. Then, I ran this tool with the rule I wrote against a malicious scripts directory (shown in the following example). Finally, I ran it against a directory with clean scripts. After Going through the results of both clean and malicious scripts, I was able to:

  1. Group the strings of the YARA rule to suspicious ($s_…), for example, tskill, and noisy ($n_…), for example, echo.
  2. Create a condition for my rule that catches the malicious samples but not the clean samples, minimizing false positives.
  • python statiStrings.py -y .\batch_commands.yar -d .\batch_samples -t s
  • Results:
     {'$s_ren': 1, '$n_set': 8, '$s_mem': 1, '$s_reg_add': 8, '$s_taskkill': 4, '$n_exit': 9, '$s_maybe_block_sites_hosts_file': 1, '$s_move': 2, '$s_attrib': 6, '$n_copy': 6, '$n_start': 10, '$n_type': 7, '$n_echo': 26, '$n_reg': 11, '$s_aes': 1, '$s_cscript': 1, '$s_change_mouse_settings': 1, '$n_net': 3, '$n_find': 6, '$s_infinite_loop': 2, '$s_shutdown': 9, '$n_del': 6, '$n_goto': 12, '$s_generic_bat_maybe_copy_itself': 5, '$n_ipconfig': 2, '$n_maybe_time_change': 5, '$n_system': 2, '$s_tskill': 3, '$s_cpu_damage': 1, '$s_erase': 3, '$s_make_random_folders': 1, '$s_sleep': 4, '$n_bat_maybe_copy_itself': 9}
    
    Number of files scanned: 157

     

  • python statiStrings.py -y .\batch_commands.yar -d .\batch_samples -t p
  • Results:

Author: Shelly Raban

Source: https://github.com/Sh3llyR/