MADLIRA: Malware detection using learning and information retrieval for Android
MADLIRA
Malware detection using learning and information retrieval for Android
Overview
MADLIRA is a tool for Android malware detection. It consists of two components: TFIDF component and SVM learning component. In general, it takes an input a set of malware and benwares and then extracts the malicious behaviors (TFIDF component) or computer training model (SVM classifier). Then, it uses this knowledge to detect malicious behaviors in the Android application.
Installing
Download file MADLIRA.7z and decompress it.
Installed Data:
- MADLIRA.jar is the main application.
- noAPI.txt declares the prefix of APIs.
- family.txt lists malware by family.
- Folder TrainData contains the training configuration and training model.
- Folder Samples contains sample data.
- Folder TempData contains data for kernel computation.
Functionality
This tool has two main components: TFIDF component and SVM component.
TFIDF component
Command: MADLIRA TFIDF
For this component, there are two functions: the training function (Malicious behavior extraction) and the test function (Malicious behavior detection)
Malicious behavior extraction
- Collect benign applications and malicious applications and output them in folders named benginAPKFolder and maliciousApkFolder, respectively.
- Prepare training data and pack them in two files named benignPack and maliciousPack by using the command:
MADLIRA TFIDF packAPK -PB benignApkFolder -B benignPack -PM maliciousApkFolder -M maliciousPack
- Extracting malicious behaviors from two packed files (benignPack and maliciousPack) by using the command:
MADLIRA TFIDF train -B benignPack -M maliciousPack
Malicious behavior detection
- Collect new applications and put them in a folder named checkApk.
- Detect malicious behaviors of applications in the folder checkApk by using the command:
MADLIRA TFIDF check -S checkApk
Command:
Examples:
Training new data:
- First, collect training applications (APK files) and store them in folders named MalApkFolder and BenApkFolder.
- Pack training applications into archive files named MalPack and BenPack by using this command:
MADLIRA TFIDF packAPK -PB BenApkFolder -B BenPack -PM MalApkFolder -M MalPack
- Clean old training data:
MADLIRA TFIDF clear
- Compute the malicious graphs from the training packs (BenPack and MalPack)
MADLIRA TFIDF train -B BenPack -M MalPack
Checking new applications:
- put these applications in a folder named checkApk and use this command:
MADLIRA TFIDF check -S checkApk
Output:
SVM component
Command: MADLIRA SVM
For this component, there are two functions: the training function and the test function.
Training phase
- Collect benign applications in a folder named benignApkFolder and malicious applications in a folder named maliciousApkFolder.
- Prepare training data by using the commands:
MADLIRA SVM packAPK -PB benignApkFolder -B benignPack -PM maliciousApkFolder -M maliciousPack
- Compute the training model by this command:
MADLIRA SVM train -B benignPack -M maliciousPack
Malicious behavior detection
- Collect new applications and put them in a folder named checkApk
- Detect malicious behaviors of applications in the folder checkApk by using the command:
MADLIRA SVM check -S checkApk
Command:
Copyright (C) 2018 dkhuuthe
Source: https://github.com/dkhuuthe/