ELF Miner

This is an approximate implementation of the ELF Miner framework as described in this paper. The difference being that this model aims to classify a given ELF as Malware or Benign but does not classify the Malware into the the five types as described in the paper. This is because of the limitations of the dataset used.

Requirements

Usage

The ELF files to be analyzed must be put into the folder elfs.
pip install -r requirements.txt
From the root of the project, run - python run_system.py

This prints the predicted class (Malware or Benign) for each ELF file in the same order as in the generated final.csv in the elfs folder.

Steps involved

Run the ELF Miner framework for feature extraction on the given ELF files. The details of the dataset and the features that are extracted are explained in much detail in the presentation linked below. A total of 343 features are initially 342 features.
Perform some postprocessing on the CSV to convert values for certain attributes to a form suitable for applying Machine Learning.
Perform Feature Selection on the CSV file. The features to remove were determined using Information Gain. For this we used WEKA. The features to remove are given in feature_selection/weka_features_toremove.txt. These are the ones which have 0 Information Gain. This reduces the number of features to 147.
Use the saved models (after training multiple classifiers using WEKA) to make predictions. The saved models and their details are present in models folder.

Two classes of classifiers have been used in the paper -

Non-Evolutionary Classifiers
- JRip
- J48
- PART
- Random Forest
Evolutionary Classifiers
- UCS
- XCS
- GAssist-Adi

For the Non-Evolutionary Classifiers we have used the WEKA toolkit and for the Evolutionary Classifiers we have used the KEEL toolkit. The accuracy of each of these classifiers (on 70-30 split of train-test split) is given in keel/results/results.txt.

However, the end-to-end system incorporates a voting classifier based only on the Non-Evolutionary classifiers, due to the availability of WEKA's Java API.

Link to Presentation

If you find any issues or bugs, feel free to open an issue or open a pull request if you wish to make an improvement.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
arff_headers		arff_headers
dataset		dataset
docs		docs
elftools		elftools
feature_selection		feature_selection
keel		keel
models		models
system		system
.gitignore		.gitignore
ELFMiner.py		ELFMiner.py
LICENSE		LICENSE
Pathway.txt		Pathway.txt
README.md		README.md
_config.yml		_config.yml
feature_selection.ipynb		feature_selection.ipynb
postprocessing.ipynb		postprocessing.ipynb
prepare_dataset.ipynb		prepare_dataset.ipynb
readelf3.py		readelf3.py
requirements.txt		requirements.txt
results.csv		results.csv
results2.csv		results2.csv
results_final.csv		results_final.csv
run_system.py		run_system.py
testing.csv		testing.csv
testing_1.arff		testing_1.arff
testing_1.csv		testing_1.csv

License

shreyansh26/ELF-Miner

Folders and files

Latest commit

History

Repository files navigation

ELF Miner

Requirements

Usage

Steps involved

About

Topics

Resources

License

Stars

Watchers

Forks

Languages