4.4 Article

Entropy analysis to classify unknown packing algorithms for malware detection

Journal

INTERNATIONAL JOURNAL OF INFORMATION SECURITY
Volume 16, Issue 3, Pages 227-248

Publisher

SPRINGER
DOI: 10.1007/s10207-016-0330-4

Keywords

Entropy analysis; Original entry point (OEP); Symbolic aggregate approximation (SAX); Piecewise aggregate approximation (PAA)

Funding

  1. National Research Foundation of Korea [2015-003689]
  2. National Research Foundation of Korea [2015R1A2A2A01003689] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

Ask authors/readers for more resources

The proportion of packed malware has been growing rapidly and now comprises more than 80 % of all existing malware. In this paper, we propose a method for classifying the packing algorithms of given unknown packed executables, regardless of whether they are malware or benign programs. First, we scale the entropy values of a given executable and convert the entropy values of a particular location of memory into symbolic representations. Our proposed method uses symbolic aggregate approximation (SAX), which is known to be effective for large data conversions. Second, we classify the distribution of symbols using supervised learning classification methods, i.e., naive Bayes and support vector machines for detecting packing algorithms. The results of our experiments involving a collection of 324 packed benign programs and 326 packed malware programs with 19 packing algorithms demonstrate that our method can identify packing algorithms of given executables with a high accuracy of 95.35 %, a recall of 95.83 %, and a precision of 94.13 %. We propose four similarity measurements for detecting packing algorithms based on SAX representations of the entropy values and an incremental aggregate analysis. Among these four metrics, the fidelity similarity measurement demonstrates the best matching result, i.e., a rate of accuracy ranging from 95.0 to 99.9 %, which is from 2 to 13 higher than that of the other three metrics. Our study confirms that packing algorithms can be identified through an entropy analysis based on a measure of the uncertainty of the running processes and without prior knowledge of the executables.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available