☆ 4.7 Article

CANDYMAN: Classifying Android malware families by modelling dynamic traces with Markov chains

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2018)

Journal

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE

Volume 74, Issue -, Pages 121-133

Publisher

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.engappai.2018.06.006

Keywords

Android malware; Dynamic analysis; Classification; Deep Learning; Markov chains

Funding

Comunidad Autdnoma de Madrid [S2013/ICE-3095]
Spanish Ministry of Science and Education and Competitivity (MINECO)
European Regional Development Fund (FEDER) [TIN2014-56494-C4-4-P, TIN2017-85727-C4-3-P]
Justice Programme of the European Union [723180 - RiskTrack - JUST-2015-JCOO-AG/JUST-2015-JCOO-AG-1]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Malware writers are usually focused on those platforms which are most used among common users, with the aim of attacking as many devices as possible. Due to this reason, Android has been heavily attacked for years. Efforts dedicated to combat Android malware are mainly concentrated on detection, in order to prevent malicious software to be installed in a target device. However, it is equally important to put effort into an automatic classification of the type, or family, of a malware sample, in order to establish which actions are necessary to mitigate the damage caused. In this paper, we present CANDYMAN, a tool that classifies Android malware families by combining dynamic analysis and Markov chains. A dynamic analysis process allows to extract representative information of a malware sample, in form of a sequence of states, while a Markov chain allows to model the transition probabilities between the states of the sequence, which will be used as features in the classification process. The space of features built is used to train classical Machine Learning, including methods for imbalanced learning, and Deep Learning algorithms, over a dataset of malware samples from different families, in order to evaluate the proposed method. Using a collection of 5,560 malware samples grouped into 179 different families (extracted from the Drebin dataset), and once made a selection based on a minimum number of relevant and valid samples, a final set of 4,442 samples grouped into 24 different malware families was used. The experimental results indicate a precision performance of 81.8% over this dataset.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7

Not enough ratings

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Privacy-preserving malware detection in Android-based IoT devices through federated Markov chains

Gianni D'Angelo, Eslam Farsimadan, Massimo Ficco, Francesco Palmieri, Antonio Robustelli

Summary: The emergence of new and sophisticated malware targeting Android-based IoT devices poses security risks and the need for effective detection models and strategies. Federated Learning-based solutions, which use Machine Learning models without sharing user data, are being developed. However, these methods are affected by non-independent and identically distributed data. Privacy-preserving approaches using Markov chains and associative rules are proposed to handle malware classification in the IoT scenario. The approach achieves high accuracy and comparable runtime performance with centralized methods.

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE (2023)