4.7 Article

CANDYMAN: Classifying Android malware families by modelling dynamic traces with Markov chains

Journal

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.engappai.2018.06.006

Keywords

Android malware; Dynamic analysis; Classification; Deep Learning; Markov chains

Funding

  1. Comunidad Autdnoma de Madrid [S2013/ICE-3095]
  2. Spanish Ministry of Science and Education and Competitivity (MINECO)
  3. European Regional Development Fund (FEDER) [TIN2014-56494-C4-4-P, TIN2017-85727-C4-3-P]
  4. Justice Programme of the European Union [723180 - RiskTrack - JUST-2015-JCOO-AG/JUST-2015-JCOO-AG-1]

Ask authors/readers for more resources

Malware writers are usually focused on those platforms which are most used among common users, with the aim of attacking as many devices as possible. Due to this reason, Android has been heavily attacked for years. Efforts dedicated to combat Android malware are mainly concentrated on detection, in order to prevent malicious software to be installed in a target device. However, it is equally important to put effort into an automatic classification of the type, or family, of a malware sample, in order to establish which actions are necessary to mitigate the damage caused. In this paper, we present CANDYMAN, a tool that classifies Android malware families by combining dynamic analysis and Markov chains. A dynamic analysis process allows to extract representative information of a malware sample, in form of a sequence of states, while a Markov chain allows to model the transition probabilities between the states of the sequence, which will be used as features in the classification process. The space of features built is used to train classical Machine Learning, including methods for imbalanced learning, and Deep Learning algorithms, over a dataset of malware samples from different families, in order to evaluate the proposed method. Using a collection of 5,560 malware samples grouped into 179 different families (extracted from the Drebin dataset), and once made a selection based on a minimum number of relevant and valid samples, a final set of 4,442 samples grouped into 24 different malware families was used. The experimental results indicate a precision performance of 81.8% over this dataset.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Engineering, Aerospace

A deep learning approach to solar radio flux forecasting

Emma Stevenson, Victor Rodriguez-Fernandez, Edmondo Minisci, David Camacho

Summary: This study employs a novel deep learning method, N-BEATS, for predicting solar proxy index in space operations a few days ahead. The experimental results show that this method performs well in single point forecasting and can generate uncertainty estimates. The N-BEATS model outperforms baseline models and statistical methods, demonstrating significant advantages in performance.

ACTA ASTRONAUTICA (2022)

Review Computer Science, Artificial Intelligence

A survey on extremism analysis using natural language processing: definitions, literature review, trends and challenges

Javier Torregrosa, Gema Bello-Orgaz, Eugenio Martinez-Camara, Javier Del Ser, David Camacho

Summary: This article discusses the issue of extremism as a global problem and explores the application of natural language processing (NLP) in extremism research. The article reviews the definition of extremism, the characteristics of extremist discourse, and the application and achievements of NLP techniques. It also suggests future research directions and challenges in this field.

JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING (2022)

Article Computer Science, Artificial Intelligence

Fusing CNNs and statistical indicators to improve image classification

Javier Huertas-Tato, Alejandro Martin, Julian Fierrez, David Camacho

Summary: This paper proposes an ensemble method for accurate image classification, which combines automatically detected features and statistical indicators to achieve better performance. Testing on various datasets shows that including additional indicators and using an ensemble classification approach can improve performance.

INFORMATION FUSION (2022)

Article Computer Science, Artificial Intelligence

A Mixed Approach for Aggressive Political Discourse Analysis on Twitter

Javier Torregrosa, Sergio D'Antonio-Maceiras, Guillermo Villar-Rodriguez, Amir Hussain, Erik Cambria, David Camacho

Summary: Political tensions have increased in Europe since the beginning of the new century, leading to social movements and political changes in various countries. This study examines the political discourse and underlying tensions during Madrid's elections in May 2021, using a mixed methodology approach. The findings suggest that the electoral campaign is not as negative as perceived by the citizens, and that ideologically extreme parties tend to use more aggressive language.

COGNITIVE COMPUTATION (2023)

Article Chemistry, Multidisciplinary

AWMC: Abnormal-Weather Monitoring and Curation Service Based on Dynamic Graph Embedding

Yuxuan Gu, Jiakai Gu, Gen Li, Heeseung Yun, Jason J. Jung, Sojung An, David Camacho

Summary: This paper presents a system called the abnormal-weather monitoring and curation service (AWMC), which analyzes weather datasets to show abnormal conditions in specific cities on certain dates. The system uses a dynamic graph-embedding-based anomaly detection method to measure anomaly scores, and evaluations show high precision, recall, and F1 score for all cities monitored by AWMC.

APPLIED SCIENCES-BASEL (2022)

Article Automation & Control Systems

Guest Editorial: Scientific and Physics-Informed Machine Learning for Industrial Applications

Francesco Piccialli, Fabio Giampaolo, David Camacho, Gang Mei

Summary: Deep learning technology is driving the in-depth development of industrial automation. Wang et al. interpret the decision process of convolutional neural networks (CNNs) using a percolation model from a statistical physics perspective. They introduce the concept of differentiation degree and present an empirical formula for quantifying it.

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS (2023)

Article Computer Science, Artificial Intelligence

Countering malicious content moderation evasion in online social networks: Simulation and detection of word camouflage

Alvaro Huertas-Garcia, Alejandro Marin, Javier Huertas-Tato, David Camacho

Summary: Content moderation is crucial in stopping unacceptable behaviors in online platforms. This article presents an innovative approach involving the simulation and detection of content evasion techniques using a multilingual transformer model. The developed multilingual tool, pyleetspeak, allows for the generation and simulation of content evasion through word camouflage, while a multilingual NER model is designed for the detection of such evasion techniques.

APPLIED SOFT COMPUTING (2023)

Article Computer Science, Artificial Intelligence

BERTuit: Understanding Spanish language in Twitter with transformers

Javier Huertas-Tato, Alejandro Martin, David Camacho

Summary: The emergence of complex attention-based language models like BERT, RoBERTa or GPT-3 has enabled the tackling of highly complex tasks in various scenarios. However, these models face significant difficulties when applied to specific domains, such as social networks like Twitter. In order to address the challenges of natural language processing in this domain, we present BERTuit, the largest transformer proposed for the Spanish language, pre-trained on a massive dataset of Spanish tweets. Our motivation is to provide a powerful resource for better understanding Spanish Twitter and combating the spread of misinformation. BERTuit is evaluated and compared against competitive multilingual transformers, showing its utility through applications like visualizing groups of hoaxes and profiling authors spreading disinformation.

EXPERT SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Evolving Generative Adversarial Networks to improve image steganography

Alejandro Martin, Alfonso Hernandez, Moutaz Alazab, Jason Jung, David Camacho

Summary: Images are commonly used for hiding information using steganography techniques. A wide range of steganography methods and steganalysis techniques are available, with recent techniques relying on Convolutional Neural Networks to minimize visual changes. This article demonstrates the use of a Generative Adversarial Network (GAN) to enhance a spatial domain steganalysis method and insert secret information with minimal image alteration. The results show that this approach successfully avoids detection by a state-of-the-art Deep Learning steganalysis architecture.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

Article Computer Science, Artificial Intelligence

DeepVATS: Deep Visual Analytics for Time Series

Victor Rodriguez-Fernandez, David Montalvo-Garcia, Francesco Piccialli, Grzegorz J. Nalepa, David Camacho

Summary: Deep Visual Analytics (DVA) is a field that aims to develop Visual Interactive Systems supported by deep learning for large-scale data processing and implementation across different data and domains. This paper presents DeepVATS, an open-source tool for time series data that uses a self-supervised masked time series autoencoder to discover patterns and anomalies.

KNOWLEDGE-BASED SYSTEMS (2023)

Proceedings Paper Computer Science, Artificial Intelligence

Android Malware Detection Through a Pre-trained Model for Code Understanding

Eva Garcia-Soto, Alejandro Martin, Javier Huertas-Tato, David Camacho

Summary: This study utilizes CodeT5 pre-trained language model to generate context and semantic aware embeddings for a better representation of the behavior of Android applications. It shows how these embeddings can be used to train a recurrent neural network for malware detection tasks, and presents promising results.

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING & AMBIENT INTELLIGENCE (UCAMI 2022) (2023)

Article Computer Science, Artificial Intelligence

Generation and detection of manipulated multimodal audiovisual content: Advances, trends and open challenges☆

Helena Liz-Lopez, Mamadou Keita, Abdelmalik Taleb-Ahmed, Abdenour Hadid, Javier Huertas-Tato, David Camacho

Summary: Generative deep learning techniques have been widely discussed in the public, but the slow progress in applying these techniques to counter disinformation is concerning. With the ease and credibility of manipulating multimedia content, developing effective forensic techniques becomes invaluable. This survey comprehensively describes modern manipulation and forensic techniques, focusing on their applications in video, audio, and multimodal fusion. The classification of manipulation techniques and the generation of datasets using generative techniques are provided for forensic purposes. The review and comparative analysis of forensic techniques from 2018 to 2023, as well as the comparison of end-to-end forensic tools for end-users, are presented. Clear trends and challenges, such as multilinguality, multimodality, and improving data quality, are identified for future research in an ever-changing adversarial environment.

INFORMATION FUSION (2024)

Article Computer Science, Information Systems

YoungRes: A Serious Game-Based Intervention to Increase Youngsters Resilience Against Extremist Ideologies

Angel Panizo-Lledot, Javier Torregrosa, Raquel Menendez-Ferreira, Daniel Lopez-Fernandez, Pedro P. Alarcon, David Camacho

Summary: Extremist ideologies are spreading in today's society, affecting both the political and social levels. Young people, in particular, are vulnerable to these influences due to their developmental stage. Therefore, it is crucial to equip them with psychological skills to rationalize and resist these ideologies. Video games, already a popular technology among young generations, can be used as an innovative approach to motivate and engage youngsters in interventions to increase psychological resilience. This study adapted a traditional emotional intelligence training program into a serious game-based intervention called YoungRes and evaluated its impact on students. The findings showed that the intervention was well received by students, especially those who frequently play video games, and resulted in improvements in emotional intelligence competences and knowledge about the Islamic culture.

IEEE ACCESS (2022)

Article Computer Science, Information Systems

Addressing Evolutionary-Based Dynamic Problems: A New Methodology for Evaluating Immigrants Strategies in MOGAs

Angel Panizo-Lledot, Martin Pedemonte, Gema Bello-Orgaz, David Camacho

Summary: Multi-Objective Genetic Algorithms (MOGAs) have been successfully applied to dynamic problems in various domains, but they often require special adaptation to work properly in such environments. Different techniques, including immigrant strategies, have been proposed to address the challenges of dynamic environments. This work proposes a new methodology that evaluates the performance of immigrant strategies in two levels: coarse-grain evaluation based on quality, stability, and speed, and fine-grain study of the status of immigrant individuals during the algorithm evolution. A visualization technique for population mixing analysis is also proposed. The proposed methodology is validated in the context of the Dynamic Community Detection problem.

IEEE ACCESS (2022)

Article Automation & Control Systems

Walk as you feel: Privacy preserving emotion recognition from gait patterns

Carmen Bisogni, Lucia Cimmino, Michele Nappi, Toni Pannese, Chiara Pero

Summary: This paper presents a gait-based emotion recognition method that does not rely on facial cues, achieving competitive performance on small and unbalanced datasets. The proposed approach utilizes advanced deep learning architecture and achieves high recognition and accuracy rates.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

Satellite constellation method for ground targeting optimized with K-means clustering and genetic algorithm

Soung Sub Lee

Summary: This study proposed a satellite constellation method that utilizes machine learning and customized repeating ground track orbits to optimize satellite revisit performance for each target.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

A method of user recruitment and adaptation degree improvement via community collaboration in sparse mobile crowdsensing systems

Jian Wang, Xiuying Zhan, Yuping Yan, Guosheng Zhao

Summary: This paper proposes a method of user recruitment and adaptation degree improvement via community collaboration to solve the task allocation problem in sparse mobile crowdsensing. By matching social relationships and perception task characteristics, the entire perceptual map can be accurately inferred.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

Robotic assembly control reconfiguration based on transfer reinforcement learning for objects with different geometric features

Yuhang Gai, Bing Wang, Jiwen Zhang, Dan Wu, Ken Chen

Summary: This paper investigates how to reconfigure existing compliance controllers for new assembly objects with different geometric features. By using the proposed Equivalent Theory of Compliance Law (ETCL) and Weighted Dimensional Policy Distillation (WDPD) method, the learning cost can be reduced and better control performance can be achieved.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

Progress and prospects of future urban health status prediction

Zhihao Xu, Zhiqiang Lv, Benjia Chu, Zhaoyu Sheng, Jianbo Li

Summary: Predicting future urban health status is crucial for identifying urban diseases and planning cities. By applying an improved meta-analysis approach and considering the complexity of cities as systems, this study selects eight urban factors and explores suitable prediction methods for these factors.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

A localized decomposition evolutionary algorithm for imbalanced multi-objective optimization

Yulong Ye, Qiuzhen Lin, Ka-Chun Wong, Jianqiang Li, Zhong Ming, Carlos A. Coello Coello

Summary: This paper proposes a localized decomposition evolutionary algorithm (LDEA) to tackle imbalanced multi-objective optimization problems (MOPs). LDEA assigns a local region for each subproblem using a localized decomposition method and restricts the solution update within the region to maintain diversity. It also speeds up convergence by evolving only the best-associated solution in each subproblem while balancing the population's diversity.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

LDD-Net: Lightweight printed circuit board defect detection network fusing multi-scale features

Longxin Zhang, Jingsheng Chen, Jianguo Chen, Zhicheng Wen, Xusheng Zhou

Summary: This study proposes a lightweight PCB image defect detection network (LDD-Net) that achieves high accuracy by designing a novel lightweight feature extraction network, multi-scale aggregation network, and lightweight decoupling head. Experimental results show that LDD-Net outperforms state-of-the-art models in terms of accuracy, computation, and detection speed, making it suitable for edge systems or resource-constrained embedded devices.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

Adaptive stable backstepping controller based on support vector regression for nonlinear systems

Kemal Ucak, Gulay Oke Gunel

Summary: This paper introduces a novel adaptive stable backstepping controller based on support vector regression for nonlinear dynamical systems. The controller utilizes SVR to identify the dynamics of the nonlinear system and integrates stable BSC behavior. The experimental results demonstrate successful control performance for both nonlinear systems.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

A non-dominated sorting genetic algorithm III using competition crossover and opposition-based learning for the optimal dispatch of the combined cooling, heating, and power system with photovoltaic thermal collector

Dexuan Zou, Mengdi Li, Haibin Ouyang

Summary: In this study, a photovoltaic thermal collector is integrated into a combined cooling, heating, and power system to reduce primary energy consumption, operation cost, and carbon dioxide emission. By applying a novel genetic algorithm and constraint handling approach, it is found that the CCHP scenarios with PV/T are more efficient and achieve the lowest energy consumption.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

Identification and optimization of material constitutive equations using genetic algorithms

Abhinav Pandey, Litton Bhandari, Vidit Gaur

Summary: This research proposes a novel model-agnostic framework based on genetic algorithms to identify and optimize the set of coefficients of the constitutive equations of engineering materials. The framework demonstrates solution convergence, scalability, and high explainability for a wide range of engineering materials. The experimental validation shows that the proposed framework outperforms commercially available software in terms of optimization efficiency.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

A generalized visibility graph algorithm for analyzing biological time series having rotation in polar plane

Zahra Ramezanpoor, Adel Ghazikhani, Ghasem Sadeghi Bajestani

Summary: Time series analysis is a method used to analyze phenomena with temporal measurements. Visibility graphs are a technique for representing and analyzing time series, particularly when dealing with rotations in the polar plane. This research proposes a visibility graph algorithm that efficiently handles biological time series with rotation in the polar plane. Experimental results demonstrate the effectiveness of the proposed algorithm in both synthetic and real world time series.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

Mutual dimensionless improved bearing fault diagnosis based on Bp-increment broad learning system in computer vision

ChunLi Li, Qintai Hu, Shuping Zhao, Jigang Wu, Jianbin Xiong

Summary: Efficient and accurate diagnosis of rotating machinery in the petrochemical industry is crucial. However, the nonlinear and non-stationary vibration signals generated in harsh environments pose challenges in distinguishing fault signals from normal ones. This paper proposes a BP-Incremental Broad Learning System (BP-INBLS) model to address these challenges. The effectiveness of the proposed method in fault diagnosis is demonstrated through validation and comparative analysis with a published method.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

Influence of cost/loss functions on classification rate: A comparative study across diverse classifiers and domains

Fatemeh Chahkoutahi, Mehdi Khashei

Summary: The classification rate is the most important factor in selecting an appropriate classification approach. In this paper, the influence of different cost/loss functions on the classification rate of different classifiers is compared, and empirical results show that cost/loss functions significantly affect the classification rate.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Article Automation & Control Systems

A partition-based problem transformation algorithm for classifying imbalanced multi-label data

Jicong Duan, Xibei Yang, Shang Gao, Hualong Yu

Summary: The study proposes a novel partition-based imbalanced multi-label learning algorithm, MLHC, which divides the original label space into disconnected subspaces using hierarchical clustering. It successfully tackles the class imbalance problem in multi-label data and outperforms other class imbalance multi-label learning algorithms.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)

Review Automation & Control Systems

A review of retinal vessel segmentation for fundus image analysis

Qing Qin, Yuanyuan Chen

Summary: This paper offers a comprehensive review of retinal vessel automatic segmentation research, including both traditional methods and deep learning methods. In particular, supervised learning methods are summarized and analyzed based on CNN, GAN, and UNet. The advantages and disadvantages of existing segmentation methods are also outlined.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2024)