Article
Computer Science, Software Engineering
Kui Liu, Jingtang Zhang, Li Li, Anil Koyuncu, Dongsun Kim, Chunpeng Ge, Zhe Liu, Jacques Klein, Tegawende F. Bissyande
Summary: Fix pattern-based patch generation is a promising direction in automated program repair (APR). The performance of pattern-based APR systems depends on the fix ingredients mined from fix changes in development histories. Collecting a reliable set of bug fixes in repositories can be challenging.
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
(2023)
Article
Computer Science, Software Engineering
Yilin Yang, Tianxing He, Yang Feng, Shaoying Liu, Baowen Xu
Summary: The study proposed a mining approach to identify fix patterns of Python programs by extracting fine-grained bug-fixing code changes. Results showed that 13 out of 101 real bugs could be fixed without human intervention, and in the wild, 15% of the bug code in complex bugs could be fixed and 37% could be matched by fix patterns.
EMPIRICAL SOFTWARE ENGINEERING
(2022)
Review
Biochemical Research Methods
Mohamed Nadif, Francois Role
Summary: Biomedical scientific literature is growing rapidly, making it challenging to identify relevant results; automated information extraction tools based on text mining techniques are essential; deep neural networks have significantly advanced this research field.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Computer Science, Hardware & Architecture
Renjian Pan, Zhaobo Zhang, Xin Li, Krishnendu Chakrabarty, Xinli Gu
Summary: In this article, a two-stage unsupervised root-cause analysis method is proposed, which combines decision tree model and frequent pattern mining to achieve accurate root-cause clustering without requiring historical test data.
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
(2022)
Article
Computer Science, Information Systems
Mona Moradi, Mohammad Rahmanimanesh, Ali Shahzadi, Reza Monsefi
Summary: Collecting sufficient labeled data in streaming data applications can be time-consuming and infeasible. Unsupervised domain adaptation has emerged as a reasonable solution. However, most existing research has focused on stationary environments and overlooked the uncertainties caused by mismatched distributions. In this study, a heterogeneous unsupervised domain adaptation method is proposed to address the classification of samples in streaming data with concept drift. Through a fuzzy rough set-based sample weighting approach, the influence of uncertainties on feature alignment in non-stationary environments is modulated. The proposed method demonstrates advantages in terms of avoiding excessive alignment, optimizing training cost, and gradually reducing dependency on the source domain for domain adaptation, as shown in experiments on benchmark datasets.
INFORMATION SCIENCES
(2023)
Article
Computer Science, Artificial Intelligence
Jiang Lu, Lei Li, Changshui Zhang
Summary: The remarkable progress in deep learning largely relies on large-scale supervised data. Ensuring intra-class modality diversity in the training set is crucial for the generalization capability of cutting-edge deep models, but it requires heavy manual labor for data collection and annotation. Additionally, rare or unexpected modalities may cause reduced performance in current models under emerging modalities.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2022)
Article
Computer Science, Artificial Intelligence
Zhihao Zhang, Yuan Zuo, Chenghua Lin, Junjie Wu
Summary: This paper proposes a novel unsupervised context-aware quality phrase mining framework called LMPhrase, which is built upon large pre-trained language models. The framework mines quality phrases as silver labels using a parameter-free probing technique on the pre-trained language model BERT, and formalizes the phrase tagging task as a sequence generation problem by fine-tuning on the Sequence to-Sequence pre-trained language model BART. The results of extensive experiments show that LMPhrase consistently outperforms existing competitors in two different granularity phrase mining tasks.
KNOWLEDGE-BASED SYSTEMS
(2024)
Article
Computer Science, Artificial Intelligence
Siqiang Hao, Di Liu, Simone Baldi, Wenwu Yu
Summary: Botnet is a network of infected computers that can send spam, spread viruses, or stage denial-of-service attacks without consent. To improve detection, this paper proposes a system using association analysis and the FP-growth algorithm, achieving over 94% recognition accuracy in detecting various botnet activities.
COMPLEX & INTELLIGENT SYSTEMS
(2022)
Article
Chemistry, Multidisciplinary
Faria Ferooz, Malik Tahir Hassan, Sajid Mahmood, Hira Asim, Muhammad Idrees, Muhammad Assam, Abdullah Mohamed, El-Awady Attia
Summary: To reduce crime rates, this study examines the occurrence patterns of crimes using the crime dataset of Lahore, Pakistan. Visualization and unsupervised data mining techniques are utilized to facilitate crime investigation and future risk analysis.
APPLIED SCIENCES-BASEL
(2022)
Article
Computer Science, Artificial Intelligence
Jongbin Ryu, Ming-Hsuan Yang, Jongwoo Lim
Summary: Transfer learning has gained attention for its ability to adapt well-trained models to new domains, with fine-tuning being a commonly used method. This paper introduces a fully unsupervised self-tuning algorithm for learning visual features in different domains, which significantly improves network performance. The algorithm updates pre-trained models with only unlabeled data in the target domain, demonstrating effectiveness across various benchmark datasets.
Article
Automation & Control Systems
Yousef Kowsar, Masud Moshtaghi, Eduardo Velloso, Christopher Leckie, Lars Kulik
Summary: This article presents an efficient algorithm for finding and tracking repeating patterns in a time-series data stream. The algorithm is able to accurately detect intervals of repeating activities in real-time and shows robustness to variations in the signal of recurrence.
IEEE TRANSACTIONS ON CYBERNETICS
(2022)
Article
Geography
Steven Logan
Summary: This article discusses the impact of self-driving cars and tech-driven alternatives on the urban future, highlighting the importance of a logic of repair and pointing out the injustices and inequalities that technology solutions may bring. It calls for attention to repairing infrastructural relations to avoid reproducing the injustices of automobility in history.
Article
Computer Science, Artificial Intelligence
Wanxia Deng, Qing Liao, Lingjun Zhao, Deke Guo, Gangyao Kuang, Dewen Hu, Li Liu
Summary: The Joint Clustering and Discriminative Feature Alignment (JCDFA) approach proposed in this paper aims to simultaneously mine discriminative features of target data and align cross-domain discriminative features to enhance performance in Unsupervised Domain Adaptation (UDA). The method integrates supervised classification of labeled source data and discriminative clustering of unlabeled target data, as well as optimizing supervised contrastive learning and conditional Maximum Mean Discrepancy (MMD) for feature alignment. Experimental results on real-world benchmarks demonstrate the superiority of JCDFA over state-of-the-art domain adaptation methods.
IEEE TRANSACTIONS ON IMAGE PROCESSING
(2021)
Article
Construction & Building Technology
Yunchun Yang, Wenjie Gang, Jiaqi Yuan, Zhenying Zhang, Changqing Tian
Summary: This study proposes a three-stage strategy to identify and analyze the energy consumption patterns of college dormitories in Wuhan, China. The results show that a small percentage of heavy energy users consume a significant portion of the total energy, while the majority of occupants consume a smaller amount. The study also identifies factors such as gender and location that influence energy consumption in different weather conditions.
Article
Engineering, Civil
Bubryur Kim, N. Yuvaraj, K. T. Tse, Dong-Eun Lee, Gang Hu
Summary: This study utilized clustering algorithms to investigate wind pressures on buildings, revealing distinct pressure patterns for different building models. The clustering algorithms were effective in identifying unknown wind pressure patterns on buildings, offering a promising machine-learning technique for wind engineering.
JOURNAL OF WIND ENGINEERING AND INDUSTRIAL AERODYNAMICS
(2021)
Article
Computer Science, Software Engineering
Jinhan Kim, Robert Feldt, Shin Yoo
Summary: The rapid adoption of Deep Learning (DL) systems in safety critical domains necessitates the testing of their correctness and robustness. In this article, we propose Surprise Adequacy (SA) as a test adequacy criterion, which measures the difference between the behavior of a DL system for a given input and its behavior for training data. We demonstrate that SA can predict model behavior correctness and detect adversarial examples.
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
(2023)
Article
Computer Science, Software Engineering
Ahmed Khanfir, Anil Koyuncu, Mike Papadakis, Maxime Cordy, Tegawende F. Bissyande, Jacques Klein, Yves Le Traon
Summary: This study introduces a fault injection tool called iBiR, which injects realistic faults by exploring change patterns associated with user-reported faults. Experimental results show that iBiR outperforms traditional mutation testing in terms of semantic similarity and test effectiveness.
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
(2023)
Article
Computer Science, Software Engineering
Imen Sayar, Alexandre Bartel, Eric Bodden, Yves Le Traon
Summary: Nowadays, the increasing use of deserialization in applications poses a security risk due to the potential for remote code execution attacks originating from untrusted sources. Deserialization vulnerabilities are a critical concern in web applications, often caused by development process faults and library flaws. This study explores attack gadgets in Java libraries and vulnerabilities in Java applications, identifying and understanding how these weaknesses are introduced, patched, and how long they persist. The analysis reveals that even a minor change in a class can introduce a gadget, and a significant portion of libraries remain unpatched, leaving them vulnerable to future attacks.
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
(2023)
Article
Computer Science, Software Engineering
Joe Lorentz, Thomas Hartmann, Assaad Moawad, Francois Fouquet, Djamila Aouada, Yves Le Traon
Summary: This article introduces CalcGraph, a model abstraction of differential programming layers, which can simulate the usage of computational resources and automatically schedule execution based on specified specifications. We propose a novel method for switching models between storage and preallocated memory zones efficiently, maximizing the number of model executions given the available resources. The efficiency of our approach is demonstrated by consuming fewer resources than state-of-the-art frameworks like TensorFlow and PyTorch for single-model and multi-model execution.
SOFTWARE AND SYSTEMS MODELING
(2023)
Article
Computer Science, Software Engineering
Haoye Tian, Kui Liu, Yinghua Li, Abdoul Kader Kabore, Anil Koyuncu, Andrew Habib, Li Li, Junhao Wen, Jacques Klein, Tegawende F. Bissyande
Summary: This study explores the use of learned code representations to identify correct patches. The experimental results show that deep learned embeddings can outperform existing methods that rely on dynamic information.
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
(2023)
Article
Computer Science, Software Engineering
Kui Liu, Jingtang Zhang, Li Li, Anil Koyuncu, Dongsun Kim, Chunpeng Ge, Zhe Liu, Jacques Klein, Tegawende F. Bissyande
Summary: Fix pattern-based patch generation is a promising direction in automated program repair (APR). The performance of pattern-based APR systems depends on the fix ingredients mined from fix changes in development histories. Collecting a reliable set of bug fixes in repositories can be challenging.
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
(2023)
Review
Computer Science, Artificial Intelligence
Sylvain Kubler, Matthieu Renard, Sankalp Ghatpande, Jean-Philippe Georges, Yves Le Traon
Summary: Blockchain technologies are being explored in various applications, but selecting the right platform is challenging. This paper conducts a systematic literature review and develops a decision support tool based on recommended assessment criteria to aid in platform selection.
EXPERT SYSTEMS WITH APPLICATIONS
(2023)
Article
Computer Science, Software Engineering
Shangwen Wang, Ming Wen, Bo Lin, Yepang Liu, Tegawende F. Bissyande, Xiaoguang Mao
Summary: Method naming is a challenging task in object-oriented programming, and automated tool support has been developed to assist developers in this task. However, current approaches assume the availability of method implementation to infer its name, while methods are usually named before their implementations. This work fills the gap by developing an approach that predicts the names of all methods to be implemented within a class based on the class name. A large-scale empirical analysis is conducted to validate the approach, and a hybrid big code-driven approach, Mario, is proposed to predict method names. The experiments show promising results, outperforming existing models and baselines.
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
(2023)
Article
Computer Science, Software Engineering
Zhongxing Yu, Matias Martinez, Zimin Chen, Tegawende F. F. Bissyande, Martin Monperrus
Summary: This article presents the first approach for structurally predicting code transforms at the level of AST nodes using conditional random fields (CRFs). The approach learns a probabilistic model offline and uses it to predict code transforms for new, unseen code snippets. The experimental evaluation shows that considering code structure is crucial for achieving good prediction accuracy.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
(2023)
Article
Computer Science, Software Engineering
Milos Ojdanic, Aayush Garg, Ahmed Khanfir, Renzo Degiovanni, Mike Papadakis, Yves Le Traon
Summary: Fault seeding is commonly used in empirical studies to evaluate and compare test techniques. Recent research has used machine learning techniques to seed faults that look like real ones, raising the question of whether syntactically similar faults result in semantically similar faults. By employing different fault-seeding techniques, the study demonstrates that syntactic similarity does not reflect semantic similarity.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
(2023)
Article
Computer Science, Software Engineering
Pei Liu, Li Li, Kui Liu, Shane McIntosh, John Grundy
Summary: Build systems are crucial in software development to convert source code into executable software. However, the quality and evolution of build systems for mobile apps, particularly on the Android platform, have not been extensively studied. This paper presents an empirical study of 5222 Android projects to investigate the quality and evolution of their build systems.
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS
(2023)
Proceedings Paper
Computer Science, Software Engineering
Jinhan Kim, Nargiz Humbatova, Gunel Jahangirova, Paolo Tonella, Shin Yoo
Summary: As the use of Deep Neural Networks (DNNs) in large software systems continues to grow, there is an increasing need for software developers to design, train, and deploy these models. However, little attention has been given to addressing the difficulties developers face when designing and training such models. This paper surveys and evaluates existing techniques for repairing model performance, using real-world mistakes made by developers and artificial faulty models as benchmarks. The findings suggest that random baseline performs as well as or even outperforms existing techniques, but for larger and more complicated models, all repair techniques fail to find fixes. Further research is needed to develop more sophisticated Deep Learning repair techniques.
2023 IEEE CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION, ICST
(2023)
Article
Computer Science, Information Systems
Farah Batool, Abdul Rehman, Dongsun Kim, Assad Abbas, Raheel Nawaz, Tahir Mustafa Madni
Summary: The authors propose an informed search greedy approach to efficiently identify influencer nodes in the social Internet of Things that provide legitimate information. This approach minimizes network size and eliminates undesirable connections by ranking and prioritizing nodes. Nodes with ranking greater than 0.5 are considered authentic influencers, while nodes with lower rankings are discarded. The algorithm traverses the pruned network to obtain desired information from the authentic node. Experimental results demonstrate the effectiveness of the approach in terms of time consumption and network traversal.
CMC-COMPUTERS MATERIALS & CONTINUA
(2023)
Article
Computer Science, Information Systems
Tahir Sher, Abdul Rehman, Dongsun Kim
Summary: COVID-19, a contagious disease, has put pressure on various sectors, but data mining with IoT and SIoT has played a crucial role in overcoming it. This study used different machine learning algorithms to develop a model for analyzing and predicting the existence of COVID-19. The decision tree model performed the best, achieving an accuracy of 98.42%.
CMC-COMPUTERS MATERIALS & CONTINUA
(2023)