Article
Automation & Control Systems
Bruno Sotto-Mayor, Amir Elmishali, Meir Kalech, Rui Abreu
Summary: This paper studies the performance of defect prediction models and compares models using Design code smells, Traditional smells, and a combination of both. The results show that models trained with both Design code smells and Traditional smells performed the best.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2022)
Article
Computer Science, Software Engineering
Mouna Abidi, Md Saidur Rahman, Moses Openja, Foutse Khomh
Summary: Modern applications are developed using components written in different programming languages and technologies, which presents challenges in terms of development and maintenance due to the increased number of languages. Design smells can impact software quality and are associated with a higher risk of future bugs.
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
(2021)
Review
Computer Science, Software Engineering
Xiaofeng Han, Amjed Tahir, Peng Liang, Steve Counsell, Kelly Blincoe, Bing Li, Yajing Luo
Summary: Code review is a crucial part of software quality control. This study examines the identification and resolution of code smells in modern code reviews. The results show that code smells are not commonly found, usually caused by violation of coding conventions, and are generally fixed promptly when detected. The study also highlights the importance of following coding conventions and considering program context when addressing code smells.
EMPIRICAL SOFTWARE ENGINEERING
(2022)
Article
Computer Science, Software Engineering
Emanuele Iannone, Roberta Guadagni, Filomena Ferrucci, Andrea De Lucia, Fabio Palomba
Summary: Software vulnerabilities are weaknesses in source code that can be exploited to cause harm. However, there is a lack of knowledge on how vulnerabilities are introduced and removed during the software engineering life cycle. This study investigates the life cycle of known vulnerabilities in open-source software projects, finding that vulnerabilities often require multiple contributions before being introduced and remain unfixed for significant periods of time. The study provides practical implications for vulnerability detectors to assist developers in identifying and addressing these issues in a timely manner.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
(2023)
Article
Computer Science, Software Engineering
Lucas Francisco da Matta Vegi, Marco Tulio Valente
Summary: This paper studies the internal quality issues of systems implemented with Elixir language and discovered and documented new code smells for this language through interaction with the Elixir developer community and mining of GitHub repositories. The results propose a catalog of 35 code smells, 23 of which are specific to Elixir and 12 are traditional code smells. The relevance and prevalence of each smell in the catalog were validated through a survey with 181 experienced Elixir developers.
EMPIRICAL SOFTWARE ENGINEERING
(2023)
Article
Computer Science, Software Engineering
Beyza Eken, Francis Palma, Basar Ayse, Tosun Ayse
Summary: Community-aware metrics and code smells have been studied in software bug prediction, showing improvement in prediction performance. Future research should focus on communication patterns and cross-project bug prediction settings.
SOFTWARE QUALITY JOURNAL
(2021)
Article
Computer Science, Software Engineering
Ahmed Samir Imam Mahmoud, Tapajit Dey, Alexander Nolte, Audris Mockus, James D. Herbsleb
Summary: Research shows that 9.14% of code blobs in hackathon repositories and 8% of lines of code (LOC) are created during hackathons, with around a third of hackathon code getting reused in other projects. The number of associated technologies and participants in hackathons increase the probability of code reuse.
EMPIRICAL SOFTWARE ENGINEERING
(2022)
Review
Computer Science, Software Engineering
Dong Wang, Yuki Ueda, Raula Gaikovina Kula, Takashi Ishio, Kenichi Matsumoto
Summary: This study investigates the potential of benchmarking in code review research, identifying trends in common methodologies, papers with replicable potential, and the use of specific metric sets in different research topics. While it is currently not feasible to benchmark code review studies, a common benchmark could foster innovation among new researchers and the development of established methodologies.
JOURNAL OF SYSTEMS AND SOFTWARE
(2021)
Article
Computer Science, Software Engineering
Ruben Opdebeeck, Ahmed Zerouali, Camilo Velazquez-Rodriguez, Coen De Roover
Summary: Ansible roles are recommended to adhere to semantic versioning format for new releases, but the criteria for breaking changes or feature additions are unclear. An empirical study was conducted on over 81000 version increments across 8500 roles to analyze the state of semantic versioning and the most commonly changed elements. Structural difference metrics were used to train a classifier for predicting version bumps, and developer survey confirmed that the application of version increments is not always consistent. Guidelines were formulated based on insights gained to ensure clear interpretation of version increments for Ansible roles.
JOURNAL OF SYSTEMS AND SOFTWARE
(2021)
Article
Computer Science, Software Engineering
Jirat Pasuksmit, Patanamon Thongtanunam, Shanika Karunasekera
Summary: This study aims to minimize the negative impact of Story Points (SP) changes on sprint planning in Agile software development. Through analysis of 19,349 work items from seven open-source projects, it was found that approximately 10% of work items undergo SP changes, with unchanged SP being more reliable in reflecting development time. The study suggests reviewing SP and scope of work prior or during sprint planning and introduces a classifier for predicting SP changes.
EMPIRICAL SOFTWARE ENGINEERING
(2022)
Article
Computer Science, Software Engineering
Chao Liu, Xin Xia, David Lo, Zhiwe Liu, Ahmed E. Hassan, Shanping Li
Summary: This article proposed an IR-based model CodeMatcher, inherited the advantages of DeepCS, and achieved fast and accurate code search. Experimental results showed that CodeMatcher performed well on the MRR metric and outperformed existing online search engines.
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
(2022)
Article
Computer Science, Software Engineering
Jose Amancio M. Santos, Gadiel Xavier Antunes Petronilo
Summary: Code smell refers to potential problems in software design, while design pattern describes good design solutions. Both concepts serve as metaphors for understanding and communication in software design. This study aimed to empirically investigate the relationship between code smells and design patterns. Through mining software repositories and studying software evolution, the researchers analyzed 61 software and identified classes linked to both code smells and design patterns. The findings showed that the relationship between smells and design patterns varied depending on the software, and that the instability metric weakly reflected this relationship.
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS
(2022)
Article
Computer Science, Software Engineering
Dong Wang, Tao Xiao, Teyon Son, Raula Gaikovina Kula, Takashi Ishio, Yasutaka Kamei, Kenichi Matsumoto
Summary: This study analyzed a large number of pull requests on GitHub and found correlations between emoji reactions and review time, first-time contributors, comment intentions, and sentiment consistency. The results suggest that emoji reactions not only reduce commenting noise, but also play a positive role in facilitating collaborative communication during the review process.
EMPIRICAL SOFTWARE ENGINEERING
(2023)
Review
Computer Science, Information Systems
Dong Wang, Raula Gaikovina Kula, Takashi Ishio, Kenichi Matsumoto
Summary: Contemporary code review tools are popular for software quality assurance, with the ability to post linkages between patches during discussions. Patch linkage notifications exhibit latency, with patches having Alternative Solution linkages undergoing quicker reviews and fewer revisions. Detection models show promising recall rates for Alternative Solution linkages, but precision can be improved.
INFORMATION AND SOFTWARE TECHNOLOGY
(2021)
Article
Computer Science, Software Engineering
Wesley K. G. Assuncao, Jacob Kruger, Sebastien Mosser, Sofiane Selaoui
Summary: Microservice architectures are widely used in the industry for developing scalable software systems. However, their design and maintenance present challenges to software engineers. To gain insights into the evolution of microservices, a large-scale empirical study was conducted on 11 open-source systems, revealing recurring patterns of evolution and analyzing the dependence between microservices.
JOURNAL OF SYSTEMS AND SOFTWARE
(2023)
Article
Computer Science, Software Engineering
Fengcai Wen, Csaba Nagy, Michele Lanza, Gabriele Bavota
Summary: Most changes during software maintenance are not atomic and developers may omit needed changes, leading to technical debt or bugs. A study on quick remedy commits found that developers tend to quickly fix issues introduced by omitted changes in previous commits. These quick remedy commits are important for improving code quality and must be considered in mining software repositories for accurate findings.
EMPIRICAL SOFTWARE ENGINEERING
(2022)
Article
Computer Science, Software Engineering
Emanuele Iannone, Roberta Guadagni, Filomena Ferrucci, Andrea De Lucia, Fabio Palomba
Summary: Software vulnerabilities are weaknesses in source code that can be exploited to cause harm. However, there is a lack of knowledge on how vulnerabilities are introduced and removed during the software engineering life cycle. This study investigates the life cycle of known vulnerabilities in open-source software projects, finding that vulnerabilities often require multiple contributions before being introduced and remain unfixed for significant periods of time. The study provides practical implications for vulnerability detectors to assist developers in identifying and addressing these issues in a timely manner.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
(2023)
Review
Computer Science, Software Engineering
Cody Watson, Nathan Cooper, David Nader Palacio, Kevin Moran, Denys Poshyvanyk
Summary: This article presents a systematic literature review of the intersection of software engineering (SE) and Deep Learning (DL), analyzing 128 papers across 23 SE tasks. It provides an overview of the current state and future directions of DL techniques applied to SE research, outlining a research roadmap for this cross-cutting area.
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
(2022)
Article
Computer Science, Software Engineering
Stefano Dalla Palma, Chiel van Asseldonk, Gemma Catolino, Dario Di Nucci, Fabio Palomba, Damian A. Tamburri
Summary: Infrastructure-as-code (IaC) is crucial for providing and managing infrastructures through configuration files, but these files may suffer from code smells that impact quality and maintenance. This paper investigates the application of a traditional implementation code smell, Large Class or Blob Blueprint, in the context of TOSCA, and compares metrics-based and unsupervised learning-based detectors on a large dataset. The results suggest a new research direction for dealing with this problem and highlight the effectiveness of metrics-based detectors in detecting Blob Blueprints.
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS
(2023)
Review
Computer Science, Theory & Methods
Morteza Zakeri-Nasrabadi, Saeed Parsa, Ehsan Esmaili, Fabio Palomba
Summary: The accuracy of code smell-detecting tools varies depending on the dataset used for evaluation. The adequacy of a dataset highly depends on relevant properties such as size, severity level, project types, and the number of each type of smell. Existing datasets often suffer from imbalanced samples, lack of severity level support, and restriction to Java language.
ACM COMPUTING SURVEYS
(2023)
Article
Computer Science, Software Engineering
Giovanni Rosa, Simone Scalabrino, Gabriele Bavota, Rocco Oliveto
Summary: This article defines a taxonomy of quality features for Docker artifacts through literature review and empirical study, and explores the influence of externally observable features on developers' preferences and their relationship with configuration-related features.
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
(2023)
Article
Computer Science, Software Engineering
Antonio Mastropaolo, Emad Aghajani, Luca Pascarella, Gabriele Bavota
Summary: Identifiers, such as method and variable names, play a significant role in source code comprehension. Existing techniques, mostly data-driven or based on static code analysis, have been proposed to support meaningful identifier recommendations. However, limited empirical investigations have been conducted to evaluate the effectiveness of these techniques, potentially leading to rename refactoring operations. This study explores the potential of data-driven approaches in automated variable renaming and presents promising results, along with identified limitations that require further research.
EMPIRICAL SOFTWARE ENGINEERING
(2023)
Article
Computer Science, Software Engineering
Mattia Fazzini, Kevin Moran, Carlos Bernal-Cardenas, Tyler Wendland, Alessandro Orso, Denys Poshyvanyk
Summary: This article introduces a new bug reporting approach called EBUG, which assists users in writing easily readable and conveniently reproducible bug reports by analyzing natural language information entered in real-time and linking it to information extracted via program analyses. Two user studies were conducted to evaluate EBUG, and the results showed that users were able to construct bug reports faster and the reports were more reproducible compared to a baseline bug reporting system. The predictive models of EBUG also outperformed other approaches.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
(2023)
Article
Computer Science, Software Engineering
Antonio Mastropaolo, Nathan Cooper, David Nader Palacio, Simone Scalabrino, Denys Poshyvanyk, Rocco Oliveto, Gabriele Bavota
Summary: This paper evaluates the performance of the T5 model in supporting four different code-related tasks and studies the impact of pre-training and multi-task fine-tuning. The results show that the T5 model outperforms state-of-the-art baselines and that not all tasks benefit from multi-task fine-tuning despite the advantages of pre-training.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
(2023)
Article
Computer Science, Software Engineering
Xiaozhou Li, Sergio Moreschini, Zheying Zhang, Fabio Palomba, Davide Taibi
Summary: Software vulnerabilities pose significant risks, such as the loss and manipulation of private data. The software engineering research community has conducted empirical studies and proposed automated techniques to detect and remove vulnerabilities. In this paper, a systematic mapping study is conducted to analyze popular vulnerability databases, adoption goals, other information sources, methods and techniques, and proposed tools. Understanding these aspects can help researchers make informed decisions and practitioners establish reliable sources of information for security policies and standards.
JOURNAL OF SYSTEMS AND SOFTWARE
(2023)
Article
Computer Science, Software Engineering
Stefano Lambiase, Gemma Catolino, Fabiano Pecorelli, Damian A. Tamburri, Fabio Palomba, Willem-Jan van den Heuvel, Filomena Ferrucci
Summary: This paper contributes to the existing body of knowledge on factors affecting productivity in software development by studying the cultural and geographical dispersion of a development community. The results show that cultural and geographical dispersion significantly impact productivity, suggesting that managers and practitioners should consider these aspects throughout the software development lifecycle.
JOURNAL OF SYSTEMS AND SOFTWARE
(2024)
Proceedings Paper
Computer Science, Artificial Intelligence
Matteo Ciniselli, Luca Pascarella, Gabriele Bavota
Summary: Deep learning models are widely used for code completion but it is unclear if the code they generate violates licenses. A study found that around 10% to 0.1% of the code generated by DL models has similarities with instances in the training set.
2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022)
(2022)
Proceedings Paper
Computer Science, Software Engineering
Rosalia Tufano, Simone Masiero, Antonio Mastropaolo, Luca Pascarella, Denys Poshyvanyk, Gabriele Bavota
Summary: Code review is a widely adopted practice in open source and industrial projects. This paper introduces a method for automating code review tasks using deep learning models and demonstrates that a pre-trained T5 model can outperform previous DL models. Furthermore, experiments were conducted on a larger, more realistic, and challenging dataset of code review activities.
2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022)
(2022)
Proceedings Paper
Computer Science, Software Engineering
Rosalia Tufano, Simone Scalabrino, Luca Pascarella, Emad Aghajani, Rocco Oliveto, Gabriele Bavota
Summary: This study explores the possibility of using reinforcement learning for load testing video games. It proposes a method to train agents that can play games like humans while identifying areas that cause a drop in frame rate. The feasibility of this approach is demonstrated through experiments on three games.
2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022)
(2022)
Proceedings Paper
Computer Science, Software Engineering
Antonio Mastropaolo, Luca Pascarella, Gabriele Bavota
Summary: This paper presents LANCE, an approach that supports developers in making decisions related to logging. LANCE utilizes a trained model to automatically identify the position for logging, select the appropriate log level, and generate correct log statements.
2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022)
(2022)