4.6 Article

When and Why Your Code Starts to Smell Bad (and Whether the Smells Go Away)

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
卷 43, 期 11, 页码 1063-1088

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TSE.2017.2653105

关键词

Code smells; empirical study; mining software repositories

资金

  1. NSF [CCF-1525902, CCF-1253837]
  2. University of Molise
  3. Direct For Computer & Info Scie & Enginr
  4. Division of Computing and Communication Foundations [1253837] Funding Source: National Science Foundation

向作者/读者索取更多资源

Technical debt is a metaphor introduced by Cunningham to indicate not quite right code which we postpone making it right. One noticeable symptom of technical debt is represented by code smells, defined as symptoms of poor design and implementation choices. Previous studies showed the negative impact of code smells on the comprehensibility and maintainability of code. While the repercussions of smells on code quality have been empirically assessed, there is still only anecdotal evidence on when and why bad smells are introduced, what is their survivability, and how they are removed by developers. To empirically corroborate such anecdotal evidence, we conducted a large empirical study over the change history of 200 open source projects. This study required the development of a strategy to identify smell-introducing commits, the mining of over half a million of commits, and the manual analysis and classification of over 10K of them. Our findings mostly contradict common wisdom, showing that most of the smell instances are introduced when an artifact is created and not as a result of its evolution. At the same time, 80 percent of smells survive in the system. Also, among the 20 percent of removed instances, only 9 percent are removed as a direct consequence of refactoring operations.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Software Engineering

Quick remedy commits and their impact on mining software repositories

Fengcai Wen, Csaba Nagy, Michele Lanza, Gabriele Bavota

Summary: Most changes during software maintenance are not atomic and developers may omit needed changes, leading to technical debt or bugs. A study on quick remedy commits found that developers tend to quickly fix issues introduced by omitted changes in previous commits. These quick remedy commits are important for improving code quality and must be considered in mining software repositories for accurate findings.

EMPIRICAL SOFTWARE ENGINEERING (2022)

Article Computer Science, Software Engineering

The Secret Life of Software Vulnerabilities: A Large-Scale Empirical Study

Emanuele Iannone, Roberta Guadagni, Filomena Ferrucci, Andrea De Lucia, Fabio Palomba

Summary: Software vulnerabilities are weaknesses in source code that can be exploited to cause harm. However, there is a lack of knowledge on how vulnerabilities are introduced and removed during the software engineering life cycle. This study investigates the life cycle of known vulnerabilities in open-source software projects, finding that vulnerabilities often require multiple contributions before being introduced and remain unfixed for significant periods of time. The study provides practical implications for vulnerability detectors to assist developers in identifying and addressing these issues in a timely manner.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2023)

Review Computer Science, Software Engineering

A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research

Cody Watson, Nathan Cooper, David Nader Palacio, Kevin Moran, Denys Poshyvanyk

Summary: This article presents a systematic literature review of the intersection of software engineering (SE) and Deep Learning (DL), analyzing 128 papers across 23 SE tasks. It provides an overview of the current state and future directions of DL techniques applied to SE research, outlining a research roadmap for this cross-cutting area.

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY (2022)

Article Computer Science, Software Engineering

Through the looking-glass ... An empirical study on blob infrastructure blueprints in the Topology and Orchestration Specification for Cloud Applications

Stefano Dalla Palma, Chiel van Asseldonk, Gemma Catolino, Dario Di Nucci, Fabio Palomba, Damian A. Tamburri

Summary: Infrastructure-as-code (IaC) is crucial for providing and managing infrastructures through configuration files, but these files may suffer from code smells that impact quality and maintenance. This paper investigates the application of a traditional implementation code smell, Large Class or Blob Blueprint, in the context of TOSCA, and compares metrics-based and unsupervised learning-based detectors on a large dataset. The results suggest a new research direction for dealing with this problem and highlight the effectiveness of metrics-based detectors in detecting Blob Blueprints.

JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS (2023)

Review Computer Science, Theory & Methods

A Systematic Literature Review on the Code Smells Datasets and Validation Mechanisms

Morteza Zakeri-Nasrabadi, Saeed Parsa, Ehsan Esmaili, Fabio Palomba

Summary: The accuracy of code smell-detecting tools varies depending on the dataset used for evaluation. The adequacy of a dataset highly depends on relevant properties such as size, severity level, project types, and the number of each type of smell. Existing datasets often suffer from imbalanced samples, lack of severity level support, and restriction to Java language.

ACM COMPUTING SURVEYS (2023)

Article Computer Science, Software Engineering

What Quality Aspects Influence the Adoption of Docker Images?

Giovanni Rosa, Simone Scalabrino, Gabriele Bavota, Rocco Oliveto

Summary: This article defines a taxonomy of quality features for Docker artifacts through literature review and empirical study, and explores the influence of externally observable features on developers' preferences and their relationship with configuration-related features.

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY (2023)

Article Computer Science, Software Engineering

Automated variable renaming: are we there yet?

Antonio Mastropaolo, Emad Aghajani, Luca Pascarella, Gabriele Bavota

Summary: Identifiers, such as method and variable names, play a significant role in source code comprehension. Existing techniques, mostly data-driven or based on static code analysis, have been proposed to support meaningful identifier recommendations. However, limited empirical investigations have been conducted to evaluate the effectiveness of these techniques, potentially leading to rename refactoring operations. This study explores the potential of data-driven approaches in automated variable renaming and presents promising results, along with identified limitations that require further research.

EMPIRICAL SOFTWARE ENGINEERING (2023)

Article Computer Science, Software Engineering

Enhancing Mobile App Bug Reporting via Real-Time Understanding of Reproduction Steps

Mattia Fazzini, Kevin Moran, Carlos Bernal-Cardenas, Tyler Wendland, Alessandro Orso, Denys Poshyvanyk

Summary: This article introduces a new bug reporting approach called EBUG, which assists users in writing easily readable and conveniently reproducible bug reports by analyzing natural language information entered in real-time and linking it to information extracted via program analyses. Two user studies were conducted to evaluate EBUG, and the results showed that users were able to construct bug reports faster and the reports were more reproducible compared to a baseline bug reporting system. The predictive models of EBUG also outperformed other approaches.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2023)

Article Computer Science, Software Engineering

Using Transfer Learning for Code-Related Tasks

Antonio Mastropaolo, Nathan Cooper, David Nader Palacio, Simone Scalabrino, Denys Poshyvanyk, Rocco Oliveto, Gabriele Bavota

Summary: This paper evaluates the performance of the T5 model in supporting four different code-related tasks and studies the impact of pre-training and multi-task fine-tuning. The results show that the T5 model outperforms state-of-the-art baselines and that not all tasks benefit from multi-task fine-tuning despite the advantages of pre-training.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2023)

Article Computer Science, Software Engineering

The anatomy of a vulnerability database: A systematic mapping study?

Xiaozhou Li, Sergio Moreschini, Zheying Zhang, Fabio Palomba, Davide Taibi

Summary: Software vulnerabilities pose significant risks, such as the loss and manipulation of private data. The software engineering research community has conducted empirical studies and proposed automated techniques to detect and remove vulnerabilities. In this paper, a systematic mapping study is conducted to analyze popular vulnerability databases, adoption goals, other information sources, methods and techniques, and proposed tools. Understanding these aspects can help researchers make informed decisions and practitioners establish reliable sources of information for security policies and standards.

JOURNAL OF SYSTEMS AND SOFTWARE (2023)

Article Computer Science, Software Engineering

An Empirical Investigation Into the Influence of Software Communities' Cultural and on

Stefano Lambiase, Gemma Catolino, Fabiano Pecorelli, Damian A. Tamburri, Fabio Palomba, Willem-Jan van den Heuvel, Filomena Ferrucci

Summary: This paper contributes to the existing body of knowledge on factors affecting productivity in software development by studying the cultural and geographical dispersion of a development community. The results show that cultural and geographical dispersion significantly impact productivity, suggesting that managers and practitioners should consider these aspects throughout the software development lifecycle.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Proceedings Paper Computer Science, Artificial Intelligence

To What Extent do Deep Learning-based Code Recommenders Generate Predictions by Cloning Code from the Training Set?

Matteo Ciniselli, Luca Pascarella, Gabriele Bavota

Summary: Deep learning models are widely used for code completion but it is unclear if the code they generate violates licenses. A study found that around 10% to 0.1% of the code generated by DL models has similarities with instances in the training set.

2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022) (2022)

Proceedings Paper Computer Science, Software Engineering

Using Pre-Trained Models to Boost Code Review Automation

Rosalia Tufano, Simone Masiero, Antonio Mastropaolo, Luca Pascarella, Denys Poshyvanyk, Gabriele Bavota

Summary: Code review is a widely adopted practice in open source and industrial projects. This paper introduces a method for automating code review tasks using deep learning models and demonstrates that a pre-trained T5 model can outperform previous DL models. Furthermore, experiments were conducted on a larger, more realistic, and challenging dataset of code review activities.

2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022) (2022)

Proceedings Paper Computer Science, Software Engineering

Using Reinforcement Learning for Load Testing of Video Games

Rosalia Tufano, Simone Scalabrino, Luca Pascarella, Emad Aghajani, Rocco Oliveto, Gabriele Bavota

Summary: This study explores the possibility of using reinforcement learning for load testing video games. It proposes a method to train agents that can play games like humans while identifying areas that cause a drop in frame rate. The feasibility of this approach is demonstrated through experiments on three games.

2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022) (2022)

Proceedings Paper Computer Science, Software Engineering

Using Deep Learning to Generate Complete Log Statements

Antonio Mastropaolo, Luca Pascarella, Gabriele Bavota

Summary: This paper presents LANCE, an approach that supports developers in making decisions related to logging. LANCE utilizes a trained model to automatically identify the position for logging, select the appropriate log level, and generate correct log statements.

2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022) (2022)

暂无数据