☆ 4.6 Article

When and Why Your Code Starts to Smell Bad (and Whether the Smells Go Away)

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2017)

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

卷 43, 期 11, 页码 1063-1088

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TSE.2017.2653105

关键词

Code smells; empirical study; mining software repositories

类别

Computer Science, Software Engineering Engineering, Electrical & Electronic

资金

NSF [CCF-1525902, CCF-1253837]
University of Molise
Direct For Computer & Info Scie & Enginr
Division of Computing and Communication Foundations [1253837] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

Reagent

摘要

Technical debt is a metaphor introduced by Cunningham to indicate not quite right code which we postpone making it right. One noticeable symptom of technical debt is represented by code smells, defined as symptoms of poor design and implementation choices. Previous studies showed the negative impact of code smells on the comprehensibility and maintainability of code. While the repercussions of smells on code quality have been empirically assessed, there is still only anecdotal evidence on when and why bad smells are introduced, what is their survivability, and how they are removed by developers. To empirically corroborate such anecdotal evidence, we conducted a large empirical study over the change history of 200 open source projects. This study required the development of a strategy to identify smell-introducing commits, the mining of over half a million of commits, and the manual analysis and classification of over 10K of them. Our findings mostly contradict common wisdom, showing that most of the smell instances are introduced when an artifact is created and not as a result of its evolution. At the same time, 80 percent of smells survive in the system. Also, among the 20 percent of removed instances, only 9 percent are removed as a direct consequence of refactoring operations.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Automation & Control Systems

Exploring Design smells for smell-based defect prediction

Bruno Sotto-Mayor, Amir Elmishali, Meir Kalech, Rui Abreu

Summary: This paper studies the performance of defect prediction models and compares models using Design code smells, Traditional smells, and a combination of both. The results show that models trained with both Design code smells and Traditional smells performed the best.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2022)

添加到收藏夹

Article Computer Science, Software Engineering

Are Multi-Language Design Smells Fault-Prone? An Empirical Study

Mouna Abidi, Md Saidur Rahman, Moses Openja, Foutse Khomh

Summary: Modern applications are developed using components written in different programming languages and technologies, which presents challenges in terms of development and maintenance due to the increased number of languages. Design smells can impact software quality and are associated with a higher risk of future bugs.

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY (2021)

添加到收藏夹

Review Computer Science, Software Engineering

Code smells detection via modern code review: a study of the OpenStack and Qt communities

Xiaofeng Han, Amjed Tahir, Peng Liang, Steve Counsell, Kelly Blincoe, Bing Li, Yajing Luo

Summary: Code review is a crucial part of software quality control. This study examines the identification and resolution of code smells in modern code reviews. The results show that code smells are not commonly found, usually caused by violation of coding conventions, and are generally fixed promptly when detected. The study also highlights the importance of following coding conventions and considering program context when addressing code smells.

EMPIRICAL SOFTWARE ENGINEERING (2022)

添加到收藏夹

Article Computer Science, Software Engineering

The Secret Life of Software Vulnerabilities: A Large-Scale Empirical Study

Emanuele Iannone, Roberta Guadagni, Filomena Ferrucci, Andrea De Lucia, Fabio Palomba

Summary: Software vulnerabilities are weaknesses in source code that can be exploited to cause harm. However, there is a lack of knowledge on how vulnerabilities are introduced and removed during the software engineering life cycle. This study investigates the life cycle of known vulnerabilities in open-source software projects, finding that vulnerabilities often require multiple contributions before being introduced and remain unfixed for significant periods of time. The study provides practical implications for vulnerability detectors to assist developers in identifying and addressing these issues in a timely manner.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2023)

添加到收藏夹

Article Computer Science, Software Engineering

Understanding code smells in Elixir functional language

Lucas Francisco da Matta Vegi, Marco Tulio Valente

Summary: This paper studies the internal quality issues of systems implemented with Elixir language and discovered and documented new code smells for this language through interaction with the Elixir developer community and mining of GitHub repositories. The results propose a catalog of 35 code smells, 23 of which are specific to Elixir and 12 are traditional code smells. The relevance and prevalence of each smell in the catalog were validated through a survey with 181 experienced Elixir developers.

EMPIRICAL SOFTWARE ENGINEERING (2023)

添加到收藏夹

Article Computer Science, Software Engineering

An empirical study on the effect of community smells on bug prediction

Beyza Eken, Francis Palma, Basar Ayse, Tosun Ayse

Summary: Community-aware metrics and code smells have been studied in software bug prediction, showing improvement in prediction performance. Future research should focus on communication patterns and cross-project bug prediction settings.

SOFTWARE QUALITY JOURNAL (2021)

添加到收藏夹

Article Computer Science, Software Engineering

One-off events? An empirical study of hackathon code creation and reuse

Ahmed Samir Imam Mahmoud, Tapajit Dey, Alexander Nolte, Audris Mockus, James D. Herbsleb

Summary: Research shows that 9.14% of code blobs in hackathon repositories and 8% of lines of code (LOC) are created during hackathons, with around a third of hackathon code getting reused in other projects. The number of associated technologies and participants in hackathons increase the probability of code reuse.

EMPIRICAL SOFTWARE ENGINEERING (2022)

添加到收藏夹

Review Computer Science, Software Engineering

Can we benchmark Code Review studies? A systematic mapping study of methodology, dataset, and metric

Dong Wang, Yuki Ueda, Raula Gaikovina Kula, Takashi Ishio, Kenichi Matsumoto

Summary: This study investigates the potential of benchmarking in code review research, identifying trends in common methodologies, papers with replicable potential, and the use of specific metric sets in different research topics. While it is currently not feasible to benchmark code review studies, a common benchmark could foster innovation among new researchers and the development of established methodologies.

JOURNAL OF SYSTEMS AND SOFTWARE (2021)

添加到收藏夹

Article Computer Science, Software Engineering

On the practice of semantic versioning for Ansible galaxy roles: An empirical study and a change classification model

Ruben Opdebeeck, Ahmed Zerouali, Camilo Velazquez-Rodriguez, Coen De Roover

Summary: Ansible roles are recommended to adhere to semantic versioning format for new releases, but the criteria for breaking changes or feature additions are unclear. An empirical study was conducted on over 81000 version increments across 8500 roles to analyze the state of semantic versioning and the most commonly changed elements. Structural difference metrics were used to train a classifier for predicting version bumps, and developer survey confirmed that the application of version increments is not always consistent. Guidelines were formulated based on insights gained to ensure clear interpretation of version increments for Ansible roles.

JOURNAL OF SYSTEMS AND SOFTWARE (2021)

添加到收藏夹

Article Computer Science, Software Engineering

Story points changes in agile iterative development An empirical study and a prediction approach

Jirat Pasuksmit, Patanamon Thongtanunam, Shanika Karunasekera

Summary: This study aims to minimize the negative impact of Story Points (SP) changes on sprint planning in Agile software development. Through analysis of 19,349 work items from seven open-source projects, it was found that approximately 10% of work items undergo SP changes, with unchanged SP being more reliable in reflecting development time. The study suggests reviewing SP and scope of work prior or during sprint planning and introduces a classifier for predicting SP changes.

EMPIRICAL SOFTWARE ENGINEERING (2022)

添加到收藏夹

Article Computer Science, Software Engineering

CodeMatcher: Searching Code Based on Sequential Semantics of Important Query Words

Chao Liu, Xin Xia, David Lo, Zhiwe Liu, Ahmed E. Hassan, Shanping Li

Summary: This article proposed an IR-based model CodeMatcher, inherited the advantages of DeepCS, and achieved fast and accurate code search. Experimental results showed that CodeMatcher performed well on the MRR metric and outperformed existing online search engines.

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY (2022)

添加到收藏夹

Article Computer Science, Software Engineering

Building empirical knowledge on the relationship between code smells and design patterns: An exploratory study

Jose Amancio M. Santos, Gadiel Xavier Antunes Petronilo

Summary: Code smell refers to potential problems in software design, while design pattern describes good design solutions. Both concepts serve as metaphors for understanding and communication in software design. This study aimed to empirically investigate the relationship between code smells and design patterns. Through mining software repositories and studying software evolution, the researchers analyzed 61 software and identified classes linked to both code smells and design patterns. The findings showed that the relationship between smells and design patterns varied depending on the software, and that the instability metric weakly reflected this relationship.

JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS (2022)

添加到收藏夹

Article Computer Science, Software Engineering

More than React: Investigating the Role of Emoji Reaction in GitHub Pull Requests

Dong Wang, Tao Xiao, Teyon Son, Raula Gaikovina Kula, Takashi Ishio, Yasutaka Kamei, Kenichi Matsumoto

Summary: This study analyzed a large number of pull requests on GitHub and found correlations between emoji reactions and review time, first-time contributors, comment intentions, and sentiment consistency. The results suggest that emoji reactions not only reduce commenting noise, but also play a positive role in facilitating collaborative communication during the review process.

EMPIRICAL SOFTWARE ENGINEERING (2023)

添加到收藏夹

Review Computer Science, Information Systems

Automatic patch linkage detection in code review using textual content and file location features

Dong Wang, Raula Gaikovina Kula, Takashi Ishio, Kenichi Matsumoto

Summary: Contemporary code review tools are popular for software quality assurance, with the ability to post linkages between patches during discussions. Patch linkage notifications exhibit latency, with patches having Alternative Solution linkages undergoing quicker reviews and fewer revisions. Detection models show promising recall rates for Alternative Solution linkages, but precision can be improved.

INFORMATION AND SOFTWARE TECHNOLOGY (2021)

添加到收藏夹

Article Computer Science, Software Engineering

How do microservices evolve? An empirical analysis of changes in open-source microservice repositories

Wesley K. G. Assuncao, Jacob Kruger, Sebastien Mosser, Sofiane Selaoui

Summary: Microservice architectures are widely used in the industry for developing scalable software systems. However, their design and maintenance present challenges to software engineers. To gain insights into the evolution of microservices, a large-scale empirical study was conducted on 11 open-source systems, revealing recurring patterns of evolution and analyzing the dependence between microservices.

JOURNAL OF SYSTEMS AND SOFTWARE (2023)

添加到收藏夹

Article Computer Science, Software Engineering

Quick remedy commits and their impact on mining software repositories

Fengcai Wen, Csaba Nagy, Michele Lanza, Gabriele Bavota

Summary: Most changes during software maintenance are not atomic and developers may omit needed changes, leading to technical debt or bugs. A study on quick remedy commits found that developers tend to quickly fix issues introduced by omitted changes in previous commits. These quick remedy commits are important for improving code quality and must be considered in mining software repositories for accurate findings.

EMPIRICAL SOFTWARE ENGINEERING (2022)

添加到收藏夹

Article Computer Science, Software Engineering

The Secret Life of Software Vulnerabilities: A Large-Scale Empirical Study

Emanuele Iannone, Roberta Guadagni, Filomena Ferrucci, Andrea De Lucia, Fabio Palomba

Summary: Software vulnerabilities are weaknesses in source code that can be exploited to cause harm. However, there is a lack of knowledge on how vulnerabilities are introduced and removed during the software engineering life cycle. This study investigates the life cycle of known vulnerabilities in open-source software projects, finding that vulnerabilities often require multiple contributions before being introduced and remain unfixed for significant periods of time. The study provides practical implications for vulnerability detectors to assist developers in identifying and addressing these issues in a timely manner.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2023)

添加到收藏夹

Review Computer Science, Software Engineering

A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research

Cody Watson, Nathan Cooper, David Nader Palacio, Kevin Moran, Denys Poshyvanyk

Summary: This article presents a systematic literature review of the intersection of software engineering (SE) and Deep Learning (DL), analyzing 128 papers across 23 SE tasks. It provides an overview of the current state and future directions of DL techniques applied to SE research, outlining a research roadmap for this cross-cutting area.

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY (2022)

添加到收藏夹

Article Computer Science, Software Engineering

Through the looking-glass ... An empirical study on blob infrastructure blueprints in the Topology and Orchestration Specification for Cloud Applications

Stefano Dalla Palma, Chiel van Asseldonk, Gemma Catolino, Dario Di Nucci, Fabio Palomba, Damian A. Tamburri

Summary: Infrastructure-as-code (IaC) is crucial for providing and managing infrastructures through configuration files, but these files may suffer from code smells that impact quality and maintenance. This paper investigates the application of a traditional implementation code smell, Large Class or Blob Blueprint, in the context of TOSCA, and compares metrics-based and unsupervised learning-based detectors on a large dataset. The results suggest a new research direction for dealing with this problem and highlight the effectiveness of metrics-based detectors in detecting Blob Blueprints.

JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS (2023)

添加到收藏夹

Review Computer Science, Theory & Methods

A Systematic Literature Review on the Code Smells Datasets and Validation Mechanisms

Morteza Zakeri-Nasrabadi, Saeed Parsa, Ehsan Esmaili, Fabio Palomba

Summary: The accuracy of code smell-detecting tools varies depending on the dataset used for evaluation. The adequacy of a dataset highly depends on relevant properties such as size, severity level, project types, and the number of each type of smell. Existing datasets often suffer from imbalanced samples, lack of severity level support, and restriction to Java language.

ACM COMPUTING SURVEYS (2023)

添加到收藏夹

Article Computer Science, Software Engineering

What Quality Aspects Influence the Adoption of Docker Images?

Giovanni Rosa, Simone Scalabrino, Gabriele Bavota, Rocco Oliveto

Summary: This article defines a taxonomy of quality features for Docker artifacts through literature review and empirical study, and explores the influence of externally observable features on developers' preferences and their relationship with configuration-related features.

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY (2023)

添加到收藏夹

Article Computer Science, Software Engineering

Automated variable renaming: are we there yet?

Antonio Mastropaolo, Emad Aghajani, Luca Pascarella, Gabriele Bavota

Summary: Identifiers, such as method and variable names, play a significant role in source code comprehension. Existing techniques, mostly data-driven or based on static code analysis, have been proposed to support meaningful identifier recommendations. However, limited empirical investigations have been conducted to evaluate the effectiveness of these techniques, potentially leading to rename refactoring operations. This study explores the potential of data-driven approaches in automated variable renaming and presents promising results, along with identified limitations that require further research.

EMPIRICAL SOFTWARE ENGINEERING (2023)

添加到收藏夹

Article Computer Science, Software Engineering

Enhancing Mobile App Bug Reporting via Real-Time Understanding of Reproduction Steps

Mattia Fazzini, Kevin Moran, Carlos Bernal-Cardenas, Tyler Wendland, Alessandro Orso, Denys Poshyvanyk

Summary: This article introduces a new bug reporting approach called EBUG, which assists users in writing easily readable and conveniently reproducible bug reports by analyzing natural language information entered in real-time and linking it to information extracted via program analyses. Two user studies were conducted to evaluate EBUG, and the results showed that users were able to construct bug reports faster and the reports were more reproducible compared to a baseline bug reporting system. The predictive models of EBUG also outperformed other approaches.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2023)

添加到收藏夹

Article Computer Science, Software Engineering

Using Transfer Learning for Code-Related Tasks

Antonio Mastropaolo, Nathan Cooper, David Nader Palacio, Simone Scalabrino, Denys Poshyvanyk, Rocco Oliveto, Gabriele Bavota

Summary: This paper evaluates the performance of the T5 model in supporting four different code-related tasks and studies the impact of pre-training and multi-task fine-tuning. The results show that the T5 model outperforms state-of-the-art baselines and that not all tasks benefit from multi-task fine-tuning despite the advantages of pre-training.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2023)

添加到收藏夹

Article Computer Science, Software Engineering

The anatomy of a vulnerability database: A systematic mapping study?

Xiaozhou Li, Sergio Moreschini, Zheying Zhang, Fabio Palomba, Davide Taibi

Summary: Software vulnerabilities pose significant risks, such as the loss and manipulation of private data. The software engineering research community has conducted empirical studies and proposed automated techniques to detect and remove vulnerabilities. In this paper, a systematic mapping study is conducted to analyze popular vulnerability databases, adoption goals, other information sources, methods and techniques, and proposed tools. Understanding these aspects can help researchers make informed decisions and practitioners establish reliable sources of information for security policies and standards.

JOURNAL OF SYSTEMS AND SOFTWARE (2023)

添加到收藏夹

Article Computer Science, Software Engineering

An Empirical Investigation Into the Influence of Software Communities' Cultural and on

Stefano Lambiase, Gemma Catolino, Fabiano Pecorelli, Damian A. Tamburri, Fabio Palomba, Willem-Jan van den Heuvel, Filomena Ferrucci

Summary: This paper contributes to the existing body of knowledge on factors affecting productivity in software development by studying the cultural and geographical dispersion of a development community. The results show that cultural and geographical dispersion significantly impact productivity, suggesting that managers and practitioners should consider these aspects throughout the software development lifecycle.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

添加到收藏夹

Proceedings Paper Computer Science, Artificial Intelligence

To What Extent do Deep Learning-based Code Recommenders Generate Predictions by Cloning Code from the Training Set?

Matteo Ciniselli, Luca Pascarella, Gabriele Bavota

Summary: Deep learning models are widely used for code completion but it is unclear if the code they generate violates licenses. A study found that around 10% to 0.1% of the code generated by DL models has similarities with instances in the training set.

2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022) (2022)

添加到收藏夹

Proceedings Paper Computer Science, Software Engineering

Using Pre-Trained Models to Boost Code Review Automation

Rosalia Tufano, Simone Masiero, Antonio Mastropaolo, Luca Pascarella, Denys Poshyvanyk, Gabriele Bavota

Summary: Code review is a widely adopted practice in open source and industrial projects. This paper introduces a method for automating code review tasks using deep learning models and demonstrates that a pre-trained T5 model can outperform previous DL models. Furthermore, experiments were conducted on a larger, more realistic, and challenging dataset of code review activities.

2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022) (2022)

添加到收藏夹

Proceedings Paper Computer Science, Software Engineering

Using Reinforcement Learning for Load Testing of Video Games

Rosalia Tufano, Simone Scalabrino, Luca Pascarella, Emad Aghajani, Rocco Oliveto, Gabriele Bavota

Summary: This study explores the possibility of using reinforcement learning for load testing video games. It proposes a method to train agents that can play games like humans while identifying areas that cause a drop in frame rate. The feasibility of this approach is demonstrated through experiments on three games.

2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022) (2022)

添加到收藏夹

Proceedings Paper Computer Science, Software Engineering

Using Deep Learning to Generate Complete Log Statements

Antonio Mastropaolo, Luca Pascarella, Gabriele Bavota

Summary: This paper presents LANCE, an approach that supports developers in making decisions related to logging. LANCE utilizes a trained model to automatically identify the position for logging, select the appropriate log level, and generate correct log statements.

2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022) (2022)

添加到收藏夹

暂无数据

© Peeref 2019-2024. All rights reserved.