☆ 4.6 Article

Mining Fix Patterns for FindBugs Violations

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2021)

Journal

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

Volume 47, Issue 1, Pages 165-188

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/TSE.2018.2884955

Keywords

Fix pattern; pattern mining; program repair; findbugs violation; unsupervised learning

Funding

Fonds National de la Recherche (FNR), Luxembourg [FIXPATTERN C15/IS/9964569, RECOMMEND C15/IS/10449467]
Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF) - Ministry of Science, ICT [2017M3C4A7068179]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Several static analysis tools have been proposed to detect security vulnerabilities or bad programming practices, but their adoption is hindered by high false positive rates. By analyzing distributions of violations and their fixes, an automated approach using convolutional neural networks and clustering can identify fix patterns and apply them to unresolved violations effectively.

Several static analysis tools, such as Splint or FindBugs, have been proposed to the software development community to help detect security vulnerabilities or bad programming practices. However, the adoption of these tools is hindered by their high false positive rates. If the false positive rate is too high, developers may get acclimated to violation reports from these tools, causing concrete and severe bugs being overlooked. Fortunately, some violations are actually addressed and resolved by developers. We claim that those violations that are recurrently fixed are likely to be true positives, and an automated approach can learn to repair similar unseen violations. However, there is lack of a systematic way to investigate the distributions on existing violations and fixed ones in the wild, that can provide insights into prioritizing violations for developers, and an effective way to mine code and fix patterns which can help developers easily understand the reasons of leading violations and how to fix them. In this paper, we first collect and track a large number of fixed and unfixed violations across revisions of software. The empirical analyses reveal that there are discrepancies in the distributions of violations that are detected and those that are fixed, in terms of occurrences, spread and categories, which can provide insights into prioritizing violations. To automatically identify patterns in violations and their fixes, we propose an approach that utilizes convolutional neural networks to learn features and clustering to regroup similar instances. We then evaluate the usefulness of the identified fix patterns by applying them to unfixed violations. The results show that developers will accept and merge a majority (69/116) of fixes generated from the inferred fix patterns. It is also noteworthy that the yielded patterns are applicable to four real bugs in the Defects4J major benchmark for software testing and automated repair.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6

Not enough ratings

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Reliable Fix Patterns Inferred from Static Checkers for Automated Program Repair

Kui Liu, Jingtang Zhang, Li Li, Anil Koyuncu, Dongsun Kim, Chunpeng Ge, Zhe Liu, Jacques Klein, Tegawende F. Bissyande

Summary: Fix pattern-based patch generation is a promising direction in automated program repair (APR). The performance of pattern-based APR systems depends on the fix ingredients mined from fix changes in development histories. Collecting a reliable set of bug fixes in repositories can be challenging.

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY (2023)