☆ 4.6 Article

The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2015)

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

卷 41, 期 12, 页码 1236-1256

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TSE.2015.2454513

关键词

Automated program repair; benchmark; subject defect; reproducibility; MANYBUGS; INTROCLASS

类别

Computer Science, Software Engineering Engineering, Electrical & Electronic

资金

AFOSR [FA9550-07-1-0532, FA9550-10-1-0277]
US Defense Advanced Research Projects Agency (DARPA) [P-1070-113237]
US Department of Energy (DOE) [DE-AC02-05CH11231]
US National Science Foundation (NSF) [CCF-0729097, CCF-0905236, CCF-1446683, CNS-0905222]
Santa Fe Institute
Direct For Computer & Info Scie & Enginr
Division of Computing and Communication Foundations [0954024, 1646813, 1446683, 0905373, 1446966] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The field of automated software repair lacks a set of common benchmark problems. Although benchmark sets are used widely throughout computer science, existing benchmarks are not easily adapted to the problem of automatic defect repair, which has several special requirements. Most important of these is the need for benchmark programs with reproducible, important defects and a deterministic method for assessing if those defects have been repaired. This article details the need for a new set of benchmarks, outlines requirements, and then presents two datasets, MANYBUGS and INTROCLASS, consisting between them of 1,183 defects in 15 C programs. Each dataset is designed to support the comparative evaluation of automatic repair algorithms asking a variety of experimental questions. The datasets have empirically defined guarantees of reproducibility and benchmark quality, and each study object is categorized to facilitate qualitative evaluation and comparisons by category of bug or program. The article presents baseline experimental results on both datasets for three existing repair methods, GenProg, AE, and TrpAutoRepair, to reduce the burden on researchers who adopt these datasets for their own comparative evaluations.

The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs

期刊

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文