4.7 Article

Extracting classification rule of software diagnosis using modified MEPA

Journal

EXPERT SYSTEMS WITH APPLICATIONS
Volume 34, Issue 1, Pages 411-418

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2006.09.042

Keywords

software diagnosis; minimize entropy; C4.5; data discretization

Ask authors/readers for more resources

Defective software modules cause software failures, increase development and maintenance costs, and reduce customer satisfaction. Effective defect prediction models can help developers focus quality assurance activities on defect-prone modules and thus improve software quality by using resources more efficiently. In real-world databases are highly susceptible to noisy, missing, and inconsistent data. Noise is a random error or variance in a measured variable [Han, J., & Kamber, M. (2001). Data Mining: Concepts and Techniques, San Francisco: Morgan Kaufmann Publishers]. When decision trees are built, many of the branches may reflect noisy or outlier data. Therefore, data preprocessing steps are very important. There are many methods for data preprocessing. Concept hierarchies are a form of data discretization that can use for data preprocessing. Data discretization has many advantages, such as data can be reduced and simplified. Using discrete features are usually more compact, shorter and more accurate than using continuous ones [Liu, H., Hussain, F., Tan, C.L., & Dash, M. (2002). Discretization: An enabling technique. Data Mining and Knowledge Discovery, 6(4), 393-423]. In this paper, we propose a modified minimize entropy principle approach and develop a modified MEPA system to partition the data, and then build the classification tree model. For verification, two NASA software projects KC2 and JM1 are applied to illustrate our proposed method. We establish a prototype system to discrete data from these projects. The error rate and number of rules show that the proposed approach is both better than other methods. (c) 2006 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available