Abstract:
The quest for improving the software quality has given rise to various studies which focus on the enhancement of the quality of software through various processes. Code smells, which are indicators of the software quality have not been put to an extensive study for as to determine their role in the prediction of defects in the software. This study aims to investigate the role of code smells in prediction of non-faulty classes. We examine the Eclipse software with four versions (3.2, 3.3, 3.6, and 3.7) for metrics and smells. Further, different code smells, derived subjectively through iPlasma, are taken into conjugation and three efficient, but subjective models are developed to detect code smells on each of Random Forest, J48 and SVM machine learning algorithms. This model is then used to detect the absence of defects in the four Eclipse versions. The effect of balanced and unbalanced datasets is also examined for these four versions. The results suggest that the code smells can be a valuable feature in discriminating absence of defects in a software.
Machine summary:
Among other fields, software engineering also uses the services of machine learning algorithm to augment various activities of software maintenance, the defect prediction being one of them (Lessmann, Baesens, Mues, & Pietsch, 2008).
The defect prediction models have proven to be efficient as concluded by various studies (Cartwright & Shepperd, 2000; Catal, 2011; Hall, Beecham, Bowes, Gray, & Counsell, 2011).
Performance measures of the smell prediction models Algorithm, Source Code, Precision, Recall, F-Measure, ROC )VIew the image of this page) Creation of Smell-defect models As we are already aware that the presence of smells is indicative of the fact that the code is prone to defects or faults (by the definition of code smells), we train our models with the metrics (as already defined) that are the indicators of smells to find out the absence of bugs.
The non-faulty class prediction models that were based on Random Forest algorithm are trained on the smell data of the corresponding version and tested on the defect data in order of chronology.
ROC curves on application of Random based prediction model for prediction of non-faulty classes in the subsequent versions using balanced dataset.
Performance measures of SVM based Smell-Defect model after balancing Algorithm, Smell Prediction Model, Applied on, Precision, Recall, F-Measure, ROC )VIew the image of this page) Figure 7.
Conclusion and Future Work This study focuses on the viability of code smell based model as predictors of non-faulty classes, with supervised machine learning algorithms in an industry sized, object-oriented software.