The Effectiveness of the Fused Weighted Filter Feature Selection Method to Improve Software Fault Prediction

Fatemeh Alighardashi, Mohammad Ali Zare Chahooki


Improving the software product quality before releasing by periodic tests is one of the most expensive activities in software projects. Due to limited resources to modules test in software projects, it is important to identify fault-prone modules and use the test sources for fault prediction in these modules. Software fault predictors based on machine learning algorithms, are effective tools for identifying fault-prone modules. Extensive studies are being done in this field to find the connection between features of software modules, and their fault-prone. Some of features in predictive algorithms are ineffective and reduce the accuracy of prediction process. So, feature selection methods to increase performance of prediction models in fault-prone modules are widely used. In this study, we proposed a feature selection method for effective selection of features, by using combination of filter feature selection methods. In the proposed filter method, the combination of several filter feature selection methods presented as fused weighed filter method. Then, the proposed method caused convergence rate of feature selection as well as the accuracy improvement. The obtained results on NASA and PROMISE with ten datasets, indicates the effectiveness of proposed method in improvement of accuracy and convergence of software fault prediction.


Software fault prediction; Feature selection; Filter method; Machine learning

Full Text:



G. Iker, “Applying machine learning to software fault-proneness prediction,” J. Syst. Softw., vol. 81, no. 2, pp. 186–195, 2008.

E. Rashid, S. Patnaik, and A. Usmani, “Machine Learning and Its Application in Software Fault Prediction with Similarity Measures,” In Computational Vision and Robotics, pp. 37-45, Springer India, 2015.

R. Malhotra, “A systematic review of machine learning techniques for software fault prediction,” Appl. Soft Comput., pp. 504–518, 2015.

B. Ghotra, S. Mcintosh, and A. E. Hassan, “Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models,” in Proc. of the 37th Int’l Conf. on Software Engineering (ICSE), 2015.

K. Dejaeger, T. Verbraken, and B. Baesens, “Toward Comprehensible Software Fault Prediction Models Using Bayesian Network Classifiers,” IEEE Trans. Softw. Eng., vol. 39, no. 2, pp. 237–257, 2013.

A. Okutan, and O. Yıldız, “Software defect prediction using Bayesian networks,” Empirical Software Engineering, vol. 19, no. 1, pp. 154-181, 2014.

S. Lessmann, B. Baesens , C. Mues, and S. Pietsch, “Benchmarking classification models for software defect prediction: a proposed framework and novel findings,” Softw. Eng. IEEE Trans., vol. 34, no. 4, pp. 485–496, 2008.

R. Malhotra, and A. Jain, “Fault prediction using statistical and machine learning methods for improving software quality,” JIPS, vol. 8, no. 2, pp. 241–262, 2012.

A. Monden, T. Hayashi, S. Shinoda, K. Shirai, J. Yoshida, M. Barker, and K. Matsumoto, “Assessing the cost effectiveness of fault prediction in acceptance testing,” Softw. Eng. IEEE Trans., vol. 39, no. 10, pp. 1345–1357, 2013.

J. Zheng, “Cost-sensitive boosting neural networks for software defect prediction,” Expert Syst. Appl., vol. 37, no. 6, pp. 4537–4543, 2010.

T. Choeikiwong and P. Vateekul, “Software Defect Prediction in Imbalanced Data Sets Using Unbiased Support Vector Machine,” Inf. Sci. Appl. Springer Berlin Heidelb., vol. 339, pp. 923–931, 2015.

A. H. Al-Jamimi, and L. Ghouti, “Efficient prediction of software fault proneness modules using support vector machines and probabilistic neural networks,” Software Engineering (MySEC), 2011, 5th Malaysian Conference in. IEEE, 2011.

H. Can, X. Jianchun, Z. Ruide, L. Juelong, Y. Qiliang, and X. Liqiang, “A new model for software defect prediction using particle swarm optimization and support vector machine,” Control and Decision Conference (CCDC), 2013, 25th Chinese. IEEE, 2013.

G. Chandrashekar and F. Sahin, “A survey on feature selection methods,” Comput. Electr. Eng., vol. 40, no. 1, pp. 16–28, 2014.

K. Muthukumaran, R. Akhila, and N. L. Murthy. "Impact of Feature Selection Techniques on Bug Prediction Models." Proceedings of the 8th India Software Engineering Conference. ACM, 2015.

S. Shivaji, E. J. Whitehead, R. Akella, and S. Kim, “Reducing features to improve code change-based bug prediction”. IEEE Transactions on Software Engineering, 39(4), 552-569, 2013.

K. Gao, T. M. Khoshgoftaar, H. Wang, and N. Seliya, “Choosing software metrics for defect prediction: an investigation on feature selection techniques”. Software: Practice and Experience, 41(5), 579-606, 2011.

H. Wang, T. M. Khoshgoftaar, and A. Napolitano, “Software measurement data reduction using ensemble techniques”. Neurocomputing, 92, 124-132, 2012.

H. Wang, T. M. Khoshgoftaar, and A. Napolitano, “A comparative study of ensemble feature selection techniques for software defect prediction”. In Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on. IEEE, 135-140, 2010.

A. Okutan, and O. T. Yıldız, "Software defect prediction using Bayesian networks." Empirical Software Engineering 19.1 (2014): 154-181.

Laradji, I. H., Alshayeb, M., & Ghouti, L. “Software defect prediction using ensemble learning on selected features”. Information and Software Technology, 58, 388-402, 2015.

R. S. Wahono, N. Suryana, and S. Ahmad. "Metaheuristic Optimization based Feature Selection for Software Defect Prediction." Journal of Software 9.5 (2014): 1324-1333.

S. Liu, X. Chen, W. Liu, J. Chen, Q. Gu, and D. Chen, “Fecar: A feature selection framework for software defect prediction.” Computer Software and Applications Conference (COMPSAC), 2014 IEEE 38th Annual. IEEE, 2014.

Z. Zhao, F. Morstatter, S. Sharma, S. Alelyani, A. Anand, and H. Liu, “Advancing feature selection research”. ASU feature selection repository, 2010.

D. Rodriguez, I. Herraiz, R. Harrison, J. Dolado, and J. C. Riquelme, “Preliminary comparison of techniques for dealing with imbalance in software defect prediction”. In Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (p. 43). ACM, 2014.

Z. Wang, Y. H. Shao, and T. R. Wu, “A GA-based model selection for smooth twin parametric-margin support vector machine”. Pattern Recognition, 46(8), 2267-2277, 2013.

B. Shuai, H. Li, M. Li, Q. Zhang, and C. Tang, “Software defect prediction using dynamic support vector machine,” Comput. Intell. Secur. (CIS), 2013 9th Int. Conf., pp. 260-263, IEEE, 2013.