Home / Regular Issue / JST Vol. 30 (3) Jul. 2022 / JST-3033-2021


Classification of Fault Prediction: A Mapping Study

Sasha Farhana Shamsul Anwar, Marshima Mohd Rosli and Nur Atiqah Sia Abdullah

Pertanika Journal of Science & Technology, Volume 30, Issue 3, July 2022

DOI: https://doi.org/10.47836/pjst.30.3.23

Keywords: Fault prediction techniques, fault prediction, software metrics, systematic mapping

Published on: 25 May 2022

Software fault prediction is an important activity in the testing phase of the software development life cycle and involves various statistical and machine learning techniques. These techniques are useful for making accurate predictions to improve software quality. Researchers have used different techniques on different datasets to build fault prediction in software projects, but these techniques vary and are not generalised. As a result, it creates challenges that make it difficult to choose a suitable technique for software fault prediction in a particular context or project. This mapping study focuses on research published from 1997 to 2020 involving fault prediction techniques, intending to determine a classification of fault prediction techniques based on problem types that researchers need to solve. This study conducted a systematic mapping study to structure and categorise the research evidence that has been published in fault prediction. A total of 82 papers are mapped to a classification scheme. This study identified research gaps and specific issues for practitioners, including the need to classify fault prediction techniques according to problem types and to provide a systematic way to identify suitable techniques for fault prediction models.

  • Al Qasem, O., Akour, M., & Alenezi, M. (2020). The influence of deep learning algorithms factors in software fault prediction. IEEE Access, 8, 63945-63960. https://doi.org/10.1109/ACCESS.2020.2985290

  • Ardil, E., & Sandhu, P. S. (2010). A soft computing approach for modeling of severity of faults in software systems. International Journal of Physical Sciences, 5(2), 74-85. https://doi.org/10.5897/IJPS.9000037

  • Budgen, D., Turner, M., Brereton, P., & Kitchenham, B. (2008, September 10-12). Using mapping studies in software engineering. In Proceedings of Psychology of Programming Interest Group Workshop (Vol. 8, pp. 195-204). Lancaster, UK.

  • Caglayan, B., Misirli, A. T., Bener, A. B., & Miranskyy, A. (2015). Predicting defective modules in different test phases. Software Quality Journal, 23(2), 205-227. https://doi.org/10.1007/s11219-014-9230-x

  • Catal, C. (2011). Software fault prediction: A literature review and current trends. Expert Systems with Applications, 38(4), 4626-4636. https://doi.org/10.1016/j.eswa.2010.10.024

  • Catal, C., & Diri, B. (2009). A systematic review of software fault prediction studies. Expert Systems with Applications, 36(4), 7346-7354. https://doi.org/10.1016/j.eswa.2008.10.027

  • Dejaeger, K., Verbraken, T., & Baesens, B. (2013). Toward comprehensible software fault prediction models using bayesian network classifiers. IEEE Transactions on Software Engineering, 39(2), 237-257. https://doi.org/10.1109/TSE.2012.20

  • Garcia, L. P. F., de Carvalho, A. C. P. L. F., & Lorena, A. C. (2016). Noise detection in the meta-learning level. Neurocomputing, 176, 14-25. https://doi.org/10.1016/j.neucom.2014.12.100

  • Geng, R., Wang, X., Ye, N., & Liu, J. (2018). A fault prediction algorithm based on rough sets and back propagation neural network for vehicular networks. IEEE Access, 6, 74984-74992. https://doi.org/10.1109/ACCESS.2018.2881890

  • Gokhale, S. S., & Lyu, M. R. (1997). Regression tree modeling for the prediction of software quality. In Proceedings of the Third ISSAT International Conference on Reliability and Quality in Design (pp. 31-36). International Society of Science and Applied Technologies.

  • Hall, T., Beecham, S., Bowes, D., Gray, D., & Counsell, S. (2012). A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering, 38(6), 1276-1304. https://doi.org/10.1109/TSE.2011.103

  • Hosseini, S., Turhan, B., & Mäntylä, M. (2016). Search based training data selection for cross project defect prediction. In Proceedings of the The 12th International Conference on Predictive Models and Data Analytics in Software Engineering (pp. 1-10). ACM Publishing. https://doi.org/10.1145/2972958.2972964

  • Hosseinzadeh, M., Rahmani, A. M., Vo, B., Bidaki, M., Masdari, M., & Zangakani, M. (2021). Improving security using SVM-based anomaly detection: Issues and challenges. Soft Computing, 25(4), 3195-3223. https://doi.org/10.1007/s00500-020-05373-x

  • Illes-Seifert, T., & Paech, B. (2010). Exploring the relationship of a file’s history and its fault-proneness: An empirical method and its application to open source programs. Information and Software Technology, 52(5), 539-558. https://doi.org/10.1016/j.infsof.2009.11.010

  • Kassie, N. B., & Singh, J. (2020). A study on software quality factors and metrics to enhance software quality assurance. International Journal of Productivity and Quality Management, 29(1), 24-44. https://doi.org/10.1504/IJPQM.2020.104547

  • Kastro, Y., & Bener, A. B. (2008). A defect prediction method for software versioning. Software Quality Journal, 16(4), 543-562. https://doi.org/10.1007/s11219-008-9053-8

  • Khan, L., Awad, M., & Thuraisingham, B. (2007). A new intrusion detection system using support vector machines and hierarchical clustering. The VLDB Journal, 16(4), 507-521. https://doi.org/10.1007/s00778-006-0002-5

  • Khoshgoftaar, T. M., Seliya, N., & Sundaresh, N. (2006). An empirical study of predicting software faults with case-based reasoning. Software Quality Journal, 14(2), 85-111. https://doi.org/10.1007/s11219-006-7597-z

  • Kim, S., Whitehead, E., & Zhang, Y. (2008). Classifying software changes: Clean or buggy? IEEE Transactions on Software Engineering, 34(2), 181-196. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4408585

  • Kitchenham, B. A., Budgen, D., & Brereton, O. P. (2011). Using mapping studies as the basis for further research - A participant-observer case study. Information and Software Technology, 53(6), 638-651. https://doi.org/10.1016/j.infsof.2010.12.011

  • Mendes-Moreira, J., Soares, C., Jorge, A. M., & de Sousa, J. F. (2012). Ensemble approaches for regression. ACM Computing Surveys, 45(1), 1-40. https://doi.org/10.1145/2379776.2379786

  • Mohammed, M. N., & Sulaiman, N. (2012). Intrusion detection system based on SVM for WLAN. Procedia Technology, 1, 313-317. https://doi.org/10.1016/j.protcy.2012.02.066

  • Murillo-Morera, J., Quesada-López, C., & Jenkins, M. (2015, April 22-24). Software fault prediction: A systematic mapping study. In CIBSE 2015 - XVIII Ibero-American Conference on Software Engineering (pp. 446-459). Lima, Peru.

  • Peters, F., Menzies, T., & Marcus, A. (2013). Better cross company defect prediction. In 2013 10th Working Conference on Mining Software Repositories (MSR) (pp. 409-418). IEEE Publishing. https://doi.org/10.1109/MSR.2013.6624057

  • Petersen, K., Feldt, R., Mujtaba, S., & Mattsson, M. (2008, June 26-27). Systematic mapping studies in software engineering. In 12th International Conference on Evaluation and Assessment in Software Engineering, EASE 2008 (pp. 1-10). University of Bari, Italy. https://doi.org/10.14236/ewic/EASE2008.8

  • Rathore, S. S., & Kumar, S. (2017). A study on software fault prediction techniques. Artificial Intelligence Review, 51(2), 255-327. https://doi.org/10.1007/s10462-017-9563-5

  • Rosli, M. M., Teo, N. H. I., Yusop, N. S. M., & Mohammad, N. S. (2011). The design of a software fault prone application using evolutionary algorithm. In 2011 IEEE Conference on Open Systems (pp. 338-343). IEEE Publishing. https://doi.org/10.1109/ICOS.2011.6079246

  • Seo, Y. S., & Bae, D. H. (2013). On the value of outlier elimination on software effort estimation research. Empirical Software Engineering, 18(4), 659-698. https://doi.org/10.1007/s10664-012-9207-y

  • Shin, Y., Bell, R., Ostrand, T., & Weyuker, E. (2009). Does calling structure information improve the accuracy of fault prediction? In 2009 6th IEEE International Working Conference on Mining Software Repositories (pp. 61-70). IEEE Publishing. https://doi.org/10.1109/MSR.2009.5069481

  • Shin, Y., Meneely, A., Williams, L., & Osborne, J. A. (2011). Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Transactions on Software Engineering, 37(6), 772-787. https://doi.org/10.1109/TSE.2010.81

  • Vandecruys, O., Martens, D., Baesens, B., Mues, C., De Backer, M., & Haesen, R. (2008). Mining software repositories for comprehensible software fault prediction models. Journal of Systems and Software, 81(5), 823-839. https://doi.org/10.1016/j.jss.2007.07.034

  • Weyuker, E. J., Ostrand, T. J., & Bell, R. M. (2007). Using developer information as a factor for fault prediction. In Third International Workshop on Predictor Models in Software Engineering (PROMISE’07: ICSE Workshops 2007) (pp. 8-8). IEEE Publishing. https://doi.org/10.1109/PROMISE.2007.14

  • Yadav, H. B., & Yadav, D. K. (2015). A fuzzy logic based approach for phase-wise software defects prediction using software metrics. Information and Software Technology, 63, 44-57. https://doi.org/10.1016/j.infsof.2015.03.001

  • Zhou, Y., & Leung, H. (2006). Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Transactions on Software Engineering, 32(10), 771-789. https://doi.org/10.1109/TSE.2006.102

ISSN 0128-7702

e-ISSN 2231-8534

Article ID


Download Full Article PDF

Share this article

Recent Articles