PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY

 

e-ISSN 2231-8526
ISSN 0128-7680

Home / Regular Issue / JST Vol. 30 (4) Oct. 2022 / JST-3314-2021

 

Robust Hybrid Classification Methods and Applications

Friday Zinzendoff Okwonu, Nor Aishah Ahad, Innocent Ejiro Okoloko, Joshua Sarduana Apanapudor, Saadi Ahmad Kamaruddin and Festus Irimisose Arunaye

Pertanika Journal of Science & Technology, Volume 30, Issue 4, October 2022

DOI: https://doi.org/10.47836/pjst.30.4.29

Keywords: Classification, least squares, linear prediction, prediction errors, robust

Published on: 28 September 2022

The sample mean classifier, such as the nearest mean classifier (NMC) and the Bayes classifier, is not robust due to the influence of outliers. Enhancing the robust performance of these methods may result in vital information loss due to weighting or data deletion. The focus of this study is to develop robust hybrid univariate classifiers that do not rely on data weighting or deletion. The following data transformation methods, such as the least square approach (LSA) and linear prediction approach (LPA), are applied to estimate the parameters of interest to achieve the objectives of this study. The LSA and LPA estimates are applied to develop two groups of univariate classifiers. We further applied the predicted estimates from the LSA and LPA methods to develop four hybrid classifiers. These classifiers are applied to investigate whether cattle horn and base width length could be used to determine cattle gender. We also used these classification methods to determine whether shapes could classify banana variety. The NMC, LSA, LPA, and hybrid classifiers showed that cattle gender could be determined using horn length and base width measurement. The analysis further revealed that shapes could determine banana variety. The comparative results using the two data sets demonstrated that all the methods have over 90% performance prediction accuracy. The findings affirmed that the performance of the NMC, LSA, LPA, and the hybrid classifiers satisfy the data-dependent theory and are suitable for classifying agricultural products. Therefore, the proposed methods could be applied to perform classification tasks efficiently in many fields of study.

  • Almetwally, E. M., & Almongy, H. M. (2018). Comparison between M-estimation, S-estimation, and MM estimation methods of robust estimation with application and simulation censoring. International Journal of Mathematical Archive, 9(11), 1-9.

  • Atal, B. S. (2006). The history of linear prediction. IEEE Signal Processing Magazine, 23(2), 154-161.

  • Bickel, P. J., & Doksum, K. A. (2015). Mathematical Statistics: Basic Ideas and Selected Topics (Vol. 1, 2nd Ed). Chapman and Hall/CRC.

  • Bultheel, A., & van Barel, M. (1994). Linear prediction: Mathematics and engineering. Bulletin of the Belgian Mathematical Society, 1, 1-58.

  • Campbell, N. A., Lopuhaä, H. P., & Rousseeuw, P. J. (1999). On the calculation of a robust S-estimator of a covariance matrix. Statistics in Medicine, 17(23), 2685-2695. https://doi.org/10.1002/(SICI)1097-0258(19981215)17:23<2685::AID-SIM35>3.0.CO;2-W

  • Chen, C., & Liu, C. J. (2012). The application of total least squares method in data fitting of speed radar. Applied Mechanics and Materials, 203, 69-75. https://doi.org/10.4028/www.scientific.net/amm.203.69

  • Croux, C., Rousseeuw, P. J., & Hossjer, O. (1994). Generalized S-estimators. Journal of the American Statistical Association, 89(428), 1271-1281. https://doi.org/10.2307/2290990

  • Dobler, P. C. (2002). Mathematical statistics: Basic ideas and selected topics. The American Statistician, 56(4), 332-332. https://doi.org/10.1198/tas.2002.s204

  • Drygas, H. (2011). On the relationship between the method of least squares and Gram-Schmidt orthogonalization. Acta et Commentationes Universitatis Tartuensis de Mathematica, 15(1), 3-13. https://doi.org/10.12697/ACUTM.2011.15.01

  • Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica, 50(4), 987-1007. https://doi.org/10.2307/1912773

  • Eriksson, A., Preve, D., & Yu, J. (2019). Forecasting realized volatility using a nonnegative semiparametric model. Journal of Risk and Financial Management, 12(3), Article 139. https://doi.org/10.3390/jrfm12030139

  • Girshin, S. S., Kuznetsov, E. A., & Petrova, E. V. (2016, May). Application of least square method for heat balance equation solving of overhead line conductors in case of natural convection. In 2016 2nd International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM) (pp. 1-5). IEEE Publishing. https://doi.org/10.1109/ICIEAM.2016.7911417.

  • Gupta, A. K., & Govindarajulu, Z. (1973). Some new classification rules for c univariate normal populations. Canadian Journal of Statistics, 1(1-2), 139-157. https://doi.org/10.2307/3314996

  • Harianto, H., Sunyoto, A., & Sudarmawan, S. (2020). Optimasi algoritma Naïve Bayes classifier untuk mendeteksi anomaly dengan univariate fitur selection[Naïve Bayes algorithm optimization classifier to detect anomalies with univariate feature selection]. Edumatic: Jurnal Pendidikan Informatika, 4(2), 40-49. https://doi.org/10.29408/edumatic.v4i2.2433

  • Hasselmann, K., & Barnett, T. P. (1981). Techniques of linear prediction for systems with periodic statistics. Journal of Atmospheric Sciences, 38(10), 2275-2283. https://doi.org/10.1175/1520-0469(1981)038<2275:TOLPFS>2.0.CO;2

  • He, Z., Zuo, R., Zhang, D., Ni, P., Han, K., Xue, Z., Wang, J., & Xu, D. (2021). A least squares method for identification of unknown groundwater pollution source. Hydrology Research, 52(2), 450-460. https://doi.org/10.2166/nh.2021.088

  • Hubert, M., & Debruyne, M. (2010). Minimum covariance determinant. WIREs Computational Statistics, 2(1), 36-43. https://doi.org/10.1002/wics.61

  • Hubert, M., Debruyne, M., & Rousseeuw, P. J. (2018). Minimum covariance determinant and extensions. WIREs Computational Statistics, 10(3), Article e1421. https://doi.org/10.1002/wics.1421

  • Huberty, C. J., & Holmes, S. E. (1983). Two-group comparisons and univariate classification. Educational and Psychological Measurement, 43(1), 15-26. https://doi.org/10.1177/001316448304300103

  • Jaeger, B. (2006). The method of least squares. In A. A. Lazakidou (Ed.), Handbook of Research on Informatics in Healthcare and Biomedicine (pp. 181-185). IGI Global. https://doi.org/10.4018/978-1-59140-982-3.ch023

  • Johnson, R. A., & Wichern, D. W. (1992). Applied Multivariate Statistical Analysis (3rd Ed.). Prentice-Hall.

  • Jones, R. H. (1978). Multivariate autoregression estimation using residuals. In Applied Time Series Analysis I, (pp. 139-162). Academic Press. https://doi.org/10.1016/B978-0-12-257250-0.50009-X

  • Karimi-Bidhendi, S., Munshi, F., & Munshi, A. (2018). Scalable classification of univariate and multivariate time series. In 2018 IEEE International Conference on Big Data (Big Data) (pp. 1598-1605). IEEE Publishing. https://doi.org/10.1109/BigData.2018.8621889

  • KEEL. (2015). Banana. OpenML. https://www.openml.org/d/1460

  • Kern, M. (2016). Numerical Methods for Inverse Problems. John Wiley & Sons.

  • Kordestani, M., Hassanvand, F., Samimi, Y., & Shahriari, H. (2020). Monitoring multivariate simple linear profiles using robust estimators. Communications in Statistics - Theory and Methods, 49(12), 2964-2989. https://doi.org/10.1080/03610926.2019.1584314

  • Koubaa, Y. (2006). Application of least-squares techniques for induction motor parameters estimation. Mathematical and Computer Modelling of Dynamical Systems, 12(4), 363-375. https://doi.org/10.1080/13873950500064103

  • Leys, C., Delacre, M., Mora, Y. L., Lakens, D., & Ley, C. (2019). How to classify, detect, and manage univariate and multivariate outliers, with emphasis on pre-registration. International Review of Social Psychology, 32(1), 1-10. https://doi.org/10.5334/irsp.289

  • Lindley, D. V. (1999). Introduction to the practice of statistics, by David S. Moore and George P. McCabe. Pp. 825 (with appendices and CD-ROM).£ 27.95. 1999. ISBN 0 7167 3502 4 (WH Freeman). The Mathematical Gazette, 83(497), 374-375. https://doi.org/10.2307/3619120

  • Ma, D., Wei, W., Hu, H., & Guan, J. (2011). The application of Bayesian classification theories in distance education system. International Journal of Modern Education and Computer Science, 3(4), 9-16.

  • Manolakis, D. G., & Proakis, J. G. (1996). Digital Signal Processing. Principles, Algorithm, and Applications (4th Ed.). Prentice-Hall International Inc.

  • Marple, S. L., & Carey, W. M. (1989). Digital spectral analysis with applications. The Journal of the Acoustical Society of America, 86, Article 2043. https://doi.org/10.1121/1.398548

  • Mello, L. (2006). Linear Predictive Coding as an Estimator of Volatility. arXiv e-Print. https://doi.org/10.48550/arXiv.cs/0607107

  • Miller, S. J. (2006). The method of least squares. Williams College. https://web.williams.edu/Mathematics/sjmiller/public_html/probabilitylifesaver/MethodLeastSquares.pdf

  • Ogundokun, R. O., Lukman, A. F., Kibria, G. B., Awotunde, J. B., & Aladeitan, B. B. (2020). Predictive modelling of COVID-19 confirmed cases in Nigeria. Infectious Disease Modelling, 5, 543-548. https://doi.org/10.1016/j.idm.2020.08.003

  • Okwonu F. Z., & Othman, A. R. (2012). A model classification technique for linear discriminant analysis for two groups. International Journal of Computer Science Issues (IJCSI), 9(3), 125-128.

  • Olarenwaju, B. A., & Harrison, I. U. (2020). Modeling of COVID-19 cases of selected states in Nigeria using linear and non-linear prediction models. Journal of Computer Sciences Institute, 17, 390-395.

  • Penenberg, D. N. (2015). Mathematical Statistics: Basic Ideas and Selected Topics, 2nd edn, vols I and II PJ Bickel and KA Doksum, 2015 Boca Raton, Chapman and Hall–CRC xxii+ 548 pp., $99.95 (vol. I); 438 pp., $99.95 (vol. II) ISBN 978-1-498-72380-0. Journal of the Royal Statistical Society Serires, 179(4), 1128-1129.

  • Randall, R. B., Antoni, J., & Borghesani, P. (2020). Applied digital signal processing. In R. Allemang & P. Avitabile (Eds), Handbook of Experimental Structural Dynamics (pp.1-81). Springer. https://doi.org/10.1007/978-1-4939-6503-8_6-1

  • Skurichina, M., & Duin, R. P. (2000). Boosting in linear discriminant analysis. In International Workshop on Multiple Classifier Systems. MCS 2000. Lecture Notes in Computer Science, (Vol. 1857, pp. 190-199). Springer. https://doi.org/10.1007/3-540-45014-9_18

  • Song, K., Wang, N., & Wang, H. (2020). A metric learning-based univariate time series classification method. Information, 11(6), Article 288. https://doi.org/10.3390/info11060288

  • Srivastava, S. (2017). Fundamentals of linear prediction. The Institute for Signal and Information Processing. https://www.isip.piconepress.com/courses/msstate/ece_7000_speech/lectures/current/lecture_03/paper/paper.pdf

  • Sun, W., Zuo, F., Dong, A., & Zhou, L. (2015). Application of least square curve fitting algorithm based on LabVIEW in pressure detection system. In 2015 International Conference on Applied Science and Engineering Innovation (pp. 39-43). Atlantis Press.

  • Tan, L., & Jiang, J. (2018). Digital Signal Processing: Fundamentals and Applications. Academic Press.

  • Tang, J., Alelyani, S., & Liu, H. (2014). Feature selection for classification: A review. In C. C. Aggarwal (Ed.), Data Classification: Algorithms and Applications (pp. 1-28). Chapman and Hall/CRC. https://doi.org/10.1201/b17320

  • Theodoridis, S., & Koutroumbas, K. (2009). Classifiers based on Bayes decision theory. In Pattern recognition (pp. 13-28). Academia Press.

  • Vaseghi, S. V. (2008). Advanced Digital Signal Processing and Noise Reduction. John Wiley & Sons.

  • Verardi, V., & McCathie, A. (2012). The S-estimator of multivariate location and scatter in stata. The Stata Journal: Promoting Communications on Statistics and Stata, 12(2), 299-307. https://doi.org/10.1177/1536867X1201200208

  • Yao, T. T., Bai, Z. J., Jin, X. Q., & Zhao, Z. (2020). A geometric Gauss-Newton method for least squares inverse eigenvalue problems. BIT Numerical Mathematics, 60(3), 825-852. https://doi.org/10.1007/s10543-019-00798-9

  • Ye, N. (2020). Naïve Bayes classifier. In Data Mining (pp. 1-6). CRC Press. https://doi.org/10.1201/b15288-5

ISSN 0128-7680

e-ISSN 2231-8526

Article ID

JST-3314-2021

Download Full Article PDF

Share this article

Recent Articles