e-ISSN 2231-8526
ISSN 0128-7680
Yik Siong Pang, Nor Aishah Ahad and Sharipah Soaad Syed Yahaya
Pertanika Journal of Science & Technology, Volume 30, Issue 4, October 2022
DOI: https://doi.org/10.47836/pjst.30.4.05
Keywords: Discriminant analysis, distance-based trimmed median, robust Mahalanobis squared distance
Published on: 28 September 2022
The commonly employed classical linear discriminant rule, based on classical mean and covariance, are highly sensitive to outliers. Therefore, outlier influence on location and scale estimation will affect the accuracy of a discriminant rule and lead to high misclassification rates. The past studies used classical Mahalanobis Squared Distance (MSD) to alleviate the problem. However, the highly sensitive mean and covariance shortcoming can still affect the distance computation, causing masking and swamping effects. In a previous study, researchers proposed a double trimming procedure that adopted MSD-based α-trimmed mean into MSD-based α-trimmed median to construct a robust classifier. However, the proposed procedure has an overlooked flaw because the procedure employed the MSD in the computation. Thus, this study proposed to employ a robust MSD for the distance-based trimmed median procedure. The improvised trimmed median was then used to construct a robust linear discriminant rule and compared with the classical and existing robust rules using a simulation study. The results show that this study’s proposed robust linear discriminant rule has better accuracy and consistent performance than the classical linear discriminant rule and two other robust linear discriminant rules.
Abu-Shawiesh, M. O., & Abdullah, M. B. (2001). A new robust bivariate control chart for location. Communications in Statistics-Simulation and Computation, 30(3), 513-529. https://doi.org/10.1081/SAC-100105076
Alloway, J. J., & Raghavachari, M. (1990). Multivariate control charts based on trimmed means. In ASQC Quality Congress Transactions (Vol. 44, pp. 449-453). American Society for Quality Control.
Campbell, N. A. (1980). Robust procedures in multivariate analysis I: Robust covariance estimation. Journal of the Royal Statistical Society: Series C (Applied Statistics), 29(3), 231-237. https://doi.org/10.2307/2346896
Croux, C., & Dehon, C. (2001). Robust linear discriminant analysis using
S‐estimators. Canadian Journal of Statistics, 29(3), 473-493. https://doi.org/10.2307/3316042
Erceg-Hurn, D. M., Wilcox, R. R., & Keselman, H. J. (2013). Robust statistical estimation. In T. D. Little (Ed), The Oxford Handbook of Quantitative Methods: Foundations (Vol. 1, pp. 388-406). Oxford University Press.
Hadi, A. S., Imon, A. H. M., & Werner, M. (2009). Detection of outliers. Wiley Interdisciplinary Reviews: Computational Statistics, 1(1), 57-70. https://doi.org/10.1002/wics.6
He, X., & Fung, W. K. (2000). High breakdown estimation for multiple populations with applications to discriminant analysis. Journal of Multivariate Analysis, 72(2), 151-162. https://doi.org/10.1006/jmva.1999.1857
Huber, P. J. (1964). Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35(1), 73-101.
Hubert, M., & Van Driessen, K. (2004). Fast and robust discriminant analysis. Computational Statistics & Data Analysis, 45(2), 301-320. https://doi.org/10.1016/S0167-9473(02)00299-2
Johnson, R. A., & Wichern, D. W. (2013). Applied Multivariate Statistical Analysis (6th Ed.). Pearson.
Lim, Y. F., Yahaya, S. S. S., & Ali, H. (2016). Winsorization on linear discriminant analysis. In AIP Conference Proceedings (Vol. 1782, No. 1, p. 050010). AIP Publishing LLC. https://doi.org/10.1063/1.4966100
Pang, Y. S., Ahad, N. A., & Yahaya, S. S. S. (2021). Robust multivariate location estimation in the existence of casewise and cellwise outliers. Mathematics and Statistics, 9(5), 653-663. https://doi.org/10.13189/ms.2021.090505
Pang, Y. S., Ahad, N. A., Yahaya, S. S. S., & Lim, Y. F. (2019). Robust linear discriminant rule using novel distance-based trimming procedure. Journal of Advanced Research in Dynamical and Control Systems, 11(05-Special Issue), 969-978.
Penny, K. I., & Jolliffe, I. T. (2001). A comparison of multivariate outlier detection methods for clinical laboratory safety data. Journal of the Royal Statistical Society: Series D (The Statistician), 50(3), 295-307. https://doi.org/10.1111/1467-9884.00279
Rousseeuw, P. J. (1985). Multivariate estimation with high breakdown point. Mathematical Statistics and Applications, 8, 283-297.
Rousseeuw, P. J., & Van Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41(3), 212-223. https://doi.org/10.1080/00401706.1999.10485670
Rousseeuw, P. J., & Van Zomeren, B. C. (1990). Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association, 85(411), 633-639. https://doi.org/10.1080/01621459.1990.10474920
Todorov, V., & Pires, A. M. (2007). Comparative performance of several robust linear discriminant analysis methods. REVSTAT Statistical Journal, 5, 63-83. https://doi.org/10.2307/3316042
Tukey, J. W. (1962). The future of data analysis. The Annals of Mathematical Statistics, 33(1), 1-67.
Yahaya, S. S. S., Lim, Y. F., Ali, H., & Omar, Z. (2016). Robust linear discriminant analysis with automatic trimmed mean. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 8(10), 1-3.
ISSN 0128-7680
e-ISSN 2231-8526