e-ISSN 2231-8526
ISSN 0128-7680
Friday Zinzendoff Okwonu, Nor Aishah Ahad, Joshua Sarduana Apanapudor and Festus Irismisose Arunaye
Pertanika Journal of Science & Technology, Volume 29, Issue 2, April 2021
DOI: https://doi.org/10.47836/pjst.29.2.16
Keywords: Coefficient of determination, Covid-19, multivariate correlation techniques, robust
Published on: 30 April 2021
Robust multivariate correlation techniques are proposed to determine the strength of the association between two or more variables of interest since the existing multivariate correlation techniques are susceptible to outliers when the data set contains random outliers. The performances of the proposed techniques were compared with the conventional multivariate correlation techniques. All techniques under study are applied on COVID-19 data sets for Malaysia and Nigeria to determine the level of association between study variables which are confirmed, discharged, and death cases. These techniques’ performances are evaluated based on the multivariate correlation (R), multivariate coefficient of determination (R^2), and Adjusted R^2. The proposed techniques showed R=0.99 and the conventional methods showed that R ranges from 0.44 to 0.73. The R^2 and the Adjusted R^2 for proposed methods are 0.98 and 0.97 while the conventional methods showed that R equals 0.53, 0.44, and 0.19 whereas Adjusted R^2 equals 0.52, 0.43, and 0.18, respectively. The proposed techniques strongly affirmed that for any patient to be discharged or die of the Covid-19, the patient must be confirmed Covid-19 positive, whereas the conventional method showed moderate to very weak affirmation. Based on the results, the proposed techniques are robust and show a very strong association between the variables of interest than the conventional techniques.
Abdi, H. (2007). Multiple correlation coefficient. In N. Salkind (Ed.), Encyclopedia of Measurement and Statistics (pp. 648-651). Sage Publication.
Abdullah, M. B. (1990). On a robust correlation coefficient. Journal of the Royal Statistical Society: Series D (The Statistician), 39(4), 455-460. https://doi.org/10.2307/2349088
Armstrong, R. A. (2019). Should Pearson’s correlation coefficient be avoided? Ophthalmic and Physiological Optics, 39(5), 316-327. https://doi.org/10.1111/opo.12636
Asuero, A. G., Sayago, A., & Gonzalez, A. G. (2006). The correlation coefficient: An overview. Critical Reviews in Analytical Chemistry, 36(1), 41-59. https://doi.org/10.1080/10408340500526766
Bareinboim, E., Tian, J., & Pearl, J. (2014). Recovering from selection bias in causal and statistical inference. In Proceedings of the National Conference on Artificial Intelligence (Vol. 28, No. 1). PKP Publishing Services Network.
Brown, G., Pocock, A., Zhao, M. J., & Luján, M. (2012). Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. The Journal of Machine Learning Research, 13(1), 27-66.
Châtillon, G. (1984). The balloon rules for a rough estimate of the correlation coefficient. American Statistician, 38(1), 58-60. https://doi.org/10.1080/00031305.1984.10482875
Garnett, J. C. (1919). General ability, cleverness and purpose. British Journal of Psychology, 9(3), 345-366.
Geiß, S., & Einax, J. (1996). Multivariate correlation analysis - A method for the analysis of multidimensional time series in environmental studies. Chemometrics and Intelligent Laboratory Systems, 32(1), 57-65. https://doi.org/10.1016/0169-7439(95)00067-4
Geiss, S., Einax, J., & Danzer, K. (1991). Multivariate correlation analysis and its application in environmental analysis. Analytica Chimica Acta, 242, 5-9. https://doi.org/10.1016/0003-2670(91)87040-E
Huberty, C. J. (2003). Multiple correlations versus multiple regression. Educational and Psychological Measurement, 63(2), 271-278. https://doi.org/10.1177/0013164402250990
Lewis-Beck, M. S., Bryman, A., & Futing Liao, T. (2004). The SAGE Encyclopedia of Social Science Research Methods (Vols. 1-0). Sage Publications, Inc. https://doi.org/10.4135/9781412950589
KKM. (2020). Distribution of covid-19 cases according to date of confirmation. Retrieved October 01, 2020, from http://covid-19.moh.gov.my/
Mukaka M. M. (2012). Statistics corner: A guide to the appropriate use of the Correlation coefficient in medical research. Malawi Medical Journal, 24(3), 69-71.
Nagelkerke, N. J. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78(3), 691-692. https://doi.org/10.1093/biomet/78.3.691
Nakagawa, S., Johnson, P. C. D., & Schielzeth, H. (2017). The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded. Journal of the Royal Society Interface, 14(134), Article 20170213. https://doi.org/10.1098/rsif.2017.0213
NCDC. (2020). The official Twitter account of the Nigeria Centre for Disease Control. Retrieved June 19, 2020, from https://twitter.com/ncdcgov
Nguyen, H. V., Müller, E., Vreeken, J., Keller, F., & Böhm, K. (2013). CMI: An information-theoretic contrast measure for enhancing subspace cluster and outlier detection. In Proceedings of the 2013 SIAM International Conference on Data Mining (pp. 198-206). Society for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9781611972832.22
Okwonu, F. Z., Asaju, B. L., & Arunaye, F. I. (2020, September). Breakdown analysis of pearson correlation coefficient and robust correlation methods. In IOP Conference Series: Materials Science and Engineering (Vol. 917, No. 1, p. 012065). IOP Publishing. https://doi.org/10.1088/1757-899X/917/1/012065
Pearson, K. (1920). Notes on the history of correlation. Biometrika, 13(1), 25-45. https://doi.org/10.1093/biomet/13.1.25
Pyrczak, F., & Oh, D. M. (2018). Making sense of statistics: A conceptual overview (7th ed.). Routledge.
Rodgers, L. J., & Nicewander, W. L. (1988). Thirteen ways to look at the correlation coefficient. The American Statistician, 42(1), 59-66. https://doi.org/10.1080/00031305.1988.10475524
Tan, Z., Jamdagni, A., He, X., Nanda, P., & Liu, R. P. (2011). Denial-of-service attack detection based on multivariate correlation analysis. In International Conference on Neural Information Processing (pp. 756-765). Springer. https://doi.org/10.1007/978-3-642-24965-5_85
Urain, J., & Peters, J. (2019). Generalized multiple correlation coefficient as a similarity measurement between trajectories. In IEEE International Conference on Intelligent Robots and Systems (pp. 1-7). IEEE Conference Publication. https://doi.org/10.1109/IROS40897.2019.8967884
Wang, J., & Zheng, N. (2020). Correlation with applications (1): Measures of correlation for multiple variables. In IEEE International Conference on Intelligent Robots and Systems (pp. 1-18). Cornell University Press.
Wang, L., Tang, X., Zhang, J., & Guan, D. (2018). Correlation Analysis for Exploring Multivariate Data Sets. IEEE Access, 6, 44235-44243. https://doi.org/10.1109/ACCESS.2018.2864685
Wang, Y., Romano, S., Nguyen, V., Bailey, J., Ma, X., & Xia, S. T. (2017). Unbiased multivariate correlation analysis. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 31, No. 1). PKP Publishing Services Network.
Weida, F. M. (1927). On various conceptions of correlation. The Annals of Mathematics, 29(1/4), 276-312. https://doi.org/10.2307/1968000
Zhang, X., Pan, F., Wang, W., & Nobel, A. (2008). Mining non-redundant high order correlations in binary data. In Proceedings of the VLDB Endowment International Conference on Very Large Data Bases (Vol. 1, No. 1, p. 1178). NIH Public Access. https://doi.org/10.14778/1453856.1453981
ISSN 0128-7680
e-ISSN 2231-8526