e-ISSN 2231-8526
ISSN 0128-7680
Nur Aziana Azwani Abdul Aziz, Masnida Hussin and Nur Raidah Salim
Pertanika Journal of Science & Technology, Volume 32, Issue 1, January 2024
DOI: https://doi.org/10.47836/pjst.32.1.14
Keywords: Naïve Bayes, privacy classification, public data attribute
Published on: 15 January 2024
The growth of the digital era with diverse existing electronic platforms offers information sharing and leads to the realization of a culture of knowledge. Vast amounts of data and information can be reached anywhere at any time, fingertips away. These data are public because people are willing to share them on digital platforms like social media. It should be noted that not all information is supposed to be made public; some is supposed to be kept private or confidential. However, people always misunderstand and are misled about which data needs to be secured and which can be shared. We proposed an attribute-based data privacy classification model using a Naïve Bayesian classifier in this work. It aims to identify and classify metadata (attributes) commonly accessible on digital platforms. We classified the attributes that had been collected into three privacy classes. Each class represents a level of data privacy in terms of its risk of breach. The public (respondent) is determined according to different ages to gather their perspective on the unclassified attribute data. The input from the survey is then used in the Naïve Bayesian classifier to formulate data weights. Then, the sorted privacy data in the class is sent back to the respondent to get their agreement on the class of attributes. We compare our approach with another classifier approach. The result shows fewer conflicting reactions from the respondents to our approach. This study could make the public aware of the importance of disclosing their information on open digital platforms.
Abraham, A., Kanjamala, E. R., Thomas, E. M., & Akhila, G. P. (2019). Email security classification of imbalanced data using naive Bayes classifier. International Journal of Wireless Communications and Network Technologies, 8(3), 16-20. https://doi.org/10.30534/ijwcnt/2019/04832019
Algarni, A. (2019). A survey and classification of security and privacy research in smart healthcare systems. IEEE Access, 7, 101879-101894. https://doi.org/10.1109/ACCESS.2019.2930962
Analysis & Policy Observatory. (2020). ACSC Annual Cyber Threat Report: July 2019 to June 2020. Australian Cyber Security Centre. https://apo.org.au/node/308071 https://www.cyber.gov.au/acsc/view-all-content/advice/personal-information-and-privacy
Bibhu, V., Salagrama, S., Lohani, B. P., & Kushwaha, P. K. (2021). An analytical survey of user privacy on social media platform. In 2021 International Conference on Technological Advancements and Innovations (ICTAI) (pp. 173-176). IEEE Publishing. https://doi.org/10.1109/ICTAI53825.2021.9673402
Budiu, R., & Moran, K. (2021). How many participants for quantitative usability studies: A summary of sample-size recommendations. Nielsen Normal Group. https://www.nngroup.com/articles/summary-quant-sample-sizes/
Cain, J. A., & Imre, I. (2022). Everybody wants some: Collection and control of personal information, privacy concerns, and social media use. New Media & Society, 24(12), 2705-2724. https://doi.org/10.1177/14614448211000327
Dokuchaev, V. A., Maklachkova, V. V., & Statev, V. Y. (2020). Classification of personal data security threats in information systems. T-Comm, 14(1), 56-60. https://doi.org/10.36724/2072-8735-2020-14-1-56-60
Indeed. (2021). A guide to data classification (with types and examples). Indeed. https://www.indeed.com/career-advice/career-development/data-classification
Liu, S., Zhu, M., & Yang, Y. (2013). A Bayesian classifier learning algorithm based on optimization model. Mathematical Problems in Engineering, 2013, Article 975953. https://doi.org/10.1155/2013/975953
MyGoverment. (2019). Mygov - The government of Malaysia’s Official Portal. MyGoverment. https://www.malaysia.gov.my/portal/content/30588
Rashid, A. F. A., & Zaaba, Z. F. (2020). Facebook, Twitter, and Instagram: The privacy challenges. In 2020 International Conference on Promising Electronic Technologies (ICPET) (pp. 122-127). IEEE Publishing. https://doi.org/10.1109/ICPET51420.2020.00032
Ravn, S., Barnwell, A., & Neves, B. B. (2019). What is “publicly available data”? exploring blurred public-private boundaries and ethical practices through a case study on Instagram. Journal of Empirical Research on Human Research Ethics, 15(1-2), 40-45. https://doi.org/10.1177/1556264619850736
Rehman, S. U., Manickam, S., & Al-Charchafchi, A. (2022). Privacy calculus model for online social networks: A study of Facebook users in a Malaysian University. Education and Information Technologies, 28, 7205-7223. https://doi.org/10.1007/s10639-022-11459-w
Reza, K. J., Islam, M. Z., & Estivill-Castro, V. (2020). Protection of user-defined sensitive attributes on online social networks against attribute inference attack via adversarial data mining. In Information Systems Security and Privacy: 5th International Conference, ICISSP 2019 (pp. 230-249). Springer International Publishing. https://doi.org/10.1007/978-3-030-49443-8_11
Salim, S., Turnbull, B., & Moustafa, N. (2022). Data analytics of social media 3.0: Privacy protection perspectives for integrating social media and Internet of Things (SM-IoT) systems. Ad Hoc Networks, 128, Article 102786. https://doi.org/10.1016/j.adhoc.2022.102786
Sanderson, T., Reeson, A., & Box, P. (2019). Optimizing open government: An economic perspective on data sharing. In Proceedings of the 12th International Conference on Theory and Practice of Electronic Governance (pp. 140-143). ACM Publishing. https://doi.org/10.1145/3326365.3326383
Shallal, Q. M., Hussien, Z. A., & Abbood, A. A. (2020). Method to implement K-NN machine learning to classify data privacy in IOT environment. Indonesian Journal of Electrical Engineering and Computer Science, 20(2), 985-990. https://doi.org/10.11591/ijeecs.v20.i2.pp985-990
Vu, D. H. (2022). Privacy-preserving Naive Bayes classification in semi-fully distributed data model. Computers & Security, 115, Article 102630. https://doi.org/10.1016/j.cose.2022.102630
Vu, D. H., Vu, T. S., & Luong, T. D. (2022). An efficient and practical approach for privacy-preserving Naive Bayes classification. Journal of Information Security and Applications, 68, Article 103215. https://doi.org/10.1016/j.jisa.2022.103215
Wibawa, A. P., Kurniawan, A. C., Murti, D. M., Adiperkasa, R. P., Putra, S. M., Kurniawan, S. A., & Nugraha, Y. R. (2019). Naïve Bayes classifier for journal quartile classification. International Journal of Recent Contributions from Engineering, Science & IT (IJES), 7(2), 91-99. https://doi.org/10.3991/ijes.v7i2.10659
Wu, J., Li, W., Bai, Q., Iko, T., & Moustafa, A. (2021). Privacy information classification: A hybrid approach. ArXiv Preprint. https://doi.org/10.48550/arXiv.2101.11574
Zanella-Béguelin, S., Wutschitz, L., & Tople, S. (2022). Bayesian estimation of differential privacy. ArXiv Preprint. https://doi.org/10.48550/arXiv.2206.05199
ISSN 0128-7680
e-ISSN 2231-8526