PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY

 

e-ISSN 2231-8526
ISSN 0128-7680

Home / Regular Issue / JST Vol. 32 (1) Jan. 2024 / JST-4252-2023

 

An Attribute-based Data Privacy Classification Through the Bayesian Theorem to Raise Awareness in Public Data Sharing Activity

Nur Aziana Azwani Abdul Aziz, Masnida Hussin and Nur Raidah Salim

Pertanika Journal of Science & Technology, Volume 32, Issue 1, January 2024

DOI: https://doi.org/10.47836/pjst.32.1.14

Keywords: Naïve Bayes, privacy classification, public data attribute

Published on: 15 January 2024

The growth of the digital era with diverse existing electronic platforms offers information sharing and leads to the realization of a culture of knowledge. Vast amounts of data and information can be reached anywhere at any time, fingertips away. These data are public because people are willing to share them on digital platforms like social media. It should be noted that not all information is supposed to be made public; some is supposed to be kept private or confidential. However, people always misunderstand and are misled about which data needs to be secured and which can be shared. We proposed an attribute-based data privacy classification model using a Naïve Bayesian classifier in this work. It aims to identify and classify metadata (attributes) commonly accessible on digital platforms. We classified the attributes that had been collected into three privacy classes. Each class represents a level of data privacy in terms of its risk of breach. The public (respondent) is determined according to different ages to gather their perspective on the unclassified attribute data. The input from the survey is then used in the Naïve Bayesian classifier to formulate data weights. Then, the sorted privacy data in the class is sent back to the respondent to get their agreement on the class of attributes. We compare our approach with another classifier approach. The result shows fewer conflicting reactions from the respondents to our approach. This study could make the public aware of the importance of disclosing their information on open digital platforms.

  • Abraham, A., Kanjamala, E. R., Thomas, E. M., & Akhila, G. P. (2019). Email security classification of imbalanced data using naive Bayes classifier. International Journal of Wireless Communications and Network Technologies, 8(3), 16-20. https://doi.org/10.30534/ijwcnt/2019/04832019

  • Algarni, A. (2019). A survey and classification of security and privacy research in smart healthcare systems. IEEE Access, 7, 101879-101894. https://doi.org/10.1109/ACCESS.2019.2930962

  • Analysis & Policy Observatory. (2020). ACSC Annual Cyber Threat Report: July 2019 to June 2020. Australian Cyber Security Centre. https://apo.org.au/node/308071 https://www.cyber.gov.au/acsc/view-all-content/advice/personal-information-and-privacy

  • Bibhu, V., Salagrama, S., Lohani, B. P., & Kushwaha, P. K. (2021). An analytical survey of user privacy on social media platform. In 2021 International Conference on Technological Advancements and Innovations (ICTAI) (pp. 173-176). IEEE Publishing. https://doi.org/10.1109/ICTAI53825.2021.9673402

  • Budiu, R., & Moran, K. (2021). How many participants for quantitative usability studies: A summary of sample-size recommendations. Nielsen Normal Group. https://www.nngroup.com/articles/summary-quant-sample-sizes/

  • Cain, J. A., & Imre, I. (2022). Everybody wants some: Collection and control of personal information, privacy concerns, and social media use. New Media & Society, 24(12), 2705-2724. https://doi.org/10.1177/14614448211000327

  • Dokuchaev, V. A., Maklachkova, V. V., & Statev, V. Y. (2020). Classification of personal data security threats in information systems. T-Comm, 14(1), 56-60. https://doi.org/10.36724/2072-8735-2020-14-1-56-60

  • Indeed. (2021). A guide to data classification (with types and examples). Indeed. https://www.indeed.com/career-advice/career-development/data-classification

  • Liu, S., Zhu, M., & Yang, Y. (2013). A Bayesian classifier learning algorithm based on optimization model. Mathematical Problems in Engineering, 2013, Article 975953. https://doi.org/10.1155/2013/975953

  • MyGoverment. (2019). Mygov - The government of Malaysia’s Official Portal. MyGoverment. https://www.malaysia.gov.my/portal/content/30588

  • Rashid, A. F. A., & Zaaba, Z. F. (2020). Facebook, Twitter, and Instagram: The privacy challenges. In 2020 International Conference on Promising Electronic Technologies (ICPET) (pp. 122-127). IEEE Publishing. https://doi.org/10.1109/ICPET51420.2020.00032

  • Ravn, S., Barnwell, A., & Neves, B. B. (2019). What is “publicly available data”? exploring blurred public-private boundaries and ethical practices through a case study on Instagram. Journal of Empirical Research on Human Research Ethics, 15(1-2), 40-45. https://doi.org/10.1177/1556264619850736

  • Rehman, S. U., Manickam, S., & Al-Charchafchi, A. (2022). Privacy calculus model for online social networks: A study of Facebook users in a Malaysian University. Education and Information Technologies, 28, 7205-7223. https://doi.org/10.1007/s10639-022-11459-w

  • Reza, K. J., Islam, M. Z., & Estivill-Castro, V. (2020). Protection of user-defined sensitive attributes on online social networks against attribute inference attack via adversarial data mining. In Information Systems Security and Privacy: 5th International Conference, ICISSP 2019 (pp. 230-249). Springer International Publishing. https://doi.org/10.1007/978-3-030-49443-8_11

  • Salim, S., Turnbull, B., & Moustafa, N. (2022). Data analytics of social media 3.0: Privacy protection perspectives for integrating social media and Internet of Things (SM-IoT) systems. Ad Hoc Networks, 128, Article 102786. https://doi.org/10.1016/j.adhoc.2022.102786

  • Sanderson, T., Reeson, A., & Box, P. (2019). Optimizing open government: An economic perspective on data sharing. In Proceedings of the 12th International Conference on Theory and Practice of Electronic Governance (pp. 140-143). ACM Publishing. https://doi.org/10.1145/3326365.3326383

  • Shallal, Q. M., Hussien, Z. A., & Abbood, A. A. (2020). Method to implement K-NN machine learning to classify data privacy in IOT environment. Indonesian Journal of Electrical Engineering and Computer Science, 20(2), 985-990. https://doi.org/10.11591/ijeecs.v20.i2.pp985-990

  • Vu, D. H. (2022). Privacy-preserving Naive Bayes classification in semi-fully distributed data model. Computers & Security, 115, Article 102630. https://doi.org/10.1016/j.cose.2022.102630

  • Vu, D. H., Vu, T. S., & Luong, T. D. (2022). An efficient and practical approach for privacy-preserving Naive Bayes classification. Journal of Information Security and Applications, 68, Article 103215. https://doi.org/10.1016/j.jisa.2022.103215

  • Wibawa, A. P., Kurniawan, A. C., Murti, D. M., Adiperkasa, R. P., Putra, S. M., Kurniawan, S. A., & Nugraha, Y. R. (2019). Naïve Bayes classifier for journal quartile classification. International Journal of Recent Contributions from Engineering, Science & IT (IJES), 7(2), 91-99. https://doi.org/10.3991/ijes.v7i2.10659

  • Wu, J., Li, W., Bai, Q., Iko, T., & Moustafa, A. (2021). Privacy information classification: A hybrid approach. ArXiv Preprint. https://doi.org/10.48550/arXiv.2101.11574

  • Zanella-Béguelin, S., Wutschitz, L., & Tople, S. (2022). Bayesian estimation of differential privacy. ArXiv Preprint. https://doi.org/10.48550/arXiv.2206.05199

ISSN 0128-7680

e-ISSN 2231-8526

Article ID

JST-4252-2023

Download Full Article PDF

Share this article

Related Articles