A robust privacy preserving approach for electronic health records using multiple dataset with multiple sensitive attributes
journal contribution
posted on 2024-10-14, 11:13authored byTehsin Kanwal, Adeel Anjum, Saif UR Malik, Haider Sajjad, Abid Khan, Umar Manzoor, Alia AsheralievaAlia Asheralieva
Privacy preserving data publishing of electronic health record (EHRs) for 1 to M datasets with multiple sensitive attributes (MSAs) is an interesting and challenging issue. There is always a trade-off between privacy and utility in data publishing. Most of the privacy-preserving models shows critical privacy disclosure issues and, hence, they are not robust in practical datasets. The k-anonymity model is a broadly used privacy model to analyze privacy disclosures, however, this model is only useful against identity disclosure. To address the limitations of k-anonymity, a group of privacy model extensions have been proposed in past years. It includes a p-sensitive k-anonymity model, a p+-sensitive k-anonymity model, and a balanced p+-sensitive k-anonymity model. However these privacy-preserving models are not sufficient to preserve the privacy of end-users in practical datasets. In this paper we have formalize the behavior of an adversary which perform identity and attribute disclosures on balanced p+-sensitive k-anonymity model with the help of adversarial scenarios. Since balanced p+-sensitive k-anonymity model is not sufficient for 1 to M with MSAs datasets privacy preservation. We propose an extended privacy model called “1: M MSA-(p, l)-diversity” for 1: M dataset with MSAs. We then perform formal modeling and verification of the proposed model using High-Level Petri Nets (HLPN) to confirm privacy attacks invalidation. Experimental results show that our proposed “1: M MSA-(p, l)-diversity model” is efficient and provide enhanced data utility of published data.
Funding
National Natural Science Foundation of China (NSFC): project No. 61950410603