Page 741 - Shaping smarter and more sustainable cities - Striving for sustainable development goals
P. 741
Table 7 – Anonymized medical record
Birth Gender ID Problem
1970 Male 121 Cold
1970 Male 121 Obesity
1970 Male 121 diabetes
198* Human 12* diabetes
198* Human 12* Obesity
198* Human 12* diabetes
198* Human 12* Cold
As displayed in these tables, the masking or generalization processes prevent an attacker from
identifying a specific person. There are several algorithms for calculating masking or generalization.
The most popular algorithm is the heuristic searching method, utilizing double‐nested loops.
(2) Diversity
68
Diversity is a method designed to protect the privacy of data . This method considers the diversity
of sensitive attributes, and it is, therefore, different from ‐anonymity.
The definition of ‐diversity is as follows: "In all q*‐blocks in a data table, there are at least l different
sensitive attributes."
Researchers designed this method to provide protection from the following attacks.
(i) Homogeneity attack:
Table 8 is an additional example of a medical record data table. In this case, if an attacker
has acquired Alice's quasi‐identifier, the attacker can read Alice's problem from this table
because no diversity exists for the sensitive attributes in the q*‐block.
(ii) Background knowledge attack:
Although theq*‐block in the table has a diversity of sensitive attributes, if the probability of
poor circulation is very low for males and an attacker is aware of that, the attacker can read
Bob's problem from the table.
l‐diversity provides more security than ‐anonymity for preserving privacy. However, the calculation
cost of l‐diversity is higher than anonymity.
____________________
68 Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrke, Muthuramakrishnan Venkitasubramaniam; L‐diversity:
Privacy beyond k‐anonymity; ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 1, No. 1, 2007.
ITU‐T's Technical Reports and Specifications 731