Page 740 - Shaping smarter and more sustainable cities - Striving for sustainable development goals
P. 740

(1)     Anonymity
                                                                     67
            Anonymity is one of the methods utilized for generalization , and it is the base of l‐diversity. Further
            explanation of this method will incorporate the various definitions listed below.

            (i)     Data table:
                    A data list similar to a database table is termed a "data table." Its column is termed an
                    "attribute."  Address,  birth,  and  gender  are  examples  of  attributes.  One  group  of  data
                    corresponding to the person or group of people is termed a "data set" and one data set is
                    termed a "tuple".
            (ii)    Attribute:
                    An attribute among a group of related attributes that can identify a corresponding person
                    by  itself,  such  as  name  or  unique  ID,  is  termed  an  "identifier,"  and  others  that  cannot
                    identify a group on their own, however, it can provide identification when combined with
                    other attributes, such as illness, birth, gender, is termed a "quasi‐identifier".
            (iii)   Sensitive attribute:
                    A significant attribute for secondary use is termed a "sensitive attribute," which can be
                    selected from attributes that are not identifiers. The method will exclude this attribute from
                    masking or generalization by anonymization. Furthermore, tuple groups that have the same
                    quasi‐identifier values are termed "q*‐block".
            The definition of k‐anonymity is as follows: "In each q*‐block in the data table, at least k tuples are
            included".

            Table 6 represents an example of a medical records data table. In this table, the sensitive attribute
            is "Problem" and the quasi‐identifiers are "Birth,""Gender," and "ID." The data consists of a t1~t3
            q*‐block, a t4, t5 q*‐block, and a t6, t7 q*‐block. It represents k=2. Even if an attacker attempts to
            ascertain a specific individual's problem and has already obtained the individual's quasi‐identifier,
            the  attacker  can  narrow  the  results  down  to  only  two  tuples.  Table  7  indicates  that  the
            anonymization results from Table 6 are k=3. The results displayed in this table demonstrate that
            anonymization  methods  provide  the  required  privacy  protection  level,  utilizing  masking  or
            generalization.


                                                Table 6 – Medical record

                                        Birth      Gender        ID       Problem

                                     1970       Male          121      Cold
                                     1970       Male          121      Obesity
                                     1970       Male          121      Diabetes
                                     1980       Female        121      Diabetes
                                     1980       Female        121      Obesity
                                     1981       Male          125      diabetes
                                     1981       Male          125      Cold




            ____________________
            67   L.  Seeney;  K‐anonymity:  A  model  for  protecting  privacy;  International  Journal  on  Uncertainty,  Fuzziness  and
               Knowledge‐based Systems, vol. 10, no. 5, pp. 557–570, 2002.

            730                                                      ITU‐T's Technical Reports and Specifications
   735   736   737   738   739   740   741   742   743   744   745