Page 99 - Proceedings of the 2017 ITU Kaleidoscope
P. 99
Challenges for a data-driven society
development data ecosystem, exploring both the therefore represents a direct (and future) benefit to
willingness of the participants to share their data with the individual.
specific stakeholders, as well as the factors that would • Cluster 2 are organizational entities within the
inform their willingness to share (or not to share) their wider health sector with clear sub-clusters of
personal data. governmental, non-governmental, and
international/multinational organizations. The
Using a continuous scale from 1 to 7 at “low willingness to benefits that accrue to the individual from sharing
share” and “high willingness to share” respectively, the data with these entities are indirect and generally
participants are most (mean 6.58) willing to share their not immediate.
personal health data with their doctors, and least (mean • Cluster 3 are entities with a high social proximity
3.24) willing to share their data with pharmaceutical to the individual, where the sharing of the personal
companies (Figure 4). health data could be more towards the associated
social benefits, such as sense-making [16], and
Further analysis was undertaken to understand how the social support [32].
participants’ attitudes towards sharing their personal data
correlates across the different stakeholders. For this The findings from the survey are that these initial clusters
analysis a Spearman correlation matrix was derived and of stakeholders not only highlight the need for
subsequently agglomerative hierarchical clustering differentiated data sharing arrangements with entities
(complete linkages method), using the Euclidian distance within the data ecosystem, but also point to the willingness
between the correlation scores, undertaken to understand of the participants to consider sharing their data across the
the main clusters for the different stakeholders (Figure 5). ecosystem.
The advent of social media has meant that individuals are
increasingly used to sharing their data. However a lot of the
voluntary and active sharing of data is typically in the
context of the social networks that the individuals have.
Currently a lot of individuals’ data is collected, without
their full awareness and complicity, from individuals’
digital traces and from tracking of individuals online
through surveillance. Solove suggest a taxonomy that
identifies four basic activities around which violation of
individuals’ privacy violation can occur, and these are [33]:
information collection – in which activities such as
surveillance and interrogation can be employed (by data
holders) to gather information about individuals (the data
subjects); information processing – through the processing
of the data involving aggregation and analysis; information
dissemination – encapsulates activities such as breach of
confidentiality, disclosure, exploration, blackmail and
distortion, which would contribute towards violating
individuals’ privacy; and lastly invasion – which is not
Fig. 5. Data sharing entities clustering about individuals’ information but rather about violating
privacy associated with individuals personhood. The
contention and opposition to the practice of mass collection
From this analysis three primary clusters of stakeholders of individuals’ data is growing, and increasingly there is
are noted (cutting the dendrogram in Figure 5 at the height push back from civil society to have increased privacy and
of 1.5) and these are: Cluster 1 - individual’s doctor; confidentiality of their data, to have control over who
Cluster 2 – NGO working on health issues, national collects the data, what data is collected, and how the data is
Department of Health, National Statistics Department, a used (i.e. increased data legibility [17]).
pharmaceutical company, and the World Health
Organization; and Cluster 3 - family members and friends. As such, beyond just understanding the participants’
Clear characterization emanates from these clusters, based attitudes towards sharing data with specific stakeholders,
on the relationship between the stakeholders and the this research also sought to investigate the factors that
individual, and the nature of the utility that accrues to the affect the willingness of participants to share data, based on
individual, as follows: 10 pre-selected factors and an evaluation using a
• Cluster 1 is a stakeholder that is able to use the continuous scale of between 1 (for low influence) and 7 (for
shared personal health data towards the high influence).
provisioning of an immediate health service,
wherein the data can be used for health monitoring
or to inform diagnosis of medical ailments. This
– 83 –