Page 47 - Crowdsourcing AI and Machine Learning solutions for SDGs - ITU AI/ML Challenges 2024 Report
P. 47
Crowdsourcing AI and Machine Learning solutions for SDGs
Annex 3: Data Sharing Guidelines
The success of the ITU AI/ML Challenge depends on the availability of data and whether entities
(or data owners) are willing to share data with others. Rapid and unrestricted sharing of data and
resources is essential for advancing the Challenge. However, there are cases where unrestricted
data sharing is not possible. In this case, this document addresses measures that can be taken
to ensure that data providers can share relevant data with problem solvers or researchers
under specific agreements to ensure data integrity. Therefore, having an institutional data-
sharing guideline is the first step towards encouraging companies, entities (data providers),
collaborators, researchers, and professionals to share relevant data for the challenge.
NOTE – Data providers/owners: defined as entities who have data to share for specific problem statements.
This data may be useful for the training and testing of AI/ML models.
This document outlines data management and sharing guidelines. This guideline would help
data owners to derive maximum value from their data while protecting the interests of their
institution and its members.
1 Data Classification Categories
For the purposes of the ITU AI/ML challenge, we consider the data classification categories
[1]
below:
Table 3: Data Classification Categories
Data Category Description
Public/Open Data Data that can be made publicly available because disclosure
is associated with little or minimal privacy impact on individu-
als and/or organizations. This includes data that is anonymous,
aggregated, and non-sensitive data.
NOTE – This kind of data can be shared without any restrictions.
Restricted data Some data are moderately sensitive and cannot be shared
publicly (as it is) because disclosure can cause minor privacy
impact for an individual, put an individual or community at risk
of a privacy incident, or negatively impact an organization’s
capacity to compete in the market or carry out its activities. Exam-
ple: measurement data obtained per access network or access
network site.
NOTE – This kind of data needs to be pre-processed to remove
the privacy impact before being shared.
Restricted data may be available only under certain conditions set
forth by the data provider.
Example-1: Restricted data may be made available after signing a
NDA.
Example-2: Restricted data may be available only for use within
the hosted platform and not for moving out of the hosted plat-
form (i.e. no downloading of data may be allowed).
Example-3: Restricted data may be available to citizens of a
particular country or region e.g. under data privacy regulations of
EU or China.
39