Page 760 - AI for Good Innovate for Impact
P. 760

AI for Good Innovate for Impact



                      Reduced Inequality

                      By creating tools that can process and interpret audio data in Kurdish, the project helps ensure
                      that feedback and interviews from Kurdish-speaking populations are accurately captured
                      and  understood.  Transcribing  voice  into  text  allows  for easier  translation,  classification,
                      summarization, and integration into data systems such as dashboards and reports. This
                      amplifies the voices of marginalized communities and enables more inclusive decision-making
                      processes, ultimately supporting equity and representation.


                      Partnerships and Collaboration

                      The project is grounded in collaboration with key partners to maximize impact. One partnership
                      is with a local university that specializes in Kurdish computational linguistics, providing deep
                      academic insight and cultural relevance. Another is with Kobo, the organization behind
                      KoboToolbox, a widely used platform for data collection and visualization. By working with
                      these partners, the project ensures the practical deployment of the language model within
                      humanitarian data workflows and promotes adoption at scale through trusted platforms.


                      2�3     Future Work

                      Direct next steps:
                      •    Conduct Data Collection: We will commission 10 hours of synthetic data and conduct
                           humanitarian domain-specific audio data collection using the KoboToolbox survey.
                      •    Development of Kurdish Speech to Text Language Model: We will set up pipelines for
                           data processing, develop and fine-tune the speech-to-text model, and evaluate the model
                           with test data and real-life testing.
                      •    KoboToolbox Integration: We will create an API and establish a server for hosting
                           language models, and modify KoboToolbox software to include the custom language
                           model.
                      •    Test the Integrated Model: We will test the integrated model in real-life settings and
                           capture feedback.
                      •    Communication and Dissemination: We will prepare a summary report and host a
                           dissemination conference.

                      Further future work:

                      •    Expanding the model to support additional Kurdish dialects beyond Sorani and Kurmanji,
                           ensuring broader coverage and usability across different regions.
                      •    Exploring the development of a Kurdish Large Language Model (LLM) capable of
                           performing basic tasks such as summarization, translation, and Q&A with the transcribed
                           text. This would enhance the utility of the transcriptions and provide more comprehensive
                           language support.
                      •    Collaborating with other operations to adapt the developed template for other low-
                           resource languages. This would involve customizing the model and processes to fit the
                           specific linguistic and operational needs of different regions and languages.
                      Our project has the potential for several collaborations and expansions:

                      •    Partnering with other international organizations working in similar fields to share
                           knowledge, resources, and best practices. This can help in scaling the project to other
                           regions and similar languages.






                  724
   755   756   757   758   759   760   761   762   763   764   765