Page 34 - AI Ready – Analysis Towards a Standardized Readiness Framework
P. 34
AI Ready – Analysis Towards a Standardized Readiness Framework
formulas will be integrated into the system. For future study, mobile and/or web applications
will be developed.
Research on Statistical vs Neural Machine Translations for Khmer Braille [86][87], Khmer word
segmentation using conditional random fields [88], and Khmer Braille Book For Blind People
[89] are referred to in this project.
The use case aims to develop mobile and web applications to help machine translations for
Khmer Braille.
4�9�2 Live Primary Health Care African National Sign Language Translation
Tool
This use case [77] [78] [79] aims to solve the problem of difficulties in critical health care
and services, especially with effective communication between deaf individuals and service
providers. By providing AI-powered Live Sign Language Translation and multi-modal content
analysis, it is possible to achieve text-to-speech and speech-to-sign language translation. This
approach enables an AI-powered live sign language translation tool that translates between at
least 25 African sign languages and spoken/written language in real time.
This use case uses AutoML, which automatically prepares a dataset for model training, performs
a set of trials using open-source libraries such as sci-kit-learn and XGBoost, and creates a Python
notebook with the source code for each trial run so that revision, reproduction, and modification
of the code are possible. It also uses hyperparameter tuning to fine-tune the model. With the
abovementioned techniques, speech-to-sign language translation with facial animation and
vice versa is achievable.
The tool has been applied to the healthcare sector in Zimbabwe, benefiting deaf people and
service providers. Extension of the AI-based African National Sign Translation tool to sectors
other than Health care such as Banking, Finance, Insurance, and Investment industries in Africa
is planned. Regional sign language dialects may be integrated for best use in Public Healthcare.
4�9�3 Smartphone OS-based Information Accessibility Solutions and Public
Welfare for People with Disabilities
The use case [2] [85] introduced the technologies using smartphone that could benefit people
with disabilities. The technologies include Automatic Speech Recognition, note recognition
algorithms, text-to-speech and speech-to-text translation, and sound recognition that supports
Chinese sign language recognition, which fills the gap for Chinese hearing-impaired people.
The multimodal large model uses a large language model as its base and adds a visual module
to it, enabling the model to simultaneously process data from both text and image modalities.
Data used in the use case comes from national and international public data, internal company
data, third-party data, user-authorized data, and generated data. Image caption provides a text
description of an image, visual question answering combines images and questions to predict
answers and audio-visual speech recognition combines sound and video information to identify
speech content. By using these techniques, models used in the use case are fine-tuned.
27