Page 227 - AI for Good-Innovate for Impact Final Report 2024

P. 227

AI for Good-Innovate for Impact

(continued)

Domain Accessibility
4. Sound Recognition: Audio event detection, audio labeling,
and offline audio recognition help users recognize import- 53 - VIVO
ant sounds around them. When a corresponding sound
type is recognized, hearing impaired users will be promptly
reminded through vibrations and notifications.
5. Sign Language Translator: The first application in China
to implement sign language recognition technology. Sign
language recognition utilizes deep learning algorithms
to interpret the movements in sign language videos and
transform them into text-based information, aiding commu-
nication between hearing-impaired and those with normal
hearing. Sign language synthesis also creates continuous
sign language actions by an AI virtual figure based on text
content, which assists the hearing-impaired community in
gaining access to and understanding information.
Technology keywords Automatic Speech Recognition, Facial Recognition, Multi-target
Tracking/Recognition, Multi-modal Large Model, Music Note
Recognition Technology, Audio Event Detection, Sign Language
Recognition Technology, Sign Language Synthesis, Virtual
Avatar, Offline Audio Recognition
Data availability 1. National and international public data

• laion5B: https:// laion .ai/ blog/ laion -5b/
• Taisu: https:// github .com/ ksOAn6g5/ TaiSu
• wukong: https:// wukong -dataset .github .io/ wukong -dataset/
index .html
2. Internal company data
3. Third-party procurement data
4. User-authorized data
5. Generated data

Metadata (type of data) structured and unstructured data
Model Training and • Image caption provides a text description of an image
fine-tuning • Visual Question Answering combines images and questions
to predict answers
• Audio-visual speech recognition combines sound and video
information to identify speech content

Case Studies None

53�2� Use Case description

53�2�1 Description

Introduction: Based on the sustainable development vision of "Technology for a Better Future"
by vivo, more than 10 accessibility features and products have been launched to meet the
needs of relevant groups. Beginning with a humanistic mindset, vivo actively engages in public
welfare actions, supporting over 600 impoverished people with disabilities in enhancing their
digital literacy, information competency, and employment skills and helping them achieve

211

222 223 224 225 226 227 228 229 230 231 232