Page 747 - AI for Good Innovate for Impact
P. 747
AI for Good Innovate for Impact
(continued)
Item Details
1. National and international public data 4.9: Accessibility
• laion5B:[4]
• Taisu:[5]
• wukong:[6]
Data Avail-
ability 2. Internal company data
3. Third-party procurement data
4. User-authorized data
5. Generated data
Metadata
(Type of Audio, Video and Image
Data)
Image caption provides a text description of an image. Visual Question Answering
Model combines images and questions to predict answers.
Training and Audio-visual speech recognition combines sound and video information to iden-
Fine-Tuning tify speech content.
The project focuses on model fine-tuning based on different scenarios.
Testbeds or Project Accessibility Features: [7]
Pilot Deploy-
ments vivo origin OS: [8]
2 Use Case Description
2�1 Description
Context and Background
People with disabilities and multilingual cultural groups face significant challenges when using
smart devices due to:
• Limited Functionality: Existing assistive technologies (e.g., speech-to-text, image
recognition) heavily rely on cloud-based AI processing, leading to high latency and offline
inaccessibility.
• Privacy Risks: Cloud data transmission increases vulnerabilities to personal information
leaks, especially for sensitive data of disabled users.
• High Energy Consumption: Cloud computing’s resource-intensive nature conflicts with
sustainable development goals.
These technological barriers exacerbate difficulties in communication, information access, and
environmental adaptation, directly impacting quality of life.
711

