Page 226 - AI for Good-Innovate for Impact Final Report 2024

P. 226

AI for Good-Innovate for Impact

Use case – 53: Vivo’s Technology for All: Bridging the Accessibility

Gap

Country: China

Organization: vivo

Contact person: Li Mengzhu, limengzhu.ai@ vivo .com

53�1� Use Case Summary Table

Domain Accessibility
The Problem to be Smartphone OS-based Information Accessibility Solutions and
addressed Public Welfare for People with Disabilities
Key aspects of the solution vivo AI Lab, founded in 2018, has been committed to build-
ing industry-leading AI technologies and providing users with
ultimate product experience. Areas of work include Computer
Vision, Speech Technology, Natural Language Processing and
Machine Learning.
We have AI R&D bases in Shenzhen, Beijing, Hangzhou and
Nanjing, with a team of over 1,000 AI engineers and dozens of
papers published in top AI academic conferences (AAAI, ICLR,
ECCV, CVPR, InterSpeech, etc.), as well as hundreds of AI patents
granted.
1. vivo Sight: The offline technologies such as Automatic
Speech Recognition (ASR), facial recognition, optical char-
acter recognition, and multi-target tracking/recognition are
integrated with the AI big model's visual multimodal capabili-
ties for image processing, to assist users to "see" the world in
personalized scenarios through multiple rounds of Q&A. The
environmental description technology can convert recog-
nized images into voice descriptions and broadcast them
aloud, thereby augmenting visual comprehension of both
on-screen and off-screen environmental information.
2. vivo Score Reading: By utilizing capabilities such as note
recognition algorithms, users can customize the reading of
music scores according to notes, beats, and measures. This
feature aids in the reading and learning of piano scores.
3. vivo Voice/Accessibility Calls: ASR and Speech-to-Text/Text-
to-Speech technologies have been applied to aid the fluent
face-to-face communication and telephone conversations
of hearing-impaired individuals. Additionally, multi-lingual
recognition and translation technology have significantly
reduced language barriers, allowing effortless communica-
tion between different users.

210

221 222 223 224 225 226 227 228 229 230 231