Page 227 - AI for Good-Innovate for Impact Final Report 2024
P. 227

AI for Good-Innovate for Impact



               (continued)

                Domain                   Accessibility
                                         4.  Sound Recognition: Audio event detection, audio labeling,
                                            and offline audio recognition help users recognize import-              53 - VIVO
                                            ant sounds around them. When a corresponding sound
                                            type is recognized, hearing impaired users will be promptly
                                            reminded through vibrations and notifications.
                                         5.  Sign Language Translator: The first application in China
                                            to implement sign language recognition technology. Sign
                                            language recognition utilizes deep learning algorithms
                                            to interpret the movements in sign language videos and
                                            transform them into text-based information, aiding commu-
                                            nication between hearing-impaired and those with normal
                                            hearing. Sign language synthesis also creates continuous
                                            sign language actions by an AI virtual figure based on text
                                            content, which assists the hearing-impaired community in
                                            gaining access to and understanding information.
                Technology keywords      Automatic Speech Recognition, Facial Recognition, Multi-target
                                         Tracking/Recognition, Multi-modal Large Model, Music Note
                                         Recognition Technology, Audio Event Detection, Sign Language
                                         Recognition Technology, Sign Language Synthesis, Virtual
                                         Avatar, Offline Audio Recognition
                Data availability        1.  National and international public data

                                            •  laion5B: https:// laion .ai/ blog/ laion -5b/
                                            •  Taisu: https:// github .com/ ksOAn6g5/ TaiSu
                                            •  wukong: https:// wukong -dataset .github .io/ wukong -dataset/
                                               index .html
                                         2.  Internal company data
                                         3.  Third-party procurement data
                                         4.  User-authorized data
                                         5.  Generated data

                Metadata (type of data)  structured and unstructured data
                Model Training and       •  Image caption provides a text description of an image
                fine-tuning              •  Visual Question Answering combines images and questions
                                            to predict answers
                                         •  Audio-visual speech recognition combines sound and video
                                            information to identify speech content

                Case Studies             None



               53�2� Use Case description


               53�2�1  Description

               Introduction: Based on the sustainable development vision of "Technology for a Better Future"
               by vivo, more than 10 accessibility features and products have been launched to meet the
               needs of relevant groups. Beginning with a humanistic mindset, vivo actively engages in public
               welfare actions, supporting over 600 impoverished people with disabilities in enhancing their
               digital literacy, information competency, and employment skills and helping them achieve




                                                                                                    211
   222   223   224   225   226   227   228   229   230   231   232