Page 137 - Kaleidoscope Academic Conference Proceedings 2024
P. 137

ELDERLY WELLNESS COMPANION WITH VOICE AND VIDEO-BASED HEALTH
                                               ANOMALY DETECTION




                            1
                                                          1
                                                                                                            2
                                                                                               1
             Dhananjay, Kumar ; Mehal Sakthi, Muthusamy Sivaraja ; Sowbarnigaa, Kogilavani Shanmugavadivel ; Ved P., Kafle
                         1 Department of Information Technology, Anna University, MIT Campus, Chennai, India
                           2 National Institute of Information and Communications Technology, Tokyo, Japan



                              ABSTRACT                        indicative  of  a  fall  or  sudden  deterioration  in  health  and
                                                              trigger alerts to caregivers or healthcare providers. Speech
           The elderly healthcare requires an innovative approach to   analysis  can  contribute  to  personalized  healthcare  by
           address multifaceted challenges in tracking, monitoring, and   providing  objective  and  quantifiable  measures  of  health
           reporting in real-time. The proposed solution harnesses the   status. The system needs to accurately recognize emotions
           capabilities  of  voice  and  video-based  anomaly  detection   from voice samples across a diverse population, accounting
           systems  to  offer  continuous  monitoring,  personalized   for variations in tone, pitch, modulation, language, slang, and
           support,  and  timely  intervention  for  the  physical  and   other factors. By monitoring individualized speech profiles
           emotional well-being of elderly individuals. Central to the   over time, healthcare interventions can be tailored to meet
           proposed system is the integration of real-time voice emotion   the  specific  needs  of  each  individual.  A  multi-modal
           recognition and video-based posture recognition modules,   approach  such  as  combining  speech  and  visual  data  can
           constructed using cutting-edge deep learning and transfer   provide  a  more  comprehensive  understanding  of  an
           learning models respectively. These modules are deployed   individual's health status.
           on  the  Raspberry  Pi  platform,  ensuring  accessibility  and
           efficiency.   Moreover,   attention   mechanisms   are   Traditional  approaches  based  on  sensors  [3-4],  although
           incorporated  to  boost  accuracy  and  effectiveness  in   monitors  more  health  parameters,  lack  applicability  in
           detecting  health  anomalies,  with  a  particular  focus  on   elderly  health  care  due  to  their  inherent  limitations.    The
           identifying  falls.  The  proposed  elderly  companion  system   existing state-of-the-art healthcare systems like CarePredict
           implemented  on  Raspberry  Pi  achieves  a  validation   [5] are based on wearable sensors, presenting challenges for
           accuracy of 96.34% in voice module and 87.91% in video   seniors  who  are  not  technologically  savvy  or  comfortable
           module  in  delivering  comprehensive  healthcare  for  the   with  wearing  devices  continuously.  Additionally,  the
           elderly.  The  proposed  solution  demonstrates  a  potential   accuracy  of  data  collected  through  wearables  can  be
           work  for  standardization  through  the  ITU/WHO  Focus   compromised due to device malfunction or improper usage,
           Group on AI for Health (FG-AI4H).                  leading to potential false alarms, or missed health concerns.
                                                              This  necessitates  the  development  of  a  touchless  health
                                                              monitoring system
               Keywords  –  Elderly  healthcare,  voice  emotion                                   which works seamlessly integrated with
               recognition,  video-based  fall  detection,  transfer   existing  healthcare  networks.  Video  based  solutions  offer
               learning                                       better  contactless  health  care  requirements,  however  they
                                                              infringe the privacy of end users. The concern for privacy
                          1.  INTRODUCTION                    can be minimized by capturing audio alone through multiple
                                                              microphone  systems  and  video  can  be  captured  only  in
           Reports  from  WHO  and  United  Nations  Department  of   highly abnormal cases.
           Economic  and  Social  Affairs  reveal  that  the  elderly
           population of 65 years or older, which was 727 million in   The  proposed  elderly  wellness  companion  offers  a  non-
           2020, would get doubled by 2050 [1-2]. This demographic   intrusive,  yet  comprehensive  monitoring  solution  while
           shift underscores the critical need for innovative health care   preserving privacy. By integrating a voice module alongside
           tracking  systems  tailored  to  the  unique  requirements  of   the  video  component,  our  system  ensures  privacy  by
           elderly  people.  An  early  warning  system  can  analyze   activating the video module only upon detecting an anomaly,
           changes in voice quality, such as breathiness or hoarseness,   thus  minimizing  unnecessary  surveillance.  Moreover,  our
           as  an  indication  of  respiratory  or  cardiovascular  issues.   approach  incorporates  multi-modal  confirmation  of
           Analyzing acoustic features of speech, such as intensity and   anomalies, enhancing the system's reliability and reducing
           pitch  variability,  can  provide  insights  into  the  respiratory   false alarms. This strategic combination of technologies not
           function  and  cardiovascular  health.  Furthermore,  the   only  addresses  privacy  concerns  but  also  enhances  the
           healthcare  system  can  sense  shifts  in  speech  patterns   effectiveness of anomaly detection.




            978-92-61-39091-4/CFP2268P @ITU 2024           – 93 –                                   Kaleidoscope
   132   133   134   135   136   137   138   139   140   141   142