Page 137 - Kaleidoscope Academic Conference Proceedings 2024
P. 137
ELDERLY WELLNESS COMPANION WITH VOICE AND VIDEO-BASED HEALTH
ANOMALY DETECTION
1
1
2
1
Dhananjay, Kumar ; Mehal Sakthi, Muthusamy Sivaraja ; Sowbarnigaa, Kogilavani Shanmugavadivel ; Ved P., Kafle
1 Department of Information Technology, Anna University, MIT Campus, Chennai, India
2 National Institute of Information and Communications Technology, Tokyo, Japan
ABSTRACT indicative of a fall or sudden deterioration in health and
trigger alerts to caregivers or healthcare providers. Speech
The elderly healthcare requires an innovative approach to analysis can contribute to personalized healthcare by
address multifaceted challenges in tracking, monitoring, and providing objective and quantifiable measures of health
reporting in real-time. The proposed solution harnesses the status. The system needs to accurately recognize emotions
capabilities of voice and video-based anomaly detection from voice samples across a diverse population, accounting
systems to offer continuous monitoring, personalized for variations in tone, pitch, modulation, language, slang, and
support, and timely intervention for the physical and other factors. By monitoring individualized speech profiles
emotional well-being of elderly individuals. Central to the over time, healthcare interventions can be tailored to meet
proposed system is the integration of real-time voice emotion the specific needs of each individual. A multi-modal
recognition and video-based posture recognition modules, approach such as combining speech and visual data can
constructed using cutting-edge deep learning and transfer provide a more comprehensive understanding of an
learning models respectively. These modules are deployed individual's health status.
on the Raspberry Pi platform, ensuring accessibility and
efficiency. Moreover, attention mechanisms are Traditional approaches based on sensors [3-4], although
incorporated to boost accuracy and effectiveness in monitors more health parameters, lack applicability in
detecting health anomalies, with a particular focus on elderly health care due to their inherent limitations. The
identifying falls. The proposed elderly companion system existing state-of-the-art healthcare systems like CarePredict
implemented on Raspberry Pi achieves a validation [5] are based on wearable sensors, presenting challenges for
accuracy of 96.34% in voice module and 87.91% in video seniors who are not technologically savvy or comfortable
module in delivering comprehensive healthcare for the with wearing devices continuously. Additionally, the
elderly. The proposed solution demonstrates a potential accuracy of data collected through wearables can be
work for standardization through the ITU/WHO Focus compromised due to device malfunction or improper usage,
Group on AI for Health (FG-AI4H). leading to potential false alarms, or missed health concerns.
This necessitates the development of a touchless health
monitoring system
Keywords – Elderly healthcare, voice emotion which works seamlessly integrated with
recognition, video-based fall detection, transfer existing healthcare networks. Video based solutions offer
learning better contactless health care requirements, however they
infringe the privacy of end users. The concern for privacy
1. INTRODUCTION can be minimized by capturing audio alone through multiple
microphone systems and video can be captured only in
Reports from WHO and United Nations Department of highly abnormal cases.
Economic and Social Affairs reveal that the elderly
population of 65 years or older, which was 727 million in The proposed elderly wellness companion offers a non-
2020, would get doubled by 2050 [1-2]. This demographic intrusive, yet comprehensive monitoring solution while
shift underscores the critical need for innovative health care preserving privacy. By integrating a voice module alongside
tracking systems tailored to the unique requirements of the video component, our system ensures privacy by
elderly people. An early warning system can analyze activating the video module only upon detecting an anomaly,
changes in voice quality, such as breathiness or hoarseness, thus minimizing unnecessary surveillance. Moreover, our
as an indication of respiratory or cardiovascular issues. approach incorporates multi-modal confirmation of
Analyzing acoustic features of speech, such as intensity and anomalies, enhancing the system's reliability and reducing
pitch variability, can provide insights into the respiratory false alarms. This strategic combination of technologies not
function and cardiovascular health. Furthermore, the only addresses privacy concerns but also enhances the
healthcare system can sense shifts in speech patterns effectiveness of anomaly detection.
978-92-61-39091-4/CFP2268P @ITU 2024 – 93 – Kaleidoscope