Page 774 - AI for Good Innovate for Impact
P. 774
AI for Good Innovate for Impact
(continued)
Item Details
Technology Keywords Artificial Intelligence (AI), Computer Vision, YOLOv5, Dlib, Facial Land-
mark Detection, Histogram of Oriented Gradients (HOG), Eye Aspect
Ratio (EAR), Haar Cascades
Data Availability Some Private and Public [2]
Metadata (Type of Data) Images
Model Training and The system uses a hybrid approach combining YOLOv5 and Dlib
Fine-Tuning to detect and track facial features for hands-free interface control.
YOLOv5 is trained to detect the face and eyes in real time from video
frames at 30 FPS. Once the face region is detected, Dlib extracts 68
facial landmarks, enabling accurate tracking of eye movements, blinks,
and head orientation. Blink detection is performed by calculating the
Eye Aspect Ratio (EAR) from these landmarks. Finally, using Haar clas-
sification algorithms, the detected face and eye gestures are mapped
to mouse actions (e.g., cursor movement and clicks), enabling seamless
operation of digital systems without hand input.
Testbeds or Pilot YOLOv5 for facial and eye detection, Dlib for facial landmark extraction,
Deployments and Eye Aspect Ratio (EAR). [3]
2 Use Case Description
2�1 Description
This use case proposes a solution for those who have disabilities to use a computer system with
ease, regardless of their difficulties using traditional computers. We have used the various built-
in sensors like camera, mic, and touch screen together with keyboard and mouse as input to
the system to capture user interactions and to provide the desired output. To accomplish this,
our system will be trained using data that consistently updates its model for a more efficient
result. The advances in AI have made many tasks that only humans were capable of performing
to be done by intelligent agent systems. Through a previous research study, it was concluded
that eyes are an excellent candidate for ubiquitous computing since they respond well during
interaction with computer systems. Using this underlying information from eye movements
could allow bringing the use of computers back to such users. For this purpose, we propose
an eye gesture control system that is operated by human eyes only. The purpose of this work is
to design an open-source source generic eye gesture control system that can effectively track
eye movements and enable the user to perform actions mapped to specific eye movements/
gestures by using sensors like a camera and a mic as input. It detects the pupil from the user’s
face and then tracks its movements and gives the desired output, like moving the cursor to the
desired location, scrolling, zooming in, zooming out, clicking, etc. Most importantly, it needs
to be accurate in real time so that the user is able to use it for everyday life. That will be done
by making the AI model consistently update its model by getting data from edge devices to
the cloud for continuous improvements of the intended human-computer interaction.
Use case status: The use case is part of a project we are working on already.
738

