Page 455 - Kaleidoscope Academic Conference Proceedings 2024
P. 455
Modified CNN Model for Hand Gesture Recognition Using Sign
Language
Rajesh Kumar Singh and Abhishek Kumar Mishra
rks2019ay@gmail.com and abhimishra2@gmail.com
Department of computer Science &Engineering IFTM University Moradabad, India
Abstract: In this article an enhanced accuracy of hand gesture recognition using data augmentation is
presented. The proposed model has base on the CNN with data augmentation to recognize static hand
gestures. The model has tested on 7172 images after being trained on 27,455 images. The accuracy of the
model using supplemented data was 99.76%, which is nearly greater than the accuracy of the CNN model
without augmentation (86.87%).
Keywords: Neural Network, Static Hand Gestures Recognition, Data Augmentation and Sign Language
INTRODUCTION
Dumb people struggle to communicate since
normal people rarely learn Sign Language (SL)
[1]. If there is no silent person in their social
circle or if it is not necessary for their
profession, individuals typically do not learn it.
Communication with a soundless individual can
be challenging and time-consuming. The
purpose of the study is to examine a
Convolutional Neural Network (CNN) [2] Fig1: Sign of alphabets
recognition capacity and conversion of ASL
pictures of hand gestures into text format. The Various deep-learning techniques are available
main focus of the study is on letters and for sign language recognition using the CNN
numerical symbols in American sign language approach. Most of the techniques show good
(ASL). Gestures (J and Z) are eliminated performance and better recognition capabilities
because they require movement to be executed. but there needs an improved model for
In general, people use hand gestures more enhanced recognition accuracy and low time
frequently for communication than other body complexity. Using a boundary histogram, [6]
parts. Nonverbal communication takes place showed rotation-invariant postures. The input
while two people are conversing which image was captured using a camera, and then a
expresses the meaning of the speech through skin color detection filter, clustering, and a
hand and body motions. Several advanced standard contour-tracking technique were used
sensor techniques are available to capture hand to determine the boundaries of each group in the
gestures. Bobick and Wilson [3], claimed that a clustered image. The boundaries have been
gesture is a movement of the body designed to adjusted and the image has been divided into
communicate with other agents. Most grids. The border was represented as a chord-
researchers suggest the Gesture Recognition size chain that was employed as a histogram by
method for creating user-friendly interfaces. dividing the image into N radially spaced areas,
People who are deaf or dumb can also each with a different angle. Neural Networks
communicate via sign language, which uses MLP and Dynamic Programming DP matching
well-known gestures or body language to were employed in the classification procedure.
convey meaning rather than utilizing sound [4]. 26 static postures from American Sign
A symbol enables hearing-impaired people to Language were used in the trials, which were
communicate with one another by linking executed on several feature formats and varied
spoken language letters, words, and phrases to chord sizes for the histogram and FFT. The
understand hand gestures and body language. results showed DP matching and MLP at 94%
Hand gesture recognition has recently been and 98.8 respectively. The method TDSEP
utilized to take the place of commonly used (Temporal Decomposition Source Separation
human-computer interactive devices like BSS (blind source separation) together with the
joysticks, keyboards, and mice [5]. The sign of neural network) was presented by [7] and
the alphabet is represented by Fig1. successfully employed to classify small muscle
activity for distinguishing modest hand action. It
was suggested by [8-10] .
978-92-61-39091-4/CFP2268P @ITU 2024
– 411 – Kaleidoscope