nanoll extt
Please use this identifier to cite or link to this item: http://lrcdrs.bennett.edu.in:80/handle/123456789/2018
Title: Designing an efficient algorithm for recognition of human emotions through speech
Authors: Singh, Youddha Beer
Keywords: Computer Science
Computer Science Software Engineering
Issue Date: Sep-2022
Publisher: Bennett university
Abstract: In today’s world, human emotion recognition from speech is being utilized for many of the real-life applications such as healthcare, behaviour assessment, human and robot interaction, robot-robot interaction, and many more. The recognition of emotion from speech is still a challenging task which requires for the SER system such as the availability of suitable emotional databases, identification of the relevant feature vector, and suitable classifiers. The first challenge is the availability of a speech database of high quality which is critical for the performance of the machine learning algorithms to recognize emotions from a particular language. There is a lack of an emotional speech database in the Indian ascent. The second challenge is to identify the relevant feature vector to correctly classify the emotions with a low computational cost. The third challenge is to identify/modify/create new classifier for identification of emotions. Research says that in the recent years, there is a growing interest of researchers to use deep learning approaches for SER and get improvement in recognition rate. In this field, researchers focused on either hand-crafted classifiers or deep learning approaches to increase the recognition rate. The major challenge for hand-crafted classifiers is to identify the suitable feature vector. In this thesis, our contributions are: i) reduce the computation cost of the SER model and ii) improve the average accuracy of the SER model than the state-of-the-art, and iii) created Indian emotional speech database to overcome the above challenges. This research has critically analysed the literature on SER in terms of speech database, speech features, and deep learning approaches to investigate the current research work and is focused on designing an efficient algorithm for recognition of human emotions through speech to overcome the challenges and limitations in SER system. For that, a novel emotional speech database IESC (Indian Emotional Speech Corpora) is created, and efficient CNN based architectures have been proposed for SER. The IESC database created from eight north Indian speakers 5 males and 3 females in the English language with 600 samples. The created database IESC has been validated from more than twenty people by conducting a subjective test. The proposed models have been evaluated on IESC dataset and publicly available benchmark datasets namely the Italian emotional speech database (EMOVO), Berlin database of emotional speech (EMODB), Surrey audio-visual expressed emotion database (SAVEE), and Ryerson audio-visual database of emotion (RAVDESS) databases. The results of the experimentations show that the proposed model has out-performed the state-of-the-art SER models. The latest SER model CNN- assisted is also implemented on IESC database. The average accuracy of the proposed model is found to be 95% for IESC which is better than CNN- Assisted model (88%). The average accuracy of the proposed model is found to be even better than the state-of-the-art SER approaches and the baseline CNN-based architectures ResNet-18 and ResNet-34.
URI: https://shodhganga.inflibnet.ac.in/handle/10603/446803
Appears in Collections:School of Computer Science Engineering and Technology (SCSET)

Files in This Item:
File Description SizeFormat 
Thesis Youddha Beer Singh E16SOE806 (1).pdf3.17 MBAdobe PDFView/Open

Contact admin for Full-Text

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.