Speech Emotion Recognition System
Abstract
Emotion is a reaction arising as a result of a person's actions or certain events. It is very important to understand the emotional state of a person with certain emotions because emotions are one of the important things for life. Emotion detection can be done in two ways, namely, through the face and through the voice. In this study, researchers used sound as a medium for detecting sound. System Development Life Cycle were used as a methodology where each phases are important to achieve the goals of the project. Each phase is critical to meet client requirements and achieve project objectives. Moreover, the development life cycle is a well-described method that has steps in standard phases which aim to regulate the development of the application. This application system is designed and implemented using the Python programming language in Visual Studio Code. There are 2 main features, namely real time for user voice recording and upload for users to upload their record files in wav format. Therefore, by implementing this system, he will be able to detect the main emotions, namely Angry, Disgust, fear, Happy, Neutral, Sad, and Surprised through the user's voice. Not only that, the system also provides a real time system that can show the percentage of each type of emotion that is generated every 5 seconds.
References
Reakaa, S. D., & Haritha, J. (2021, May). Comparison study on speech emotion prediction using machine learning. In Journal of Physics: Conference Series (Vol. 1921, No. 1, p. 012017). IOP Publishing.
Han, K., Yu, D., & Tashev, I. (2014, September). Speech emotion recognition using deep neural network and extreme learning machine. In Interspeech 2014.
Shen, P., Changjun, Z., & Chen, X. (2011, August). Automatic speech emotion recognition using support vector machine. In Proceedings of 2011 International Conference on Electronic & Mechanical Engineering and Information Technology (Vol. 2, pp.
-625). IEEE.
Wootaek Lim, Daeyoung Jang dan taejin Lee. (2016). Speech Emotion Recognition using Convolutional and Recurrent Neural Networks.
Yoon, S., Byun, S., & Jung, K. (2018, December). Multimodal speech emotion recognition using audio and text. In 2018 IEEE Spoken Language Technology Workshop (SLT) (pp. 112-118). IEEE.
Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., & Mahjoub, M. A. (2018). Speech Emotion Recognition: Methods and Cases Study. ICAART (2), 20.
Han, K., Yu, D., & Tashev, I. (2014, September). Speech emotion recognition using deep neural network and extreme learning machine. In Interspeech 2014.
Mirsamadi, S., Barsoum, E., & Zhang, C. (2017, March). Automatic speech emotion recognition using recurrent neural networks with local attention. In 2017 IEEE International conference on acoustics, speech and signal processing (ICASSP) (pp. 2227-2231). IEEE.
Jiang, W., Wang, Z., Jin, J. S., Han, X., & Li, C. (2019). Speech emotion recognition with heterogeneous feature unification of deep neural network. Sensors, 19(12), 2730.
Ghosh, S., Laksana, E., Morency, L. P., & Scherer, S. (2016, September). Representation Learning for Speech Emotion Recognition. In Interspeech (pp. 3603-3607).