Massachusetts Undergraduate Research Conference (MassURC)

Poster Session 3, 1:15 PM - 2:00 PM: Room 163 [C5]

Emotion Detection from Speech Audio Using Deep Learning Architectures

Presenter: Tanmay Sonawane

Faculty Sponsor: Alfa Heryudono

School: UMass Dartmouth

Research Area: Computer Science

ABSTRACT

Human speech carries rich paralinguistic information, particularly emotion, which provides valuable insight into psychological state, intent, and behavioral response. This project investigates how modern deep learning architectures can detect emotion from speech audio using time-frequency representations stored as numerical arrays. Centered on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), the methodology involves systematic preprocessing of raw audio files into normalized Mel-based NumPy (.npy) representations, followed by multimodal learning that jointly processes raw waveforms and spectrogram arrays. A pretrained ResNet-18 architecture is employed as the primary convolutional backbone, implemented within the PyTorch framework. Model performance is evaluated using accuracy, F1-score, precision, recall, and confusion matrices, with an achieved baseline accuracy of approximately 70% on a five-class emotion mapping.

To further assess robustness and generalization, this work will be extended to the Toronto Emotional Speech Set (TESS), enabling cross-dataset evaluation and combined-training strategies. In addition to PyTorch-based models, equivalent architectures will be implemented and tested using TensorFlow to provide a comparative analysis of deep learning frameworks for speech emotion recognition. Differences in training dynamics, performance, and deployment considerations across frameworks will be systematically examined. The ultimate application focus of this research is emergency service call analysis, where real-time emotion detection can assist dispatchers by identifying heightened stress, fear, or distress in callers. By benchmarking models across datasets and frameworks, this project aims to support the development of reliable, emotion-aware systems for safety-critical, interactive, and assistive technologies.

RELATED ABSTRACTS

A Python Implementation of Gauss's Method for Determining Asteroid Orbits, Milagros Tamara Giraldo, Bridgewater State University, Poster Session 3, 1:15 PM - 2:00 PM, Auditorium, A27
Lifestyle Strategies for a Healthy Autonomic Nervous System, Jalyn Allison Miller, Springfield Technical Community College, Poster Session 2, 11:30 AM - 12:15 PM, Auditorium, A28
Computing Tree Decompositions of Road Networks, Patrick Martin, UMass Amherst, Poster Session 3, 1:15 PM - 2:00 PM, 163, C20

301 Commonwealth Honors College
University of Massachusetts Amherst
157 Commonwealth Avenue
Amherst, MA 01003-9253

Tel: 413.545.2483
Fax: 413.577.2620
Monday - Friday, 9:00 a.m. - 5:00 p.m.
General inquiries: info@honors.umass.edu

This page is maintained by the Commonwealth Honors College
© 2026 University of Massachusetts Amherst • Site Policies
07/16/26 11:05