Emotional Speech Recognition

This is a 2 layer bi-directional LSTM model which classifies 7 emotions from an audio file:

Happy
Sad
Pleasant Surprise
Anger
Neutral
Fear
Disgust

Data

The training data is from the Toronto Emotional Speech Set Two actresses aged 26 and 64 read a variety of sentences structured in the format

'Say the word _____' using different words.

There are a total of 2800 audio files with an equal distribution across all classes.

Each audio file was preprocessed using python_speech_features into mel-frequency cepstral coefficients (MFCC), delta (first difference) and delta-delta (first difference of differences) features, totaling 39 features per 25ms segment of audio.

Test Results

Accuracy (one vs. all): 84% F1 Score: 87%

Training Results

Accuracy (one vs. all): 93% F1 Score: 93%

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
final_prediction_model		final_prediction_model
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Emotional Speech Recognition

Data

Test Results

Training Results

About

Releases

Packages

Languages

kailin-lu/emotional-speech-recognition

Folders and files

Latest commit

History

Repository files navigation

Emotional Speech Recognition

Data

Test Results

Training Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages