Skip to content

This is my Masters thesis project titled "Speaker Detection and Conversation Analysis on Mobile Devices".

License

Notifications You must be signed in to change notification settings

wahibhaq/android-speaker-audioanalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

"Speaker Detection and Conversation Analysis on Mobile Devices"

This was my Masters thesis project and It was associated with "Cooperative Systems" Chair of TUM Informatics Faculty. The scope of project was to implement, investigate and compare offline approaches for audio processing on Android device. It involved Smart Probing strategies, Recording, Voice Activity Detection, Feature Extraction and using machine learning techniques for Speaker Recognition. The challenge was to perform everything on Android device without Internet and then investiagate energy consumption (battery usage, memory usage and cpu usage). Then comparison was required to be made if extracted features are sent to Server which is already functional.

This thesis project serves as one of the module of PhD research (http://www11.in.tum.de/userfiles/flosch_chi2014wip.pdf) of my adviser which aims at detecting owner in a conversation, evaluate emotion/character of that conversation and manage mobile notification on basis of that. For example, if owner is having a heated argument (labelled as anger) then its not a good time for a phone to ring. The project cover the areas of Ubiquious Computing, Mobile Computing and Human-Computer-Interaction.

Thesis report was submitted on 15th May, 2015 at Department of Informatics, Technical University of Munich. It can be downloaded from https://www.academia.edu/12802417/Speaker_Detection_and_Conversation_Analysis_on_Mobile_Devices

The buzz words to describe the project are Android, Performance analysis, Machine Learning, Offline recognition, MFCCs Feature Extraction, Voice Activity Detection, Conversation Analysis and Speaker Recognition.