DBpedia-Chatlog-Analysis

Discourse analysis for DBpedia chatbot: http://chat.dbpedia.org/

data_exploration.ipynb houses code for grouping chats w.r.t. user_id and for preliminary analysis, such as, finding average length of conversation and number of users.
In analysis.ipynb, we find -
- the most used channel (web/slack/facebook messenger)
- no. of failed responses per conversation and no. of questions that did not satisfy users
- Conversation length after a negative feedback
- character length of user-requests
- perform NER and find commonly asked topics
- if coreferences exist
- the language of user-requests
Use dependency_parsing.ipynb to get the estimate of the number of complex questions asked and to prepare input (candidate pairs) for intent clustering.
The clustering folder contains 2 implementations (KMeans and HDBSCAN) for finding the latent-intents in utterance representations. Use get_sentence_embeddings.ipynb, preferably on Google Colab, to fetch sentence embeddings for clustering user-requests based on their semantics.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
clustering		clustering
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analysis.ipynb		analysis.ipynb
data_exploration.ipynb		data_exploration.ipynb
dependency_parsing.ipynb		dependency_parsing.ipynb
requirements.txt		requirements.txt

Provide feedback