Conversational Collaborative Filtering using External Data and MovieSent dataset

Code and data from the NAACL'21 short paper "You Sound Like Someone Who Watches Drama Movies: Towards Predicting Movie Preferences from Conversational Interactions" by Volokhin et al.

MovieSent - dataset containing 489 movie-related conversations with fine-grained user sentiment labels about each mentioned movie. Conversations are in the MovieSent.json file.

Reviews were collected in April 2020. Initially a list of critics is compiled from more than 600 movies, their IDs are in films_rt_ids.json. Then for those critics all their reviews are scraped and put into reviews.tar.gz file.

To run the model:

Install requirements.txt
Run indexing.py to create an index of reviews based on the reviews.tsv.gz file.
Run sentiment_estimation.py to create a sentiment estimation model.
Run main.py for the final model. Training of CF model will occur at the same time, and can take a long time for a SVDpp model (KNN is much faster, ~20 seconds, if you just want to check if the code works).

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data		data
.gitignore		.gitignore
README.md		README.md
dataset.py		dataset.py
indexing.py		indexing.py
main.py		main.py
model.py		model.py
requirements.txt		requirements.txt
scrape_reviews.py		scrape_reviews.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conversational Collaborative Filtering using External Data and MovieSent dataset

About

Releases

Packages

Languages

sergey-volokhin/conversational-movies

Folders and files

Latest commit

History

Repository files navigation

Conversational Collaborative Filtering using External Data and MovieSent dataset

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages