CornellMovieDialogsCorpus.jl

CornellMovieDialogsCorpus.jl is a Julia package that provides a thin wrapper for the Cornell Movie Dialogs Corpus.

Usage

Exported functions:

movie_conversations
movie_lines
movie_title_metadata
movie_character_metadata
movie_script_urls

Each of these loads the corresponding corpus database file.

Example

Let's say you want to train a simple chatbot using "call-and-response" dialog pairs as training data, as in this pytorch tutorial.

using CornellMovieDialogsCorpus

First, create a Dict that maps line IDs to the raw text.

id2text = Dict(l.line_id => l.text for l in movie_lines())

Now, create a dataset of (utterance, response) pairs from the movie conversations.

utterance_pairs = [(id2text[id], id2text[conv.lines[i+1]])
                   for conv in movie_conversations()
                   for (i, id) in enumerate(conv.lines[1:end-1])]

julia> utterance_pairs[1:5]
5-element Array{Tuple{Any,Any},1}:
 ("Can we make this quick?  Roxanne Korrine and Andrew Barrett are having an incredibly horrendous public break- up on the quad.  Again.", "Well, I thought we'd start with pronunciation, if that's okay with you.")
 ("Well, I thought we'd start with pronunciation, if that's okay with you.", "Not the hacking and gagging and spitting part.  Please.")
 ("Not the hacking and gagging and spitting part.  Please.", "Okay... then how 'bout we try out some French cuisine.  Saturday?  Night?")
 ("You're asking me out.  That's so cute. What's your name again?", "Forget it.")
 ("No, no, it's my fault -- we didn't have a proper introduction ---", "Cameron.")

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
src		src
test		test
.travis.yml		.travis.yml
Project.toml		Project.toml
README.md		README.md
REQUIRE		REQUIRE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CornellMovieDialogsCorpus.jl

Usage

Example

About

Releases

Packages

Languages

dellison/CornellMovieDialogsCorpus.jl

Folders and files

Latest commit

History

Repository files navigation

CornellMovieDialogsCorpus.jl

Usage

Example

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages