This repository contains the prototype of a recommender system that can do content-based citation recommendations based on the majority opinion of US court cases.
data
: Contains all the code for- Data preprocessing
- Creation of the datastructures required for training, validation and testing
- Data analysis
document_embedding
: Contains all the code for the hierarchical recurrant neural net (HRNN) document embedderpretrained_word_embedding
: contains all the code to generate the needed datastructures from the pre-trained GloVe word embeddingranking_models
: Contains the code for theItemPopularity
andEmbedTextNCF
ranking modelstraining
: Contains the code for the training of the recommender systems
In order to run the entire training you must do the following steps
- Create the word-embedding datastructures from GloVe as described in the
folder
pretrained_word_embedding
- Create the training, validaiton and test data structures as described in
the folder
data
(this requires access to a proprietary dataset from LawEcon)