lstm_language_model

a pytorch version lstm language model, support class-based softmax (Following the paper) and NCE (noise contrasitve estimation, following the paper], and thanks Stonesjtu's amazing project) for speeding up .

Theoretical Analysis

Class-based Softmax

In class-based softmax, each word is assigned to one class, hence the probability of a word become:

Theoretically, the computational cost can be reduced from O(dk) to O(d\sqrt{k}), where d is the size of last hidden layer and k is the size of vocabulary.

But in pratice, there are too many overhead (especially in GPU).

NCE

NCE transfers the probability estimation problem into a binary classification problem. In NCE, we have a noise distributiona and our goal is to train a model to differentiate the target word from noise. The biggest trick in NCE is that, we treat the probability normalization term as a constant, which saves a lots of time for both training and testing.

Usage

Before training the model, please run the following script to build a vocab with class:

python build_vocab_with_class.py --ncls 30 --min_count 0

The vocab built above is based on the frequence, you can also build your own vocab using other methods. (see example in ./data/penn/vocab.c.txt, Notice that the class should be a integer.)

Run training script:

python train.py --cuda --data [data_path] --decoder [sm|nce|cls]

File Structure

data/: corpus dictionary
params/: save the parameters
data.py: custom data iter and dictionary
model.py: the basic rnn model
decoder.py: the decoder layers (softmax, class-based softmax and NCE)
train.py: the training process
utils.py: utilize functions

Performance

Experiments on swb corpus (6W vocab):

epoch average training time:

softmax: 1061s

nce: 471s

class-based softmax: 465s

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data/ptb		data/ptb
res		res
.gitignore		.gitignore
README.md		README.md
alias_multinomial.py		alias_multinomial.py
build_vocab_with_class.py		build_vocab_with_class.py
data.py		data.py
decoder.py		decoder.py
model.py		model.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

lstm_language_model

Theoretical Analysis

Class-based Softmax

NCE

Usage

File Structure

Performance

About

Uh oh!

Releases

Packages

Languages

sa1ka/pytorch_language_model

Folders and files

Latest commit

History

Repository files navigation

lstm_language_model

Theoretical Analysis

Class-based Softmax

NCE

Usage

File Structure

Performance

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages