Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About GLOVE model #5

Open
jx00109 opened this issue Nov 8, 2017 · 1 comment
Open

About GLOVE model #5

jx00109 opened this issue Nov 8, 2017 · 1 comment

Comments

@jx00109
Copy link

jx00109 commented Nov 8, 2017

Recently, I have use torchtext to get the glove model, By this module I got the dictionary that maps word to index and the embedding matrix (shape word_count * dim, torch.FloatTensor), so to create the file which can be used in train.py, I write my code like this:

t=(dictionary, embedding matrix, dim)
torch.save(t, mypath/glove.pt)

Is the file glove.pt in the right format that asked in your program?

@DenisDsh
Copy link

This is how I created the GloVe model :


TEXT = data.Field(sequential=True) 
LABEL = data.Field(sequential=False)

train, val, test = data.TabularDataset.splits(
        path='./', train='train.json',
        validation='val.json', test='test.json', format='json',
        fields={'text': ('text', TEXT),
             'label': ('label', LABEL)})

TEXT.build_vocab(train, vectors="glove.42B.300d")

dictionary = TEXT.vocab.stoi
vectors = TEXT.vocab.vectors
dim = TEXT.vocab.vectors.size()[1] #300 in this case

torch.save(tuple([dictionary,vectors,dim]), './GloVe/glove.42B.300d.pt')

Took inspiration from :

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants