Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero loss when pretraining on floret vectors #12363

Closed
ruben-dedoncker opened this issue Mar 4, 2023 · 2 comments · Fixed by #12366
Closed

Zero loss when pretraining on floret vectors #12363

ruben-dedoncker opened this issue Mar 4, 2023 · 2 comments · Fixed by #12366
Labels
bug Bugs and behaviour differing from documentation feat / vectors Feature: Word vectors and similarity

Comments

@ruben-dedoncker
Copy link

ruben-dedoncker commented Mar 4, 2023

How to reproduce the behaviour

I'm getting zero loss when running the pretrain command using floret vectors as input.

The vectors are downloaded from https://github.com/explosion/spacy-vectors-builder/releases/tag/en-3.4.0 and are initialized to a model named en_floret_md.

For fasttext vectors I am able to get actual loss values. Below is a snippet of the output given by pretrain

============== Pre-training tok2vec layer - starting at epoch 0 ==============
  #      # Words   Total Loss     Loss    w/s
  0        10927       0.0000   0.0000   8073
  0        22039       0.0000   0.0000   10960
  0        34764       0.0000   0.0000   10826
  0        45368       0.0000   0.0000   10267
  0        55951       0.0000   0.0000   10216
  0        66756       0.0000   0.0000   10909
  0        78491       0.0000   0.0000   10764
  0        89552       0.0000   0.0000   10612

The relevant parts of my config file are the following

[paths]
vectors = "en_floret_md"
raw_text = "raw_text.jsonl"

[corpora.pretrain]
@readers = "spacy.JsonlCorpus.v1"
path = ${paths.raw_text}
min_length = 0
max_length = 0
limit = 0

[pretraining]
max_epochs = 1000
dropout = 0.2
n_save_every = null
n_save_epoch = null
component = "tok2vec"
layer = ""
corpus = "corpora.pretrain"

[pretraining.batcher]
@batchers = "spacy.batch_by_words.v1"
size = 3000
discard_oversize = false
tolerance = 0.2
get_length = null

[pretraining.objective]
@architectures = "spacy.PretrainVectors.v1"
maxout_pieces = 3
hidden_size = 300
loss = "cosine"

[pretraining.optimizer]
@optimizers = "Adam.v1"
beta1 = 0.9
beta2 = 0.999
L2_is_weight_decay = true
L2 = 0.01
grad_clip = 1.0
use_averages = true
eps = 0.00000001
learn_rate = 0.001

[initialize]
vectors = ${paths.vectors}

Your Environment

  • Operating System: Windows 11
  • Python Version Used: 3.10.6
  • spaCy Version Used: 3.4.4
  • Environment Information: WSL 2
@adrianeboyd
Copy link
Contributor

Thanks for the report! Looking at the code for spacy.PretrainVectors.v1, it was only designed to work with default vectors, not floret vectors, but didn't get updated when floret vectors were added.

We can initially raise an error for floret vectors until the code is updated to work with floret.

@adrianeboyd adrianeboyd added bug Bugs and behaviour differing from documentation feat / vectors Feature: Word vectors and similarity labels Mar 6, 2023
@adrianeboyd adrianeboyd linked a pull request Mar 6, 2023 that will close this issue
3 tasks
@github-actions
Copy link
Contributor

github-actions bot commented Apr 7, 2023

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Bugs and behaviour differing from documentation feat / vectors Feature: Word vectors and similarity
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants