Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: TarsModels do redownload embeddings #3207

Closed
helpmefindaname opened this issue Apr 20, 2023 · 0 comments · Fixed by #3212
Closed

[Bug]: TarsModels do redownload embeddings #3207

helpmefindaname opened this issue Apr 20, 2023 · 0 comments · Fixed by #3212
Labels
bug Something isn't working

Comments

@helpmefindaname
Copy link
Collaborator

Describe the bug

related but not the same as: #3167

Locally saving a TarsModel won't save the huggingface config and therefore requires internet connection & a longer loading time when loading it on a new machine without hf-cache.
This happens due to tars pickeling the internal SequenceTagger/TextClassifier instead of just seralizing their embeddings.

To Reproduce

from flair.models import TARSClassifier

model = TARSClassifier.load("tars-base")
model.tars_embeddings.model.config._name_or_path = "bert-base-uncased"
model.tars_embeddings.base_model_name = "bert-base-uncased"
model.tars_embeddings.name = "transformer-bert-base-uncased"
model.save("local-tars-base.pt")

# clear huggingface cache or copy `local-tars-base.pt` to another machine or docker container.

model.load("local-tars-base.pt")

Expected behavior

The model should load without the need of internet and it shouldn't require me to wait until the embedding config and weights are downloaded.

Logs and Stack traces

Downloading (…)okenizer_config.json: 100%|██████████| 28.0/28.0 [00:00<00:00, 17.2kB/s]
Downloading (…)lve/main/config.json: 100%|██████████| 570/570 [00:00<00:00, 517kB/s]
Downloading (…)solve/main/vocab.txt: 100%|██████████| 232k/232k [00:00<00:00, 2.12MB/s]
Downloading (…)/main/tokenizer.json: 100%|██████████| 466k/466k [00:00<00:00, 3.61MB/s]
Downloading pytorch_model.bin: 100%|██████████| 440M/440M [00:35<00:00, 12.5MB/s]

Screenshots

No response

Additional Context

No response

Environment

Versions:

Flair

0.12.2 (master branch)

Pytorch

2.0.0+cu117

Transformers

4.28.1

GPU

True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant