Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable custom HF BERT models with default tokenizer config #2973

Merged
merged 2 commits into from
Jan 20, 2023

Conversation

geoffreyangus
Copy link
Contributor

Prior to this PR, Ludwig assumed that all custom HuggingFace BERT models have tokenizer_config.json in their repository. This is not the case, particularly for these two models:

We wrap the attempt to get tokenizer_config.json in a try-except block and default to an empty configuration if it does not exist.

ludwig/utils/tokenizers.py Show resolved Hide resolved
@geoffreyangus geoffreyangus merged commit 3031e4f into master Jan 20, 2023
@geoffreyangus geoffreyangus deleted the fix-missing-tokenizer-config branch January 20, 2023 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants