Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Improve pipeline() docstrings for config and tokenizer #8123

Merged
merged 3 commits into from
Oct 28, 2020
Merged

[DOC] Improve pipeline() docstrings for config and tokenizer #8123

merged 3 commits into from
Oct 28, 2020

Conversation

BramVanroy
Copy link
Collaborator

@BramVanroy BramVanroy commented Oct 28, 2020

As currently written, it was not clear to me which arguments were needed when using a non-default model in pipeline(). It seemed that when you provided a non-default model, that you still needed to manually change the config and tokenizer because otherwise the "task's default will be used". In practice, though, the pipeline is smart enough to automatically choose the right config/tokenizer for the given model. This PR clarifies that a bit in the docstrings/documentation, by explaining exactly which priorities are used when loading the tokenizer. A small change was made for config, too.

Admittedly, the wording for the tokenizer part is a bit off (programmatical, even), but I think it should make clear how the right tokenizer is loaded.

cc @sgugger

@BramVanroy BramVanroy requested a review from sgugger October 28, 2020 16:31
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

src/transformers/pipelines.py Outdated Show resolved Hide resolved
@BramVanroy
Copy link
Collaborator Author

@sgugger I made the change as you requested. Not sure why CI is failing on build_doc. Seems to have to do with some env installation.

@sgugger sgugger merged commit 5193172 into huggingface:master Oct 28, 2020
@sgugger
Copy link
Collaborator

sgugger commented Oct 28, 2020

The failure is spurious (basically the new version of pytorch is not cached on the CI and it fails to download it sometimes). Thanks for th fix!

@BramVanroy BramVanroy deleted the patch-2 branch October 28, 2020 19:47
fabiocapsouza pushed a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020
…face#8123)

* Improve pipeline() docstrings

* make style

* Update wording for config
fabiocapsouza added a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants