Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Jsonl Corpus reader path optional again #9638

Merged
merged 3 commits into from
Nov 9, 2021

Conversation

polm
Copy link
Contributor

@polm polm commented Nov 7, 2021

Description

In #8396 the path arg to the JsonlCorpus was made optional. This was done so that you could use a config with a pretraining section for training without including the raw text path. However, #9167 made the path non-optional again, apparently inadvertently.

Types of change

minor fix

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@polm polm added the feat / tok2vec Feature: Token-to-vector layer and pretraining label Nov 7, 2021
spacy/training/corpus.py Outdated Show resolved Hide resolved
@svlandeg
Copy link
Member

svlandeg commented Nov 8, 2021

Ah, this was probably an oversight on the mypy PR, yes. To make the tests succeed, you'll have to make sure that the path argument of the JsonlCorpus constructor matches this change though. And let's use Optional instead of Union[None,...] as we do in other places of the code base.

@svlandeg svlandeg added the types Issues related to typing or typing tools label Nov 8, 2021
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
@svlandeg
Copy link
Member

svlandeg commented Nov 8, 2021

Oh I think you pushed the edit to the wrong PR 🙈 #9639

@polm
Copy link
Contributor Author

polm commented Nov 9, 2021

Gah, thanks for catching that! Should be fixed now.

@svlandeg svlandeg merged commit 24cdd4c into explosion:master Nov 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / tok2vec Feature: Token-to-vector layer and pretraining types Issues related to typing or typing tools
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants