Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/debug data warn spread ents #9960

Merged
merged 3 commits into from
Jan 4, 2022

Conversation

DuyguA
Copy link
Contributor

@DuyguA DuyguA commented Dec 30, 2021

Description

Additions for warning users about entities in training data that cross sentence boundaries. An example:

I went there on [Friday. Aidan] came too.

Most probably this is an annotation mistake and not what is meant.

Types of change

Minor enhancement

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@svlandeg svlandeg added enhancement Feature requests and improvements feat / cli Feature: Command-line interface labels Dec 30, 2021
Copy link
Contributor

@polm polm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! It'll be great to finally catch this in debug data.

Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I just had a few nitpicking comments about rephrasing, I'll add those and then this can be merged :-)

spacy/cli/debug_data.py Outdated Show resolved Hide resolved
spacy/cli/debug_data.py Outdated Show resolved Hide resolved
@svlandeg svlandeg merged commit 55cf492 into explosion:master Jan 4, 2022
polm pushed a commit to polm/spaCy that referenced this pull request Jan 17, 2022
* added check for crossing boundaries

* formatted blacked

* Rephrasing slightly

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests and improvements feat / cli Feature: Command-line interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants