-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi token patterns #19
Comments
I'm also thinking of a postprocessing trick now. If a token is detected as an entity, but it is part of a noun-chunk, we may also attempt to highlight the entire noun-chunk. This would be for a separate tutorial, but I'm curious what you think of the idea. |
They are, but the n-grams do actually need to be present in the embedding model. If not, the algorithm doesn´t have any input to expand over. I can see 2 solutions:
|
Additionally, there is a function to add a flag This I also one of the features that isn't properly added to the documentation. Besides that, the |
I generally like to compose the behaviour of the patterns along with your rule-based matcher explorer https://demos.explosion.ai/matcher. |
Yeah, averaging the embeddings of inputs seems like it'll result in a bad time. But it was indeed probably the There is also a third option, one that (hopefully) will get announced next week on our YouTube channel. |
Now you got me curious about the third option. But cool that you are working on a tutorial. Let me know if there are any hiccups or features you might think of. |
@koaning I closed this for now. Will review the solution after your blogpost. |
It will be a two-part thing, the first part will be on YouTube. The thing about the solution though is that it is already implemented in another library 😉 |
That library being? 😅 or are you talking about the |
Cool. I´ll do some testing and look into a way to integrate this. |
There are likely some other integrations inbound, but yeah, s2v is a great trick. |
I might be working on a tutorial on this project, so I figured I'd double-check explicitly: are multi-token phrases supported? My impression is that they're not, and that's totally fine, but I just wanted to make sure.
This example:
Yields this error:
The text was updated successfully, but these errors were encountered: