-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matcher does not assign entity IDs to relevant spans. #615
Comments
As a related side note, I tested the merge phrases callback below, the label assignment does not seem to work:
|
Hmm. The entity ID feature isn't yet 100% well thought out. There are a few complications. One implementation detail to keep in mind is that spans don't "really" exist. They're views of slices of the document. We can have labels on the spans, and I guess we can extend this to having entity IDs on the spans as well. I'm worried that if we start writing stuff to the span object, things will quickly get confusing. Consider: for span in doc.ents:
span.ent_id = lookup_ent_id(span.text) This isn't going to work. The Do you think it's too limiting for entity IDs to stay a per-token property? If the entity ID is per token, we don't have this problem. You can always merge a span into a single token anyway. |
@honnibal In my view, there should be a story establishing entity id for the token resultant from merged span. The issue is that on the merge statement, there needs to be a set attribute method defined for taking the entity metadata passed to matcher and setting that to the merged token. Perhaps something like this should be able to set the entity_id data to the merged token:
in which case if i check out the resulting token, all of the entity attributes should be set What do you think? |
I thought that already worked! It definitely should. Actually merge should support all of the token attributes. |
If you run this on merge function within the matcher above:
Assertion 1 passes, but assertion 2 does not, so it seems like merge isnt setting the attributes. To be sure, i also tried checked the ent_id of the resulting token after the matcher runs, and get the same result. |
Not sure precisely when this was fixed, but the test for it is there and it's been green for some time --- so, closing this. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Related to #605 and #577
Passing a document through a custom matcher, the matcher indeed tags the relevant span as a entity, but the span comprising the entity itself does not have entity metadata associated with it (assertion 1 passes, assertion 2 fails). Not sure whether this is a bug exactly, as to my understanding the matcher is meant to have minimal side effects unless the user does something explicit. However, I'm not sure how to assign entity metadata to the appropriate span myself. I've tried adding a callback to the matcher to assign the entity manually but the set methods for the span don't seem to be allowed.
Your Environment
The text was updated successfully, but these errors were encountered: