How to relate the entity to it place in the text? #12

ali3assi · 2022-03-17T16:36:43Z

Once I get the annotation of the entities how can get the starting position and ending position in the text. So I want to relate the text to its corresponding entity.

I do the following:

for ent in doc.ents:
            print(ent.text, ent.start_char-ent.sent.start_char, ent.end_char-ent.sent.start_char, ent.label_)

But I get the following exception:

Traceback (most recent call last):
  File "C:\Users\Admin\miniconda3\envs\projet1\lib\tkinter\__init__.py", line 1892, in __call__
    return self.func(*args)
  File "C:\Users\Admin\Documents\codePython\dbpedia\index.py", line 71, in <lambda>
    display_annotate = Button(root, height = 2, width = 20, text ="Annotate text", command = lambda:take_input()) 
  File "C:\Users\Admin\Documents\codePython\dbpedia\index.py", line 15, in take_input
    logger.warning(annotate(text_to_annotate))
  File "C:\Users\Admin\Documents\codePython\dbpedia\index.py", line 57, in annotate
    print(ent.text, ent.start_char-ent.sent.start_char, ent.end_char-ent.sent.start_char, ent.label_)
  File "spacy\tokens\span.pyx", line 429, in spacy.tokens.span.Span.sent.__get__
ValueError: [E030] Sentence boundaries unset. You can add the 'sentencizer' component to the pipeline with: `nlp.add_pipe('sentencizer')`. Alternatively, add the dependency parser or sentence recognizer, or set sentence boundaries by setting `doc[i].is_sent_start`.

The text was updated successfully, but these errors were encountered:

MartinoMensio · 2022-04-26T09:17:52Z

Hi @ali3assi,
The error you are mentioning happens because by default the blank pipelines don't load the sentencizer.
You can do the following:

import spacy
nlp = spacy.blank('en')
nlp.add_pipe('sentencizer')
nlp.add_pipe('dbpedia_spotlight')
doc = nlp("This is an example text. Let's mention Natural Language Processing")
for ent in doc.ents:
    print(ent.text, ent.start_char-ent.sent.start_char, ent.end_char-ent.sent.start_char, ent.label_)
# Natural Language Processing 14 41 DBPEDIA_ENT

Or in alternative load one of the models that already load the sentencizer:

import spacy
# this needs to be installed https://spacy.io/models/en#en_core_web_sm
nlp = spacy.load('en_core_web_sm')

# then the following is the same
nlp.add_pipe('dbpedia_spotlight')
doc = nlp("This is an example text. Let's mention Natural Language Processing")
for ent in doc.ents:
    print(ent.text, ent.start_char-ent.sent.start_char, ent.end_char-ent.sent.start_char, ent.label_)
# Natural Language Processing 14 41 DBPEDIA_ENT

MartinoMensio added the documentation Improvements or additions to documentation label Feb 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to relate the entity to it place in the text? #12

How to relate the entity to it place in the text? #12

ali3assi commented Mar 17, 2022

MartinoMensio commented Apr 26, 2022

How to relate the entity to it place in the text? #12

How to relate the entity to it place in the text? #12

Comments

ali3assi commented Mar 17, 2022

MartinoMensio commented Apr 26, 2022