Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Caching bugs when accessing context in Sentence #3156

Closed
alanakbik opened this issue Mar 22, 2023 · 1 comment
Closed

[Bug]: Caching bugs when accessing context in Sentence #3156

alanakbik opened this issue Mar 22, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@alanakbik
Copy link
Collaborator

alanakbik commented Mar 22, 2023

Describe the bug

The right_context and left_context functions of the Sentence are used to calculate the context for all our best models. However, the behavior of the context expansion is inconsistent due to two bugs.

To Reproduce

from flair.data import Sentence

# make a sentence without context
sentence = Sentence("Luke and Leia destroyed the Death Star.")
print(sentence)
# print right context: There is none - CORRECT!
print(sentence.right_context(4))

# now make a second sentence and set it as context
other_sentence = Sentence("The Death Star then exploded.")
Sentence.set_context_for_sentences([sentence, other_sentence])
# print(sentence.next_sentence()) # verify that next sentence is correctly set (it is: CORRECT)

# now print right context. Even though context is set, right_context returns '[]': ERROR!
print(sentence.right_context(4))

# now calculate right context for some other random sentence
Sentence("Why am I here?").right_context(4)

# print right context again. Now suddenly, the correct context is returned. (ERROR because inconsistent behavior)
print(sentence.right_context(4))

Expected behaivor

The correct context should always be returned.

Additional Context

There are likely two reasons for this:

  • We set the lru_cache of right_context to 1 (
    @lru_cache(maxsize=1) # cache last context, as training repeats calls
    ). This is because I assumed that caching is computed here per-instance. But it turns out that caching is computed globally, even though the method is part of Sentence. This is why computing the right_context for some random sentence as in the snippet above results in a different context being computed: The original cache is already lost. -> to fix, set much higher cache size here
  • The main error is likely a problem with the equality definition of Sentence. Equality is considered only using features of the sentence itself, not its context. This is why setting a context belately and then calling right_context again gives the same result as before -> to fix this, the quality definition of Sentence needs to be changed

Environment

Python 3.8, master branch

@alanakbik alanakbik added the bug Something isn't working label Mar 22, 2023
alanakbik added a commit that referenced this issue Mar 22, 2023
@alanakbik
Copy link
Collaborator Author

Closed by #3157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant