Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix RO sync: clear LRU for all log updates #399

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

art-w
Copy link
Contributor

@art-w art-w commented May 24, 2024

As reported by @edwintorok in #398, a read-only instance would sometimes fail to find the latest value after doing a sync, because it would keep finding the old value in its LRU cache (and ignore the more recent writes).

The quick-fix is to always clear the LRU whenever sync discovers that the log has changed on disk... It's not super satisfying as we could be more fine grained and update the LRU with the new bindings from the log, but it's the strategy the code was already using for some (but not all) log updates.

@edwintorok
Copy link

Clearing all entries is probably safer indeed. Another approach might be to have a generation counter on each entry, and then compare the generation counter of the file with the generation counter of the cached entry, and consider the entry stale if it is older.

@art-w
Copy link
Contributor Author

art-w commented May 27, 2024

Hmm I'm slightly worried of the performance impact of double-checking the LRU freshness on every query, it felt more predictable to only penalize the read-only instance for calling sync by clearing its LRU. Wouldn't the generation counter be equivalent to an LRU clear as it would invalidate all entries that are older than the latest update? (note that we only call Lru.clear if sync sees that an update happened, but otherwise the LRU is preserved)

That being said, I believe it should be possible for sync to update the LRU with the newer values read from the log, but the code change is more involved... Perhaps as a follow-up PR, unless you are already seeing poor performances caused by the LRU clear?

@art-w art-w requested a review from clecat May 27, 2024 08:52
@edwintorok
Copy link

Hmm I'm slightly worried of the performance impact of double-checking the LRU freshness on every query, it felt more predictable to only penalize the read-only instance for calling sync by clearing its LRU. Wouldn't the generation counter be equivalent to an LRU clear as it would invalidate all entries that are older than the latest update? (note that we only call Lru.clear if sync sees that an update happened, but otherwise the LRU is preserved)

You are right, keeping a RO client efficiently in-sync with another one is not really possible currently, and the generation counter wouldn't really help (you need to discard all LRU entries, whenever anything changes in the index, even if we have per-entry generation counters, we'd still need to read that counter to know whether it changed or not).

That being said, I believe it should be possible for sync to update the LRU with the newer values read from the log, but the code change is more involved... Perhaps as a follow-up PR, unless you are already seeing poor performances caused by the LRU clear?

Yes, that sounds like a better approach than what I was suggesting with the generation counters. Followup PR sounds good, currently I'm just evaluating whether I can use 'index' in a project, I'm not using it yet as such.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants