Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ledger] Replace LRU cache with a FIFO queue (circular buffer) #2893

Conversation

ramtinms
Copy link
Contributor

@ramtinms ramtinms commented Jul 29, 2022

This PR

  • replaces the LRU cache used by the forest with a FIFO queue implemented with a circular buffer.
  • implements an adjusted copy of the great buffer @fxamacker has written for the checkpointer. (added lookup and mutex functionality)
  • updates the ledger to still be able to read trie removal entries from WAL but don't do anything about them and don't capture these types of entries going forward (backward compatibility for old version WALs).

But, Why?
The initial decision to use an LRU cache had roots in another type of design considered for the ledger. ideally, the ledger should only care about purging the oldest trie added to the buffer and get trie calls having any impact on which trie is going to be evicted. This would also make the checkpointer and forest to be in harmony with how they deal with trie updates.

Why LRU cache was a bad idea? in the LRU version scripts would change the order of tries kept in the cache, and this could result in issues loosing the head of the trie we need to build blocks. in other words, one might send a lot of scripts targeting older tries with the hope to purge a necessary trie in memory and halt the execution.

@ramtinms ramtinms requested a review from fxamacker July 29, 2022 16:10
@ramtinms ramtinms changed the title [Ledger] Replace LRU cache with a FIFO circular buffer [Ledger] Replace LRU cache with a FIFO queue (circular buffer) Jul 29, 2022
Copy link
Member

@fxamacker fxamacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work and amazing how insanely fast you created this PR, right after talking about it!

Some suggestions:

  1. Replace sync.Mutex with sync.RWMutex in buffer.
  2. Add forest.PurgeCacheExcept() test in forest_test.go.
  3. Replace references of LRU cache to FIFO queue in docs in ledger.go and forest.go.

First item is in the code comments but 2 and 3 are just here because it is unmodified by this PR.

Also, maybe rename Buffer to be more specific if you'd like. I don't have a better name for it 😆

ledger/complete/mtrie/forest.go Outdated Show resolved Hide resolved
ledger/complete/mtrie/buffer.go Outdated Show resolved Hide resolved
ledger/complete/mtrie/buffer.go Outdated Show resolved Hide resolved
ledger/complete/mtrie/buffer.go Outdated Show resolved Hide resolved
ledger/complete/mtrie/buffer.go Outdated Show resolved Hide resolved
ledger/complete/mtrie/buffer.go Outdated Show resolved Hide resolved
ledger/complete/mtrie/buffer.go Outdated Show resolved Hide resolved
@ramtinms ramtinms requested a review from fxamacker July 29, 2022 20:29

tries := make([]*trie.MTrie, tc.count)

if tc.isFull() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a random though: if we used a double-linked list, would the logic here be a bit simpler?

Copy link
Contributor

@pattyshack pattyshack Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array is much cheaper than link list

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree! that said, given that we use map for O(1) access it should not affect perf too much (esp. given how infrequently we should update the forest.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is going to be a follow up PR for improvements, going to consider your suggestions in the PR.

@SaveTheRbtz SaveTheRbtz requested a review from pattyshack July 29, 2022 20:58
ledger/complete/mtrie/trieCache.go Outdated Show resolved Hide resolved
ledger/complete/mtrie/trieCache.go Outdated Show resolved Hide resolved
@@ -145,7 +145,6 @@ func BenchmarkLoadCheckpointAndWALs(b *testing.B) {
return err
},
func(rootHash ledger.RootHash) error {
forest.RemoveTrie(rootHash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With removal of removal, maybe we can get out of it completely, remove WALDelete (and WALUpdate since now we have only one operation, and all related code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need it for reading old WALs when we spork, we could get rid of it after next spork.

ledger/complete/mtrie/forest_test.go Show resolved Hide resolved
ledger/complete/mtrie/trieCache.go Show resolved Hide resolved
ledger/complete/mtrie/trieCache.go Outdated Show resolved Hide resolved
ledger/complete/mtrie/trieCache_test.go Show resolved Hide resolved
Comment on lines -584 to -587
// test deletion
s := led2.ForestSize()
assert.Equal(t, s, size)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't do removals any more so this part is not relevant anymore

defer tc.lock.RUnlock()

if tc.count == 0 {
return nil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion, if you return an empty slice, it would make the job of caller easier. they won't have to check for nil anymore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @taabodim we are going to have a follow up PR for improvements, going to include your suggestions in that PR.

@ramtinms ramtinms merged commit ec132df into fxamacker/reuse-mtrie-state-for-checkpointing-2 Aug 2, 2022
@ramtinms ramtinms deleted the ramtin/replace-ledger-forest-lru-cache branch August 2, 2022 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants