Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factor out the index cache implementation, add ristretto #849

Closed
wants to merge 19 commits into from

Conversation

GiedriusS
Copy link
Member

@GiedriusS GiedriusS commented Feb 15, 2019

SimpleLRU is a nice and simple algorithm for storing items in a cache. However, it is not the most efficient and it doesn't work well in our use-case under high-pressure situations since users tend to execute ad-hoc queries and, in addition, refresh Grafana dashboards quite often which leads to the same items being loaded over and over again.

I have factored out the index cache implementation into another file. This lets us change it to another one on a whim. In addition, I have added an implementation which uses the ristretto library, which uses TinyLFU underneath - a relatively new algorithm which espouses very good surprising performance (see the benchmarks here: https://github.com/dgraph-io/ristretto).

This current implementation kind of works.

Verification: go test -v.

@GiedriusS GiedriusS changed the title store/cache: switch to a 2Q algorithm [IDEA/WIP] store/cache: switch to a 2Q algorithm Feb 15, 2019
@bwplotka
Copy link
Member

Nice! I did not look at this much, but definitely huge area for improvments here. Can we think of / run some benchmarks to assess if this change is valuable? It makes sense in theory (:

@GiedriusS
Copy link
Member Author

Yes, I will rework this into making it a (hidden) option and will make some automatic benchmarks, just like with the other ones that we have already.

@bwplotka
Copy link
Member

@GiedriusS
Copy link
Member Author

GiedriusS commented Mar 12, 2019

I haven't seen that article but I am aware of TinyLFU. It seems to me that it is the new best LRU cache policy. I started this PR with 2Q since there is a mature implementation of it. On the other hand, I have only seen one experimental implementation of TinyLFU in Go however it lacks some very important features that are mentioned in that article as well. So I guess we should wait until they will implement it or I will try it if I will have the time. Or maybe someone else wants to do that (:

@GiedriusS
Copy link
Member Author

https://github.com/dgraph-io/ristretto seems like that new library is here :) hopefully the implementation will come soon

@GiedriusS
Copy link
Member Author

The library has been finally implemented so now we could resume the work on this! 🎉

@GiedriusS GiedriusS changed the title [IDEA/WIP] store/cache: switch to a 2Q algorithm Factor out the index cache implementation, add TinyLFU Sep 13, 2019
@GiedriusS GiedriusS changed the title Factor out the index cache implementation, add TinyLFU Factor out the index cache implementation, add ristretto Sep 20, 2019
@GiedriusS
Copy link
Member Author

@bwplotka do you think we should make this configurable? Myself I would lean towards a one time switch since the performance should be miles ahead with this library 😃

@GiedriusS
Copy link
Member Author

Unfortunately but ristretto has some problems:

@karlmcguire
Copy link

@GiedriusS Collision checking is being finalized. We're investigating the memory leak issue; however, it appears to only occur under rare circumstances. Thanks for checking out Ristretto!

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
Encode the key's type inside of the value itself in the tinylfu case.
This is needed so that we would not lose any information in the metrics
about the different types of keys. 1 byte for this seems like a good
enough trade-off for me.

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
Add an implementation for Purge() which calls Clear(). Fix KeyData()
return value to return `false` which is the true value.

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
Two uint64 hashes are used.

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
@GiedriusS
Copy link
Member Author

GiedriusS commented Nov 16, 2019

@bwplotka how do you think the performance of this could be tested? WDYT about the PR in general? I have been running it for a bit on one node and it seems to work well.

@stale
Copy link

stale bot commented Jan 11, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 11, 2020
@stale stale bot closed this Jan 18, 2020
@GiedriusS
Copy link
Member Author

Indeed this is not relevant anymore since these things should probably be pushed to some outer provider like memcached which could have a flexible control of the admission policy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants