Factor out the index cache implementation, add ristretto #849

GiedriusS · 2019-02-15T23:49:15Z

SimpleLRU is a nice and simple algorithm for storing items in a cache. However, it is not the most efficient and it doesn't work well in our use-case under high-pressure situations since users tend to execute ad-hoc queries and, in addition, refresh Grafana dashboards quite often which leads to the same items being loaded over and over again.

I have factored out the index cache implementation into another file. This lets us change it to another one on a whim. In addition, I have added an implementation which uses the ristretto library, which uses TinyLFU underneath - a relatively new algorithm which espouses very good surprising performance (see the benchmarks here: https://github.com/dgraph-io/ristretto).

This current implementation kind of works.

Verification: go test -v.

bwplotka · 2019-02-19T03:20:55Z

Nice! I did not look at this much, but definitely huge area for improvments here. Can we think of / run some benchmarks to assess if this change is valuable? It makes sense in theory (:

GiedriusS · 2019-03-05T20:10:12Z

Yes, I will rework this into making it a (hidden) option and will make some automatic benchmarks, just like with the other ones that we have already.

bwplotka · 2019-03-11T11:45:57Z

@GiedriusS have you seen? https://blog.dgraph.io/post/caching-in-go/

GiedriusS · 2019-03-12T10:18:40Z

I haven't seen that article but I am aware of TinyLFU. It seems to me that it is the new best LRU cache policy. I started this PR with 2Q since there is a mature implementation of it. On the other hand, I have only seen one experimental implementation of TinyLFU in Go however it lacks some very important features that are mentioned in that article as well. So I guess we should wait until they will implement it or I will try it if I will have the time. Or maybe someone else wants to do that (:

GiedriusS · 2019-04-05T15:58:24Z

https://github.com/dgraph-io/ristretto seems like that new library is here :) hopefully the implementation will come soon

GiedriusS · 2019-07-12T14:07:12Z

The library has been finally implemented so now we could resume the work on this! 🎉

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

GiedriusS · 2019-09-20T14:43:12Z

@bwplotka do you think we should make this configurable? Myself I would lean towards a one time switch since the performance should be miles ahead with this library 😃

GiedriusS · 2019-11-01T21:56:04Z

Unfortunately but ristretto has some problems:

Potential collisions between keys
Memory leaks MaxCost sometimes ignored dgraph-io/ristretto#96

karlmcguire · 2019-11-01T22:51:41Z

@GiedriusS Collision checking is being finalized. We're investigating the memory leak issue; however, it appears to only occur under rare circumstances. Thanks for checking out Ristretto!

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

Encode the key's type inside of the value itself in the tinylfu case. This is needed so that we would not lose any information in the metrics about the different types of keys. 1 byte for this seems like a good enough trade-off for me. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

Add an implementation for Purge() which calls Clear(). Fix KeyData() return value to return `false` which is the true value. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

Two uint64 hashes are used. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

GiedriusS · 2019-11-16T17:43:17Z

@bwplotka how do you think the performance of this could be tested? WDYT about the PR in general? I have been running it for a bit on one node and it seems to work well.

stale · 2020-01-11T03:42:41Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

GiedriusS · 2020-02-06T13:23:52Z

Indeed this is not relevant anymore since these things should probably be pushed to some outer provider like memcached which could have a flexible control of the admission policy

store/cache: switch to a 2Q algorithm

6a129ee

GiedriusS changed the title ~~store/cache: switch to a 2Q algorithm~~ [IDEA/WIP] store/cache: switch to a 2Q algorithm Feb 15, 2019

GiedriusS added 2 commits July 30, 2019 20:56

Merge remote-tracking branch 'origin/master' into feature/segmented_lru

cdd3b35

store: cache: factor out the implementation

5de20e0

GiedriusS changed the title ~~[IDEA/WIP] store/cache: switch to a 2Q algorithm~~ Factor out the index cache implementation, add TinyLFU Sep 13, 2019

GiedriusS added 3 commits September 13, 2019 19:34

Merge remote-tracking branch 'origin' into ristretto_hehe

998e53d

store: add ristretto, fix build

2899f5e

store: cache: goimports

7f8055d

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

GiedriusS changed the title ~~Factor out the index cache implementation, add TinyLFU~~ Factor out the index cache implementation, add ristretto Sep 20, 2019

GiedriusS added component: store feature request/improvement labels Sep 20, 2019

GiedriusS mentioned this pull request Sep 21, 2019

expose metrics and add onevict callback dgraph-io/ristretto#47

Merged

GiedriusS added 6 commits September 21, 2019 17:59

store: add flag for switching between algorithms

a8d1570

docs/components/store: update

47d97b9

cache: specify custom hashing function

e1e4a88

cache: refactor .keyData, set sane counters number

4a42dc4

cache: ristretto: make ctrs dynamic

36c0d18

cache: simplelru: remove unused member

b815c1b

bwplotka mentioned this pull request Nov 1, 2019

Long Term Storage Improvements [Tracking Issue] #1705

Closed

34 tasks

GiedriusS added 2 commits November 16, 2019 00:15

*: bring ristretto up-to-date

6a15791

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

GiedriusS added 5 commits November 16, 2019 00:36

store: cache: ristretto: fix missing methods

23e4de0

Add an implementation for Purge() which calls Clear(). Fix KeyData() return value to return `false` which is the true value. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

Merge remote-tracking branch 'origin/master' into feature/segmented_lru

54aefbb

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

*: fix post-merge debris

3e64693

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

store: cache: refactor constants into variables

02d39f2

Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

store: cache: fix size calculation

95bce84

Two uint64 hashes are used. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

stale bot added the stale label Jan 11, 2020

stale bot closed this Jan 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Factor out the index cache implementation, add ristretto #849

Factor out the index cache implementation, add ristretto #849

GiedriusS commented Feb 15, 2019 •

edited

Loading

bwplotka commented Feb 19, 2019

GiedriusS commented Mar 5, 2019

bwplotka commented Mar 11, 2019

GiedriusS commented Mar 12, 2019 •

edited

Loading

GiedriusS commented Apr 5, 2019

GiedriusS commented Jul 12, 2019

GiedriusS commented Sep 20, 2019

GiedriusS commented Nov 1, 2019

karlmcguire commented Nov 1, 2019

GiedriusS commented Nov 16, 2019 •

edited

Loading

stale bot commented Jan 11, 2020

GiedriusS commented Feb 6, 2020

Factor out the index cache implementation, add ristretto #849

Factor out the index cache implementation, add ristretto #849

Conversation

GiedriusS commented Feb 15, 2019 • edited Loading

bwplotka commented Feb 19, 2019

GiedriusS commented Mar 5, 2019

bwplotka commented Mar 11, 2019

GiedriusS commented Mar 12, 2019 • edited Loading

GiedriusS commented Apr 5, 2019

GiedriusS commented Jul 12, 2019

GiedriusS commented Sep 20, 2019

GiedriusS commented Nov 1, 2019

karlmcguire commented Nov 1, 2019

GiedriusS commented Nov 16, 2019 • edited Loading

stale bot commented Jan 11, 2020

GiedriusS commented Feb 6, 2020

GiedriusS commented Feb 15, 2019 •

edited

Loading

GiedriusS commented Mar 12, 2019 •

edited

Loading

GiedriusS commented Nov 16, 2019 •

edited

Loading