-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sharding for the lazylru cache #15
Conversation
e29db0e
to
250bdb5
Compare
0624280
to
13c6b9c
Compare
e182705
to
4b51a44
Compare
b1955db
to
dfffe0b
Compare
for _, s := range slru.shards { | ||
if s.IsRunning() { | ||
return true | ||
} | ||
} | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does atomicity matter here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The individual IsRunning
calls are locked internally. Since we don't need an exact count, I don't think we need a global lock. Honestly, I'm not sure anyone will ever use this function in the real world. You'd start up the cache(s), do your thing, then shut down the process -- graceful shutdown won't really be a thing. If that seemed like a more likely use case, I'd probably make the Close
call block.
for k, v := range slru.shards[shardIx].MGet(skeys...) { | ||
retval[k] = v | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious - Did you benchmark this in separate goroutines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the benchmarks are all concurrent. See the Run
function in lazylru_benchmark_test.go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, that wasn't clear - I think (it's been a while, haha) what I meant was did you compare performance between this way (accessing each shard serially in a loop) vs spawning a new goroutine for each shard (with appropriate locking around the map write, of course)?
(Of course, performance probably changes with the expected number of shards - if it's relatively small I wouldn't expect goroutines to be a win.)
The current version of the cache uses `string` as the key and `interface{}` as the value. That fits the use case for which it was designed, but it is not as flexible as it could be. Go 1.18 generics created an opportunity to do change that. The [container/heap](https://pkg.go.dev/container/heap) in the standard library doesn't support generics. I'm sure it will at some point, but for now, the source code was copied from the standard libary and made generic. The `pq` and `lazylru` components were copied into a subpackage, `generic`. While the internals of the library (and especially the tests) are littered with type annotations, the external interface is pretty clean. Previously, the cache would be used like so: ```go // import "github.com/TriggerMail/lazylru" lru := lazylru.New(maxItems, ttl) lru.Set("key", "value") if v, ok := lru.Get("key"); ok { vstr, ok := v.(string) if !ok { panic("something terrible has happened") } fmt.Println(vstr) } ``` The new version is a bit cleaner: ```go // import "github.com/TriggerMail/lazylru/generic" lru := lazylru.NewT[string, string](maxItems, ttl) lru.Set("key", "value") if v, ok := lru.Get("key"); ok { fmt.Println(v) } ``` It's expected that the cache is going to be created at the start of a program and accessed many times, so the real win is the lack of casting on the `Get`. It is easy to put in a value when you mean a pointer or a pointer when you mean a value, but the generic version prevents that problem. The `panic` in the sample code above is a maybe overkill, but the caller is likely to do _something_ to deal with type. There is a performance impact to the casting, but it doesn't appear to be huge. In terms of caching performance, there was an improvement in all cases. I tested the original, interface-based implementation as well as a generic implementation of [string, interface{}] to mimic the interface type as closely as possible and a generic implementation of `[string, int]` to see what the improvement would be. Tests were run on an Apple Macbook Pro M1. An excerpt of the benchmarch is listed below: ```text 1% W, 99% R 99% W, 1% R ------------------------- ------------ ------------ interface-based 60.94 ns/op 107.80 ns/op generic[string,interface] 54.21 ns/op 87.76 ns/op generic[string,int] 53.24 ns/op 93.80 ns/op ``` * Separate interface and generic versions to allow the consumer to select the generic as a version * Make testing work under go 1.18 * golang:rc-buster image * go fmt * installing go-junit-report properly * adding test-results to .gitignore * Building with Earthly to make life easier on myself * Using revive rather than golangci-lint because golangci-lint doesn't work with go 1.18 yet * Publishing coverage results to coveralls * On interface (top-level) only because goveralls doesn't like submodules * README badges for coverage * benchmark * results from n2-highcpu-64, 64 cores * results from MacBook Pro M1 14" 2021 * thread-local math/rand - Using the shared RNG in math/rand has some locking in it that dominates the performance test at high thread counts * README updates
f0088d9
to
a7eb930
Compare
for k, v := range slru.shards[shardIx].MGet(skeys...) { | ||
retval[k] = v | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, that wasn't clear - I think (it's been a while, haha) what I meant was did you compare performance between this way (accessing each shard serially in a loop) vs spawning a new goroutine for each shard (with appropriate locking around the map write, of course)?
(Of course, performance probably changes with the expected number of shards - if it's relatively small I wouldn't expect goroutines to be a win.)
Please see sharding/README.md for details.