Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharding for the lazylru cache #15

Merged
merged 3 commits into from
Jul 6, 2022
Merged

Conversation

dangermike
Copy link
Contributor

Please see sharding/README.md for details.

@dangermike dangermike requested a review from a team as a code owner January 24, 2022 05:15
@dangermike dangermike marked this pull request as draft January 24, 2022 05:16
@dangermike dangermike force-pushed the dangermike/NOTICKET/sharding branch 4 times, most recently from e29db0e to 250bdb5 Compare February 2, 2022 07:00
@dangermike dangermike force-pushed the dangermike/NOTICKET/go1.18 branch 3 times, most recently from 0624280 to 13c6b9c Compare March 22, 2022 00:04
Base automatically changed from dangermike/NOTICKET/go1.18 to master March 22, 2022 00:07
@dangermike dangermike force-pushed the dangermike/NOTICKET/sharding branch from e182705 to 4b51a44 Compare March 22, 2022 19:47
@dangermike dangermike changed the base branch from master to dangermike/NOTICKET/benchmarks March 22, 2022 19:48
@dangermike dangermike force-pushed the dangermike/NOTICKET/sharding branch 2 times, most recently from b1955db to dfffe0b Compare March 22, 2022 20:02
@dangermike dangermike marked this pull request as ready for review March 22, 2022 20:02
@dangermike dangermike requested a review from a team as a code owner March 22, 2022 20:02
Base automatically changed from dangermike/NOTICKET/benchmarks to master March 22, 2022 20:10
generic/expected_stats_test.go Outdated Show resolved Hide resolved
generic/sharded/sharded.go Outdated Show resolved Hide resolved
Comment on lines +58 to +63
for _, s := range slru.shards {
if s.IsRunning() {
return true
}
}
return false
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does atomicity matter here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The individual IsRunning calls are locked internally. Since we don't need an exact count, I don't think we need a global lock. Honestly, I'm not sure anyone will ever use this function in the real world. You'd start up the cache(s), do your thing, then shut down the process -- graceful shutdown won't really be a thing. If that seemed like a more likely use case, I'd probably make the Close call block.

Comment on lines +98 to +100
for k, v := range slru.shards[shardIx].MGet(skeys...) {
retval[k] = v
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious - Did you benchmark this in separate goroutines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the benchmarks are all concurrent. See the Run function in lazylru_benchmark_test.go.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that wasn't clear - I think (it's been a while, haha) what I meant was did you compare performance between this way (accessing each shard serially in a loop) vs spawning a new goroutine for each shard (with appropriate locking around the map write, of course)?

(Of course, performance probably changes with the expected number of shards - if it's relatively small I wouldn't expect goroutines to be a win.)

generic/sharded/sharded.go Outdated Show resolved Hide resolved
generic/sharded/sharded.go Outdated Show resolved Hide resolved
The current version of the cache uses `string` as the key and `interface{}` as the value. That fits the use case for which it was designed, but it is not as flexible as it could be. Go 1.18 generics created an opportunity to do change that.

The [container/heap](https://pkg.go.dev/container/heap) in the standard library doesn't support generics. I'm sure it will at some point, but for now, the source code was copied from the standard libary and made generic. The `pq` and `lazylru` components were copied into a subpackage, `generic`. While the internals of the library (and especially the tests) are littered with type annotations, the external interface is pretty clean. Previously, the cache would be used like so:

```go
// import "github.com/TriggerMail/lazylru"

lru := lazylru.New(maxItems, ttl)
lru.Set("key", "value")
if v, ok := lru.Get("key"); ok {
	vstr, ok := v.(string)
	if !ok {
		panic("something terrible has happened")
	}
	fmt.Println(vstr)
}
```

The new version is a bit cleaner:

```go
// import "github.com/TriggerMail/lazylru/generic"

lru := lazylru.NewT[string, string](maxItems, ttl)
lru.Set("key", "value")
if v, ok := lru.Get("key"); ok {
	fmt.Println(v)
}
```

It's expected that the cache is going to be created at the start of a program and accessed many times, so the real win is the lack of casting on the `Get`. It is easy to put in a value when you mean a pointer or a pointer when you mean a value, but the generic version prevents that problem. The `panic` in the sample code above is a maybe overkill, but the caller is likely to do _something_ to deal with type. There is a performance impact to the casting, but it doesn't appear to be huge.

In terms of caching performance, there was an improvement in all cases. I tested the original, interface-based implementation as well as a generic implementation of [string, interface{}] to mimic the interface type as closely as possible and a generic implementation of `[string, int]` to see what the improvement would be. Tests were run on an Apple Macbook Pro M1. An excerpt of the benchmarch is listed below:

```text
                           1% W, 99% R 99% W, 1% R
------------------------- ------------ ------------
interface-based            60.94 ns/op 107.80 ns/op
generic[string,interface]  54.21 ns/op  87.76 ns/op
generic[string,int]        53.24 ns/op  93.80 ns/op
```

* Separate interface and generic versions to allow the consumer to select the generic as a version
* Make testing work under go 1.18
* golang:rc-buster image
* go fmt
* installing go-junit-report properly
* adding test-results to .gitignore
* Building with Earthly to make life easier on myself
* Using revive rather than golangci-lint because golangci-lint doesn't work with go 1.18 yet
* Publishing coverage results to coveralls
   * On interface (top-level) only because goveralls doesn't like submodules
* README badges for coverage
* benchmark
	* results from n2-highcpu-64, 64 cores
	* results from MacBook Pro M1 14" 2021
	* thread-local math/rand - Using the shared RNG in math/rand has some locking in it that dominates the performance test at high thread counts
* README updates
@dangermike dangermike force-pushed the dangermike/NOTICKET/sharding branch from f0088d9 to a7eb930 Compare July 6, 2022 13:51
@dangermike dangermike enabled auto-merge (squash) July 6, 2022 14:00
Comment on lines +98 to +100
for k, v := range slru.shards[shardIx].MGet(skeys...) {
retval[k] = v
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that wasn't clear - I think (it's been a while, haha) what I meant was did you compare performance between this way (accessing each shard serially in a loop) vs spawning a new goroutine for each shard (with appropriate locking around the map write, of course)?

(Of course, performance probably changes with the expected number of shards - if it's relatively small I wouldn't expect goroutines to be a win.)

@dangermike dangermike merged commit cf518d8 into master Jul 6, 2022
@dangermike dangermike deleted the dangermike/NOTICKET/sharding branch July 6, 2022 14:53
@ghost ghost mentioned this pull request Jul 30, 2022
@ghost ghost mentioned this pull request Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants