-
Notifications
You must be signed in to change notification settings - Fork 107
refactor findCache to use fewer resources #1263
Conversation
930a1db
to
83b49de
Compare
49bbfa6
to
a41534c
Compare
this will help to optimize invalidation find calls by doing them in batches of multiple paths
also it's a multi-tenant queue
a41534c
to
460eba5
Compare
460eba5
to
ea96728
Compare
having it per-tenant means we need tenant aware invalidation processing which would just be too complicated (e.g. different batches for different tenants, how do you prioritize smartly without starving, etc) We can re-assess later.
For more info see smartystreets/goconvey#497 TLDR: without this, I was getting unit test failures like so: 0 total assertions --- FAIL: TestFindCache (0.00s) panic: Top-level calls to Convey(...) need a reference to the *testing.T. Hint: Convey("description here", t, func() { /* notice that the second argument was the *testing.T (t)! */ }) [recovered] panic: Top-level calls to Convey(...) need a reference to the *testing.T. Hint: Convey("description here", t, func() { /* notice that the second argument was the *testing.T (t)! */ }) [recovered] panic: Top-level calls to Convey(...) need a reference to the *testing.T. Hint: Convey("description here", t, func() { /* notice that the second argument was the *testing.T (t)! */ }) goroutine 11 [running]: testing.tRunner.func1(0xc0000de400) /usr/lib/go/src/testing/testing.go:830 +0x392 panic(0xa28be0, 0xc000129290) /usr/lib/go/src/runtime/panic.go:522 +0x1b5 github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey.(*context).conveyInner.func2(0xc00007cd80) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey/context.go:232 +0x1c6 panic(0xa28be0, 0xc000129290) /usr/lib/go/src/runtime/panic.go:522 +0x1b5 github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey.conveyPanic(...) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey/context.go:20 github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey.rootConvey(0xc000067cb8, 0x2, 0x2) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey/context.go:91 +0x35c github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey.Convey(0xc000067cb8, 0x2, 0x2) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey/doc.go:75 +0xbf github.com/grafana/metrictank/idx/memory.TestFindCache.func1() /home/dieter/go/src/github.com/grafana/metrictank/idx/memory/find_cache_test.go:88 +0xe7 github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey.parseAction.func1(0xbce120, 0xc00007cd80) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey/discovery.go:80 +0x24 github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey.(*context).conveyInner(0xc00007cd80, 0xae977f, 0x17, 0xc00012cb20) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey/context.go:261 +0x154 github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey.rootConvey.func1() /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey/context.go:110 +0xf1 github.com/grafana/metrictank/vendor/github.com/jtolds/gls._m(0x0, 0xc00000ff20) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/jtolds/gls/stack_tags.go:39 +0x31 github.com/grafana/metrictank/vendor/github.com/jtolds/gls.markS(...) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/jtolds/gls/stack_tags.go:16 github.com/grafana/metrictank/vendor/github.com/jtolds/gls.addStackTag(...) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/jtolds/gls/stack_tags.go:13 github.com/grafana/metrictank/vendor/github.com/jtolds/gls.(*ContextManager).SetValues(0xc00000fd80, 0xc0001291d0, 0xc00000ff20) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/jtolds/gls/context.go:92 +0x3fd github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey.rootConvey(0xc000067f68, 0x3, 0x3) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey/context.go:105 +0x22b github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey.Convey(0xc000067f68, 0x3, 0x3) /home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/smartystreets/goconvey/convey/doc.go:75 +0xbf github.com/grafana/metrictank/idx/memory.TestFindCache(0xc0000de400) /home/dieter/go/src/github.com/grafana/metrictank/idx/memory/find_cache_test.go:86 +0x99 testing.tRunner(0xc0000de400, 0xb05e78) /usr/lib/go/src/testing/testing.go:865 +0xc0 created by testing.(*T).Run /usr/lib/go/src/testing/testing.go:916 +0x35a exit status 2 FAIL github.com/grafana/metrictank/idx/memory 0.008s
should delete all orgs + make calleable from test
3d343ab
to
d59a773
Compare
235324f
to
569d24e
Compare
idx/memory/find_cache.go
Outdated
@@ -69,22 +96,26 @@ func (c *FindCache) Get(orgId uint32, pattern string) ([]*Node, bool) { | |||
func (c *FindCache) Add(orgId uint32, pattern string, nodes []*Node) { | |||
c.RLock() | |||
cache, ok := c.cache[orgId] | |||
t := c.backoff[orgId] | |||
backoff := findCacheBackoff.Peek() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach, but findCacheBackoff cant be a global, it needs to be part of the FindCache struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, findCacheBackoff metric doesnt work at all with the partitioned Index, as there are multiple findCaches that could all have different states.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why, to make it clear that no other code in the package should mess with it? if that, we could solve it with a better comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just add a "backoff" field to the FindCache struct. Then in triggerBackoff() i would
time.AfterFunc(c.backoffTime, func() {
findCacheBackoff.SetFalse
c.Lock()
c.backoff = false
c.Unlock()
})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why, to make it clear that no other code in the package should mess with it
It is just a bad idea to mix structs and fields together. you can have multiple instances of a findCache at any one time, so controlling how the instance behaves through a global variable is just a recipe for disaster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, findCacheBackoff metric doesnt work at all with the partitioned Index, as there are multiple findCaches that could all have different states.
hmm I didn't consider this. good catch. will follow your suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually to accommodate PartitionedIndex, we can't use a backoff bool either. maybe it should be a gauge that is incremented and decremented.
6d19e6e
to
7aa469a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I did an experiment with docker stack to confirm the improvements. (also with #1265 applied) Did it like so:
run make and then start these scripts at the same time:
v0.11.0-185-ga7c8203dhttps://snapshot.raintank.io/dashboard/snapshot/esH0WpwdEXf3RUd2konLB21B4NOgRVVe?orgId=2
v0.11.0-219-g88830e3https://snapshot.raintank.io/dashboard/snapshot/aeX0vbOIQswQN6tycV2Xl7S9oKx9l8m8?orgId=2
conclusion:
|
fix #1262