[WIP] Add method to cache many chunks at once #940

replay · 2018-06-08T19:33:28Z

@Dieterbe Looks like quite a major difference, although I still need to verify that the result is really the same and I'm not creating any corruption:

mst@mst-nb1:~/documents/code/go/src/github.com/grafana/metrictank$ go test github.com/grafana/metrictank/mdata/cache  -bench BenchmarkAddingManyChunks
goos: linux
goarch: amd64
pkg: github.com/grafana/metrictank/mdata/cache
BenchmarkAddingManyChunksOneByOne-8   	   10000	    816636 ns/op
BenchmarkAddingManyChunksAtOnce-8     	 2000000	       671 ns/op
PASS
ok  	github.com/grafana/metrictank/mdata/cache	11.255s
mst@mst-nb1:~/documents/code/go/src/github.com/grafana/metrictank$

Will also deduplicate the Add() and AddSlice() methods, currently there's a lot of duplication between these two

Dieterbe · 2018-06-09T09:32:05Z

mdata/cache/ccache_metric.go

+		}
+
+		// if the previous chunk is cached, link in both directions
+		if _, ok := mc.chunks[prev]; ok {


in all but the first case, we know there was a prev
in all but the last case, we know there is a next.
we can do most of the inserts such that we immediately set the right Prev and Next fields

I think we can also pre-allocate the &CCacheChunk in a slice of len = len(itergens) for two benefits:

1 allocation instead of N (faster Adds)

at read time, we're more likely to have a sequential read (prefetched from RAM)

obviously, needs benchmarking to see if this hunch is correct. Especially number 1 is interesting. I wouldn't spend time on 2 at this point

Dieterbe · 2018-06-09T09:36:34Z

mdata/cache/ccache_metric.go

@@ -69,6 +69,66 @@ func (mc *CCacheMetric) Del(ts uint32) int {
 	return len(mc.chunks)
 }

+func (mc *CCacheMetric) AddSlice(prev uint32, itergens []chunk.IterGen) {


this function should document its assumptions/requirements, which afaik is two-fold (correct me if i'm wrong)

the sequence of chunks should be in ascending timestamp order

the sequence should be complete (no gaps)

also I don't like the name AddSlice, slice is just an implementation detail, it doesn't describe the concept well enough.
Perhaps AddSequence, AddWindow or AddRange is better.

Dieterbe · 2018-06-09T09:43:02Z

great. looks like adding 50k at once is then about 30ms. acceptable.
and we have room to optimize further (see comment) and also we don't always have to regenerate the keys (both in the Range as in the one by one case, if the new data comes after the most recent value in the chunk cache)

Dieterbe · 2018-06-12T08:59:27Z

mdata/cache/ccache_metric.go

+			mc.chunks[prev].Next = ts
+			mc.chunks[ts].Prev = prev
+		}
+		prev = ts


this is buggy, because higher up we continue if a chunk exists, but we won't update the prev value, and the next chunk we'll need to insert wil set up the links so that pre-existing chunks will be skipped.

BEFORE: ~/g/s/g/g/m/m/cache ❯❯❯ taskset --cpu-list 5 go test -bench . -benchmem goos: linux goarch: amd64 pkg: github.com/grafana/metrictank/mdata/cache BenchmarkAddingManyChunksOneByOne 10000 1225701 ns/op 21688 B/op 5 allocs/op BenchmarkAddingManyChunksAtOnce 1000000 1146 ns/op 118 B/op 1 allocs/op PASS ok github.com/grafana/metrictank/mdata/cache 14.713s AFTER: ~/g/s/g/g/m/m/cache ❯❯❯ taskset --cpu-list 5 go test -bench . -benchmem goos: linux goarch: amd64 pkg: github.com/grafana/metrictank/mdata/cache BenchmarkAddingManyChunksOneByOne 10000 1226088 ns/op 21688 B/op 5 allocs/op BenchmarkAddingManyChunksAtOnce 1000000 1043 ns/op 118 B/op 1 allocs/op PASS ok github.com/grafana/metrictank/mdata/cache 14.620s

taskset --cpu-list 5 go test -bench . -benchmem goos: linux goarch: amd64 pkg: github.com/grafana/metrictank/mdata/cache BenchmarkAddingManyChunksOneByOne 1000000 1057 ns/op 145 B/op 3 allocs/op BenchmarkAddingManyChunksAtOnce 5000000 448 ns/op 123 B/op 1 allocs/op PASS ok github.com/grafana/metrictank/mdata/cache 15.433s

taskset --cpu-list 5 go test -bench . -benchmem goos: linux goarch: amd64 pkg: github.com/grafana/metrictank/mdata/cache BenchmarkAddingManyChunksOneByOne 1000000 1054 ns/op 145 B/op 3 allocs/op BenchmarkAddingManyChunksAtOnce 5000000 427 ns/op 123 B/op 0 allocs/op PASS ok github.com/grafana/metrictank/mdata/cache 12.755s

should have been part of #830

replay · 2018-06-12T18:38:06Z

api/dataprocessor.go

 				it, err := itgen.Get()
 				if err != nil {
 					// TODO(replay) figure out what to do if one piece is corrupt
 					tracing.Failure(span)
 					tracing.Errorf(span, "itergen: error getting iter from store slice %+v", err)
+					if i > 0 {
+						// add all the iterators that are in good shape
+						s.Cache.AddRange(ctx.AMKey, prevts, storeIterGens[:i])


that's a nice idea, to still add those that were not corrupted

replay · 2018-06-12T18:42:56Z

mdata/cache/ccache.go

+	}
+
+	for _, itergen := range itergens {
+		c.accnt.AddChunk(metric, itergen.Ts, itergen.Size())


@Dieterbe when you looked at the cpu profiles, did this method ever show up? If so, we could batch-add those as well. The benefit is probably going to be smaller because this function just directly pushes the data into a channel. We could also push a whole batch into the channel instead of one-by-one and then optimize the accounting to handle batches, but i'm not sure if there is any benefit to that because in the request-handling thread this will only result in less function calls and less pushes into that channel

i was wondering the same thing. it would definitely help with Failed to submit event to accounting, channel was blocked. this is a different issue, but we it shouldn't be possible to hit that situation in normal scenarios, so from that perspective we probably want to batch-add, and also batch other things that could overload the channel (like deletes perhaps?)

replay · 2018-06-12T19:00:44Z

mdata/cache/ccache_metric.go

+	}
+
+	// add chunk if we don't have it yet (most likely)
+	if _, ok := mc.chunks[ts]; !ok {


in the exceptional case when this is ok == true then mc.chunks[ts].Next should still get set to itergens[1].Ts

replay · 2018-06-12T19:08:24Z

mdata/cache/ccache_metric.go

+
+	// handle the 2nd until the last-but-one
+	for i := 1; i < len(itergens)-1; i++ {
+		itergen := itergens[i]


is there a need to allocate a new variable for itergen and ts in this scope? i think it should be fine to just reuse the already allocated ones

replay · 2018-06-12T19:15:26Z

mdata/cache/ccache_metric.go

+	ts = itergen.Ts
+
+	// if nextTs() can't figure out the end date it returns ts
+	next := mc.nextTsCore(itergen, ts, prev, 0)


if the chunk at ts is already cached (the ok on line 165 is true), then there is no need to figure out what next should be. So the block 152-162 could be moved down into that if block

regenerateKeys is more expensive because it allocates, iterates over the map and typically has a more random input for sort.Sort Note: BenchmarkAddingManyChunksOneByOne is a bit slower because previously it was buggy (we didn't always add the ts) taskset --cpu-list 5 go test -bench . -benchmem ⏎ goos: linux goarch: amd64 pkg: github.com/grafana/metrictank/mdata/cache BenchmarkAddingManyChunksOneByOne 1000000 1114 ns/op 145 B/op 3 allocs/op BenchmarkAddingManyChunksAtOnce 5000000 427 ns/op 123 B/op 0 allocs/op PASS ok github.com/grafana/metrictank/mdata/cache 12.856s

it's a better, more consistent name

(no perf change) taskset --cpu-list 5 go test -bench . -benchmem goos: linux goarch: amd64 pkg: github.com/grafana/metrictank/mdata/cache BenchmarkAddingManyChunksOneByOne 1000000 1114 ns/op 145 B/op 3 allocs/op BenchmarkAddingManyChunksAtOnce 5000000 427 ns/op 123 B/op 0 allocs/op PASS ok github.com/grafana/metrictank/mdata/cache 12.853s

taskset --cpu-list 5 go test -bench . -benchmem goos: linux goarch: amd64 pkg: github.com/grafana/metrictank/mdata/cache BenchmarkAddingManyChunksOneByOne 1000000 1112 ns/op 145 B/op 3 allocs/op BenchmarkAddingManyChunksAtOnce 5000000 407 ns/op 123 B/op 0 allocs/op PASS ok github.com/grafana/metrictank/mdata/cache 12.700s

taskset --cpu-list 5 go test -bench . -benchmem goos: linux goarch: amd64 pkg: github.com/grafana/metrictank/mdata/cache BenchmarkAddingManyChunksOneByOne 1000000 1063 ns/op 145 B/op 3 allocs/op BenchmarkAddingManyChunksAtOnce 5000000 408 ns/op 123 B/op 0 allocs/op PASS ok github.com/grafana/metrictank/mdata/cache 12.671s

Dieterbe

i'm rewriting the tests, adding a bunch more, and adding more benchmarks to cover more scenarios (e.g. adding in desc order, adding data globally desc order, but in batches of asc ordered), then i'm rewriting the history so that we can leverage these additional tests and benchmarks as various (performance and functional) changes are introduced.
I'm going to open a new PR for this

Add method to cache many chunks at once

3b6ab0c

replay requested a review from Dieterbe June 8, 2018 19:33

add test for adding many chunks

a7c3453

replay force-pushed the cache_many_chunks branch from 1668180 to a7c3453 Compare June 8, 2018 21:45

Dieterbe reviewed Jun 9, 2018

View reviewed changes

Dieterbe reviewed Jun 12, 2018

View reviewed changes

Dieterbe added 6 commits June 12, 2018 14:26

clarify

d852ca4

fix

851f6ba

clearer name

b245187

don't need as many chunks

d321e7a

Dieterbe force-pushed the cache_many_chunks branch from 5dc80e5 to 387c401 Compare June 12, 2018 12:54

Dieterbe added 4 commits June 12, 2018 15:03

getSeriesCachedStore: use AddRange for more speed

b0be867

do away with CCacheMetric.Init just use standard constructor

63c4e5d

properly exclude stacktests in make test

a675396

should have been part of #830

Dieterbe force-pushed the cache_many_chunks branch from da3a59c to a675396 Compare June 12, 2018 15:20

replay commented Jun 12, 2018

View reviewed changes

Dieterbe added 5 commits June 13, 2018 10:23

s/CacheIfHot/AddIfHot

0a2643e

it's a better, more consistent name

Dieterbe suggested changes Jun 13, 2018

View reviewed changes

adding more tests/benchmarks, making them more sense

3727dfe

Dieterbe mentioned this pull request Jun 13, 2018

chunk cache perf fixes: AddRange + batched accounting #943

Merged

Dieterbe closed this Jun 13, 2018

Dieterbe deleted the cache_many_chunks branch September 18, 2018 09:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add method to cache many chunks at once #940

[WIP] Add method to cache many chunks at once #940

replay commented Jun 8, 2018

Dieterbe Jun 9, 2018 •

edited

Loading

Dieterbe Jun 9, 2018 •

edited

Loading

Dieterbe Jun 9, 2018

Dieterbe commented Jun 9, 2018

Dieterbe Jun 12, 2018

replay Jun 12, 2018

replay Jun 12, 2018 •

edited

Loading

Dieterbe Jun 12, 2018

replay Jun 12, 2018

replay Jun 12, 2018

replay Jun 12, 2018 •

edited

Loading

Dieterbe left a comment

[WIP] Add method to cache many chunks at once #940

[WIP] Add method to cache many chunks at once #940

Conversation

replay commented Jun 8, 2018

Dieterbe Jun 9, 2018 • edited Loading

Choose a reason for hiding this comment

Dieterbe Jun 9, 2018 • edited Loading

Choose a reason for hiding this comment

Dieterbe Jun 9, 2018

Choose a reason for hiding this comment

Dieterbe commented Jun 9, 2018

Dieterbe Jun 12, 2018

Choose a reason for hiding this comment

replay Jun 12, 2018

Choose a reason for hiding this comment

replay Jun 12, 2018 • edited Loading

Choose a reason for hiding this comment

Dieterbe Jun 12, 2018

Choose a reason for hiding this comment

replay Jun 12, 2018

Choose a reason for hiding this comment

replay Jun 12, 2018

Choose a reason for hiding this comment

replay Jun 12, 2018 • edited Loading

Choose a reason for hiding this comment

Dieterbe left a comment

Choose a reason for hiding this comment

Dieterbe Jun 9, 2018 •

edited

Loading

Dieterbe Jun 9, 2018 •

edited

Loading

replay Jun 12, 2018 •

edited

Loading

replay Jun 12, 2018 •

edited

Loading