-
Notifications
You must be signed in to change notification settings - Fork 107
Conversation
not looking too promising. |
af9b8d0
to
952d194
Compare
PLOT TWIST: see:
|
after applying #943 :
the logger.Error overhead is coming from the FlatAccnt.act() errors, so moving to batched events should save about 10% cpu |
I ran the experiment above again (with chunkcache-AddRange merged but not yet the batched accounting) differences with previous run cpu 250%, full cache hits, no cass hits once cached. (this doesn't match previous observation where i said cpu was lower and only had partial hits and leading to cassandra requests)
https://snapshot.raintank.io/dashboard/snapshot/9j0cGMOzq3sz9b4E7zSO0yIfUa6Wb9dW?orgId=2 batched accounting:
https://snapshot.raintank.io/dashboard/snapshot/EwZbEhluRrfeaurmDwxKNpS67Ax7Z0Zx?orgId=2 conclusion: 20% cpu reduction |
1) less contention on channel 2) prevent channel overflowing 3) less CPU spent printing `Failed to submit event to accounting, channel was blocked` For the record, When I patch the code like so: diff --git a/mdata/cache/accnt/flat_accnt.go b/mdata/cache/accnt/flat_accnt.go index 4097e067..d202960c 100644 --- a/mdata/cache/accnt/flat_accnt.go +++ b/mdata/cache/accnt/flat_accnt.go @@ -1,7 +1,9 @@ package accnt import ( + "fmt" "sort" + "time" "github.com/grafana/metrictank/mdata/chunk" "github.com/raintank/worldping-api/pkg/log" @@ -178,6 +180,7 @@ func (a *FlatAccnt) eventLoop() { }, ) case evnt_add_chnks: + pre := time.Now() payload := event.pl.(*AddsPayload) a.addRange(payload.metric, payload.chunks) cacheChunkAdd.Add(len(payload.chunks)) @@ -189,6 +192,7 @@ func (a *FlatAccnt) eventLoop() { }, ) } + fmt.Println("CODE evnt_add_chnks took", time.Since(pre).Nanoseconds()) case evnt_hit_chnk: payload := event.pl.(*HitPayload) a.lru.touch( @@ -198,6 +202,7 @@ func (a *FlatAccnt) eventLoop() { }, ) case evnt_hit_chnks: + pre := time.Now() payload := event.pl.(*HitsPayload) for _, chunk := range payload.chunks { a.lru.touch( @@ -207,6 +212,7 @@ func (a *FlatAccnt) eventLoop() { }, ) } + fmt.Println("CODE evnt_hit_chnks took", time.Since(pre).Nanoseconds()) and run the workload described in #931 (queries for 2y of data across 500 series) I get these durations: awk '/evnt_add_chnks/ {print $4}' code2.txt |goplot hist 1187.00 -> 11160892.40: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 850 22320597.80: ▇▇ 39 (upto 22 ms) 33480303.20: ▇ 17 (upto 33 ms) 44640008.60: ▏ 5 (upto 45 ms) 55799714.00: ▏ 5 (upto 56 ms) 66959419.40: ▏ 1 (upto 67 ms) 78119124.80: ▏ 0 89278830.20: ▏ 0 100438535.60: ▏ 0 111598241.00: ▏ 1 (upto 111 ms) awk '/evnt_hit_chnks/ {print $4}' code2.txt |goplot hist 45.00 -> 22366478.50: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 7624 44732912.00: ▏ 65 (upto 45 ms) 67099345.50: ▏ 29 (upto 67 ms) 89465779.00: ▏ 8 (upto 89 ms) 111832212.50: ▏ 1 (upto 112 ms) 134198646.00: ▏ 1 (upto 134 ms) 156565079.50: ▏ 0 178931513.00: ▏ 1 (upto 179 ms) 201297946.50: ▏ 0 223664380.00: ▏ 1 (upto 224 ms)
952d194
to
e8535cc
Compare
the original problem is now addressed via #943 I just pushed these updates to this PR: NOTE: when merging, need to add nil checks for the new ccache methods introduced in #943 |
* CassandraStore.Search: instead of doing a query for each row, do a single query across all rows. with 4week rows, that would have been a lot of rows for long timerange queries * outcome -> readResult * no more need to sort readResults (there is only one) however we must now sort all chunks from Cassandra. * change: store.cassandra.chunks_per_row to store.cassandra.chunks_per_response
* pre-prepared queries for better performance * remove a bunch of unneeded cruft * by keeping a few properties in the Table definition we can cleanup a bunch of code
e8535cc
to
fa42728
Compare
const QueryFmtRead = "SELECT ts, data FROM %s WHERE key IN ? AND ts < ?" | ||
const QueryFmtWrite = "INSERT INTO %s (key, ts, data) values(?,?,?) USING TTL ?" | ||
|
||
// TTLTables stores table definitions keyed by their TTL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
am i right that the TTLs are in hours? if so i'd specify that, otherwise it's very easy to assume that's seconds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
1) less contention on channel 2) prevent channel overflowing 3) less CPU spent printing `Failed to submit event to accounting, channel was blocked` For the record, When I patch the code like so: diff --git a/mdata/cache/accnt/flat_accnt.go b/mdata/cache/accnt/flat_accnt.go index 4097e067..d202960c 100644 --- a/mdata/cache/accnt/flat_accnt.go +++ b/mdata/cache/accnt/flat_accnt.go @@ -1,7 +1,9 @@ package accnt import ( + "fmt" "sort" + "time" "github.com/grafana/metrictank/mdata/chunk" "github.com/raintank/worldping-api/pkg/log" @@ -178,6 +180,7 @@ func (a *FlatAccnt) eventLoop() { }, ) case evnt_add_chnks: + pre := time.Now() payload := event.pl.(*AddsPayload) a.addRange(payload.metric, payload.chunks) cacheChunkAdd.Add(len(payload.chunks)) @@ -189,6 +192,7 @@ func (a *FlatAccnt) eventLoop() { }, ) } + fmt.Println("CODE evnt_add_chnks took", time.Since(pre).Nanoseconds()) case evnt_hit_chnk: payload := event.pl.(*HitPayload) a.lru.touch( @@ -198,6 +202,7 @@ func (a *FlatAccnt) eventLoop() { }, ) case evnt_hit_chnks: + pre := time.Now() payload := event.pl.(*HitsPayload) for _, chunk := range payload.chunks { a.lru.touch( @@ -207,6 +212,7 @@ func (a *FlatAccnt) eventLoop() { }, ) } + fmt.Println("CODE evnt_hit_chnks took", time.Since(pre).Nanoseconds()) and run the workload described in #931 (queries for 2y of data across 500 series) I get these durations: awk '/evnt_add_chnks/ {print $4}' code2.txt |goplot hist 1187.00 -> 11160892.40: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 850 22320597.80: ▇▇ 39 (upto 22 ms) 33480303.20: ▇ 17 (upto 33 ms) 44640008.60: ▏ 5 (upto 45 ms) 55799714.00: ▏ 5 (upto 56 ms) 66959419.40: ▏ 1 (upto 67 ms) 78119124.80: ▏ 0 89278830.20: ▏ 0 100438535.60: ▏ 0 111598241.00: ▏ 1 (upto 111 ms) awk '/evnt_hit_chnks/ {print $4}' code2.txt |goplot hist 45.00 -> 22366478.50: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 7624 44732912.00: ▏ 65 (upto 45 ms) 67099345.50: ▏ 29 (upto 67 ms) 89465779.00: ▏ 8 (upto 89 ms) 111832212.50: ▏ 1 (upto 112 ms) 134198646.00: ▏ 1 (upto 134 ms) 156565079.50: ▏ 0 178931513.00: ▏ 1 (upto 179 ms) 201297946.50: ▏ 0 223664380.00: ▏ 1 (upto 224 ms)
problem
when customers query 2 years worth of data, because we use 4-weekly rowkeys, we read from 27 different row keys, each using a separate query. When using avgConsolidator do times 2 (we read sum and cnt), so 54. this is per series.
this customer was querying 743 series, resulting in 743*54=40k read queries (each completing in under 10ms generally, but still causing a big latency overall)
We speculate that by doing fewer queries that hit multiple rowkeys at once, we can get better performance. I'm currently testing this
procedure
wait until up. then (use latest fakemetrics code):
this should complete in about 3 minutes.
load MT dashboard and confirm it ingested data at 100kHz for 3 minutes
when done: in grafana look at
some.id.of.a.metric.{1,72,199,320,438};id=*;some=tag
to confirm data looks good (over 2y)http benchmarking
when I was trying with only 1 series:
pre and post were both about 1.46 to 1.48s
same but with more series
input:
pre
post
CPU ram/cpu/io :
https://snapshot.raintank.io/dashboard/snapshot/K3C0XYLkWl4Vjq3nZtI5oih7wLek4h0J?orgId=2
MT dashboard:
https://snapshot.raintank.io/dashboard/snapshot/OrKp30bsRTMfS2pBsXNBWw6r48rV224I?orgId=2
conclusion