Skip to content
This repository was archived by the owner on Aug 23, 2023. It is now read-only.

Panic when merging series #1095

Closed
DanCech opened this issue Oct 12, 2018 · 7 comments · Fixed by #1452
Closed

Panic when merging series #1095

DanCech opened this issue Oct 12, 2018 · 7 comments · Fixed by #1452
Assignees
Milestone

Comments

@DanCech
Copy link
Contributor

DanCech commented Oct 12, 2018

The code on https://github.com/grafana/metrictank/blob/master/api/dataprocessor.go#L624 assumes when merging series that they all have the same number of data points. If that is not true, then it can panic for "runtime error: index out of range"

@Dieterbe
Copy link
Contributor

@DanCech do you know this can be reproduced ?

@Dieterbe
Copy link
Contributor

likely same as #761

@Dieterbe
Copy link
Contributor

Dieterbe commented Oct 14, 2018

Looking at the customer's query, I see that we have series with the same name and different intervals (10 and 60), which get merged

While I didn't have a tool handily available to exercise what happens with series changing intervals, I am able to get close by sending different series and just measuring their length like so:

diff --git a/api/dataprocessor.go b/api/dataprocessor.go
index eb69a088..9932a12f 100644
--- a/api/dataprocessor.go
+++ b/api/dataprocessor.go
@@ -614,6 +614,7 @@ func mergeSeries(in []models.Series) []models.Series {
        for _, series := range seriesByTarget {
                if len(series) == 1 {
                        merged[i] = series[0]
+                       fmt.Println("len series is", len(series[0].Datapoints))
                } else {
                        //we use the first series in the list as our result.  We check over every
                        // point and if it is null, we then check the other series for a non null
diff --git a/scripts/config/storage-schemas.conf b/scripts/config/storage-schemas.conf
index 2d12a1d7..38c7188d 100644
--- a/scripts/config/storage-schemas.conf
+++ b/scripts/config/storage-schemas.conf
@@ -50,6 +50,14 @@
 # This example has 3 retention definitions, the first and last override some default options (to use 10minutely and 2hourly chunks and only keep one of them in memory
 # and the last rollup is marked as not ready yet for querying.
 
+[a]
+pattern = some.id.of.a.metric.(1|3)
+retentions = 10s:35d
+
+[b]
+pattern = some.id.of.a.metric.(2|4)
+retentions = 600s:35d
+

then, launching docker-dev,
then:

fakemetrics backfill --carbon-addr localhost:2003 --offset 120h --period 10s --speedup 10000 --mpo 4

confirm the series:

./mt-index-cat -prefix some -max-stale 0 cass -hosts localhost:9042 -schema-file ../scripts/config/schema-idx-cassandra.toml '{{.Name}} {{.Interval}}\n'
2018-10-14 23:26:58.160 [INFO] cassandra-idx: ensuring that keyspace metrictank exist.
2018-10-14 23:26:58.161 [INFO] cassandra-idx: ensuring that table metric_idx exist.
some.id.of.a.metric.1 10
some.id.of.a.metric.4 600
some.id.of.a.metric.3 10
some.id.of.a.metric.2 600
$ curl 'http://localhost:6060/render/?from=1539274680&until=1539361140&target=some.id.of.a.metric.*'

this works fine but shows that it would break if they were to be merged:

metrictank_1   | [Macaron] 2018-10-14 21:36:40: Started GET /render/?from=1539274680&until=1539361140&target=some.id.of.a.metric.* for 192.168.16.1
metrictank_1   | [Macaron] 2018-10-14 21:36:40: Started POST /index/find for 127.0.0.1
metrictank_1   | [Macaron] 2018-10-14 21:36:40: Completed /index/find 200 OK in 470.393µs
metrictank_1   | len series is 145
metrictank_1   | len series is 145
metrictank_1   | len series is 144
metrictank_1   | len series is 144
metrictank_1   | [Macaron] 2018-10-14 21:36:40: Completed /render/?from=1539274680&until=1539361140&target=some.id.of.a.metric.* 200 OK in 3.187747ms

likewise, can trigger a similar problem in sumSeries like so:

$ curl 'http://localhost:6060/render/?from=1539274680&until=1539361140&target=sum(some.id.of.a.metric.*)'
<html>
<head><title>PANIC: runtime error: index out of range</title>
<meta charset="utf-8" />
<style type="text/css">
html, body {
	font-family: "Roboto", sans-serif;
	color: #333333;
	background-color: #ea5343;
	margin: 0px;
}
h1 {
	color: #d04526;
	background-color: #ffffff;
	padding: 20px;
	border-bottom: 1px dashed #2b3848;
}
pre {
	margin: 20px;
	padding: 20px;
	border: 2px solid #2b3848;
	background-color: #ffffff;
	white-space: pre-wrap;       /* css-3 */
	white-space: -moz-pre-wrap;  /* Mozilla, since 1999 */
	white-space: -pre-wrap;      /* Opera 4-6 */
	white-space: -o-pre-wrap;    /* Opera 7 */
	word-wrap: break-word;       /* Internet Explorer 5.5+ */
}
</style>
</head><body>
<h1>PANIC</h1>
<pre style="font-weight: bold;">runtime error: index out of range</pre>
<pre>/usr/lib/go/src/runtime/panic.go:513 (0x42be48)
/usr/lib/go/src/runtime/panic.go:44 (0x42ac89)
/home/dieter/go/src/github.com/grafana/metrictank/expr/seriesaggregators.go:122 (0xac3988)
/home/dieter/go/src/github.com/grafana/metrictank/expr/func_aggregate.go:49 (0xaa56fb)
/home/dieter/go/src/github.com/grafana/metrictank/expr/plan.go:223 (0xac2f5c)
/home/dieter/go/src/github.com/grafana/metrictank/api/graphite.go:696 (0xb03161)
/home/dieter/go/src/github.com/grafana/metrictank/api/graphite.go:187 (0xafe0c7)
/home/dieter/go/src/github.com/grafana/metrictank/api/routes.go:61 (0xb19fb4)
/usr/lib/go/src/runtime/asm_amd64.s:525 (0x45a4c4)
/usr/lib/go/src/reflect/value.go:447 (0x4b7978)
/usr/lib/go/src/reflect/value.go:308 (0x4b7413)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:177 (0xa36e43)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:137 (0xa367a9)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:113 (0xa52940)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:104 (0xa5285e)
/home/dieter/go/src/github.com/grafana/metrictank/api/middleware/tracer.go:67 (0xa6f3c7)
/usr/lib/go/src/runtime/asm_amd64.s:522 (0x45a2fa)
/usr/lib/go/src/reflect/value.go:447 (0x4b7978)
/usr/lib/go/src/reflect/value.go:308 (0x4b7413)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:177 (0xa36e43)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:137 (0xa367a9)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:113 (0xa52940)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:104 (0xa5285e)
/home/dieter/go/src/github.com/grafana/metrictank/api/middleware/stats.go:76 (0xa6eb58)
/usr/lib/go/src/runtime/asm_amd64.s:522 (0x45a2fa)
/usr/lib/go/src/reflect/value.go:447 (0x4b7978)
/usr/lib/go/src/reflect/value.go:308 (0x4b7413)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:177 (0xa36e43)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:137 (0xa367a9)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:113 (0xa52940)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:104 (0xa5285e)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/recovery.go:161 (0xa6292a)
/usr/lib/go/src/runtime/asm_amd64.s:522 (0x45a2fa)
/usr/lib/go/src/reflect/value.go:447 (0x4b7978)
/usr/lib/go/src/reflect/value.go:308 (0x4b7413)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:177 (0xa36e43)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:137 (0xa367a9)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:113 (0xa52940)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:104 (0xa5285e)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/logger.go:43 (0xa61c80)
/usr/lib/go/src/runtime/asm_amd64.s:522 (0x45a2fa)
/usr/lib/go/src/reflect/value.go:447 (0x4b7978)
/usr/lib/go/src/reflect/value.go:308 (0x4b7413)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:177 (0xa36e43)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/github.com/go-macaron/inject/inject.go:137 (0xa367a9)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/context.go:113 (0xa52940)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/router.go:184 (0xa63b24)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/router.go:286 (0xa5d5ee)
/home/dieter/go/src/github.com/grafana/metrictank/vendor/gopkg.in/macaron.v1/macaron.go:177 (0xa56771)
/usr/lib/go/src/net/http/server.go:2741 (0x6f304a)
/usr/lib/go/src/net/http/server.go:1847 (0x6ef3f5)
/usr/lib/go/src/runtime/asm_amd64.s:1333 (0x45be90)
</pre>
</body>

@Dieterbe
Copy link
Contributor

Dieterbe commented Oct 14, 2018

i'll dig deeper, but it looks like series with interval=10 have points like:

<     Ts: (uint32) 1539275280
<     Ts: (uint32) 1539275880
<     Ts: (uint32) 1539276480

those with interval 600 have

>     Ts: (uint32) 1539274800
>     Ts: (uint32) 1539275400
>     Ts: (uint32) 1539276000

checked this in getTargets, and also i saw they have the same values for:
//QueryCons: (consolidation.Consolidator) NoneConsolidator,
//Consolidator: (consolidation.Consolidator) AverageConsolidator
//Interval: (uint32) 600,
//QueryFrom: (uint32) 1539274681,
//QueryTo: (uint32) 1539361141,

@woodsaj
Copy link
Member

woodsaj commented Oct 15, 2018

The code on https://github.com/grafana/metrictank/blob/master/api/dataprocessor.go#L624 assumes when merging series that they all have the same number of data points.

All series should have the same number of points as the interval of all series should be the same due to https://github.com/grafana/metrictank/blob/master/api/query_engine.go#L30

@shanson7
Copy link
Collaborator

shanson7 commented Oct 25, 2018

See my comment here In #913

Here is my local hack-around

Edit: I don't think this is correct solution as it kind of "fudges" the timestamps to match up, but it more or less works.

@Dieterbe Dieterbe added this to the 0.12.0 milestone Dec 12, 2018
@Dieterbe Dieterbe modified the milestones: 0.12.0, vnext Jan 14, 2019
@DanCech
Copy link
Contributor Author

DanCech commented May 15, 2019

Seems to be the same as #874

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants