Add support for `summarize` #837

Aergonus · 2018-01-29T15:21:19Z

There's a question to be answered before this is ready to merge.
Should the input series have a matching QueryPatt and Target?
Is QueryFrom and QueryTo equivalent to series.start and series.end in the python code?

Also if #833 is merged, there'll be a conflict with the docs/graphite and I'll rebase to squash commits

Aergonus · 2018-01-29T15:25:59Z

expr/func_summarize.go

+
+		output := models.Series{
+			Target:     newName(serie.Target),
+			QueryPatt:  newName(serie.QueryPatt), // Should this match target?


Not sure if this is right. Asking for grafana people expertise

Aergonus · 2018-01-29T15:27:35Z

expr/func_summarize.go

+			newEnd = alignedEnd
+		}
+
+		output := models.Series{


The QA tests fail because I don't use newEnd. The python code sets series.start and series.end to the newStart and newEnd, I'm not sure if it's equivalent to QueryFrom and QueryTo and if we're supposed to modify them.

Dieterbe · 2018-01-30T22:07:09Z

Should the input series have a matching QueryPatt and Target?

while they represent different things, they may often end up being the same (but not always)
see https://godoc.org/github.com/grafana/metrictank/api/models#Series
maybe @DanCech and @shanson7 also have an opinion on this. this stuff is formed by early graphite-reverse-engineering with some of my sauce sprinkled on. possibly those guys have better / more recent ideas on this.

Is QueryFrom and QueryTo equivalent to series.start and series.end in the python code?

query from are literally the from and to derived from the query. (or they're defaults of now and now-24h if not specified) I'm not sure what the semantics are of those python properties.

Also if #833 is merged, there'll be a conflict with the docs/graphite and I'll rebase to squash commits

merged ! :)

https://github.com/graphite-project/graphite-web/blob/master/webapp/graphite/render/functions.py#L4706

Aergonus · 2018-01-30T23:08:50Z

expr/func_summarize.go

+
+	numPoints := int(util.Min(uint32(len(serie.Datapoints)), (start-end)/interval))
+
+	for ts, i := start, 0; i < numPoints && ts < end; ts += interval {


Just to be thorough, this is different from the python code. To mirror the python code you would remove i < numPoints from this line.
The only change is that graphite-mt would add an extra NaN value at the end of the results. Adding this would not remove an extra datapoint (only the extra NaN).

Aergonus · 2018-01-30T23:11:40Z

From what I've seen Datapoints.[0].Ts is equivalent to series.start so we wouldn't need it.

Also since fellow special function perSecond modifies QueryPatt, I'm inclined to think we should be modifying it here.

Dieterbe · 2018-01-31T07:40:41Z

yes, processing functions should update QueryPatt similar to PerSecond or others. see also
https://github.com/grafana/metrictank/blob/master/expr/NOTES#L94

shanson7 · 2018-01-31T11:59:45Z

No doubt it should be updated, but should they match? Should Target be the resolved series name and QueryPatt be the expression?

Dieterbe · 2018-01-31T12:48:37Z

TBH i'm hazy on the details / history but AFAIK they should not necessarily match (though sometimes they might) as explained in the struct documentation I linked to (if you want more info you'll have to dive into the code and see how the attributes are used).

can't we just follow the persecond example, where we just wrap around the pre-existing attributes
eg something like:

Target:     fmt.Sprintf("summarize(%s, extra params)", serie.Target),
QueryPatt:  fmt.Sprintf("summarize(%s, extra params)", serie.QueryPatt),

Aergonus · 2018-01-31T14:11:51Z

It is following the persecond example, just that the rename is dynamic based on a parameter. No changes needed so far :)

Dieterbe · 2018-01-31T17:41:56Z

ok your approach in setting QueryPatt and Target looks good to me. will get back to you later with a more complete review

Dieterbe · 2018-02-02T11:41:07Z

expr/func_summarize.go

+func (s *FuncSummarize) Signature() ([]Arg, []Arg) {
+	return []Arg{
+		ArgSeriesList{val: &s.in},
+		ArgString{key: "interval", val: &s.intervalString, validator: []Validator{IsIntervalString}},


IIRC the key attribute is only for arguments that should be made available as a keyword argument.
per http://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.summarize the interval parameter has no key

shanson7 · 2018-02-02T12:08:54Z

It doesn't look like this will support all of the documented functions that the graphite version does (e.g. median)

Dieterbe · 2018-02-02T13:32:07Z

yep. either we implement those, or we remove the stable flag.
stable flag is only for functions that we know to be 100% compatible with graphite

Dieterbe · 2018-02-02T13:52:55Z

expr/func_summarize.go

+		output := models.Series{
+			Target:     newName(serie.Target),
+			QueryPatt:  newName(serie.QueryPatt),
+			Tags:       serie.Tags,


the python code also does:

series.tags['summarize'] = intervalString series.tags['summarizeFunction'] = func

It does...which is odd IMO. Tags are something set at ingest and it seems odd to add more tags at query time (non-optionally). I get that this isn't the place to make these sorts of arguments, but it doesn't sit right with me to mess with the tags (especially in the case of name collision).

The idea was to make the info about how the series was processed available to functions further down the chain, and the rationale for modifying the tags is that when you run series x through a function f you are creating a new series f(x), and its tags should identify it and describe how it's different from x.

I get that, but it means a lot of "reserved" tag names that need to be kept track of and new ones for each function.

Dieterbe · 2018-02-02T16:20:55Z

expr/func_summarize.go

+func summarizeValues(serie models.Series, aggFunc batch.AggFunc, interval, start, end uint32) []schema.Point {
+	out := pointSlicePool.Get().([]schema.Point)
+
+	numPoints := int(util.Min(uint32(len(serie.Datapoints)), (start-end)/interval))


i'm confused here. the only reason to get the util.Min of these two, is just in case the input series had a low number of points wrt to the requested interval right?
e.g.
1 day worth of 2-hourly data
requesting summarize of 1h.
so this becomes numPoints = min(12,24) = 12
then why in the loop below do we increment by interval (1h), but only 12 times? that would cover only a 12h timerange?

If (serie.Datapoints[i].Ts < ts+interval) is violated, it doesn't increment i. So it would append NaN's for the inbetween. I'll push a test case that shows this.

Dieterbe

see comments

Simple check so we don't consolidate (batch/aggFunc) no points

Aergonus · 2018-02-08T20:32:27Z

Tags added as requested, other comment explained.
Agg funcs are in PR #847 which would allow summarize to be merged as stable

shanson7 · 2018-02-15T16:57:57Z

Not sure how important this is, but the python implementation allows 'true'and true for the bool argument. Should the ArgBool type support this?

shanson7 · 2018-02-16T14:24:19Z

Not sure how important this is, but the python implementation allows 'true'and true for the bool argument. Should the ArgBool type support this?

Turns out this is very important since grafana sometimes uses strings for bools.

Dieterbe · 2018-02-19T18:51:24Z

Turns out this is very important since grafana sometimes uses strings for bools.

any more details on "sometimes" ? i recently ran into the same problem with sortByName:
bec3e7c
there i observed grafana adds it without quotes

the graphite docs don't seem to mention quotes for the bool
http://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.summarize
or in http://graphite.readthedocs.io/en/latest/functions.html#usage

that said, i'd be happy to hear from @DanCech whether we should follow graphite's documentation/spec or what it actually allows...

shanson7 · 2018-02-20T17:16:49Z

any more details on "sometimes" ?

Yeah, in grafana if I build up the expression it doesn't use quotes. If I go and change one of the tag values afterwards, it adds the quotes in (not visibly to the user though).

DanCech · 2018-02-26T18:03:32Z

At least for the moment Graphite doesn't actually enforce any type-checking on parameters, though with the addition of the inline documentation that would now be possible. Because of that what ends up happening is that 'true' is parsed as a string and passed to the function, and when evaluated in the boolean context it is indeed true. 'false' wouldn't work since it would also evaluate to true. The problem with changing it in Graphite is the potential for breaking existing dashboards etc, especially if there are scenarios where Grafana will pass 'true' rather than true. I'd say that it's something we might look at fixing (by checking parameters passed in again the function's definition) but we would likely retain support for 'true' and "true" in any case to avoid breaking existing dashboards.

Dieterbe · 2018-02-26T18:05:47Z

so @DanCech conclusion: MT should support true, 'true', false and 'false' ( and capitalized versions)

shanson7 · 2018-02-26T19:23:51Z

Sounds more like non-empty strings are true, empty is false. Probably similar with 0/non-zero?

DanCech · 2018-02-26T19:26:55Z

@shanson7 yeah, it appears that's how it would work today, following the Python rules for truthiness https://docs.python.org/2/library/stdtypes.html#truth-value-testing

shanson7 · 2018-03-08T16:01:46Z

We didn't hash out the ArgBool truthiness. I think this change might cause some issues without it.

Dieterbe · 2018-03-08T17:53:08Z

let's continue the convo in #867

Aergonus commented Jan 29, 2018

View reviewed changes

Move Consolidate Validator

b9e8c71

Aergonus force-pushed the feature/summarize branch 5 times, most recently from 833baf1 to 3b00804 Compare January 30, 2018 22:59

Aergonus and others added 3 commits January 30, 2018 18:04

Port summarize

273ca4d

https://github.com/graphite-project/graphite-web/blob/master/webapp/graphite/render/functions.py#L4706

Make summarize stable

433184c

Don't use alignToEnd && modify QueryPatt

11000bb

Aergonus force-pushed the feature/summarize branch from 3b00804 to 11000bb Compare January 30, 2018 23:04

Aergonus commented Jan 30, 2018

View reviewed changes

Dieterbe reviewed Feb 2, 2018

View reviewed changes

Remove key for intervalstring param

dd390ff

Dieterbe reviewed Feb 2, 2018

View reviewed changes

Dieterbe suggested changes Feb 2, 2018

View reviewed changes

Add test for oversampling

03fdc38

Aergonus force-pushed the feature/summarize branch from 6bc7ef2 to 03fdc38 Compare February 2, 2018 19:27

Handle oversampling cases without panic

1897d08

Simple check so we don't consolidate (batch/aggFunc) no points

Aergonus mentioned this pull request Feb 8, 2018

Added missing batch/agg functions #847

Merged

Aergonus force-pushed the feature/summarize branch from 9a0fbbe to 1897d08 Compare February 8, 2018 20:21

Add Tags for summarize

50cbd2f

Aergonus force-pushed the feature/summarize branch from a80464c to 50cbd2f Compare February 8, 2018 20:24

Dieterbe merged commit 5240c14 into grafana:master Mar 8, 2018

Dieterbe mentioned this pull request Mar 8, 2018

ArgBool not compatible with graphite #867

Closed

Aergonus deleted the feature/summarize branch March 8, 2018 17:54

Dieterbe mentioned this pull request May 3, 2018

summarize function returning incorrect results #903

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for `summarize` #837

Add support for `summarize` #837

Aergonus commented Jan 29, 2018 •

edited

Loading

Aergonus Jan 29, 2018

Aergonus Jan 29, 2018

Dieterbe commented Jan 30, 2018 •

edited

Loading

Aergonus Jan 30, 2018 •

edited

Loading

Aergonus commented Jan 30, 2018

Dieterbe commented Jan 31, 2018

shanson7 commented Jan 31, 2018 •

edited

Loading

Dieterbe commented Jan 31, 2018 •

edited

Loading

Aergonus commented Jan 31, 2018

Dieterbe commented Jan 31, 2018

Dieterbe Feb 2, 2018

shanson7 commented Feb 2, 2018 •

edited

Loading

Dieterbe commented Feb 2, 2018

Dieterbe Feb 2, 2018

shanson7 Feb 2, 2018

DanCech Feb 2, 2018

shanson7 Feb 2, 2018

Aergonus Feb 26, 2018

Dieterbe Feb 2, 2018

Aergonus Feb 2, 2018 •

edited

Loading

Dieterbe left a comment

Aergonus commented Feb 8, 2018

shanson7 commented Feb 15, 2018

shanson7 commented Feb 16, 2018

Dieterbe commented Feb 19, 2018 •

edited

Loading

shanson7 commented Feb 20, 2018

DanCech commented Feb 26, 2018

Dieterbe commented Feb 26, 2018

shanson7 commented Feb 26, 2018

DanCech commented Feb 26, 2018

shanson7 commented Mar 8, 2018

Dieterbe commented Mar 8, 2018


		numPoints := int(util.Min(uint32(len(serie.Datapoints)), (start-end)/interval))

		for ts, i := start, 0; i < numPoints && ts < end; ts += interval {

Add support for summarize #837

Add support for summarize #837

Conversation

Aergonus commented Jan 29, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Dieterbe commented Jan 30, 2018 • edited Loading

Aergonus Jan 30, 2018 • edited Loading

Choose a reason for hiding this comment

Aergonus commented Jan 30, 2018

Dieterbe commented Jan 31, 2018

shanson7 commented Jan 31, 2018 • edited Loading

Dieterbe commented Jan 31, 2018 • edited Loading

Aergonus commented Jan 31, 2018

Dieterbe commented Jan 31, 2018

Choose a reason for hiding this comment

shanson7 commented Feb 2, 2018 • edited Loading

Dieterbe commented Feb 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Aergonus Feb 2, 2018 • edited Loading

Choose a reason for hiding this comment

Dieterbe left a comment

Choose a reason for hiding this comment

Aergonus commented Feb 8, 2018

shanson7 commented Feb 15, 2018

shanson7 commented Feb 16, 2018

Dieterbe commented Feb 19, 2018 • edited Loading

shanson7 commented Feb 20, 2018

DanCech commented Feb 26, 2018

Dieterbe commented Feb 26, 2018

shanson7 commented Feb 26, 2018

DanCech commented Feb 26, 2018

shanson7 commented Mar 8, 2018

Dieterbe commented Mar 8, 2018

Add support for `summarize` #837

Add support for `summarize` #837

Aergonus commented Jan 29, 2018 •

edited

Loading

Dieterbe commented Jan 30, 2018 •

edited

Loading

Aergonus Jan 30, 2018 •

edited

Loading

shanson7 commented Jan 31, 2018 •

edited

Loading

Dieterbe commented Jan 31, 2018 •

edited

Loading

shanson7 commented Feb 2, 2018 •

edited

Loading

Aergonus Feb 2, 2018 •

edited

Loading

Dieterbe commented Feb 19, 2018 •

edited

Loading