Reduce query planning allocations #7385

jwilder · 2016-09-29T17:51:55Z

Required for all non-trivial PRs

Rebased/mergable
Tests pass
CHANGELOG.md updated

This PR reduces allocations during query planning. The main change is to the Measurement.TagSets which was creating a lot of garbage.

The TagSets function was creating a lot of intermediate maps and
slices to calculate the sorted tag sets. It first creates a map
to group tag sets with their series, it then created an equally
sized slice of the tag keys and sorted them. Finally, it created
a new slice and added the tag sets in the original map by the ordering
of the sorted keys. It was also recreating the tags map multiple time
creating extra garbage in the loop.

This simplifies the code to create one map for grouping and than adding
the distinct sets to a slice which is then sorted. It also fixes the
multple tag maps getting created.

This could probably be optimized further using a ordered tree data structure.

1.0

$ influx_stress
Total Queries: 250
Average Query Response Time: 65.779923ms

1.1 master

$ influx_stress
Total Queries: 250
Average Query Response Time: 17.209273ms

cc @jsternberg

jsternberg · 2016-09-29T17:55:57Z

tsdb/meta.go

 		return nil, err
 	}

 	// For every series, get the tag values for the requested tag keys i.e. dimensions. This is the
 	// TagSet for that series. Series with the same TagSet are then grouped together, because for the
 	// purpose of GROUP BY they are part of the same composite series.
-	tagSets := make(map[string]*influxql.TagSet)
+	tagSets := make(map[string]*influxql.TagSet, len(filters))


This may end up overallocating. While this performs well if you have GROUP BY *, I imagine it would perform poorly when there is no GROUP BY because everything maps to the same tag set. So you could have 10,000 series ids, but only need one tag set and this would allocate a map that was ready for storing 10,000 unique entries.

I'll test my datasets w/ queries w/o group by. The problem it has without it as that it almost always has to re-allocate and grow. Maybe a smaller default of of 1024 or would be better.

I agree. This certainly needs to change in some way and the AddFilter below also needs to. I'm just not sure the proper way to do it.

jsternberg · 2016-09-29T17:57:04Z

tsdb/meta.go

 		}
-
 		// Associate the series and filter with the Tagset.
 		tagSet.AddFilter(m.seriesByID[id].Key, filter)


Related to my previous comment, this is a hot-zone I found while I was working on lazy iterators. When no GROUP BY * is used, a lot of time is spent growing the slice inside of the tag set. I have not figured out a good way to deal with this since attempting to count the number of elements ahead of time didn't really reduce the amount of time.

jsternberg · 2016-09-29T17:57:51Z

tsdb/meta.go

-		tagsAsKey := string(MarshalTags(tags))
-		tagSet, ok := tagSets[tagsAsKey]
+		tagsAsKey := MarshalTags(tags)
+		tagSet, ok := tagSets[string(tagsAsKey)]


I can't remember. Does this cause the map to use some kind of optimization for memory allocations?

Supposed to. But it reduces the []byte{} -> string conversion from 2 -> 1.

@jsternberg yeah this approach is significantly than the previous approach. Around 15ns per lookup versus 45ns from some recent benchmarking I did.

jsternberg · 2016-09-29T17:58:44Z

tsdb/meta.go

-
-	// Sort the series in each tag set.
-	for _, t := range tagSets {
-		sort.Sort(t)


Removing this, while it increases speed by a lot, will cause results to be returned non-deterministically when we use things like LIMIT.

The sort is moved to here when necessary: https://github.com/influxdata/influxdb/pull/7385/files#diff-82e4e65bd72f00208a49c4797d466ecdR1164

jsternberg · 2016-09-29T19:25:02Z

tsdb/engine/tsm1/engine.go

@@ -1161,6 +1161,12 @@ func (e *Engine) createVarRefIterator(opt influxql.IteratorOptions, aggregate bo
 				return err
 			}

+			if opt.Limit > 0 || opt.Offset > 0 || opt.SLimit > 0 || opt.SOffset > 0 {


I think this is still needed even when these conditions aren't met. Otherwise, the output is still non-deterministic for queries without a limit.

The tag sets are ordered by key still. This just order the series keys within each tagset.

Do you know of a query that could have this non-determinism?

Tag sets are ordered by the returned series. Internal to a series wouldn't be ordered so if you had this:

cpu,host=server01,region=uswest value=0 cpu,host=server01,region=useast value=1 cpu,host=server02 value=2 > SELECT * FROM cpu GROUP BY host

The host would be ordered, but the internal series grouped into the server01 bucket would not be deterministic. This is why it affects LIMIT <n> and OFFSET <n>. While I'm not sure it matters, I assume we still want that to be deterministic.

I reverted the sort removal so it will always sort as before. It seems like it is unnecessary in some cases though. For example,

SELECT SUM(value) FROM cpu GROUP BY host

Should not need to sort. Is there a way to detect this scenario here?

I think there is. We also may want to remove the sort here and instead perform the sort when we're using the tag set to create the iterators inside of the query engine.

The TagSets function was creating a lot of intermediate maps and slices to calculate the sorted tag sets. It first creates a map to group tag sets with their series, it then created an equally sized slice of the tag keys and sorted then. Finally, it created a new slice and added the tag sets in the original map by the ordering of the sorted keys. It was also recreating the tags map multiple time creating extra garbage in the loop. This simplifies the code to create one map for grouping and than adding the distinct sets to a slice which is then sorted. It also fixes the multple tag maps getting created.

e-dard

LGTM 👍

jwilder added area/queries area/performance labels Sep 29, 2016

jwilder added this to the 1.1.0 milestone Sep 29, 2016

jwilder force-pushed the jw-query-allocs branch from 7054c38 to f3388a9 Compare September 29, 2016 17:53

Avoid extra allocations when evalating binary expressions

a3fd121

jwilder force-pushed the jw-query-allocs branch from 0e5219b to 479eda6 Compare September 29, 2016 19:18

jsternberg reviewed Sep 29, 2016

View reviewed changes

jwilder force-pushed the jw-query-allocs branch from 479eda6 to 68dd312 Compare September 29, 2016 22:11

e-dard approved these changes Sep 30, 2016

View reviewed changes

jwilder merged commit f727eff into master Oct 3, 2016

jwilder deleted the jw-query-allocs branch October 3, 2016 15:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce query planning allocations #7385

Reduce query planning allocations #7385

jwilder commented Sep 29, 2016 •

edited

Loading

jsternberg Sep 29, 2016

jwilder Sep 29, 2016 •

edited

Loading

jsternberg Sep 29, 2016

jsternberg Sep 29, 2016

jsternberg Sep 29, 2016

jwilder Sep 29, 2016

e-dard Sep 30, 2016

jsternberg Sep 29, 2016

jwilder Sep 29, 2016

jsternberg Sep 29, 2016

jwilder Sep 29, 2016

jsternberg Sep 29, 2016

jwilder Sep 29, 2016

jsternberg Sep 30, 2016

e-dard left a comment

Reduce query planning allocations #7385

Reduce query planning allocations #7385

Conversation

jwilder commented Sep 29, 2016 • edited Loading

Required for all non-trivial PRs

1.0

1.1 master

Choose a reason for hiding this comment

jwilder Sep 29, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

e-dard left a comment

Choose a reason for hiding this comment

jwilder commented Sep 29, 2016 •

edited

Loading

jwilder Sep 29, 2016 •

edited

Loading