Skip to content

Commit

Permalink
separate test to assert on early termination and update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
jimczi committed Oct 31, 2019
1 parent ee0b122 commit 1fa763b
Show file tree
Hide file tree
Showing 4 changed files with 428 additions and 309 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -500,13 +500,9 @@ GET /_search

==== Early termination

The `composite` aggregation can early terminate the collection under some circumstances:

* If the primary source extracts values from a numeric or a keyword field and the query matches all documents (`match_all` query).
* If the value sources are a prefix or match entirely the <<index-modules-index-sorting,index sort>> specification.

For optimal performance the <<index-modules-index-sorting,index sort>> should be set on the index so that it matches
parts or fully the source order in the composite aggregation. For instance the following index sort:
parts or fully the source order in the composite aggregation.
For instance the following index sort:

[source,console]
--------------------------------------------------
Expand Down Expand Up @@ -608,6 +604,10 @@ If the order of sources do not matter for your use case you can follow these sim
* Make sure that the order of the field matches the order of the index sort.
* Put multi-valued fields last since they cannot be used for early termination.

WARNING: <<index-modules-index-sorting,index sort>> can slowdown indexing, it is very important to test index sorting
with your specific use case and dataset to ensure that it matches your requirement. If it doesn't note that `composite`
aggregations will also try to early terminate on non-sorted indices if the query matches all document (`match_all` query).

==== Sub-aggregations

Like any `multi-bucket` aggregations the `composite` aggregation can hold sub-aggregations.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.elasticsearch.search.aggregations.bucket.composite;

import org.apache.lucene.index.DocValues;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.util.LuceneTestCase;
import org.elasticsearch.search.sort.SortOrder;


import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Map;

/** Suppress AssertingCodec because it doesn't work with {@link DocValues#unwrapSingleton} that is used in this test. */
@LuceneTestCase.SuppressCodecs("*")
public class CompositeAggregatorEarlyTerminationTests extends CompositeAggregatorTestCase {
public void testEarlyTermination() throws Exception {
final List<Map<String, List<Object>>> dataset = new ArrayList<>();
dataset.addAll(
Arrays.asList(
createDocument("keyword", "a", "long", 100L, "foo", "bar"),
createDocument("keyword", "c", "long", 100L, "foo", "bar"),
createDocument("keyword", "a", "long", 0L, "foo", "bar"),
createDocument("keyword", "d", "long", 10L, "foo", "bar"),
createDocument("keyword", "b", "long", 10L, "foo", "bar"),
createDocument("keyword", "c", "long", 10L, "foo", "bar"),
createDocument("keyword", "e", "long", 100L, "foo", "bar"),
createDocument("keyword", "e", "long", 10L, "foo", "bar")
)
);

executeTestCase(true, false, new TermQuery(new Term("foo", "bar")),
dataset,
() ->
new CompositeAggregationBuilder("name",
Arrays.asList(
new TermsValuesSourceBuilder("keyword").field("keyword"),
new TermsValuesSourceBuilder("long").field("long")
)).aggregateAfter(createAfterKey("keyword", "b", "long", 10L)).size(2),
(result) -> {
assertEquals(2, result.getBuckets().size());
assertEquals("{keyword=c, long=100}", result.afterKey().toString());
assertEquals("{keyword=c, long=10}", result.getBuckets().get(0).getKeyAsString());
assertEquals(1L, result.getBuckets().get(0).getDocCount());
assertEquals("{keyword=c, long=100}", result.getBuckets().get(1).getKeyAsString());
assertEquals(1L, result.getBuckets().get(1).getDocCount());
assertTrue(result.isTerminatedEarly());
}
);

// source field and index sorting config have different order
executeTestCase(true, false, new TermQuery(new Term("foo", "bar")),
dataset,
() ->
new CompositeAggregationBuilder("name",
Arrays.asList(
// reverse source order
new TermsValuesSourceBuilder("keyword").field("keyword").order(SortOrder.DESC),
new TermsValuesSourceBuilder("long").field("long").order(SortOrder.DESC)
)
).aggregateAfter(createAfterKey("keyword", "c", "long", 10L)).size(2),
(result) -> {
assertEquals(2, result.getBuckets().size());
assertEquals("{keyword=a, long=100}", result.afterKey().toString());
assertEquals("{keyword=b, long=10}", result.getBuckets().get(0).getKeyAsString());
assertEquals(1L, result.getBuckets().get(0).getDocCount());
assertEquals("{keyword=a, long=100}", result.getBuckets().get(1).getKeyAsString());
assertEquals(1L, result.getBuckets().get(1).getDocCount());
assertTrue(result.isTerminatedEarly());
}
);
}
}
Loading

0 comments on commit 1fa763b

Please sign in to comment.