-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce makers #127
Reduce makers #127
Conversation
202368e
to
0916584
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love the use of a Function
to cut down on the boilerplate. Had a few suggestsions here and there, but overall this looks great.
@@ -35,6 +38,10 @@ | |||
* takes aggregated data from a finer time grain (i.e. DefaultTimeGrain.DAY) and computes an average across a coarser | |||
* time grain (i.e. DefaultTimeGrain.WEEK). For example, given the total number of visitors to www.example.com for each | |||
* day of the week, we can compute the average number of daily visitors to example.com for the entire week. | |||
* <p> | |||
* A nested average requires the following columns, an inner query constant, and outer query sum of that constant, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a colon after following columns
instead of a comma?
Also this paragraph is extremely dense, and I couldn't keep my eyes from glazing as I looked at it. I would suggest something like:
- An inner query constant
c
to sum, and the outer query sum ofc
:cs
. - An inner query (post) aggregation
a
to sum, and an outer query sum ofa
:as
- An outer query post aggregation that computes
as / cs
.
newInnerPostAggregations.add((PostAggregation) sourceMetric); | ||
} | ||
Set<PostAggregation> newInnerPostAggregations = !(sourceMetric instanceof PostAggregation) ? | ||
Collections.emptySet() : ImmutableSet.of((PostAggregation) sourceMetric); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would flip these conditions, and make the check a positive check rather than a negative one.
return originalSourceMetric; | ||
return !(originalSourceMetric instanceof SketchAggregation) ? | ||
originalSourceMetric : | ||
sketchConverter.asSketchEstimate((SketchAggregation) originalSourceMetric); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would flip these conditions, and make the check a positive check rather than a negative one.
} | ||
return result; | ||
return originalAggregations.stream() | ||
.map(agg -> ! agg.isSketch() ? agg : sketchConverter.asInnerSketch((SketchAggregation) agg)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would flip these conditions, and make the check a positive check rather than a negative one.
return result; | ||
return originalAggregations.stream() | ||
.map(agg -> ! agg.isSketch() ? agg : sketchConverter.asInnerSketch((SketchAggregation) agg)) | ||
.collect(Collectors.toCollection(LinkedHashSet::new)); | ||
} | ||
|
||
/** | ||
* Create an Aggregation for summing on a metric from an inner query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably toss in a blank line and a <p>
tag here, to separate the summary from the more detailed description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That hasn't been our rule. The summary ends with a period and newline. Further commentary may be offset, but need not be. The sentence end will be caught by the javadoc tool and pulled out to a summary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When one is adding an extra line or two to the summary, I'm fine with that. But since this is a rather detailed paragraph, I was just thinking that it would look better if it was offset into its own paragraph from the summary, in the method's more detailed view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough
@@ -2,22 +2,20 @@ | |||
// Licensed under the terms of the Apache license. Please see LICENSE.md file distributed with this work for terms. | |||
package com.yahoo.bard.webservice.data.config.metric.makers; | |||
|
|||
import static java.util.Collections.emptySet; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the static import? A static import makes sense to me if the class name is really long and we're using a function/constant a lot, but Collections
isn't that long, and emptySet
is only used once. So I would argue that the readability benefits of Collections.emptySet()
outweighs the benefits from reducing verbosity.
query, | ||
NO_OP_MAPPER, | ||
metricName | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be wrapped?
* | ||
* @return The aggregation used by this metric | ||
*/ | ||
private Aggregation assertDependentIsAggregationMetric(final LogicalMetric sourceMetric) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final
not necessary on final LogicalMetric sourceMetric
.
.map(metrics::get) | ||
.map(LogicalMetric::getTemplateDruidQuery) | ||
.reduce(TemplateDruidQuery::merge) | ||
.orElseGet(() -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use orElseThrow
here instead.
|
||
/** | ||
* Metric maker which only creates aggregations. They do not depend on other metrics and cannot be | ||
* validated against the metric dictionary | ||
*/ | ||
public abstract class RawAggregationMetricMaker extends MetricMaker { | ||
|
||
private static final int DEPENDENT_METRICS_REQUIRED = 1; | ||
|
||
private final BiFunction<String, String, Aggregation> aggregationFactory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awww yeah...
👍 |
👍 Reafirming my plus one. |
079e022
to
dd6875d
Compare
} | ||
|
||
/** | ||
* Copy a set of aggregations, replacing any sketch aggregations with sketchMerge aggregations. | ||
* This is an artifact of earlier sketch code which didn't have a single sketch type that could be used without | ||
* finalization in inner queries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is an artifact and is no longer needed, should we deprecate it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not until we remove that deprecated code, I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case, can we @Deprecate
this method as well, so that when we do the next sweep to remove deprecated code, this will be flagged and we can remove it?
* @param resultSetMapper The mapping function to be applied to the result that is returned by the query that is | ||
* built from the LogicalMetric which is built by this maker. | ||
* | ||
* @deprecated to override default mapping, use the Function constructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're deprecating this, we should also:
- Convert the test using this constructor to use the non-deprecated version
- Also deprecate the
ColumnMapper
interface
Alternatively, if we think there's value in the existence of the ColumnMapper
interface, and the JavaDoc on it's withColumName
method, we should keep it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecating
* | ||
* @param metricDictionary The dictionary used to resolve dependent metrics when building the LogicalMetric | ||
* @param function The arithmetic operation performed by the LogicalMetrics constructed by this maker | ||
* @param resultSetMapper The function to be applied to the result that is returned by the query | ||
* @param resultSetMapperBuilder A builder for a function to be applied to the result that is returned by the query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're making this change, we should take the opportunity to make the param comment more clear / easier to understand. As it is, this doc doesn't really say anything about what this param is for / expected to do.
@@ -58,32 +83,32 @@ public ArithmeticMaker( | |||
*/ | |||
public ArithmeticMaker(MetricDictionary metricDictionary, ArithmeticPostAggregationFunction function) { | |||
// TODO: Deprecate me, mappers should always be specified at creation time, not implicitly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're changing a bunch of things anyways, should we deprecate this method as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if I actually agree with this TODO. Does anyone know why this is asked for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea behind this TODO is that we shouldn't apply a SketchRoundUpMapper
to everything that uses the ArithmeticMaker
, and instead require those who want that to add the mapper they want explicitly.
For example, someone might want to add 2 estimated sketches and have the result be a floating-point, rather than an integer. The Maker should not be opinionated.
this( | ||
metricDictionary, | ||
function, | ||
(Function<String, ResultSetMapper>) (String name) -> new SketchRoundUpMapper(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can replace lambda with method reference
given: "The name of the metric the maker depends on, and the maker itself" | ||
|
||
Aggregation sumAggregation = new LongSumAggregation("tmp_metric", DEPENDENT_METRIC_NAME) | ||
LogicalMetric metric = longSumMaker.make("longSum", DEPENDENT_METRIC_NAME) | ||
Dimension dim = new KeyValueStoreDimension("d", "des", new LinkedHashSet<DimensionField>(), Mock(KeyValueStore), Mock(SearchProvider)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified using a set literal, rather than constructing a new LHS:
Dimension dim = new KeyValueStoreDimension("d", "des", [], Mock(KeyValueStore), Mock(SearchProvider))
/** | ||
* | ||
* Created by mclawhor on 12/19/16. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not useful
def "Expected numeric aggregation is produced for #makerClass.getSimpleName()"() { | ||
setup: | ||
RawAggregationMetricMaker maker = makerClass.newInstance() | ||
expect: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blank line between blocks
def "Expected sketch aggregation is produced for #makerClass.getSimpleName()"() { | ||
setup: | ||
RawAggregationMetricMaker maker = makerClass.newInstance((MetricDictionary) null, 5) | ||
expect: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blank line between blocks
public static String FIELD_NAME = "BAR" | ||
|
||
@Unroll | ||
def "Expected numeric aggregation is produced for #makerClass.getSimpleName()"() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the call to getSimpleName()
is needed, since I believe Groovy defaults to that. That said, I also don't think method calls will work correctly with unrolling, only property-style accessors, so this should be #makerClass.simpleName
(Applies to other test as well)
4f2e364
to
dbe1a4c
Compare
|
||
- [Deprecated MetricMaker.getDependentQuery lookup method in favor of simpler direct access](https://github.com/yahoo/fili/pull/124) | ||
|
||
- [Default DimensionColumn name to use apiName instead of physicalName](https://github.com/yahoo/fili/pull/115) | ||
* Deprecated `TableUtils::getColumnNames(DataApiRequest, DruidAggregationQuery, PhysicalTable)` returning dimension physical name, | ||
* Deprecated `TableUtils::getColumnNames(DataApiRequest, DruidAggregationQuery, PhysicalTable)` returning dimension physical ltername, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo?
@@ -58,32 +83,32 @@ public ArithmeticMaker( | |||
*/ | |||
public ArithmeticMaker(MetricDictionary metricDictionary, ArithmeticPostAggregationFunction function) { | |||
// TODO: Deprecate me, mappers should always be specified at creation time, not implicitly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea behind this TODO is that we shouldn't apply a SketchRoundUpMapper
to everything that uses the ArithmeticMaker
, and instead require those who want that to add the mapper they want explicitly.
For example, someone might want to add 2 estimated sketches and have the result be a floating-point, rather than an integer. The Maker should not be opinionated.
*/ | ||
public class FilteredAggregationMaker extends MetricMaker { | ||
|
||
protected static final Logger LOG = LoggerFactory.getLogger(ConstantMaker.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong class
f3896d8
to
53175b6
Compare
… to create the mapper.
… the original NumberFormatException (Config Behavior Change)
…ssing in a metric instead of binding to a single metric aggregation. Made the FilteredAggregation constructor use the column name as as the metric, as per convention elsewhere.
53175b6
to
0ac197c
Compare
this( | ||
metricDictionary, | ||
function, | ||
(Function<String, ResultSetMapper>) (ignore) -> NO_OP_MAPPER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parens around ignore
(ie. the lambda input variable) are not technically needed, since there's no type on it and there's only 1 variable.
/NotABlocker
|
||
/** | ||
* The metric dictionary used to resolve dependent metrics. | ||
*/ | ||
protected final MetricDictionary metricDictionary; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can/should we kick these over to be private
instead of protected
? My gut says we should generally not encourage extension, since it makes it harder to change internal variables like these...
/NotABlocker
@@ -27,7 +27,7 @@ | |||
|
|||
private static final int DEPENDENT_METRICS_REQUIRED = 2; | |||
|
|||
final SketchSetOperationPostAggFunction function; | |||
private final SketchSetOperationPostAggFunction function; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should call this change out in the change log. It's unlikely this was extended, but just in case...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small things that could be tightened up, but nothing major. 👍
8ffc0fe
to
7f88179
Compare
Big simplification of makers