Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement MetricUnionAvailability & MetricUnionCompositeTable #193

Merged
merged 14 commits into from Apr 13, 2017
Merged

Implement MetricUnionAvailability & MetricUnionCompositeTable #193

merged 14 commits into from Apr 13, 2017

Conversation

QubitPi
Copy link
Contributor

@QubitPi QubitPi commented Mar 10, 2017

TODO

Copy link
Collaborator

@garyluoex garyluoex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modify tests accordingly.

)
.entrySet()
.stream()
.anyMatch(entry -> entry.getValue() > 1)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indention is a bit weird, otherwise good.

* @param columns The columns associated with all tables
*/
public MetricUnionAvailability(
@NotNull Map<Table, Set<String>> metricAvailability,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change Table to PhysicalTable


this.columns = columns;
this.dataSourceNames = metricAvailability.keySet().stream()
.map(table -> TableName.of(table.getName()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use getTableName method on PhysicalTable

this.metricAvailability = metricAvailability.entrySet().stream()
.collect(
Collectors.toMap(
entry -> PhysicalTable.class.cast(entry.getKey()).getAvailability(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not need to cast

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this is a strange way to cast. Generally, casts in Java are done like:

entry -> (PhysicalTable) entry.getKey()


@Override
public Map<Column, Set<Interval>> getAllAvailableIntervals() {
return metricAvailability.entrySet().stream()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might and will be repeated columns, we need to union those availabilities together I think using group by.

intersectionBetween(getRequestDimensions(), other),
intersectionBetween(getFilterDimensions(), other),
intersectionBetween(getMetricDimensions(), other),
metricNames.stream()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just need the metricNames to intersect other not the dimensions here, change the name to include metricIntersection, leave the dimensions the same.

entry -> constraint.withIntersectedColumnNames(entry.getValue())
)
).entrySet().stream()
.filter(entry -> !entry.getValue().getAllColumnNames().isEmpty())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check if the metric is empty, not including dimensions.

*
* @return The intersection, containing <tt>Dimension</tt> objects, between the two sets
*/
private static Set<Dimension> intersectionBetween(Set<Dimension> first, Set<String> second) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove, no need.

@@ -0,0 +1,203 @@
// Copyright 2016 Yahoo Inc.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

17

@QubitPi
Copy link
Contributor Author

QubitPi commented Mar 14, 2017

The refactor breaks down methods into smaller helper methods so they are more readable and can be tested better.

Tests for getAvailableIntervals will come in the next commit(Although its helper methods are all tested). I don't want the whole PR to be blocked by a single test.


class TestMetricUnionAvailability extends MetricUnionAvailability {

List<TestAvailability> testAvailabilities
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be List<Availability> instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you briefly explain why do we need a TestAvailability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it's explained in the next commit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean... explain why do we need TestAvailability to exist? Explain briefly in this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Availability is an interface. An implementation of it, TestAvailability, can return specified value from getAvailableIntervals. We need Availability.getAvailableIntervals to return something in test

Copy link
Contributor Author

@QubitPi QubitPi Mar 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could've mocked Availability and then Availability.getAvailableIntervals >> ..., but that's not allowed in TestMetricUnionAvailability

Copy link
Collaborator

@garyluoex garyluoex Mar 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to test MetricUnionAvailability directly using mocks for ConcretePhysicalTable or ConcreteAvailability, without having to create another TestMetricUnionAvailability. I will review and figure out how to do this later today, in the mean time try if this is possible

CHANGELOG.md Outdated
@@ -9,6 +9,10 @@ Current
-------

### Added:
- [MetricUnionAvailability](https://github.com/yahoo/fili/pull/193)
* Added `MetricUnionAvailability` which unions multiple different physical tables together into a single table
that contains all columns from all tables so that we can query different columns from different tables at the same time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect.
MetricUnionAvailability joins physical tables with the same dimension schema but non-intersecting metric schemas. Metrics on the MUA are sourced from exactly one of the participating tables.

* An availability which unions multiple different physical tables together into a single table
* that contains all columns from all tables so that we can query different columns
* from different tables at the same time.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong as per the CHANGELOG comment above.

* @return true if the map contains duplicate values or false otherwise
*/
private static boolean hasDuplicateValue(Map<PhysicalTable, Set<String>> map) {
return map.values().stream()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metric Column not string, I think

// validate metric uniqueness such that
// each table's underlying datasource schema do not have repeated metric column
if (hasDuplicateValue(metricAvailability)) {
throw new IllegalArgumentException("Metric columns are not unique.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to actually let them know which metric columns are creating the error. Perhaps collect the duplicated metric names and then format them into the error string.

* @return the new <tt>DataSourceConstraint</tt> instance with a new subset of metric names
*/
public DataSourceConstraint withMetricIntersection(Set<String> other) {
return new DataSourceConstraint(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'other' here is a set of metric names. Please change the variable name.

Set<Dimension> filterDimensions,
Set<Dimension> metricDimensions,
Set<String> metricNames,
Map<Dimension, Set<ApiFilter>> apiFilters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should DSC initialize on String metric names or FieldNames?
We need to discuss this.

@garyluoex garyluoex force-pushed the UDCore3 branch 2 times, most recently from 74b324b to f046ee2 Compare March 17, 2017 00:04
Copy link
Collaborator

@cdeszaq cdeszaq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 1 of comments. Didn't get to the tests.

);
}

dataSourceNames = physicalTables.stream()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to aggregate from availabilities not physical tables. In particular, if this is a union over another union (for example), it should collect all of the leaf names. Please make sure there's a test for this case as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.
This is collecting physical table dictionary names and not datasource names.
We need to call 'getDataSourceNames' on the the underlying availabilities.

/**
* Constructor.
*
* @param physicalTables A set of <tt>PhysicalTable</tt>'s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to give more information about what this set means. The fact that it's a set of physical tables is indicated already by the type of the parameter, so the rest of the comment should describe the semantics of the parameter (ie. the meaning)

* Constructor.
*
* @param physicalTables A set of <tt>PhysicalTable</tt>'s
* @param columns The columns associated with all tables
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what way are they associated? Do all tables need to have these columns? Is this set a superset of all columns? etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should describe this as 'The set of columns supplied by this availability'

/**
* An implementation of availability which which joins physical tables with the same dimension schema
* but non-intersecting metric schemas. Metrics on the <tt>MetricUnionAvailability</tt> are sourced from
* exactly one of the participating tables.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if a table has an availability, but an availability is joining tables, do we have a circle in this dependency? Should availabilities really only do joins of availabilities, or am I confused? @michael-mclawhorn or @garyluoex, can you shed more life on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'physical tables' here is shorthand for the availabilities of physical tables, I suppose.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote for using the longer but more descriptive description, to remove confusion and ambiguity. 5 months from now, when we forget how this is architected, we should be able to read this and understand what is going on.

public MetricUnionAvailability(@NotNull Set<PhysicalTable> physicalTables, @NotNull Set<Column> columns) {
metricColumns = columns.stream()
.filter(column -> column instanceof MetricColumn)
.map(column -> (MetricColumn) column)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use the Utils::getSubsetByType helper method here instead of doing it yourself?

.collect(
Collectors.toMap(
entry -> entry.getKey(),
entry -> entry.getValue(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be method references

@Override
public Map<Column, List<Interval>> getAllAvailableIntervals() {
return metricAvailability.keySet().stream()
.map(availability -> availability.getAllAvailableIntervals().entrySet())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be decomposed to 2 map calls:

.map(Availability::getAllAvailableIntervals)
.map(Map::entrySet)

List<Interval> combined = new ArrayList<>();
combined.addAll(value1);
combined.addAll(value2);
return combined;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this body can be simpler, not needing the extra list construction:

value1.addAll(value2);
return value1;


private final Set<TableName> dataSourceNames;
private final Set<MetricColumn> metricColumns;
private final Map<Availability, Set<MetricColumn>> metricAvailability;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give thses a name that indicates what these things mean rather than what these things are? For example, when metricColumns shows up in the code of this class, it's often unclear what this collection is being used for, and a better name would help that.

* @param allColumnNames Set of all column names
* @param apiFilters Map of dimension to its set of API filters
*/
public DataSourceConstraint(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is only supposed to be used within this class, it should be protected, not public.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael-mclawhorn This constructor is a copy-constructor that doesn't go through the same defensive copies that the other constructor does. It's only purpose in life is to support the with* method that this PR added, and should not be called from outside this class.

@QubitPi QubitPi changed the title Implement MetricUnionAvailability Implement MetricUnionAvailability & MetricUnionCompositeTable Mar 24, 2017
@QubitPi QubitPi changed the base branch from UDCore3 to FieldTagging March 24, 2017 21:30
@QubitPi QubitPi changed the base branch from FieldTagging to UDCore3 March 24, 2017 21:30
@garyluoex garyluoex force-pushed the UDCore3 branch 2 times, most recently from 4c65f95 to a36562f Compare March 24, 2017 21:51
@michael-mclawhorn michael-mclawhorn mentioned this pull request Mar 27, 2017
@QubitPi QubitPi changed the base branch from UDCore3 to pr/190 March 27, 2017 22:19
@cdeszaq cdeszaq changed the base branch from pr/190 to master March 28, 2017 15:32
@cdeszaq
Copy link
Collaborator

cdeszaq commented Mar 28, 2017

The branch this had been based on got merged and deleted, which auto-closed this PR. I've re-set it's base to master and reopened it.

@cdeszaq cdeszaq reopened this Mar 28, 2017
@QubitPi QubitPi closed this Mar 29, 2017
@QubitPi QubitPi reopened this Mar 29, 2017
@QubitPi QubitPi changed the base branch from master to MoreSubstantiveApiMetricNameChange March 29, 2017 18:25
@QubitPi QubitPi changed the base branch from MoreSubstantiveApiMetricNameChange to master March 29, 2017 18:25
Copy link
Contributor

@michael-mclawhorn michael-mclawhorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need some clarifications about some of the testing.

CHANGELOG.md Outdated

- [MetricUnionAvailability](https://github.com/yahoo/fili/pull/193)
* Added `MetricUnionAvailability` which joins physical tables with the same dimension schema
but non-intersecting metric schemas. Metrics on the `MetricUnionAvailability` are sourced from exactly one of the participating tables.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space.

* @param name Name of the physical table as TableName, also used as fact table name
* @param columns The columns for this table
* @param physicalTables A set of <tt>PhysicalTable</tt>'s
* @param logicalToPhysicalColumnNames Mappings from logical to physical names
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apostrophe (') should not be here.


/**
* Returns the coarsest <tt>ZonedTimeGrain</tt> among a set of <tt>PhysicalTables</tt>'s.
* <p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No apostophe.
Maybe say 'the coarsest...that satisfies all tables'

* Returns the coarsest <tt>ZonedTimeGrain</tt> among a set of <tt>PhysicalTables</tt>'s.
* <p>
* If the set of <tt>PhysicalTables</tt>'s is empty or the coarsest <tt>ZonedTimeGrain</tt> is not
* compatible with any of the <tt>PhysicalTables</tt>'s, throw <tt>IllegalArgumentException</tt>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No apostrophe.

*
* @param physicalTables A set of <tt>PhysicalTable</tt>'s among which the coarsest <tt>ZonedTimeGrain</tt>
* is to be returned.
*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No apostophe.

class DataSourceConstraintSpec extends Specification {

def "test withIntersectedColumnNames"() {
given:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use @unroll and use #arguments in test definition to explain the parameters of each test case:
e.g.
#metricNames intersected with #other produces #newMetricNames

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ps @cdeszaq hates the use of 'Test' in test names.

interval2 = new Interval('2018-02-01/2018-03-01')
interval3 = new Interval('2019-01-01/2019-02-01')
interval4 = new Interval('2018-11-01/2018-12-01')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what to make of these intervals and the order of the intervals listed here.

Copy link
Contributor Author

@QubitPi QubitPi Mar 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean you were confused about how they were used?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of. They aren't ordered. And they don't overlap at all. So they feel kind of arbitrary. Is there some meaning that they have relative to each other?

* @return A mapping from <tt>Availability</tt> to <tt>DataSourceConstraint</tt> with non-empty metric names
*/
private Map<Availability, DataSourceConstraint> constructSubConstraint(DataSourceConstraint constraint) {
return availabilityToAvailableColumns.entrySet().stream()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable should be plural

}

def "test getAvailableIntervals"() {
given:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test tests only a path with no overlap. Would it make sense to also test when there is an overlap?

}

def "dataSourceNames aggregates from availabilities not physical tables in MetricUnionAvailability constructor"() {
given:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand from the name what this test is testing.

Copy link
Contributor

@michael-mclawhorn michael-mclawhorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty much there.
I'm not sure I agree with the implementations you use for the toString() on the availabilities. Let's talk about that offline.
Get rid of the .get(0) thing by mapping down to grains. And if you can, clarify by name the meanings of the intervals in the test.

@QubitPi QubitPi self-assigned this Apr 11, 2017
@garyluoex garyluoex assigned cdeszaq and unassigned QubitPi Apr 11, 2017
Copy link
Collaborator

@cdeszaq cdeszaq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing big. Just a couple of very small things. I'm good with this once those get a little love.

logicalToPhysicalColumnNames,
new MetricUnionAvailability(physicalTables, columns)
);
verifyGrainSatisfiesAllTables(IntervalUtils.getCoarsestTimeGrain(physicalTables).get(), physicalTables);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than re-computing the grain, just get it from the parent, since it's already built at this point: getSchema().getTimeGrain()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful!

Utils.getSubsetByType(
availability
.getAllAvailableIntervals()
.keySet(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to wrap

/NotABlocker

* Intervals of the same metric column are associated with the same metric column key. Overlapping intervals under
* the same metric column key are collapsed into single interval.
*
* @return a map of metric column to all of its available intervals in union
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This javadoc still needs to be updated to reflect that it's more than just metric column availabilities that are being returned.

* }
* </pre>
* Note that the joining availabilities must have completely different set of metric columns.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would reword this and put it higher. I would say that metric columns available on one availability must not exist on any other availabilities.

* @return duplicate values of <tt>MetricColumn</tt>s
*/
private static Map<MetricColumn, Set<Availability>> getDuplicateValue(
Map<Availability, Set<MetricColumn>> availabilityToAvailableColumns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The should probably be getDuplicateValues (plural)

Copy link
Collaborator

@garyluoex garyluoex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestions, feel free to change.

super(
name,
IntervalUtils.getCoarsestTimeGrain(physicalTables).orElseThrow(() -> {
String message = "At least 1 physical table needs to be provided";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a better error message, at least note the current table name.

IntervalUtils.getCoarsestTimeGrain(physicalTables).orElseThrow(() -> {
String message = "At least 1 physical table needs to be provided";
LOG.error(message);
return new IllegalArgumentException(message);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note, it is not clear why to throw a illegal argument here, who is receiving invalid argument? which is a good indicator of something is not right. But I am fine leaving it here.

.map(PhysicalTable::getSchema)
.map(PhysicalTableSchema::getTimeGrain)
.allMatch(grain -> grain.satisfiedBy(coarsestTimeGrain))) {
String message = String.format("There is no mutually satisfying grain among: %s", physicalTables);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably note the current physical table name in the message.

.allMatch(grain -> grain.satisfiedBy(coarsestTimeGrain))) {
String message = String.format("There is no mutually satisfying grain among: %s", physicalTables);
LOG.error(message);
throw new IllegalArgumentException(message);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not really Illegal argument, but argument that is legal but does not make sense are given. I think simply runtime exception is better.

* +---------------+---------------+
* | metricColumn1 | metricColumn2 |
* +---------------+---------------+
* | [1/10] | [20/30] |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not clear what these numbers are when looking at it, it would be nice if it is actually time interval instead of number interval

Map<MetricColumn, Set<Availability>> duplicates = getDuplicateValue(availabilitiesToAvailableColumns);
if (!duplicates.isEmpty()) {
String message = String.format(
"While constructing MetricUnionAvailability, Metric columns are not unique - %s",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should attach table name into the message.

return new SimplifiedIntervalList(
constructSubConstraint(constraints).entrySet().stream()
.map(entry -> entry.getKey().getAvailableIntervals(entry.getValue()))
.map(i -> (Set<Interval>) new HashSet<>(i))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is i? Does method reference HashSet::new work?

availability2 = Mock(Availability)

availability1.getDataSourceNames() >> Sets.newHashSet(TableName.of('source1'))
availability2.getDataSourceNames() >> Sets.newHashSet(TableName.of('source2'))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it turns out that if you wrap the right part with a parenthesis, things will work.

Copy link
Contributor

@michael-mclawhorn michael-mclawhorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few tests that probably should be added here.

I would be willing add some if you wanted to migrate the work to a branch in the main github.

def "checkDuplicateValue returns duplicate metric column names or empty, if no duplicates, from 2 physical tables in the case of #caseDescription"() {
when:
Map<MetricColumn, Set<Availability>> actual = MetricUnionAvailability.getDuplicateValue(
[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed the rename


expect:
metricUnionAvailability.constructSubConstraint(dataSourceConstraint).size() == 2
metricUnionAvailability.getAvailableIntervals(dataSourceConstraint) == new SimplifiedIntervalList(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constructSubConstraint here isn't being tested for trimming down columns. You should add a column to it that gets filtered.

['2017-01-01/2017-02-01'] | ['2017-01-01/2017-01-15'] | ['2017-01-01/2017-01-15'] | "full back overlap (0/10, 0/5)"
['2017-01-01/2017-02-01'] | ['2017-01-15/2017-01-25'] | ['2017-01-15/2017-01-25'] | "fully contain (0/10, 3/9)"
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test that the subconstraints are correctly applied to distinct subavailabilities.

e.g. Have SubAvailability-A (metric1, metric2) , SubAvailability-B (metric3, metric4) and test that DSS with columns (metric1, metric3) calls A with metric1 and B with metric3.

@cdeszaq cdeszaq merged commit 61999bb into yahoo:master Apr 13, 2017
@QubitPi QubitPi deleted the UDCore3-implement-MetricUnionAvailability branch April 13, 2017 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants